Downstream report: https://bugzilla.suse.com/show_bug.cgi?id=1215602 LKML link: https://lore.kernel.org/all/047d3c51-0a9e-4c3e-beef-625a7aa4f3c3@kernel.org/ according to logs, since 6.3 (up to 6.5.4 now), I repeatedly see: > WARNING: CPU: 14 PID: 962 at drivers/acpi/platform_profile.c:74 > platform_profile_show+0xb1/0x100 [platform_profile] > Modules linked in: ccm michael_mic ... > CPU: 14 PID: 962 Comm: power-profiles- Kdump: loaded Not tainted > 6.5.4-6-default #1 openSUSE Tumbleweed (unreleased) > dd37106c593be78644bb80e3c1534d801bf4cb36 > Hardware name: LENOVO 21CRS0K83K/21CRS0K83K, BIOS R22ET60W (1.30 ) 02/09/2023 > RIP: 0010:platform_profile_show+0xb1/0x100 [platform_profile] > Code: d0 a8 ... > RSP: 0018:ffff9c1ac0b97db0 EFLAGS: 00010296 > RAX: 0000000000000000 RBX: 0000000000000000 RCX: 000000008fc35be0 > RDX: 0000000000000000 RSI: ffff9c1ac0b97db4 RDI: ffffffffc0a8b0a0 > RBP: ffff8955ca540000 R08: ffff895b9f1ed180 R09: ffff895559ea1bc0 > R10: 00000000031a400e R11: 000000000003f680 R12: ffff895b9f1ed180 > R13: ffff9c1ac0b97e50 R14: 0000000000000001 R15: ffff9c1ac0b97ee8 > FS: 00007f71b0e71900(0000) GS:ffff895b9f100000(0000) knlGS:0000000000000000 > CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > CR2: 00007fe402ea3400 CR3: 000000012004c000 CR4: 0000000000750ee0 > PKRU: 55555554 > Call Trace: > <TASK> > sysfs_kf_seq_show+0xab/0x100 > seq_read_iter+0x123/0x480 > vfs_read+0x1b8/0x300 It's: WARN_ON((profile < 0) || (profile >= ARRAY_SIZE(profile_names)))) So I put there one more print: dev_warn(dev, "profile=%d profile_get=%ps\n", profile, cur_profile->profile_get); and I see: : profile=-1883022368 profile_get=dytc_profile_get [thinkpad_acpi] : profile=-1510173440 profile_get=dytc_profile_get [thinkpad_acpi] : profile=-1510173440 profile_get=dytc_profile_get [thinkpad_acpi] : profile=-966231712 profile_get=dytc_profile_get [thinkpad_acpi] : profile=-1578420592 profile_get=dytc_profile_get [thinkpad_acpi] : profile=-1578420592 profile_get=dytc_profile_get [thinkpad_acpi] : profile=-1578420592 profile_get=dytc_profile_get [thinkpad_acpi] I wonder about dev passed to dytc_profile_get() having empty name (nothing before colon above)? Is that expected? Ah, convert_dytc_to_profile()'s retval is not checked in dytc_profile_refresh(). Adding: --- a/drivers/platform/x86/thinkpad_acpi.c +++ b/drivers/platform/x86/thinkpad_acpi.c @@ -10418,7 +10418,14 @@ static void dytc_profile_refresh(void) return; perfmode = (output >> DYTC_GET_MODE_BIT) & 0xF; - convert_dytc_to_profile(funcmode, perfmode, &profile); + err = convert_dytc_to_profile(funcmode, perfmode, &profile); + if (err) { + pr_warn("%s: mmc=%u psc=%u mmc_get=%u funcmode=%d output=0x%x perfmode=%d\n", + __func__, !!(dytc_capabilities & BIT(DYTC_FC_MMC)), + !!(dytc_capabilities & BIT(DYTC_FC_PSC)), + dytc_mmc_get_available, funcmode, output, perfmode); + return; + } if (profile != dytc_current_profile) { dytc_current_profile = profile; platform_profile_notify(); fixes the warning, of course. The output is: dytc_profile_refresh: mmc=0 psc=1 mmc_get=0 funcmode=0 output=0x1f001 perfmode=15 So using psc mode, it retrieves DYTC (0x1f001) as follows: 0x1.... -> VSTD=1 -> STD mode 0x.f... -> CICM=0xf, DYTC_GET_MODE_BIT(12) -> dunno what it is in STD 0x..0.. -> CICF=0x0, DYTC_GET_FUNCTION_BIT(8) -> DYTC_FUNCTION_STD 0x....1 -> GOOD But convert_dytc_to_profile() doesn't handle this at all. Do I have a newer DYTC interface? Or a broken one? Case (0x00) { Local1 = 0x0100 Local1 |= 0x80000000 Local1 |= 0x00 Local1 |= 0x01 } I.e. version 8.0, it seems. My DYTC for CMD_GET looks like: Case (0x02) { Local5 = VSTD /* \VSTD */ Local5 |= (VCQL << 0x01) Local5 |= (VSTP << 0x04) Local5 |= (VADM << 0x07) Local5 |= (VTMS << 0x09) Local5 |= (VDLS << 0x0A) Local5 |= (VMSC << 0x0C) Local5 |= (VPSC << 0x0D) Local1 = (CICF << 0x08) If ((CICF == 0x03)) { CICM = SMYH /* \SMYH */ } ElseIf ((CICF == 0x0B)) { CICM = SMMC /* \SMMC */ } ElseIf ((CICF == 0x0D)) { CICM = SPSC /* \SPSC */ } ElseIf ((CICF == 0x0F)) { CICM = SAMT /* \SAMT */ } ElseIf ((CICF == 0x07)) { CICM = SADM /* \SADM */ } Else { CICM = 0x0F } Local1 |= (CICM << 0x0C) Local1 |= (Local5 << 0x10) Local1 |= 0x01 }
Created attachment 305147 [details] DSDT
Hi, Can I confirm this is on a T14s G3 AMD platform? Thanks for the detailed report - very helpful and very much appreciated. I'll need to reproduce this but I don't have one of these platforms myself so will need to ask a colleague for help. It looks like this line is causing problems on this platform: funcmode = (output >> DYTC_GET_FUNCTION_BIT) & 0xF; Normally that should return PSC mode (in this case - unless you have AMT mode, which you don't on that gen platform). But it's saying it is in standard mode instead....which is peculiar. The FW is doing something non-standard to other platforms. Do you still get the warning when: - you're in performance or low-power modes? - you're in lap mode? As a quick test could you try adding: /* Check if we are PSC mode, or have AMT enabled */ funcmode = (output >> DYTC_GET_FUNCTION_BIT) & 0xF; +if (funcmode == DYTC_FUNCTION_STD) + funcmode = DYTC_FUNCTION_PSC; To see if it behaves correctly please? I'll have to check with the FW team and find out why this change in behaviour...I suspect I need to do a kernel patch to better handle this AMT case though. Mark
(In reply to Mark Pearson from comment #2) > Can I confirm this is on a T14s G3 AMD platform? Yes. > It looks like this line is causing problems on this platform: > funcmode = (output >> DYTC_GET_FUNCTION_BIT) & 0xF; > > Normally that should return PSC mode (in this case - unless you have AMT > mode, which you don't on that gen platform). > But it's saying it is in standard mode instead....which is peculiar. The FW > is doing something non-standard to other platforms. Interestingly, it happens only initially and during suspend. Other than that, it's in PSC mode likely. > Do you still get the warning when: > - you're in performance or low-power modes? I am on always in low-power mode. This notebook is crap, it's noisy and very hot in balanced or performance modes even when idling. > - you're in lap mode? What is that? > As a quick test could you try adding: > > /* Check if we are PSC mode, or have AMT enabled */ > funcmode = (output >> DYTC_GET_FUNCTION_BIT) & 0xF; > +if (funcmode == DYTC_FUNCTION_STD) > + funcmode = DYTC_FUNCTION_PSC; > > To see if it behaves correctly please? Will try.
(In reply to Jiri Slaby from comment #3) > > As a quick test could you try adding: > > > > /* Check if we are PSC mode, or have AMT enabled */ > > funcmode = (output >> DYTC_GET_FUNCTION_BIT) & 0xF; > > +if (funcmode == DYTC_FUNCTION_STD) > > + funcmode = DYTC_FUNCTION_PSC; > > > > To see if it behaves correctly please? > > Will try. It only changes the error to EINVAL. 0xf is not handled as a correct state in PSC. BTW the first (bad) log: thinkpad_acpi: dytc_profile_refresh: err=-22 mmc=0 psc=1 mmc_get=0 funcmode=13 output=0x1f001 perfmode=15 subsequent refreshes (good): thinkpad_acpi: dytc_profile_refresh: err=0 mmc=0 psc=1 mmc_get=0 funcmode=13 output=0x20013d01 perfmode=3
Ah yeah - need to add: case DYTC_MODE_MMC_BALANCE: underneath case DYTC_MODE_PSC_BALANCE: in function convert_dytc_to_profile (same as is done for the MMC block above) I think that will cure that error. Still tracking down HW here :( Mark
(In reply to Mark Pearson from comment #5) > Ah yeah - need to add: > case DYTC_MODE_MMC_BALANCE: > underneath > case DYTC_MODE_PSC_BALANCE: > in function convert_dytc_to_profile > (same as is done for the MMC block above) > > I think that will cure that error. Of course: dytc_profile_refresh: err=0 mmc=0 psc=1 mmc_get=0 funcmode=13 output=0x1f001 perfmode=15
OK - so (just to be sure) with those changes it is working correctly? Mark
(In reply to Mark Pearson from comment #7) > OK - so (just to be sure) with those changes it is working correctly? It appears so.
Gentle ping: any updates on this?
Thanks for the ping - I had completely forgotten about this one. I'll check on the status. My notes (internal ticket LO-2242) say we're waiting on the FW team for a fix, they did do one version but it didn't work. I've no idea why it's taking so long. Mark