Bug 217947 - WARNING at drivers/acpi/platform_profile.c:74 in platform_profile_show()
Summary: WARNING at drivers/acpi/platform_profile.c:74 in platform_profile_show()
Status: NEW
Alias: None
Product: Drivers
Classification: Unclassified
Component: Platform_x86 (show other bugs)
Hardware: All Linux
: P3 normal
Assignee: drivers_platform_x86@kernel-bugs.osdl.org
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2023-09-25 09:12 UTC by Jiri Slaby
Modified: 2023-10-26 17:21 UTC (History)
1 user (show)

See Also:
Kernel Version:
Subsystem:
Regression: No
Bisected commit-id:


Attachments
DSDT (677.43 KB, text/plain)
2023-09-25 09:12 UTC, Jiri Slaby
Details

Description Jiri Slaby 2023-09-25 09:12:13 UTC
Downstream report:
https://bugzilla.suse.com/show_bug.cgi?id=1215602

LKML link:
https://lore.kernel.org/all/047d3c51-0a9e-4c3e-beef-625a7aa4f3c3@kernel.org/

according to logs, since 6.3 (up to 6.5.4 now), I repeatedly see:
> WARNING: CPU: 14 PID: 962 at drivers/acpi/platform_profile.c:74
> platform_profile_show+0xb1/0x100 [platform_profile]
> Modules linked in: ccm michael_mic ...
> CPU: 14 PID: 962 Comm: power-profiles- Kdump: loaded Not tainted
> 6.5.4-6-default #1 openSUSE Tumbleweed (unreleased)
> dd37106c593be78644bb80e3c1534d801bf4cb36
> Hardware name: LENOVO 21CRS0K83K/21CRS0K83K, BIOS R22ET60W (1.30 ) 02/09/2023
> RIP: 0010:platform_profile_show+0xb1/0x100 [platform_profile]
> Code: d0 a8 ...
> RSP: 0018:ffff9c1ac0b97db0 EFLAGS: 00010296
> RAX: 0000000000000000 RBX: 0000000000000000 RCX: 000000008fc35be0
> RDX: 0000000000000000 RSI: ffff9c1ac0b97db4 RDI: ffffffffc0a8b0a0
> RBP: ffff8955ca540000 R08: ffff895b9f1ed180 R09: ffff895559ea1bc0
> R10: 00000000031a400e R11: 000000000003f680 R12: ffff895b9f1ed180
> R13: ffff9c1ac0b97e50 R14: 0000000000000001 R15: ffff9c1ac0b97ee8
> FS:  00007f71b0e71900(0000) GS:ffff895b9f100000(0000) knlGS:0000000000000000
> CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> CR2: 00007fe402ea3400 CR3: 000000012004c000 CR4: 0000000000750ee0
> PKRU: 55555554
> Call Trace:
>  <TASK>
>  sysfs_kf_seq_show+0xab/0x100
>  seq_read_iter+0x123/0x480
>  vfs_read+0x1b8/0x300

It's:
WARN_ON((profile < 0) || (profile >= ARRAY_SIZE(profile_names))))

So I put there one more print:
dev_warn(dev, "profile=%d profile_get=%ps\n",
         profile, cur_profile->profile_get);

and I see:
: profile=-1883022368 profile_get=dytc_profile_get [thinkpad_acpi]
: profile=-1510173440 profile_get=dytc_profile_get [thinkpad_acpi]
: profile=-1510173440 profile_get=dytc_profile_get [thinkpad_acpi]
: profile=-966231712 profile_get=dytc_profile_get [thinkpad_acpi]
: profile=-1578420592 profile_get=dytc_profile_get [thinkpad_acpi]
: profile=-1578420592 profile_get=dytc_profile_get [thinkpad_acpi]
: profile=-1578420592 profile_get=dytc_profile_get [thinkpad_acpi]

I wonder about dev passed to dytc_profile_get() having empty name (nothing before colon above)? Is that expected?

Ah, convert_dytc_to_profile()'s retval is not checked in dytc_profile_refresh(). Adding:
--- a/drivers/platform/x86/thinkpad_acpi.c
+++ b/drivers/platform/x86/thinkpad_acpi.c
@@ -10418,7 +10418,14 @@ static void dytc_profile_refresh(void)
                return;

        perfmode = (output >> DYTC_GET_MODE_BIT) & 0xF;
-       convert_dytc_to_profile(funcmode, perfmode, &profile);
+       err = convert_dytc_to_profile(funcmode, perfmode, &profile);
+       if (err) {
+               pr_warn("%s: mmc=%u psc=%u mmc_get=%u funcmode=%d output=0x%x perfmode=%d\n",
+                          __func__, !!(dytc_capabilities & BIT(DYTC_FC_MMC)),
+                          !!(dytc_capabilities & BIT(DYTC_FC_PSC)),
+                          dytc_mmc_get_available, funcmode, output, perfmode);
+               return;
+       }
        if (profile != dytc_current_profile) {
                dytc_current_profile = profile;
                platform_profile_notify();

fixes the warning, of course.

The output is:
dytc_profile_refresh: mmc=0 psc=1 mmc_get=0 funcmode=0 output=0x1f001 perfmode=15

So using psc mode, it retrieves DYTC (0x1f001) as follows:
0x1.... -> VSTD=1 -> STD mode
0x.f... -> CICM=0xf, DYTC_GET_MODE_BIT(12) -> dunno what it is in STD
0x..0.. -> CICF=0x0, DYTC_GET_FUNCTION_BIT(8) -> DYTC_FUNCTION_STD
0x....1 -> GOOD

But convert_dytc_to_profile() doesn't handle this at all. Do I have a newer DYTC interface? Or a broken one?
                Case (0x00)
                {
                    Local1 = 0x0100
                    Local1 |= 0x80000000
                    Local1 |= 0x00
                    Local1 |= 0x01
                }

I.e. version 8.0, it seems.

My DYTC for CMD_GET looks like:
   Case (0x02)
   {
       Local5 = VSTD /* \VSTD */
       Local5 |= (VCQL << 0x01)
       Local5 |= (VSTP << 0x04)
       Local5 |= (VADM << 0x07)
       Local5 |= (VTMS << 0x09)
       Local5 |= (VDLS << 0x0A)
       Local5 |= (VMSC << 0x0C)
       Local5 |= (VPSC << 0x0D)
       Local1 = (CICF << 0x08)
       If ((CICF == 0x03))
       {
           CICM = SMYH /* \SMYH */
       }
       ElseIf ((CICF == 0x0B))
       {
           CICM = SMMC /* \SMMC */
       }
       ElseIf ((CICF == 0x0D))
       {
           CICM = SPSC /* \SPSC */
       }
       ElseIf ((CICF == 0x0F))
       {
           CICM = SAMT /* \SAMT */
       }
       ElseIf ((CICF == 0x07))
       {
           CICM = SADM /* \SADM */
       }
       Else
       {
           CICM = 0x0F
       }

       Local1 |= (CICM << 0x0C)
       Local1 |= (Local5 << 0x10)
       Local1 |= 0x01
   }
Comment 1 Jiri Slaby 2023-09-25 09:12:31 UTC
Created attachment 305147 [details]
DSDT
Comment 2 Mark Pearson 2023-09-25 12:54:13 UTC
Hi,

Can I confirm this is on a T14s G3 AMD platform?

Thanks for the detailed report - very helpful and very much appreciated. I'll need to reproduce this but I don't have one of these platforms myself so will need to ask a colleague for help.

It looks like this line is causing problems on this platform:
funcmode = (output >> DYTC_GET_FUNCTION_BIT) & 0xF;

Normally that should return PSC mode (in this case - unless you have AMT mode, which you don't on that gen platform). 
But it's saying it is in standard mode instead....which is peculiar. The FW is doing something non-standard to other platforms.

Do you still get the warning when:
 - you're in performance or low-power modes?
 - you're in lap mode?

As a quick test could you try adding:

 /* Check if we are PSC mode, or have AMT enabled */
 funcmode = (output >> DYTC_GET_FUNCTION_BIT) & 0xF;
+if (funcmode == DYTC_FUNCTION_STD)
+  funcmode = DYTC_FUNCTION_PSC;

To see if it behaves correctly please?

I'll have to check with the FW team and find out why this change in behaviour...I suspect I need to do a kernel patch to better handle this AMT case though.

Mark
Comment 3 Jiri Slaby 2023-09-26 06:57:50 UTC
(In reply to Mark Pearson from comment #2)
> Can I confirm this is on a T14s G3 AMD platform?

Yes.

> It looks like this line is causing problems on this platform:
> funcmode = (output >> DYTC_GET_FUNCTION_BIT) & 0xF;
> 
> Normally that should return PSC mode (in this case - unless you have AMT
> mode, which you don't on that gen platform). 
> But it's saying it is in standard mode instead....which is peculiar. The FW
> is doing something non-standard to other platforms.

Interestingly, it happens only initially and during suspend. Other than that, it's in PSC mode likely.

> Do you still get the warning when:
>  - you're in performance or low-power modes?

I am on always in low-power mode. This notebook is crap, it's noisy and very hot in balanced or performance modes even when idling.

>  - you're in lap mode?

What is that?

> As a quick test could you try adding:
> 
>  /* Check if we are PSC mode, or have AMT enabled */
>  funcmode = (output >> DYTC_GET_FUNCTION_BIT) & 0xF;
> +if (funcmode == DYTC_FUNCTION_STD)
> +  funcmode = DYTC_FUNCTION_PSC;
> 
> To see if it behaves correctly please?

Will try.
Comment 4 Jiri Slaby 2023-09-26 10:32:09 UTC
(In reply to Jiri Slaby from comment #3)
> > As a quick test could you try adding:
> > 
> >  /* Check if we are PSC mode, or have AMT enabled */
> >  funcmode = (output >> DYTC_GET_FUNCTION_BIT) & 0xF;
> > +if (funcmode == DYTC_FUNCTION_STD)
> > +  funcmode = DYTC_FUNCTION_PSC;
> > 
> > To see if it behaves correctly please?
> 
> Will try.

It only changes the error to EINVAL. 0xf is not handled as a correct state in PSC.

BTW the first (bad) log:
thinkpad_acpi: dytc_profile_refresh: err=-22 mmc=0 psc=1 mmc_get=0 funcmode=13 output=0x1f001 perfmode=15

subsequent refreshes (good):
thinkpad_acpi: dytc_profile_refresh: err=0 mmc=0 psc=1 mmc_get=0 funcmode=13 output=0x20013d01 perfmode=3
Comment 5 Mark Pearson 2023-09-27 00:06:00 UTC
Ah yeah - need to add:
case DYTC_MODE_MMC_BALANCE:
underneath 
case DYTC_MODE_PSC_BALANCE:
in function convert_dytc_to_profile
(same as is done for the MMC block above)

I think that will cure that error.

Still tracking down HW here :(
Mark
Comment 6 Jiri Slaby 2023-09-27 06:34:04 UTC
(In reply to Mark Pearson from comment #5)
> Ah yeah - need to add:
> case DYTC_MODE_MMC_BALANCE:
> underneath 
> case DYTC_MODE_PSC_BALANCE:
> in function convert_dytc_to_profile
> (same as is done for the MMC block above)
> 
> I think that will cure that error.

Of course:
dytc_profile_refresh: err=0 mmc=0 psc=1 mmc_get=0 funcmode=13 output=0x1f001 perfmode=15
Comment 7 Mark Pearson 2023-09-27 17:09:26 UTC
OK - so (just to be sure) with those changes it is working correctly?
Mark
Comment 8 Jiri Slaby 2023-09-28 06:10:40 UTC
(In reply to Mark Pearson from comment #7)
> OK - so (just to be sure) with those changes it is working correctly?

It appears so.
Comment 9 Jiri Slaby 2023-10-26 08:59:01 UTC
Gentle ping: any updates on this?
Comment 10 Mark Pearson 2023-10-26 17:21:41 UTC
Thanks for the ping - I had completely forgotten about this one.

I'll check on the status. My notes (internal ticket LO-2242) say we're waiting on the FW team for a fix, they did do one version but it didn't work. I've no idea why it's taking so long.

Mark

Note You need to log in before you can comment on or make changes to this bug.