Bug 217833

Summary: Re: [REGRESSION] [BISECTED] power button stopped working on lenovo ideapad 5 since a855724dc08b
Product: Drivers Reporter: Luca Pigliacampo (lucapgl2001)
Component: OtherAssignee: drivers_other
Status: RESOLVED UNREPRODUCIBLE    
Severity: low CC: mario.limonciello, shyam-sundar.s-k
Priority: P3    
Hardware: AMD   
OS: Linux   
Kernel Version: Subsystem:
Regression: No Bisected commit-id:
Attachments: dumps of acpi tables
content of /sys/kernel/debug/gpio in version 6.4.4
content of /sys/kernel/debug/gpio in version 6.5-rc7 before pressing power button
content of /sys/kernel/debug/gpio in version 6.5-rc7 after pressing power button
possible patch v1
possible patch v2
messages received after enabling dynamic debug
log from pressing power after using trackpad
log of unrecognized press followed by succesful press
log of a specific sequence of presses
state of gpio0 while using the power button
possible patch v3
output of dmidecode -t system
gpio before sleep
gpio after sleep
gpio before sleep in 6.4.4
gpio after sleep in 6.4.4
log of kernel messages across suspend
log across suspension without touching power before
log when using the power button to suspend on patch v4

Description Luca Pigliacampo 2023-08-27 19:57:50 UTC
continuing from: https://lore.kernel.org/regressions/a715e893-733b-4ffe-8646-87af49998cc2@amd.com/

sure thing,

here are the files,
i used the version 6.5-rc7 because the next mainline isn't available yet, i'll add it tomorrow probably,
i got the contents of the gpio file both before and after pressing the power button because some values changed only in the newer version and it seems related,
also sorry, but i don't have windows in dual boot, i'll see if i can use windows to go or something similar.

also i haven't updated the bios for a while, maybe that could be related to this?
Comment 1 Luca Pigliacampo 2023-08-27 20:01:33 UTC
Created attachment 304954 [details]
dumps of acpi tables
Comment 2 Luca Pigliacampo 2023-08-27 20:02:42 UTC
Created attachment 304955 [details]
content of /sys/kernel/debug/gpio in version 6.4.4
Comment 3 Luca Pigliacampo 2023-08-27 20:04:15 UTC
Created attachment 304956 [details]
content of /sys/kernel/debug/gpio in version 6.5-rc7 before pressing power button
Comment 4 Luca Pigliacampo 2023-08-27 20:04:42 UTC
Created attachment 304957 [details]
content of /sys/kernel/debug/gpio in version 6.5-rc7 after pressing power button
Comment 5 Mario Limonciello (AMD) 2023-08-28 01:00:22 UTC
My observations from your logs so far:
1) Your system doesn't use _AEI to program any GPIOs.  The values programmed to the firmware are what persist.

2) In your 6.5 logs it seems that bit 30 is active after you've pressed the button.

Bit 30 being asserted for GPIO 0 means that the power button was pressed for less than 2 seconds you can reference page 948 of this document:
https://www.amd.com/content/dam/amd/en/documents/archived-tech-docs/programmer-references/55072_AMD_Family_15h_Models_70h-7Fh_BKDG.pdf

---

My next step asks:

1) Can you please turn on the dynamic debugging statements for pinctrl-amd.c and see if you get messages when you press the button?  Given the interrupt is "disabled" I think you won't, but I'd like to confirm.

2) Can you try what I suggested with holding the power button longer than 2s but less than 10s to see if it triggers?

3) If nothing from 2, can you try from >10s?

4) I'd like to understand what source you "normally" see the power button from as well.  An ACPI device?

5) I notice in the documentation that software needs to clear InterruptSts to clear LessThan2s.  As the interrupt isn't enabled (since you don't have _AEI), I'm not sure if that will work.  Let me see your findings from 1-4 first, and I can follow up with a patch later if I think this can work for your case.
Comment 6 Mario Limonciello (AMD) 2023-08-28 02:56:01 UTC
Created attachment 304963 [details]
possible patch v1

Here's a possible patch for my idea.  It's only compile tested, and please still provide the logs for the other things I asked to make sure it makes sense.
Comment 7 Mario Limonciello (AMD) 2023-08-28 02:58:10 UTC
Created attachment 304964 [details]
possible patch v2
Comment 8 Luca Pigliacampo 2023-08-28 08:21:20 UTC
Created attachment 304966 [details]
messages received after enabling dynamic debug
Comment 9 Luca Pigliacampo 2023-08-28 08:46:14 UTC
thanks, i'll test the patch now,

about 1) when i enabled dynamic debug it started spamming the message i'm attaching in the system journal, multiple times every second, and pressing the power button didin't cause any different messages

about 2) sorry i forgot about it before,
keeping the power button pressed for a bit longer (between 2 and 4 seconds) didn't make a difference compared to a short press,
after keeping it pressed for more than 4 seconds the press is recognized as KEY_LEFTMETA on the keyboard, rather than power button,

about 3) keeping the power button pressed for more than 10 seconds cuts power to the computer and causes it to shut down abruptly.

about 4) sorry, but i'm not sure what you mean, i know evtest is reading events for the power button from the file at /dev/input/event2, i found that it is also referenced from /sys/devices/LNXSYSTM\:00/LNXPWRBN\:00/, that seems related to acpi, so i think it's an acpi device?

about 5) i hope these were useful
Comment 10 Luca Pigliacampo 2023-08-28 09:24:09 UTC
I'm testing the patch right now and certainly it changed something,
before the power button would be ignored until a reboot,
now it is recognized again after i log out and log back in,

so i can press it once, log out, log in, and i can press it a second time,
then i need again to log out and back in to press it a third time
Comment 11 Luca Pigliacampo 2023-08-28 09:57:46 UTC
also it seems that in some cases the power button can be actually pressed multiple times by waiting a few seconds between presses, but i couldn't find a way to reproduce it consistently
Comment 12 Mario Limonciello (AMD) 2023-08-28 13:31:58 UTC
(In reply to Luca Pigliacampo from comment #10)
> I'm testing the patch right now and certainly it changed something,
> before the power button would be ignored until a reboot,
> now it is recognized again after i log out and log back in,
> 
> so i can press it once, log out, log in, and i can press it a second time,
> then i need again to log out and back in to press it a third time

I believe the patch is the right direction then.  I think that what you're probably observing is related to the debounce behavior and the driver is expected to be clearing it.  The patch might be correct, but I'd still like some more checks (see below).

(In reply to Luca Pigliacampo from comment #9)
> thanks, i'll test the patch now,
> 
> about 1) when i enabled dynamic debug it started spamming the message i'm
> attaching in the system journal, multiple times every second, and pressing
> the power button didin't cause any different messages

It's tied to movement from your touchpad.  It makes sense.

> 
> about 2) sorry i forgot about it before,
> keeping the power button pressed for a bit longer (between 2 and 4 seconds)
> didn't make a difference compared to a short press,
> after keeping it pressed for more than 4 seconds the press is recognized as
> KEY_LEFTMETA on the keyboard, rather than power button,

Yup, that's exactly what I expected to happen when it's more than 2 and less than 10.  It's probably a combination (left meta + something else).

> 
> about 3) keeping the power button pressed for more than 10 seconds cuts
> power to the computer and causes it to shut down abruptly.

OK.

> about 4) sorry, but i'm not sure what you mean, i know evtest is reading
> events for the power button from the file at /dev/input/event2, i found that
> it is also referenced from /sys/devices/LNXSYSTM\:00/LNXPWRBN\:00/, that
> seems related to acpi, so i think it's an acpi device?

OK.

> 
> about 5) i hope these were useful

Very much so.

(In reply to Luca Pigliacampo from comment #11)
> also it seems that in some cases the power button can be actually pressed
> multiple times by waiting a few seconds between presses, but i couldn't find
> a way to reproduce it consistently

Is it possible that it only starts working again after the trackpad is used?

The patch introduces a new dynamic debug statement.  If you turn it on for the whole module while look at this we should be able to better understand it.

Can you please try to correlate the exact things you do with power button and  touchpad movement along with the logs?  What I'm wondering is if the power button only starts working again after you've moved the touchpad.
Comment 13 Luca Pigliacampo 2023-08-28 21:38:43 UTC
Created attachment 304975 [details]
log from pressing power after using trackpad
Comment 14 Mario Limonciello (AMD) 2023-08-28 21:47:10 UTC
> log from pressing power after using trackpad

So that message definitely comes up, but it's not possible to tell purely from the log if it was tied to the touchpad movement or if the interrupt fired on it's own.

Can you compare the GPIO0 line from /sys/kernel/debug/gpio in this specific series of events:
1. Boot up (no power button pressed)
2. Press power button once.  DON'T touch the touchpad.
3. Touch the touchpad.
4. Press power button once.  DON'T touch the touchpad.
5. Press power button once more.  DON'T touch the touchpad.
6. Touch the touchpad.

If after 2, 4, and 5 bit 30 clears on it's own then I think the patch does everything I hope it would.  If bit 30 only clears after steps 3 and 6 then it's not an "ideal" fix, but might be good enough.
Comment 15 Luca Pigliacampo 2023-08-28 21:58:57 UTC
Created attachment 304976 [details]
log of unrecognized press followed by succesful press
Comment 16 Mario Limonciello (AMD) 2023-08-28 22:06:12 UTC
> log of unrecognized press followed by succesful press

Do I follow this right?

1. You pressed once (worked).
2. You pressed again (didn't work).
3. You touched touchpad.
4. You pressed again (worked).
5. You touched touchpad.
Comment 17 Luca Pigliacampo 2023-08-28 22:12:03 UTC
> Do I follow this right?

> 1. You pressed once (worked).
> 2. You pressed again (didn't work).
> 3. You touched touchpad.
> 4. You pressed again (worked).
> 5. You touched touchpad.

not exactly,

1. i pressed once (worked)
2. activated debug messages
3. pressed again (didn't work)
4. touched touchpad
5. pressed again (worked)
6. touched touchpad

so the first press shouldn't appear i think,
still now i'm going to upload the sequence you asked
Comment 18 Luca Pigliacampo 2023-08-28 22:14:35 UTC
Created attachment 304977 [details]
log of a specific sequence of presses
Comment 19 Luca Pigliacampo 2023-08-28 22:17:34 UTC
timings of my actions:

00:03:40 - pressed power
00:03:45 - touched trackpad
00:03:50 - pressed power
00:04:00 - pressed power
00:04:05 - touched trackpad
Comment 20 Mario Limonciello (AMD) 2023-08-28 22:18:08 UTC
> log of a specific sequence of presses

Sorry I might have been unclear.  I'm looking specifically for what happens to this command in that sequence of events.

grep "#0" /sys/kernel/debug/gpio

So capture that command's output 6 times
Comment 21 Luca Pigliacampo 2023-08-28 22:21:39 UTC
> Is it possible that it only starts working again after the trackpad is used?

yes, it does
Comment 22 Luca Pigliacampo 2023-08-28 22:24:12 UTC
(In reply to Mario Limonciello (AMD) from comment #20)
> > log of a specific sequence of presses
> 
> Sorry I might have been unclear.  I'm looking specifically for what happens
> to this command in that sequence of events.
> 
> grep "#0" /sys/kernel/debug/gpio
> 
> So capture that command's output 6 times

sorry, i really should pay more attention
Comment 23 Luca Pigliacampo 2023-08-28 22:33:13 UTC
Created attachment 304978 [details]
state of gpio0 while using the power button
Comment 24 Luca Pigliacampo 2023-08-28 22:34:37 UTC
here it is
Comment 25 Mario Limonciello (AMD) 2023-08-28 22:38:20 UTC
Created attachment 304979 [details]
possible patch v3

No worries.

The patch you bisected to we definitely can't revert as it fixes other bugs with power button handling relative to what the platform expects.

I think what we should do is:
1. Submit that patch that helps when there is another GPIO interrupt firing.
2. Write a new patch to offer a module parameter

Can you please try this attached patch?  This should let you configure the power button behavior back the way it was before (pinctrl_amd.powerbtn=0 on your kernel command line).

If that works and you're happy enough with it I'll submit the two patches.
Comment 26 Luca Pigliacampo 2023-08-28 22:47:19 UTC
thank you!
i'll try that patch tomorrow, at the moment i feel like i'm going to fall asleep while standing.
good night
Comment 27 Mario Limonciello (AMD) 2023-08-29 01:31:38 UTC
OK.  Please also attach a dmidecode, if the module parameter approach works I'll add a 3rd patch that sets the default behavior for your system to the old behavior with it.
Comment 28 Mario Limonciello (AMD) 2023-08-29 01:47:24 UTC
Actually scratch that, I think I found a dmidecode online to match your system.  Here's a branch as I'd like it tested on your system.

https://gitlab.freedesktop.org/superm1/linux/-/commits/mlimonci/kbz-217833/

It should detect your system and automatically put it into the previous behavior.
If it doesn't, please try pinctrl_amd.powerbtn=0 on kernel command line to force it and share the dmidecode.
Comment 29 Luca Pigliacampo 2023-08-29 10:12:07 UTC
Great!

the patch works, except my system's product name is 82LM rather than 82FG,

attached is the output of dmidecode -t system

i'm going to use this computer normally for a few days to see if there's any problem,
sorry i didn't realize it was so close to the merge window when i reported the bug

Thanks!
Comment 30 Luca Pigliacampo 2023-08-29 10:13:03 UTC
Created attachment 304983 [details]
output of dmidecode -t system
Comment 31 Mario Limonciello (AMD) 2023-08-29 15:40:58 UTC
A bug is a bug; IMO it doesn't matter when it happens.  I'll submit the patches to the mailing list for review.
Comment 32 Mario Limonciello (AMD) 2023-08-31 04:26:35 UTC
Per Hans' feedback on my v1 series, I have some updated patches for you to try.

https://gitlab.freedesktop.org/superm1/linux/-/commits/mlimonci%2Fkbz-217833-v2

Please swap out the v1 ones for these two.
Comment 33 Luca Pigliacampo 2023-08-31 13:47:51 UTC
hi,
i've been running them since this morning,
it works well and it doesn't seem to cause problems on my other system.

also it's nice that now it doesn't need to be enabled for specific hardware
Comment 34 Mario Limonciello (AMD) 2023-08-31 13:58:58 UTC
OK thanks!  I am going to make one more change to it so it doesn't run on ALL hardware but only the ones that need it.  I expect it will still work for you, but please test it instead of v2.

https://gitlab.freedesktop.org/superm1/linux/-/commits/mlimonci%2Fkbz-217833-v3
Comment 35 Luca Pigliacampo 2023-08-31 18:45:44 UTC
sorry, but there's another problem, the presses of the power button stop being recognized after putting the system into s3 sleep, i didn't think of testing this before, but now i'll check the previous version of the patch
Comment 36 Mario Limonciello (AMD) 2023-08-31 18:46:57 UTC
No worries.  Can you please upload /sys/kernel/debug/gpio after resuming from suspend?  I have an idea what this is, but I want to confirm your registers.
Comment 37 Mario Limonciello (AMD) 2023-08-31 18:48:25 UTC
Also, did you resume using the power button or another source when the failure happened?
Comment 38 Luca Pigliacampo 2023-08-31 19:09:25 UTC
i woke up the system using the power button
now i'll post the contents of gpio before and after the sleep
Comment 39 Luca Pigliacampo 2023-08-31 19:09:59 UTC
Created attachment 304994 [details]
gpio before sleep
Comment 40 Luca Pigliacampo 2023-08-31 19:10:29 UTC
Created attachment 304995 [details]
gpio after sleep
Comment 41 Mario Limonciello (AMD) 2023-08-31 19:31:31 UTC
mmm, they're "the same" for GPIO0 and the master register.  I don't know what we can do here.

Has this always been there for wakeup from power button?
Comment 42 Luca Pigliacampo 2023-08-31 19:59:02 UTC
nevermind, it's an unrelated issue, just checked and version 6.4.4, which is before your commits, is also affected.

i normally shut the computer down, so i didn't notice it, now i saw it by chance
Comment 43 Mario Limonciello (AMD) 2023-08-31 20:04:50 UTC
OK.  In that case, does v3 work as correctly as expected?
Comment 44 Luca Pigliacampo 2023-08-31 20:05:58 UTC
yes, it works just fine
Comment 45 Luca Pigliacampo 2023-08-31 20:15:01 UTC
nonono, sorry,
i booted in the wrong kernel,
in 6.4.4 it works
Comment 46 Mario Limonciello (AMD) 2023-08-31 20:20:13 UTC
Can you capture /sys/kernel/debug/gpio before/after suspend on the working kernel?
Comment 47 Luca Pigliacampo 2023-08-31 20:37:15 UTC
sure thing,
attaching them now,
next i'll see if the first patch also was the same
Comment 48 Luca Pigliacampo 2023-08-31 20:37:59 UTC
Created attachment 304997 [details]
gpio before sleep in 6.4.4
Comment 49 Luca Pigliacampo 2023-08-31 20:38:29 UTC
Created attachment 304998 [details]
gpio after sleep in 6.4.4
Comment 50 Luca Pigliacampo 2023-08-31 20:45:22 UTC
so, the power button after sleep works in the first patch you proposed, so the issue was probably introduced in the patch v2
Comment 51 Mario Limonciello (AMD) 2023-08-31 20:46:14 UTC
Can I see the kernel log across a suspend cycle with dynamic debug turned on for pinctrl-amd with version 3?
Comment 52 Mario Limonciello (AMD) 2023-08-31 21:06:24 UTC
And does moving the touchpad fix it after resume on v2/v3?
Comment 53 Luca Pigliacampo 2023-08-31 21:06:38 UTC
Created attachment 304999 [details]
log of kernel messages across suspend
Comment 54 Luca Pigliacampo 2023-08-31 21:07:18 UTC
> And does moving the touchpad fix it after resume on v2/v3?
sadly no
Comment 55 Mario Limonciello (AMD) 2023-08-31 21:08:35 UTC
(In reply to Luca Pigliacampo from comment #53)
> Created attachment 304999 [details]
> log of kernel messages across suspend

It looks like you probably pressed the power button to 'enter' suspend.  Is that right?  If so; does this also reproduce if you only use suspend using software but resume using power button?
Comment 56 Luca Pigliacampo 2023-08-31 21:11:46 UTC
> It looks like you probably pressed the power button to 'enter' suspend.
Yes
Comment 57 Mario Limonciello (AMD) 2023-08-31 21:14:21 UTC
OK let's try this.  Explicitly at suspend let's check if those bits weren't cleared yet and try to clear them again.

https://gitlab.freedesktop.org/superm1/linux/-/commits/mlimonci%2Fkbz-217833-v4

If this reproduces with using power-button for suspend and resume can you share the log again?
Comment 58 Luca Pigliacampo 2023-08-31 21:18:51 UTC
ok, got the log without touching power before sleep,
i only saw messages from the trackpad
Comment 59 Luca Pigliacampo 2023-08-31 21:19:41 UTC
Created attachment 305000 [details]
log across suspension without touching power before
Comment 60 Luca Pigliacampo 2023-08-31 21:22:29 UTC
> It looks like you probably pressed the power button to 'enter' suspend.  Is
> that right?  If so; does this also reproduce if you only use suspend using
> software but resume using power button?

it also reproduces using software

now i'm compiling the new patch
Comment 61 Luca Pigliacampo 2023-08-31 21:36:22 UTC
> If this reproduces with using power-button for suspend and resume can you
> share the log again?

can reproduce bot using and not using the power button

attached is the log when using the power button
Comment 62 Luca Pigliacampo 2023-08-31 21:38:34 UTC
Created attachment 305001 [details]
log when using the power button to suspend on patch v4
Comment 63 Mario Limonciello (AMD) 2023-09-02 15:05:26 UTC
Well I'm awfully confused now.  You are absolutely sure it resolved with just the patch that clears it when you moved the touchpad?

Here's one more idea, let's try to force to ensure it clears on suspend.
https://gitlab.freedesktop.org/superm1/linux/-/commits/mlimonci%2Fkbz-217833-v5
Comment 64 Luca Pigliacampo 2023-09-02 15:35:42 UTC
no, my finger slipped, i meant the one at

https://gitlab.freedesktop.org/superm1/linux/-/commits/mlimonci/kbz-217833/

that added the commandline option
Comment 65 Luca Pigliacampo 2023-09-02 15:37:46 UTC
that apparently just restores the old behaviour for specific hardware didn't it?
Comment 66 Mario Limonciello (AMD) 2023-09-05 16:25:17 UTC
OK.  How about this - let's try to clear the bit for suspend and restore it afterwards.

https://gitlab.freedesktop.org/superm1/linux/-/commits/mlimonci%2Fkbz-217833-v6

Can you test this?
Comment 67 Luca Pigliacampo 2023-09-07 08:30:21 UTC
sure,
it fixes the issue with suspending,

but now i think my pc is messing with me,
i just realised that rebooting and powering off and on again are different,

and in all patches since the patch v2 the power button is also not recognized
after shutting down and turning the pc on again, it works only after a reboot.

if it's of any use, the power button also works if i rebooted from the bootloader (refind)
before starting the operating system.

i'm sorry i keep missing stuff i should have tested

also there seems to be no difference in the contents of /sys/kernel/debug/gpio
before and after rebooting
Comment 68 Mario Limonciello (AMD) 2023-09-07 14:26:10 UTC
You're killing me here! :)

If /sys/kernel/debug/gpio is the same before/after reboot that's quite baffling.

Is there any chance that you can get me information from Windows?  Windows To Go is fine as long as you get the AMD drivers installed.

If you can do that, please install R/W everything in Windows and follow these steps to get the register.

For using R/W everything in Windows you will use the "MMIO" function.  
1. Press the MMIO button and then hit the "+" button.
2. Add a mapping for the base address (FED81500) and save it.
3. Add a line item for the master register (FC)

Share the value it shows.

I'd like to see this value both from cold boot, warm boot, and after suspend in Windows if you can.
Comment 69 Luca Pigliacampo 2023-09-12 08:02:12 UTC
sorry for the wait, i the last few days i didn't have much access to my computer,

the value R/W everything shows is 0xFF00A000 after both booting from powered off, rebooting, and waking from sleep,

but it seems that in certain conditions the power button doesn't work in windows as well,
i'll have to do more tests
Comment 70 Mario Limonciello (AMD) 2023-09-12 21:37:26 UTC
> the value R/W everything shows is 0xFF00A000 after both booting from powered
> off, rebooting, and waking from sleep,

OK - then that means that Windows isn't really messing with this value.

> but it seems that in certain conditions the power button doesn't work in
> windows as well, i'll have to do more tests

OK.

Please be cognizant of avoiding a warm boot from Linux so that it doesn't influence registers.  Cold boot into Windows, or Warm boot from Windows to Windows.
Comment 71 Luca Pigliacampo 2023-09-13 21:14:53 UTC
now i did some more tests and confirmed that windows also has that issue, specifically the power button stops working after suspending and resuming windows,

then i updated the bios and the issue was fixed both on windows and every linux version i tried (6.4.5, all your patches, 6.5.3, 6.5.3-arch1-1),

so apparently it was a bios issue all along and the old behaviour hid the problem

i know it's not advised to change too much stuff while investigating an issue, but it seemed too strange that both systems failed almost in the exact same way,

so, I'm sorry for wasting your time on an issue that wasn't even in the kernel
Comment 72 Mario Limonciello (AMD) 2023-09-13 21:17:56 UTC
Wow, great!