Bug 216101
Summary: | lost acpi events after resume from suspend - AMD Ryzen 6800H | ||
---|---|---|---|
Product: | ACPI | Reporter: | Catalin (catalin) |
Component: | Other | Assignee: | Mario Limonciello (AMD) (mario.limonciello) |
Status: | CLOSED CODE_FIX | ||
Severity: | normal | CC: | mario.limonciello, paul, rui.zhang, shyam-sundar.s-k, travisghansen |
Priority: | P1 | ||
Hardware: | AMD | ||
OS: | Linux | ||
Kernel Version: | 5.18.2 | Subsystem: | |
Regression: | No | Bisected commit-id: | |
Attachments: |
patch 1/4
patch 2/4 patch 3/4 patch 4/4 |
Description
Catalin
2022-06-09 10:26:56 UTC
Correction: close lid works on fresh start with 5.18, after resume no longer works. OS: Linux Mint 20.3 Can you please share your kernel config and your full kernel log with /sys/power/pm_debug_messages set before you suspend. Sorry, I missed your reply ... I found same issue, also Ryzen 6000 here: https://bbs.archlinux.org/viewtopic.php?id=279102 I only have s2idle not S3. Here you have config for 5.19.7: https://antebit.com/kernel-5.19.7-config.txt and this is the kernel log after echo 1 > /sys/power/pm_debug_messages && dmesg -c && systemctl suspend: https://antebit.com/pm_debug_messages.txt Thank you, Catalin > I found same issue, also Ryzen 6000 here: > https://bbs.archlinux.org/viewtopic.php?id=279102 Interesting finding. > I only have s2idle not S3. Right - to be expected. > Here you have config for 5.19.7: OK good, you have the CONFIG_PINCTRL_AMD driver in place. And by testing with 5.19.7 you've picked up the commit I was hoping was there (https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/drivers/pinctrl?h=linux-5.19.y&id=4d8e2fa66adb5514380f5b680796ab0586140447) Can you share your whole kernel log and an acpidump too? Please attach to this bug report so in case you host goes down it's still accessible. Maybe this is the same on another system too: https://forums.lenovo.com/t5/Other-Linux-Discussions/Firmware-regression-No-more-udev-power-supply-events/m-p/5166407 This is the log, fresh restart plus one suspend: https://antebit.com/kernel_log.txt, text is too large to post it here My laptop is Asus not Lenovo. To be more clear, the widget that reads the status of charging is not working, after resume. The laptop is charging. Also when I close/open the lid I need to rely on /proc/acpi/button/lid/*/state, and put laptop on sleep with a script. power_supply events seem to be ok: sudo udevadm monitor --subsystem-match="power_supply" monitor will print the received events for: UDEV - the event which udev sends out after rule processing KERNEL - the kernel uevent KERNEL[212.640913] change /devices/LNXSYSTM:00/LNXSYBUS:00/ACPI0003:00/power_supply/ACAD (power_supply) UDEV [212.643702] change /devices/LNXSYSTM:00/LNXSYBUS:00/ACPI0003:00/power_supply/ACAD (power_supply) KERNEL[212.658193] change /devices/LNXSYSTM:00/LNXSYBUS:00/ACPI0003:00/power_supply/ACAD (power_supply) UDEV [212.659753] change /devices/LNXSYSTM:00/LNXSYBUS:00/ACPI0003:00/power_supply/ACAD (power_supply) KERNEL[223.347727] change /devices/LNXSYSTM:00/LNXSYBUS:00/ACPI0003:00/power_supply/ACAD (power_supply) UDEV [223.356580] change /devices/LNXSYSTM:00/LNXSYBUS:00/ACPI0003:00/power_supply/ACAD (power_supply) KERNEL[223.372205] change /devices/LNXSYSTM:00/LNXSYBUS:00/ACPI0003:00/power_supply/ACAD (power_supply) UDEV [223.378661] change /devices/LNXSYSTM:00/LNXSYBUS:00/ACPI0003:00/power_supply/ACAD (power_supply) acpi_listen, plug and unplug the power - fresh restart: ac_adapter ACPI0003:00 00000002 00000000 ac_adapter ACPI0003:00 00000080 00000000 ac_adapter ACPI0003:00 00000080 00000000 0B3CBB35-E3C2- 000000ff 00000000 ============ acpi event battery PNP0C0A:00 00000081 00000001 battery PNP0C0A:00 00000080 00000001 ac_adapter ACPI0003:00 00000002 00000001 ac_adapter ACPI0003:00 00000080 00000001 ac_adapter ACPI0003:00 00000080 00000001 0B3CBB35-E3C2- 000000ff 00000000 battery PNP0C0A:00 00000081 00000001 battery PNP0C0A:00 00000080 00000001 battery PNP0C0A:00 00000081 00000001 battery PNP0C0A:00 00000080 00000001 after suspend: ac_adapter ACPI0003:00 00000002 00000000 ac_adapter ACPI0003:00 00000080 00000000 ac_adapter ACPI0003:00 00000002 00000001 ac_adapter ACPI0003:00 00000080 00000001 acpi_dump: https://antebit.com/acpi_dump.txt, server is always up, posts are too large... I did not pick the commit for CONFIG_PINCTRL_AMD, was in .config fiel from Ubuntu. Catalin OK I think I see what might be going on. In the PEP device there is an extra call in the 11e00d56-ce64-47ce-837b-1f898f9aa461 case for modern standby exit: Case (0x08) { M000 (0x3E08) If (CondRefOf (\_SB.PCI0.GPP7.DEV0)) { M460 (" Notify (\\_SB.PCI0.GPP7.DEV0, 0x1)\n", Zero, Zero, Zero, Zero, Zero, Zero) Notify (\_SB.PCI0.GPP7.DEV0, One) // Device Check } Return (Zero) } In the e3f32452-febc-43ce-9039-932122d37721 case (which is used by default in Linux) I don't see that call in the matching exit routines: Case (0x03) { M000 (0x3E05) Return (Zero) } or Case (0x05) { M000 (0x3E03) Return (Zero) } I would hypothesize this is the reason for the problem. Please have a try with this change. It's not upstreamable like this, but given it's a firmware bug it would at least prove the correct root cause and we can think about how to do it better. diff --git a/drivers/acpi/x86/s2idle.c b/drivers/acpi/x86/s2idle.c index f9ac12b778e6..c9a7dd474892 100644 --- a/drivers/acpi/x86/s2idle.c +++ b/drivers/acpi/x86/s2idle.c @@ -394,11 +394,6 @@ static int lps0_device_attach(struct acpi_device *adev, lps0_dsm_func_mask = (lps0_dsm_func_mask << 1) | 0x1; acpi_handle_debug(adev->handle, "_DSM UUID %s: Adjusted function mask: 0x%x\n", ACPI_LPS0_DSM_UUID_AMD, lps0_dsm_func_mask); - } else if (lps0_dsm_func_mask_microsoft > 0 && - (!strcmp(hid, "AMDI0007") || - !strcmp(hid, "AMDI0008"))) { - lps0_dsm_func_mask_microsoft = -EINVAL; - acpi_handle_debug(adev->handle, "_DSM Using AMD method\n"); } } else { rev_id = 1; Those 2 lines were missing on my side: > - (!strcmp(hid, "AMDI0007") || > - !strcmp(hid, "AMDI0008"))) { But the patch works!!! Very well, all acpi events are detected: fn keys, close/open lid ( I can get rid of the script which handle it!), plug/unplug power, suspend/resume works flawlessly and very fast, battery drain in suspend is 2% per hour. Extra: I would add that asus_wmi_ec_sensors ( which is set to be removed from kernel ) must be removed or blacklisted since 5.19.x. With it are also problems, although the module is not used. On my side, also on another Asus laptop model, kernel no longer detect close/lid action via /proc/acpi/button/lid/*/state, but also some other battery problems: https://bugs.archlinux.org/task/75653 I do not know if there is any connection with what is here. Thank you! I found another small issue, do not know if it's related or is with nvidia (515 driver) or should I open another bug here, but when I have external monitors attached, hdmi or usb-c, which are connected to nvidia chip, laptop very quickly resume from standby. In dmesg I have: [ 1917.472805] PM: suspend-to-idle == suspend [ 1918.094089] ACPI: PM: Wakeup unrelated to ACPI SCI ==== here is the wakeup [ 1918.094094] PM: resume from suspend-to-idle [ 1918.131991] ACPI: EC: interrupt unblocked [ 1918.651205] PM: noirq resume of devices complete after 519.472 msecs [ 1918.655315] PM: early resume of devices complete after 3.986 msecs [ 1918.655574] asus_wmi: Unknown key code 0xc0 [ 1918.656635] Timekeeping suspended for 0.381 seconds Created attachment 301770 [details]
patch 1/4
Created attachment 301771 [details]
patch 2/4
Created attachment 301772 [details]
patch 3/4
Created attachment 301773 [details]
patch 4/4
> But the patch works!!! Well that's great news. It confirms this is an ASUS BIOS bug. For now I've attached a series that adds a quirk for this bug that I think is more likely upstreamable. If you can please test it on top of 6.0-rc4 and see if things work now? If they don't I probably got the DMI data for your system wrong. In that case, please use add this to your kernel command line: acpi.prefer_microsoft_guid=1 pm_debug_messages acpi.dyndbg='file drivers/acpi/x86/s2idle.c +p' and try again. If that works, please share your dmidecode output so I can get the correct strings and respin patch 4/4. If that doesn't work, please share your full kernel log. > Extra: I don't see any connection, this is a separate issue you should work with owners of those drivers. > but when I have external monitors attached, hdmi or usb-c, which are > connected to nvidia chip, You mean that connecting an external monitor causes the system to wake up? > For now I've attached a series that adds a quirk for this bug that I think > is more likely upstreamable. If you can please test it on top of 6.0-rc4 > and see if things work now? Works as expected! I'll keep it under observation for a few days, just in case. > You mean that connecting an external monitor causes the system to wake up? No. When external monitor is connected ( signal is coming from nvidia chip ) and want to put it in standby it comes back right away. When is on hdmi, when it comes back there is always signal on monitor. When it is on usb-c ( display port) when it comes back monitor is off. If I try again works, because it does not know that the monitor is connected. I need to plug again to have signal on usb-c. In logs I have [ 1917.472805] PM: suspend-to-idle == suspend [ 1918.094089] ACPI: PM: Wakeup unrelated to ACPI SCI ==== here is the wakeup [ 1918.094094] PM: resume from suspend-to-idle [ 1918.131991] ACPI: EC: interrupt unblocked [ 1918.651205] PM: noirq resume of devices complete after 519.472 msecs [ 1918.655315] PM: early resume of devices complete after 3.986 msecs [ 1918.655574] asus_wmi: Unknown key code 0xc0 [ 1918.656635] Timekeeping suspended for 0.381 seconds > Works as expected! > I'll keep it under observation for a few days, just in case. OK. Let me review that series with some other guys. > When external monitor is connected ( signal is coming from nvidia chip ) > and want to put it in standby it comes back right away. > When is on hdmi, when it comes back there is always signal on monitor. > When it is on usb-c ( display port) when it comes back monitor is off. If I > try > again works, because it does not know that the monitor is connected. > I need to plug again to have signal on usb-c. This is a different unrelated issue. You won't get any help on the kernel bug tracker for out of tree modules. If you can reproduce it with nouveau, you should open a separate bug for it for that. > This is a different unrelated issue. You won't get any help on the kernel
> bug tracker for out of tree modules. If you can reproduce it with nouveau,
> you should open a separate bug for it for that.
It's not so important, maybe the fix will come one day..
Noveau does not work at all on external monitors...
Thank you once again for your support!
I have a Lenovo Slim 7 ProX 14ARH7 that I think may be experiencing this same issue. I have applied the patch on top of 6.0rc4 and it does not however seem to solve the issue for me (I added s2idle.prefer_microsoft_guid=1 to boot options). This thread https://www.reddit.com/r/Lenovo/comments/w57eeg/comment/in64lbw/ mentions /sys/firmware/acpi/interrupts/gpe09 not incrementing after suspend. Some other (perhaps useless) info here: https://bbs.archlinux.org/viewtopic.php?pid=2056912#p2056912 Please open your own issue and let's debug it there. If it's the same root cause we can add you to the quirk list, but there is zero information to indicate so right now. I have opened this: https://bugzilla.kernel.org/show_bug.cgi?id=216473 I think there are some strong similarities for sure: - the symptoms are nearly identical (works on boot, fails after suspend) - the same keys/events/etc fail (plug/unplug may function in my case) - both are Rembrandt The kernel solution has been queued up for 6.1 (https://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm.git/commit/?h=bleeding-edge&id=d0f61e89f08dd46a090da50f5d747204673f70ea) I have the 15.6" version of this laptop with an AMD Ryzen 6800H that exhibits the same behaviour, but I notice that the patch for 6.1 is only to the "ASUS TUF Gaming A17". Understandable at this point of course. The model of my laptop is: Asus TUF A15 FA507RM Paul - can you please test with 6.1-rc7 and the latest BIOS from ASUS? If you're still affected please: 1) open your own issue and CC me. 2) Attach to the issue a dmesg log, acpidump and dmidecode output We will determine next steps after that. Thanks Mario, I have submitted a new issue here: https://bugzilla.kernel.org/show_bug.cgi?id=216768 |