Bug 218265
Summary: | Immediate wake after S2 suspend, intermittently, with all /proc/acpi/wakeup sources disabled | ||
---|---|---|---|
Product: | ACPI | Reporter: | Mark Haun (haunma) |
Component: | Power-Sleep-Wake | Assignee: | acpi_power-sleep-wake |
Status: | NEW --- | ||
Severity: | normal | CC: | bagasdotme, haunma, ulf |
Priority: | P3 | ||
Hardware: | AMD | ||
OS: | Linux | ||
Kernel Version: | 6.7-rc6 | Subsystem: | |
Regression: | No | Bisected commit-id: | |
Attachments: |
dmesg from suspend which immediately wakes
dmesg from suspend which works correctly amd_s2idle.py log Full dmesg |
Created attachment 305600 [details]
dmesg from suspend which works correctly
(In reply to Mark Haun from comment #1) > Created attachment 305600 [details] > dmesg from suspend which works correctly From what version? (In reply to Bagas Sanjaya from comment #2) > (In reply to Mark Haun from comment #1) > > Created attachment 305600 [details] > > dmesg from suspend which works correctly > > From what version? I'm sorry, I don't understand. Version of what? On 12/15/23 22:31, bugzilla-daemon@kernel.org wrote: > https://bugzilla.kernel.org/show_bug.cgi?id=218265 > > --- Comment #3 from Mark Haun (haunma@keteu.org) --- > (In reply to Bagas Sanjaya from comment #2) >> (In reply to Mark Haun from comment #1) >>> Created attachment 305600 [details] >>> dmesg from suspend which works correctly >> >> From what version? > > I'm sorry, I don't understand. Version of what? > From what kernel version you attached that dmesg? Confused... (In reply to Bagas Sanjaya from comment #4) > On 12/15/23 22:31, bugzilla-daemon@kernel.org wrote: > > https://bugzilla.kernel.org/show_bug.cgi?id=218265 > > > > --- Comment #3 from Mark Haun (haunma@keteu.org) --- > > (In reply to Bagas Sanjaya from comment #2) > >> (In reply to Mark Haun from comment #1) > >>> Created attachment 305600 [details] > >>> dmesg from suspend which works correctly > >> > >> From what version? > > > > I'm sorry, I don't understand. Version of what? > > From what kernel version you attached that dmesg? > Confused... 6.6.6 like it says in the original report---why? Is there something unexpected in the dmesg? It would be great if there were. I can't see anything wrong, unfortunately :( Can you also check current mainline (v6.7-rc6)? Did this work better with earlier kernel version? Or did you just get the machine? And can you please attach a full dmesg (e.g. with everything since boot in it). I just got the machine (and Phoenix 7840 U/S/HS APUs are fairly new anyway), so can't comment on earlier kernels. I am now running 6.7-rc6 and need to run a few more days to see if the problem will recur. Suspend seems to work fine many times in a row (after manually disabling all wakeups), then the problem pops up again. We shall see. Coming back to the original report, is there any way to log/trace ACPI wakeup events? Some option I could turn on to start logging them in dmesg? It seems weird that the only way to test is "circumstantial" i.e. experimentally testing which actions cause a wake and which don't. This is obviously problematic for any intermittent failure. I have observed that the built-in trackpad and/or keyboard will usually---but not always!---wake the system even when all /proc/acpi/wakeup sources show as disabled. The power button will *always* wake the system when all /proc/acpi/wakeup sources show as disabled... one of which is "sleep button". It would sure be nice to know which interrupt/event actually causes the wakeup, unless this is obfuscated by the BIOS? Investigating some more today while I wait to see if there are still failures. Here is my /proc/acpi/wakeup (after manually disabling everything): Device S-state Status Sysfs node GPP0 S0 *disabled GPP1 S4 *disabled GPP2 S3 *disabled GPP5 S0 *disabled pci:0000:00:02.1 GPP6 S4 *disabled pci:0000:00:02.2 GP11 S4 *disabled SWUS S4 *disabled GP12 S4 *disabled pci:0000:00:04.1 SWUS S4 *disabled XHC0 S3 *disabled pci:0000:64:00.3 XHC1 S4 *disabled pci:0000:64:00.4 XHC2 S4 *disabled pci:0000:66:00.0 NHI0 S3 *disabled XHC3 S3 *disabled pci:0000:66:00.3 NHI1 S4 *disabled pci:0000:66:00.6 XHC4 S3 *disabled pci:0000:66:00.4 LID S4 *disabled platform:PNP0C0D:00 SLPB S3 *disabled platform:PNP0C0E:00 I see that my touchpad is on an I2C bus. Here is the entry in /sys/bus/hid: 0018:04F3:320B.0008 -> ../../../devices/platform/AMDI0010:01/i2c-19/i2c-ELAN0688:00/0018:04F3:320B.0008 I can't tell if the controller is on the PCI bus. Does "AMDI" stand for one of these (lspci output): 65:00.0 Non-Essential Instrumentation [1300]: Advanced Micro Devices, Inc. [AMD] Device 14ec 66:00.0 Non-Essential Instrumentation [1300]: Advanced Micro Devices, Inc. [AMD] Device 14ec If the touchpad I2C is not in /proc/acpi/wakeup, that would explain why I [usually] am able to wake the system by vigorously wiping the touchpad. If there are ever any spurious interrupts, could this also explain unwanted wakes? (In reply to Mark Haun from comment #8) > I just got the machine (and Phoenix 7840 U/S/HS APUs are fairly new anyway), > so can't comment on earlier kernels. Then this is nothing for me. But FWIW, devs sometimes use this script to get down to the root of the problem: https://gitlab.freedesktop.org/drm/amd/-/blob/master/scripts/amd_s2idle.py Created attachment 305650 [details]
amd_s2idle.py log
Created attachment 305651 [details]
Full dmesg
Update: I have not seen a recurrence of the immediate wake issue on 6.7-rc6 as long as all of the /proc/acpi/wakeup sources are set to disabled. The issue does happen most of the time with the boot-up defaults, so I guess it may be worth documenting/logging the results. Maybe someone can comment if that is appropriate here, or in a new bug report, or not considered a bug? At Thorstens suggestion I tried running the amd_s2idle.py script. Somehow it is able to tell me which interrupt/event caused the wake, information not available from dmesg, so that was helpful. In my case, this output ○ Wakeups triggered from IRQs: [7] ○ Woke up from IRQ 7 (IR-IO-APIC 7-fasteoi pinctrl_amd) ○ gpe07 increased from 13618 to 13632 ❌ Userspace suspended for 0:00:03.870541 (< minimum expected 0:00:09) led me to examine /sys/firmware/acpi/interrupts/gpe07: $ cat /sys/firmware/acpi/interrupts/gpe07 12843 EN enabled unmasked I spent some time testing with gpe07 masked/unmasked and disabled/enabled but the results were inconclusive. Sometimes the system remains in suspend and sometimes it doesn't. In *both* cases, the script usually reports that gpe07 was the cause of the wakeup, regardless of the mask or enable status of /sys/firmware/acpi/interrupts/gpe07. It seems like that is just a placebo AFAICT. Hope this helps. In further testing, I've discovered some interactions with userspace. Specifically, I am normally running elogind (on a Void Linux runit-based system), and "sudo zzz" often (but not always) fails to stay in suspend if elogind is running. But elogind isn't the whole story. With elogind disabled, zzz seems quite reliable at a VT after boot. But after starting X (with a window manager, no DE) and firefox, zzz no longer works consistently, even without elogind running. So my theory that elogind was doing something nefarious in the background doesn't quite hold up :( In any case, I think this is still in the kernel-bug category, as suspend should never fail due to unrelated userspace processes (excepting those which may be attempting to manipulate power states). as the demsg shows: --- [ 7.100182] ath11k_pci 0000:02:00.0: fw_version 0x110b196e fw_build_timestamp 2022-12-22 12:54 fw_build_id WLAN.HSP.1.1-03125-QCAHSPSWPL_V1_V2_SILICONZ_LITE-3.6510.23 --- this is possibly related to: https://bugzilla.kernel.org/show_bug.cgi?id=217239 |
Created attachment 305599 [details] dmesg from suspend which immediately wakes On a new Lenovo Thinkpad P16s with kernel 6.6.6, suspend is immediately followed by wake, about half of the time. (If sleep lasts for three seconds, it will last indefinitely.) There is no apparent difference between lid close, `zzz`, and `loginctl suspend`. I see nothing suspicious in dmesg, and I see no difference in the messages between the fail case (immediate wake-up) and the success case (wake-up later). All ACPI wake sources are already set to "disabled" in /proc/acpi/wakeup. All potential wake options, like wake on LAN, are disabled in the BIOS. Observability is poor because I know of no way to log or trace the actual wake event. E.g. even with "everything" disabled, the laptop's touchpad or keyboard still wake the system, but I only know that by human observation, not proper logging. Is there some way to make the kernel log the wake source?