Bug 206413
Description
Tsuchiya Yuto (kitakar5525)
2020-02-04 18:01:06 UTC
Created attachment 287115 [details]
acpidump from my SB1
Created attachment 287117 [details]
s2idle: unload battery module before suspend
Here is a log where I unloaded battery module before going into s2idle.
No automatic wakeup issue.
Created attachment 287119 [details]
the whole dmesg log after a fresh boot (5.5.1)
Here is the whole dmesg log after fresh boot (does not contain suspend/resume
sequence), for what it's worth.
Yes, it looks like the system is woken up by battery events while suspended, but this doesn't seem to be related to the EC GPE. It looks like the battery events are triggered by another GPE which is also a wakeup one. Please run this: $ grep -r '.*' /sys/firmware/acpi/interrupts/ before and after a suspend-to-idle cycle and (if the system wakes up from it immediately due to a battery event) attach the output of it from both runs. Created attachment 287167 [details]
contents of /sys/firmware/acpi/interrupts/ before s2idle
Created attachment 287169 [details]
contents of /sys/firmware/acpi/interrupts/ after s2idle
Data was collected after s2idle when an automatic wakeup occurred.
Thank you for looking into this issue. And here is the diff of the contents of /sys/firmware/acpi/interrupts/ before and after a suspend-to-idle cycle --- acpi_interrupts_before_s2idle.txt 2020-02-06 20:40:12.380455383 +0900 +++ acpi_interrupts_after_s2idle.txt 2020-02-06 20:40:49.436671990 +0900 @@ -33,7 +33,7 @@ /sys/firmware/acpi/interrupts/gpe76: 0 invalid unmasked /sys/firmware/acpi/interrupts/gpe38: 0 invalid unmasked /sys/firmware/acpi/interrupts/gpe0B: 0 invalid unmasked -/sys/firmware/acpi/interrupts/gpe66: 10 EN enabled unmasked +/sys/firmware/acpi/interrupts/gpe66: 18 EN enabled unmasked /sys/firmware/acpi/interrupts/gpe28: 0 invalid unmasked /sys/firmware/acpi/interrupts/gpe56: 0 invalid unmasked /sys/firmware/acpi/interrupts/gpe18: 0 invalid unmasked @@ -46,7 +46,7 @@ /sys/firmware/acpi/interrupts/error: 0 /sys/firmware/acpi/interrupts/gpe54: 0 invalid unmasked /sys/firmware/acpi/interrupts/gpe16: 0 invalid unmasked -/sys/firmware/acpi/interrupts/sci: 1058 +/sys/firmware/acpi/interrupts/sci: 1708 /sys/firmware/acpi/interrupts/gpe44: 0 invalid unmasked /sys/firmware/acpi/interrupts/gpe7E: 0 invalid unmasked /sys/firmware/acpi/interrupts/gpe72: 0 invalid unmasked @@ -70,7 +70,7 @@ /sys/firmware/acpi/interrupts/gpe2E: 0 invalid unmasked /sys/firmware/acpi/interrupts/gpe22: 0 invalid unmasked /sys/firmware/acpi/interrupts/gpe5C: 0 invalid unmasked -/sys/firmware/acpi/interrupts/gpe50: 1048 EN enabled unmasked +/sys/firmware/acpi/interrupts/gpe50: 1692 EN enabled unmasked /sys/firmware/acpi/interrupts/gpe1E: 0 invalid unmasked /sys/firmware/acpi/interrupts/gpe79: 0 invalid unmasked /sys/firmware/acpi/interrupts/gpe12: 0 invalid unmasked @@ -97,7 +97,7 @@ /sys/firmware/acpi/interrupts/gpe67: 0 invalid unmasked /sys/firmware/acpi/interrupts/gpe00: 0 invalid unmasked /sys/firmware/acpi/interrupts/gpe3A: 0 invalid unmasked -/sys/firmware/acpi/interrupts/gpe_all: 1058 +/sys/firmware/acpi/interrupts/gpe_all: 1710 /sys/firmware/acpi/interrupts/gpe29: 0 invalid unmasked /sys/firmware/acpi/interrupts/gpe57: 0 invalid unmasked /sys/firmware/acpi/interrupts/gpe2A: 0 invalid unmasked @@ -132,6 +132,6 @@ /sys/firmware/acpi/interrupts/gpe05: 0 invalid unmasked /sys/firmware/acpi/interrupts/gpe3F: 0 invalid unmasked /sys/firmware/acpi/interrupts/gpe33: 0 invalid unmasked -/sys/firmware/acpi/interrupts/gpe6D: 0 EN enabled unmasked +/sys/firmware/acpi/interrupts/gpe6D: 0 disabled unmasked /sys/firmware/acpi/interrupts/ff_slp_btn: 0 invalid unmasked /sys/firmware/acpi/interrupts/gpe61: 0 EN enabled unmasked Thanks! Please also do (as root) # echo 1 > /sys/power/pm_debug_messages without enabling the ACPI debug and attach the output of dmesg run after a suspend-to-idle cycle in which the immediate wakeup has occurred. Created attachment 287179 [details]
dmesg: s2idle cycle with pm_debug_messages
The PC does not necessarily wake up immediately. Sometimes it wakes up immediately,
sometimes after about 30 seconds and sometimes wakes up after minutes.
The sure way of waking up the PC automatically is plugging/unplugging AC adapter.
So, this time, I plugged the AC adapter into PC to cause the automatic wake up.
According to the information collected so far, the premature wakeup is caused by the EC. Now, this means that setting the ec_no_wakeup parameter of the acpi module should make the problem go away, so please try to do that. The ec_no_wakeup module parameter is writable, so you should be able to set it at run time (as root) by echoing 1 to /sys/modules/acpi/parameters/ec_no_wakeup. Created attachment 287201 [details]
Suspend-to-idle wakeup debug patch
Nevertheless, let's try to find out why calling pm_system_cancel_wakeup() from acpi_s2idle_wake() is not sufficient.
Please apply this patch, retest and attach the output of dmesg including at least one suspend-to-idle and resume cycle (without ACPI debug and with pm_debug_messages set to 1).
(In reply to Rafael J. Wysocki from comment #10) > According to the information collected so far, the premature wakeup is > caused by the EC. Now, this means that setting the ec_no_wakeup parameter > of the acpi module should make the problem go away, so please try to do that. > > The ec_no_wakeup module parameter is writable, so you should be able to set > it at run time (as root) by echoing 1 to > /sys/modules/acpi/parameters/ec_no_wakeup. That should be: /sys/module/acpi/parameters/ec_no_wakeup Sorry for the typo. Created attachment 287203 [details]
dmesg: s2idle-wake-debug
I guess you intended to do the following thing instead
(I just removed the second hunk):
Index: linux-pm/drivers/base/power/wakeup.c
===================================================================
--- linux-pm.orig/drivers/base/power/wakeup.c
+++ linux-pm/drivers/base/power/wakeup.c
@@ -924,11 +924,18 @@ bool pm_wakeup_pending(void)
return ret || atomic_read(&pm_abort_suspend) > 0;
}
-void pm_system_wakeup(void)
+void __pm_system_wakeup(void)
{
atomic_inc(&pm_abort_suspend);
s2idle_wake();
}
+
+void pm_system_wakeup(void)
+{
+ __pm_system_wakeup();
+
+ dump_stack();
+}
EXPORT_SYMBOL_GPL(pm_system_wakeup);
void pm_system_cancel_wakeup(void)
---
With the above patch, I did s2idle two times.
- The first time, I observed the automatic wakeup.
- The socond time, with ec_no_wakeup, and no automatic wakeup can be seen
(when I plug/unplug the AC adapter). I woke up the PC by lid (Power
buttons also uses EC, I can't wake it up by the button anymore).
(In reply to Tsuchiya Yuto (kitakar5525) from comment #13) > Created attachment 287203 [details] > dmesg: s2idle-wake-debug > > I guess you intended to do the following thing instead > (I just removed the second hunk): I did exactly what I wanted, but never mind, your modification probably saved us a round of back-and-forth comments. :-) Created attachment 287223 [details]
PM: s2idle: ACPI: Rework the handling of EC wakeup events
So the theory goes that the EC GPE may re-trigger immediately after acpi_s2idle_sync() and when the SCI is re-armed after that, it will re-trigger too right away and that may happen before the pm_wakeup_pending() check in s2idle_loop() and the wakeup may be counted as a genuine one.
So I have this patch (which I only have tested very lightly), so please try it (with ec_no_wakeup unset) and report back.
(In reply to Rafael J. Wysocki from comment #14) > (In reply to Tsuchiya Yuto (kitakar5525) from comment #13) > > Created attachment 287203 [details] > > dmesg: s2idle-wake-debug > > > > I guess you intended to do the following thing instead > > (I just removed the second hunk): > > I did exactly what I wanted, but never mind, your modification probably > saved us a round of back-and-forth comments. :-) (Ah, I got it. The non-underscored function "pm_system_wakeup()" will be called from another location, too and you expected it. I missed the fact. But why I removed the second hunk is that, When I tried the debug patch from Comment 11 without modification, the "dump_stack()" was not called when I observed the automatic wakeup.) (I'll try your patch from Comment 15 later!) Created attachment 287229 [details] dmesg: acpi-s2idle-rework-wake dmesg log with patch "PM: s2idle: ACPI: Rework the handling of EC wakeup events" from Comment 15 (with s2idle-wake-debug patch from Comment 13). Hmm. It still automatically wakes up with the same dmesg output as before. Created attachment 287267 [details] PM: s2idle: ACPI: Rework the handling of EC wakeup events This is a replacement for the patch from comment #15 that I'm going to apply. The race addressed by it is real, so it needs fixing regardless of how unlikely it is to occur in practice. Created attachment 287269 [details] ACPI: PM: Check status of all GPEs in acpi_s2idle_wake() And here's another patch to try on top of the one from comment #18. It is possible that the platform triggers a spurious SCI without any GPEs set, in which case it will be treated as a genuine wakeup, so avoid that by checking if any GPEs are set at all before returning 'true' from acpi_s2idle_wake(). Created attachment 287273 [details] dmesg: s2idle with patches from c18 and c19 (plus debug patch from c13) I applied patches from Comment 18 and Comment 19 (plus debug patch from Comment 13) on Linux 5.5.2 and 5.6-rc1 and I can confirm that no automatic wakeup happened at least when I plugged/unplugged AC adapter several times. I attached dmesg log on 5.6-rc1 for reference. I'll put SB1 into s2idle overnight to see if the automatic wakeup occurs and report back to you. Thanks! Please also verify that the lid wakeup still works with these patches applied. (In reply to Rafael J. Wysocki from comment #21) > Thanks! Please also verify that the lid wakeup still works with these > patches applied. Yes, it still works (with the non-upstream lid driver that I mentioned in dmesg log attached in Comment 13). Hmm... Actually, I observed automatic wakeup two times with patches from Comment 18, Comment 19 and non-upstream patches (including the lid driver). I put SB1 into s2idle overnight with only the patches from c18 and c19 (plus debug patch from c13), but the automatic wakeup did not occur. So, it might be possible that enabling wakeup from lid might be related. I couldn't get useful dmesg log at the time when the automatic wakeup occurred. I'll try to collect useful dmesg log, but it may take some time because I can't find the way to reproduce the issue yet. So I've posted these two patches to replace the patch from comment 19: https://patchwork.kernel.org/patch/11375931/ https://patchwork.kernel.org/patch/11375931/ that (together) should be functionally equivalent to it, and I'm going to push them for merging, because the "spurious SCI wakeup" problem needs to be addressed anyway. Now, if there are no premature wakeup issues on your system any more with these patches applied on top of the upstream code, I will be inclined to mark this bug as "resolved". Sorry, pasted the same URL twice in the previous comment. The other patch is at: https://patchwork.kernel.org/patch/11375923/ No premature wakeup was observed with this two-day use (5.6-rc1 + your patches + some non-upstream drivers including the lid driver). So, I'm still not sure why I encountered the wakeup two times (Comment 23). I'll report back here if I see the wakeup again and also if the wakeup seems to be still related to this bug that I originally filed. Anyway, thank you for your work! Hi Rafael, while stress testing on 5.6.0-rc2 I've discovered that the kernel hangs on freeze (it refuses to wake up from rtc, wake on wlan, keypress, or power press), and upon git bisect the culprit appears to be this patch: commit fdde0ff8590b4c1c41b3227f5ac4265fccccb96b Author: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Date: Tue Feb 11 17:53:52 2020 +0100 ACPI: PM: s2idle: Prevent spurious SCIs from waking up the system If the platform triggers a spurious SCI even though the status bit is not set for any GPE when the system is suspended to idle, it will be treated as a genuine wakeup, so avoid that by checking if any GPEs are active at all before returning 'true' from acpi_s2idle_wake(). Link: https://bugzilla.kernel.org/show_bug.cgi?id=206413 Fixes: 56b991849009 ("PM: sleep: Simplify suspend-to-idle control flow") Reported-by: Tsuchiya Yuto <kitakar@gmail.com> Cc: 5.4+ <stable@vger.kernel.org> # 5.4+ Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> If I reverse the patch and rebuild 5.6-rc2 without it freeze starts working again, so this patch clearly has an issue. follow-up: It's difficult to say for sure that my original issue (premature wakeups) has been completely resolved or not because I don't have any way to confirm this issue other than suspend computer longer and longer. However, at least I can say that I haven't observed premature wakeups even once with overnight s2idle suspends on recent kernel versions (with that aforementioned Lid driver). Previously, even with commit fdde0ff8590b4 ("ACPI: PM: s2idle: Prevent spurious SCIs from waking up the system"), I sometimes (but rare case) observed wakeups (Comment 23). After the commit, several commits that have reference to the commit have been made: $ git log --pretty=format:"%h - %s" --grep "fdde0ff8" 7b301750f7f8f - ACPI: EC: PM: Avoid premature returns from acpi_s2idle_wake() 767191db8220d - platform/x86: intel_int0002_vgpio: Use acpi_register_wakeup_handler() ddfd9dcf270ce - ACPI: PM: Add acpi_[un]register_wakeup_handler() d5406284ff803 - ACPI: PM: s2idle: Refine active GPEs check 63fb9623427fb - ACPI: PM: s2idle: Check fixed wakeup events in acpi_s2idle_wake() I guess those commits additionally improved the situation for Surface Book 1, too. (In reply to Todd Brandt from comment #27) > Hi Rafael, while stress testing on 5.6.0-rc2 I've discovered that the kernel > hangs on freeze (it refuses to wake up from rtc, wake on wlan, keypress, or > power press), and upon git bisect the culprit appears to be this patch: > > commit fdde0ff8590b4c1c41b3227f5ac4265fccccb96b > Author: Rafael J. Wysocki <rafael.j.wysocki@intel.com> > Date: Tue Feb 11 17:53:52 2020 +0100 > > ACPI: PM: s2idle: Prevent spurious SCIs from waking up the system > [...] > > If I reverse the patch and rebuild 5.6-rc2 without it freeze starts working > again, so this patch clearly has an issue. What about the stress test with the recent kernel version? I think the following commit has fixed the issue found by stress test. commit 63fb9623427fbb44e3782233b6e4714057b76ff2 Author: Rafael J. Wysocki <rafael.j.wysocki@intel.com> AuthorDate: Fri Feb 21 01:46:18 2020 +0100 Commit: Linus Torvalds <torvalds@linux-foundation.org> CommitDate: Fri Feb 21 10:01:25 2020 -0800 ACPI: PM: s2idle: Check fixed wakeup events in acpi_s2idle_wake() Commit fdde0ff8590b ("ACPI: PM: s2idle: Prevent spurious SCIs from waking up the system") overlooked the fact that fixed events can wake up the system too and broke RTC wakeup from suspend-to-idle as a result. Fix this issue by checking the fixed events in acpi_s2idle_wake() in addition to checking wakeup GPEs and break out of the suspend-to-idle loop if the status bits of any enabled fixed events are set then. Fixes: fdde0ff8590b ("ACPI: PM: s2idle: Prevent spurious SCIs from waking up the system") Reported-and-tested-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: 5.4+ <stable@vger.kernel.org> # 5.4+ Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Bug closed. |