Bug 198389 - s2idle: Surface Pro 3 suspend/resume issues
Summary: s2idle: Surface Pro 3 suspend/resume issues
Status: RESOLVED PATCH_ALREADY_AVAILABLE
Alias: None
Product: Power Management
Classification: Unclassified
Component: Hibernation/Suspend (show other bugs)
Hardware: Intel Linux
: P1 normal
Assignee: Rafael J. Wysocki
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2018-01-08 11:21 UTC by Valentin Manea
Modified: 2018-01-11 00:18 UTC (History)
2 users (show)

See Also:
Kernel Version: 4.13
Subsystem:
Regression: Yes
Bisected commit-id:


Attachments
dmesg (55.13 KB, text/plain)
2018-01-08 19:07 UTC, Valentin Manea
Details
dmidecode (7.64 KB, text/plain)
2018-01-08 19:07 UTC, Valentin Manea
Details
acpidump (352.44 KB, text/plain)
2018-01-08 19:07 UTC, Valentin Manea
Details
Patch to test (648 bytes, patch)
2018-01-09 00:14 UTC, Rafael J. Wysocki
Details | Diff
Patch to test v2 (919 bytes, patch)
2018-01-09 11:37 UTC, Rafael J. Wysocki
Details | Diff
Debug patch (1.22 KB, patch)
2018-01-09 22:33 UTC, Rafael J. Wysocki
Details | Diff
dmesg with debug enabled (58.79 KB, text/plain)
2018-01-09 23:14 UTC, Valentin Manea
Details
Debug patch v2 (2.32 KB, patch)
2018-01-09 23:14 UTC, Rafael J. Wysocki
Details | Diff
logs with new patch (60.70 KB, text/plain)
2018-01-09 23:24 UTC, Valentin Manea
Details

Description Valentin Manea 2018-01-08 11:21:41 UTC
Somewhere between 4.12 and 4.13 the s2idle doesn't work anymore on the
Surface Pro 3.
  After some bisecting I found 3 patches that seem to be at fault:
ACPI / sleep: EC-based wakeup from suspend-to-idle on recent systems ->
0f1a83833a0ba93d6986d9e30c8fe35d7d45c17c
ACPI / PM: Ignore spurious SCI wakeups from suspend-to-idle ->
33e4f80ee69b5168badf37edbfed796eb48434b9
ACPI / PM: Clean up device wakeup enable/disable code ->
235d81a630ca2d39818da96f0c14bc960ffbaeb5

  It doesn't seem to be just one patch, I've tried combinations of the
3, but only with all three reverted s2idle suspend and resume starts
working again.
Comment 1 Valentin Manea 2018-01-08 11:22:12 UTC
I tried both the DE(KDE) suspend functionality but also:
echo memory  > /sys/power/state
and the behaviour is the same: display goes dark, system is not responding - the power button which previously wakes up the system doesn't work anymore. It needs a hard reset from this point

The home button has a slight vibration when device is fully running, in this state the vibration doesn't happen at all so I suppose it went through some sleep stages.

For reference I tried with iommu=off booting because of another bug that made the rounds but that never helped. The problem seems to be in the 3 patches I mentioned in my top email
Comment 2 Valentin Manea 2018-01-08 11:22:48 UTC
When I try pm_test(platform,devices,freezer) with almost any value the system goes to sleep/resumes fine.
Comment 3 Rafael J. Wysocki 2018-01-08 12:54:51 UTC
Please attach:
1. dmesg output after a fresh boot (please use 4.15-rc7 for that if possible)
2. The output of dmidecode
3. The output of acpidump
from the affected machine.

Please test if rtcwake wakes up the system from sleep both in the working and non-working cases.
Comment 4 Rafael J. Wysocki 2018-01-08 12:59:25 UTC
For the collection of dmesg output in 1. above, please boot the kernel with

dyndbg="file sleep.c +p"

(as per file:///scratch/rafael/work/build/hornet/Documentation/output/admin-guide/dynamic-debug-howto.html).
Comment 5 Rafael J. Wysocki 2018-01-08 13:01:13 UTC
Sorry, passed a local link above, should be

https://www.kernel.org/doc/html/latest/admin-guide/dynamic-debug-howto.html
Comment 6 Valentin Manea 2018-01-08 19:07:01 UTC
Created attachment 273477 [details]
dmesg

dmesg with dyndebug for sleep.c enabled(incl rtc wake)
Comment 7 Valentin Manea 2018-01-08 19:07:22 UTC
Created attachment 273479 [details]
dmidecode
Comment 8 Valentin Manea 2018-01-08 19:07:46 UTC
Created attachment 273481 [details]
acpidump
Comment 9 Valentin Manea 2018-01-08 19:09:12 UTC
acpidump, dmidecode and dmesg for kernel(including rtcwake sleep/resume) included.

Everything is running kernel 4.15-rc7.

rtcwak -m freeze -s 20 works perfectly fine
Comment 10 Valentin Manea 2018-01-08 19:10:57 UTC
Sorry dmesg doesn't include the sleep sequence but I can't see anything new there.
Comment 11 Rafael J. Wysocki 2018-01-09 00:14:35 UTC
Created attachment 273487 [details]
Patch to test

OK, let's try a long shot.

Please check if this patch makes the power button wakeup from s2idle work.
Comment 12 Valentin Manea 2018-01-09 07:21:53 UTC
Same thing, system doesn't wake up after applying the patch
Comment 13 Rafael J. Wysocki 2018-01-09 11:37:35 UTC
Created attachment 273495 [details]
Patch to test v2

Obviously, it won't work without initializing wakeup.

Please try this one.
Comment 14 Valentin Manea 2018-01-09 17:44:51 UTC
Doesn't seem to wakeup the device either.
Comment 15 Valentin Manea 2018-01-09 19:31:43 UTC
I should note that the lid event doesn't seem to be working either anymore. Before the patches I mentioned in the report the lid would be able to wake up the device as well but not anymore.
Comment 16 Rafael J. Wysocki 2018-01-09 22:21:46 UTC
The lid thing may not be directly related to the power button, though.

What does happen when you press the power button in the working state (ie. while the machine is not suspended)?  It should generate input events, so does it do that?
Comment 17 Rafael J. Wysocki 2018-01-09 22:33:23 UTC
Created attachment 273503 [details]
Debug patch

OK, so we need to dig deeper.

Please apply this patch (it will print all Surface Pro 3 button events to the kernel log in addition to what the previous patch did).

Then, boot the kernel, enable dynamic debug in sleep.c and device_pm.c (from the kernel command line or via debugfs) and run

echo 1 > /sys/power/pm_debug_messages

Next, press the power button for a couple of times, trigger a system suspend with rtcwake (use a reasonably long sleep time), press the power button for a couple of times while suspended and when it resumes, collect the dmesg output and attach it here.
Comment 18 Valentin Manea 2018-01-09 23:14:56 UTC
Created attachment 273505 [details]
dmesg with debug enabled

Outside suspend the Power button works fine. From the logs I gather the events are queued and delivered after the rtc alarm.

I used 30s for rtcwake
Comment 19 Rafael J. Wysocki 2018-01-09 23:14:59 UTC
Created attachment 273507 [details]
Debug patch v2

Actually, please apply this patch instead of the one from the previous comment and do what I asked for in that comment.

I sort of already know what was going to happen with the previous patch.
Comment 20 Rafael J. Wysocki 2018-01-09 23:19:19 UTC
(In reply to Valentin Manea from comment #18)
> Created attachment 273505 [details]
> dmesg with debug enabled

Thanks!

> Outside suspend the Power button works fine.

OK

> From the logs I gather the events are queued and delivered after the rtc
> alarm.

Yes, they are, but that's too late. :-)

Please run the same test with the patch from comment #19 applied.
Comment 21 Valentin Manea 2018-01-09 23:24:34 UTC
Created attachment 273509 [details]
logs with new patch

Now it works correctly, resumes on first power button press.
Comment 22 Rafael J. Wysocki 2018-01-09 23:30:57 UTC
OK

What about the lid?
Comment 23 Valentin Manea 2018-01-09 23:38:41 UTC
The lid seems to be working as well.
Comment 24 Rafael J. Wysocki 2018-01-09 23:51:34 UTC
OK

Two patches seem to be needed, one for sleep.c and one for the Surface3 Pro button driver.

Let me prepare and post them.
Comment 25 Rafael J. Wysocki 2018-01-10 12:40:43 UTC
Patches posted:

https://patchwork.kernel.org/patch/10155053/
https://patchwork.kernel.org/patch/10155047/

Please test them and let me know if they work for you (they are just pieces of the big debug patch you tested before, so they together should work unless I messed up something).
Comment 26 Valentin Manea 2018-01-10 16:01:55 UTC
They seem to work fine, no problems anymore.
Comment 27 Rafael J. Wysocki 2018-01-11 00:18:02 UTC
OK, thanks!

Note You need to log in before you can comment on or make changes to this bug.