Bug 192591

Summary: Suspend to idle & ram issues on Dell XPS 13 9365
Product: Power Management Reporter: davidnmfarrell
Component: Hibernation/SuspendAssignee: Rafael J. Wysocki (rjw)
Status: RESOLVED PATCH_ALREADY_AVAILABLE    
Severity: normal CC: AxelMKlein, bpshacklett, chris.kotfila, Esokrarkose, jnruby, kernel, lenb, mapengyu, mario_limonciello, matt, patrik.kullman, paulepanter, peter.hutterer, rui.zhang, samshepard90, srinivas.pandruvada, yu.c.chen
Priority: P1    
Hardware: Intel   
OS: Linux   
Kernel Version: 4.11.0-rc3 Tree: Mainline
Regression: No
Attachments: dmesg
lspci
lsmod
boot-config
lspci
dmesg after suspend resume
proc-acpi-wakeup
dmesg-before-suspend
dmesg-after-suspend-resume
mod-params
analyse-suspend-dmesg
dmesg-pre-str
dmesg-post-str
acpi-interrupts-pre-str
acpi-interrupts-post-str
acpidump
dmesg-after-pm-trace
dmesg-after-rtcwake
dmesg rtc wake
dmesg after STR
dmesg.txt
/proc/bus/input/devices
proc-bus-input-devices
lsmod
Successful resume dmesg
ACPI / EC: Add module parameter to disable system wakeup
PM: Avoid printing suspend debug messages by default
dmesg with 4.13-rc1 + no_ec_wakeup
Benchmark of kernel 4.9

Description davidnmfarrell 2017-01-15 01:46:43 UTC
Created attachment 251631 [details]
dmesg

I've installed Fedora 25 on a Dell XPS 13 9365 (2 in 1) laptop. Duel booted with Windows, secure boot is enabled, SATA mode is ACPI, and it boots fine. Everything works - sound, trackpad, wifi etc, without issue.

However suspending the machine does not work; the screen goes blank and it appears that the machine is entering sleep mode, but the power button remains lit, and it never seems to completely finish suspending. The behavior is the same with STI and STR, via these commands:

    echo > freeze /sys/power/state
    echo > mem /sys/power/state

Booting the kernel into runlevel 3 makes no difference - the behavior is the same. Disabling async module power down also did not change the behavior; using this command:

    echo 0 > /sys/power/pm_async

The kernel was booted with the following additional options: initcall_debug no_console_suspend ignore_loglevel 3 dyndbg=\\\"file suspend.c +p\\\".

No change in behavior was seen. Attached are the outputs of various
Comment 1 davidnmfarrell 2017-01-15 01:47:15 UTC
Created attachment 251641 [details]
lspci
Comment 2 davidnmfarrell 2017-01-15 01:47:35 UTC
Created attachment 251651 [details]
lsmod
Comment 3 davidnmfarrell 2017-01-15 01:48:20 UTC
Created attachment 251661 [details]
boot-config
Comment 4 davidnmfarrell 2017-01-15 01:48:38 UTC
Created attachment 251671 [details]
lspci
Comment 5 davidnmfarrell 2017-01-15 01:52:14 UTC
Exact same behavior is seen under kernel 4.8.6-300.fc25.x86_64.

I should add: I can press the power button for a few seconds, and the system "resumes" (it doesn't appear to ever finish suspending). But closing/opening the lid, pressing keys or moving the trackpad does not resume. Only pressing the power button triggers the "resume".
Comment 6 Chen Yu 2017-01-16 02:05:35 UTC
Could you please check if it still exist on latest upstream kernel, we mainly track upstream version rather than distribute version.
If it is still there, you can use  /sys/power/pm_test to figure out at which phase it failed. thanks.
Comment 7 Zhang Rui 2017-01-16 02:06:00 UTC
(In reply to davidnmfarrell from comment #0)
> However suspending the machine does not work; the screen goes blank and it
> appears that the machine is entering sleep mode, but the power button
> remains lit,

blink or lit?


(In reply to davidnmfarrell from comment #5)
> Exact same behavior is seen under kernel 4.8.6-300.fc25.x86_64.
> 
> I should add: I can press the power button for a few seconds, and the system
> "resumes" (it doesn't appear to ever finish suspending).

please attach the dmesg after the system "resumes"
we can check kernel log to see if it is a real suspend.

BTW, you said press the power button for a few seconds? what if you press the button and release it immediately?

> But closing/opening
> the lid, pressing keys or moving the trackpad does not resume. Only pressing
> the power button triggers the "resume".

For STI only or for both STR and STI?
Comment 8 davidnmfarrell 2017-01-16 02:38:58 UTC
(In reply to Zhang Rui from comment #7)

> blink or lit?

lit, constant light (no blinking)

> BTW, you said press the power button for a few seconds? what if you press
> the button and release it immediately?

Nothing happens

> For STI only or for both STR and STI?

For both STI and STR

I also see the same issue under newer kernels:

4.9.3-200.fc25.x86_64
4.10.0-0.rc3.fc26.x86_64
Comment 9 davidnmfarrell 2017-01-16 02:56:26 UTC
Created attachment 251761 [details]
dmesg after suspend resume
Comment 10 Zhang Rui 2017-01-16 02:56:56 UTC
please attach the output of "cat /proc/acpi/wakeup" as long as the dmesg output after long power button press.
Comment 11 davidnmfarrell 2017-01-16 02:58:19 UTC
(In reply to Zhang Rui from comment #7)

> please attach the dmesg after the system "resumes"
> we can check kernel log to see if it is a real suspend.

Sure thing, I've uploaded the file "dmesg after suspend resume"
Comment 12 Zhang Rui 2017-01-16 03:02:29 UTC
(In reply to davidnmfarrell from comment #9)
> Created attachment 251761 [details]
> dmesg after suspend resume

I don't think this is the dmesg output after suspend/resume as I can see nothing indicating a suspend is started.
please make a double check.
You can also check the dmesg before echo mem/freeze > /sys/power/state, and then attach the dmesg after the system is back and tell me if there is something new.
Comment 13 Zhang Rui 2017-01-16 03:05:25 UTC
(In reply to davidnmfarrell from comment #9)
> Created attachment 251761 [details]
> dmesg after suspend resume

For me, this looks like a dmesg output after a fresh boot.

(In reply to davidnmfarrell from comment #5)
> Exact same behavior is seen under kernel 4.8.6-300.fc25.x86_64.
> 
> I should add: I can press the power button for a few seconds, and the system
> "resumes" (it doesn't appear to ever finish suspending).

when you say "resumes", does the system restore back to the same state before "suspend", or it just boots?
Comment 14 davidnmfarrell 2017-01-16 03:12:37 UTC
Created attachment 251771 [details]
proc-acpi-wakeup
Comment 15 davidnmfarrell 2017-01-16 03:13:28 UTC
Created attachment 251781 [details]
dmesg-before-suspend
Comment 16 davidnmfarrell 2017-01-16 03:13:58 UTC
Created attachment 251791 [details]
dmesg-after-suspend-resume
Comment 17 davidnmfarrell 2017-01-16 03:16:19 UTC
(In reply to Zhang Rui from comment #12)
> I don't think this is the dmesg output after suspend/resume as I can see
> nothing indicating a suspend is started.
> please make a double check.

No problem  - I uploaded before & after dmesg files
Comment 18 davidnmfarrell 2017-01-16 03:17:51 UTC
(In reply to Zhang Rui from comment #13)

> when you say "resumes", does the system restore back to the same state
> before "suspend", or it just boots?

It restores to the same state as before suspending.
Comment 19 davidnmfarrell 2017-01-16 03:48:49 UTC
Created attachment 251811 [details]
mod-params

All loaded modules and their parameters
Comment 20 davidnmfarrell 2017-01-16 03:50:19 UTC
I tried removing modules and then suspending, but it made no difference:

modprobe -r iwlmvm cfg80211 mac80211 iwlwifi                                                                                                                                  
modprobe -r wacom                                                                                                                                                             
modprobe -r acpi_als intel_lpss_acpi int3400_thermal acpi_pad acpi_thermal_rel
Comment 21 davidnmfarrell 2017-01-16 04:18:14 UTC
I tried:

analyze_suspend.py -rtcwake 30 -f -m mem

As detailed here:
https://01.org/blogs/rzhang/2015/best-practice-debug-linux-suspend/hibernate-issues

Unusually it looked to me like the suspend worked - the screen went blank, the power button turned off, etc. I'll upload the output files.
Comment 22 davidnmfarrell 2017-01-16 04:19:28 UTC
Created attachment 251821 [details]
analyse-suspend-dmesg
Comment 23 davidnmfarrell 2017-01-16 05:55:25 UTC
Ok, I had a chance to compare the behavior of my machine (9365) against another XPS 13 (9360) that works with STI/STR. tldr; suspend seems to work, it looks like resume works, but the screen does not turn back on.

STI
9360: Screen turns off, power button light remains on. Pressing the power button immediately resumes to original state.

9365: Screen turns off, power button light remains on. Pressing power button seems to have no effect. Pressing and holding power button for ~8 seconds the screen comes back on, original state is resumed (run level 3 it stays on, run level 5 screen immediately goes off again).

STR
9360: Screen turns off, power button light goes off. Pressing the power button immediately resumes: power button light comes on, screen comes back. Original state is resumed.

9365: Screen turns off, power button light goes off. Pressing power button, the power button light comes back on, but the screen remains off. Pressing and holding power button for ~8 seconds the screen comes back on (run level 3 it stays on, run level 5 it immediately goes off again).

I'm attaching dmesg logs for pre and post STR.
Comment 24 davidnmfarrell 2017-01-16 05:56:22 UTC
Created attachment 251851 [details]
dmesg-pre-str
Comment 25 davidnmfarrell 2017-01-16 05:56:46 UTC
Created attachment 251861 [details]
dmesg-post-str
Comment 26 Zhang Rui 2017-01-16 06:29:14 UTC
at runtime, please attach the output of "grep . /sys/firmware/acpi/interrupts/*" and "cat /proc/interrupts"
1. before do nothing
2. after press and release the power button
3. after press and hold the power button for 8 seconds.
Comment 27 Zhang Rui 2017-01-16 06:30:50 UTC
To, this seems like a power button interrupt issue instead of an ACPI problem.
Comment 28 davidnmfarrell 2017-01-16 07:07:52 UTC
Created attachment 251921 [details]
acpi-interrupts-pre-str
Comment 29 davidnmfarrell 2017-01-16 07:08:19 UTC
Created attachment 251931 [details]
acpi-interrupts-post-str
Comment 30 davidnmfarrell 2017-01-16 07:10:27 UTC
I uploaded the pre/post STR output. I wasn't able to run grep after pressing and releasing the power button as the screen and keyboard were disabled.
Comment 31 Zhang Rui 2017-01-17 08:51:27 UTC
I thought I've seen you've uploaded the acpidump somewhere but I can not find it.
please attach the acpidump output in this bug report.
Comment 32 davidnmfarrell 2017-01-17 21:17:49 UTC
Created attachment 252141 [details]
acpidump
Comment 33 davidnmfarrell 2017-01-17 21:34:37 UTC
Created attachment 252151 [details]
dmesg-after-pm-trace
Comment 34 davidnmfarrell 2017-01-17 23:05:46 UTC
Created attachment 252181 [details]
dmesg-after-rtcwake

One clue: rtc_wake works fine, both STI and STR. I'm running it like this:

    sudo rtc_wake -m mem -s 10
Comment 35 davidnmfarrell 2017-01-18 01:09:41 UTC
Created attachment 252221 [details]
dmesg rtc wake
Comment 36 davidnmfarrell 2017-01-18 01:10:21 UTC
Created attachment 252231 [details]
dmesg after STR
Comment 37 davidnmfarrell 2017-01-18 01:14:26 UTC
I uploaded two more dmesg outputs: I've edited them to make them easier to diff. These were running on upstream at run level 3 with these kernel options:

- initcall_debug
- no_console_suspend
- ignore_loglevel
- dyndbg=\\\"file suspend.c +p\\\"

I don't see any major differences between them, but maybe you can spot something awry with the STR one.
Comment 38 Zhang Rui 2017-01-18 11:54:18 UTC
For the long button press issue, I think we have some update in this thread.
http://marc.info/?l=linux-pm&m=148467282503082&w=2
Comment 39 davidnmfarrell 2017-01-19 14:56:08 UTC
Thanks for the link:

I rebuilt upstream, reverted 08b98d329165. STR seemed to work, the power light went off. But pressing the power button again crashed the machine: the main light on the front started flashing orange and white, and I was unable to get the monitor or anything else to work.

As this is an issue in 4.9, I'm not sure this is a 4.10 regression, or maybe it is, but reverting that commit is only part of the solution.
Comment 40 davidnmfarrell 2017-01-19 16:54:32 UTC
Before the failed STR:

~ cat /sys/power/mem_sleep
[s2idle] deep
Comment 41 Zhang Rui 2017-01-20 06:30:09 UTC
(In reply to davidnmfarrell from comment #39)
> Thanks for the link:
> 
> I rebuilt upstream, reverted 08b98d329165. STR seemed to work,

you mean suspend works, but resume crashes, right?
Comment 42 Zhang Rui 2017-01-20 06:30:42 UTC
(In reply to davidnmfarrell from comment #40)
> Before the failed STR:
> 
> ~ cat /sys/power/mem_sleep
> [s2idle] deep

what if you "echo deep > /sys/power/mem_sleep" before suspend?
Comment 43 davidnmfarrell 2017-01-21 02:38:50 UTC
(In reply to Zhang Rui from comment #41)
> (In reply to davidnmfarrell from comment #39)
> > Thanks for the link:
> > 
> > I rebuilt upstream, reverted 08b98d329165. STR seemed to work,
> 
> you mean suspend works, but resume crashes, right?

Yes, that's right.

(In reply to Zhang Rui from comment #42)
> (In reply to davidnmfarrell from comment #40)
> > Before the failed STR:
> > 
> > ~ cat /sys/power/mem_sleep
> > [s2idle] deep
> 
> what if you "echo deep > /sys/power/mem_sleep" before suspend?

That's what I did. If you need anything else, just let me know
Comment 44 Nathaniel McCallum 2017-01-26 23:55:42 UTC
I have the same problem.

I was able to get the backlight to come back on when pressing (not holding) the power button by doing: acpi_osi=! acpi_osi="Windows 2015". However, this breaks a few other things.
Comment 45 Nathaniel McCallum 2017-01-27 07:40:26 UTC
I'm running 4.10.0-0.rc5.git0.1.fc26.x86_64.
Comment 46 Srinivas Pandruvada 2017-02-07 01:02:22 UTC
Does this help?
https://patchwork.kernel.org/patch/9538381/

This is probably Win10 device the processing of power button by BIOS may be different as changing OSI string helps.
Comment 47 davidnmfarrell 2017-02-07 03:59:16 UTC
I updated to 4.10.0-rc6 and the same behavior is seen as reported previously.
Comment 48 davidnmfarrell 2017-02-07 03:59:49 UTC
(In reply to Srinivas Pandruvada from comment #46)
> Does this help?
> https://patchwork.kernel.org/patch/9538381/
> 
> This is probably Win10 device the processing of power button by BIOS may be
> different as changing OSI string helps.

Thanks for this. I applied the patch, rebuilt 4.10-rc6 but saw no change in behavior.
Comment 49 davidnmfarrell 2017-02-07 04:01:04 UTC
Created attachment 254431 [details]
dmesg.txt

dmesg output on 4.10.0-rc6 following STR and STM
Comment 50 davidnmfarrell 2017-02-07 04:03:26 UTC
These are two other issues I'm seeing on this machine: 
1) reboot fails https://bugzilla.kernel.org/show_bug.cgi?id=192651
2) ACPI Error messages whenever a power cable is plugged in or removed https://bugzilla.kernel.org/show_bug.cgi?id=194031

I don't know if they're related to this issue or not, but thought it might be helpful to point them out, just in case.
Comment 51 davidnmfarrell 2017-03-26 19:53:09 UTC
This issue still occurs on the latest upstream (4.11.0-rc3). Is there any more information I can provide to help diagnose the problem?

The behavior is as-before: suspend works, but the laptop does not resume via physical input (power button, keyboard, opening screen). However rtcwake suspend/resume works fine.
Comment 52 Zhang Rui 2017-03-27 04:09:02 UTC
(In reply to davidnmfarrell from comment #51)
> This issue still occurs on the latest upstream (4.11.0-rc3). Is there any
> more information I can provide to help diagnose the problem?
> 
> The behavior is as-before: suspend works, but the laptop does not resume via
> physical input (power button, keyboard, opening screen). However rtcwake
> suspend/resume works fine.

so, to me, the real problem is that we're lacking wakeup interrupts from the keyboard drivers.

Do you know what button driver you're using, say what driver controls your power button, keyboard, on this laptop?
Comment 53 incarnis 2017-03-28 07:43:35 UTC
Created attachment 255591 [details]
/proc/bus/input/devices

I'm here also with a Dell XPS 13 9365 and am experiencing the same issues. It seems that I'm able to get the laptop to resume after pressing the power button enough times but the trackpad does not come back online. The touch screen functions however. I'm running kernel 4.11.0.0 RC4 on fedora 25 from the vanilla mainline repository. I've heard reports of this in kernel 4.10 as well.
Comment 54 incarnis 2017-03-28 07:46:15 UTC
Zhang Rui, see my reply above with /proc/bus/input/devices and let me know if I can provide any further details.
Comment 55 Zhang Rui 2017-03-28 07:57:01 UTC
first of all, they are actually different problems, because the power button, keyboard, trackpad and touch screen are controlled by different drivers.

For the power button, I see there is ACPI power button, and power button has been verified to work well, right?

For the trackpad, it seems that the driver is not working after resume, this is usually a driver issue, can you please try to unload the trackpad driver before suspend and reload it after resume and see if it still works?
Comment 56 davidnmfarrell 2017-03-28 12:42:39 UTC
Created attachment 255603 [details]
proc-bus-input-devices
Comment 57 davidnmfarrell 2017-03-28 12:45:03 UTC
(In reply to Zhang Rui from comment #52)
> (In reply to davidnmfarrell from comment #51)
> > This issue still occurs on the latest upstream (4.11.0-rc3). Is there any
> > more information I can provide to help diagnose the problem?
> > 
> > The behavior is as-before: suspend works, but the laptop does not resume
> via
> > physical input (power button, keyboard, opening screen). However rtcwake
> > suspend/resume works fine.
> 
> so, to me, the real problem is that we're lacking wakeup interrupts from the
> keyboard drivers.
> 
> Do you know what button driver you're using, say what driver controls your
> power button, keyboard, on this laptop?

Hi Zhang,

I've uploaded the output of cat /proc/bus/input/devices. Mine is very similar to incarnis'. Just let me know if you need something else.

David
Comment 58 davidnmfarrell 2017-03-28 12:47:02 UTC
(In reply to incarnis from comment #53)
> Created attachment 255591 [details]
> /proc/bus/input/devices
> 
> I'm here also with a Dell XPS 13 9365 and am experiencing the same issues.
> It seems that I'm able to get the laptop to resume after pressing the power
> button enough times but the trackpad does not come back online. The touch
> screen functions however. I'm running kernel 4.11.0.0 RC4 on fedora 25 from
> the vanilla mainline repository. I've heard reports of this in kernel 4.10
> as well.

Looks like we have slightly different configurations - does your webcam work? (mine doesn't). Also, does the microphone channel work on the audio jack?(mine doesn't).
Comment 59 incarnis 2017-03-28 16:58:29 UTC
Inte(In reply to davidnmfarrell from comment #58)
> (In reply to incarnis from comment #53)
> > Created attachment 255591 [details]
> > /proc/bus/input/devices
> > 
> > I'm here also with a Dell XPS 13 9365 and am experiencing the same issues.
> > It seems that I'm able to get the laptop to resume after pressing the power
> > button enough times but the trackpad does not come back online. The touch
> > screen functions however. I'm running kernel 4.11.0.0 RC4 on fedora 25 from
> > the vanilla mainline repository. I've heard reports of this in kernel 4.10
> > as well.
> 
> Looks like we have slightly different configurations - does your webcam
> work? (mine doesn't). Also, does the microphone channel work on the audio
> jack?(mine doesn't).

Interesting. My webcam works in Camorama and in Skype but not in Cheese. And the microphone channel doesn't show up in gnome, with or without supporting headphones inserted. Onboard mic does work great though.
Comment 60 incarnis 2017-03-28 17:04:48 UTC
(In reply to Zhang Rui from comment #55)
> first of all, they are actually different problems, because the power
> button, keyboard, trackpad and touch screen are controlled by different
> drivers.
> 
> For the power button, I see there is ACPI power button, and power button has
> been verified to work well, right?
> 
> For the trackpad, it seems that the driver is not working after resume, this
> is usually a driver issue, can you please try to unload the trackpad driver
> before suspend and reload it after resume and see if it still works?

I do have to hold the power button down for at least 3 seconds before the laptop will properly resume and the display comes on. That doesn't happen in windows so something odd is going on there (although it does seem as the laptop has it's own suspend issues in windows as well). 

Any ideas which module would contain the trackpad driver or how to find out? Ran lsmod through every grep search I could think of and don't see anything obvious either. 

Attaching outpuf of lsmod in the following comment...
Comment 61 incarnis 2017-03-28 17:10:42 UTC
Created attachment 255615 [details]
lsmod
Comment 62 Zhang Rui 2017-03-29 02:11:22 UTC
(In reply to incarnis from comment #60)
> (In reply to Zhang Rui from comment #55)
> > first of all, they are actually different problems, because the power
> > button, keyboard, trackpad and touch screen are controlled by different
> > drivers.
> > 
> > For the power button, I see there is ACPI power button, and power button
> has
> > been verified to work well, right?
> > 
> > For the trackpad, it seems that the driver is not working after resume,
> this
> > is usually a driver issue, can you please try to unload the trackpad driver
> > before suspend and reload it after resume and see if it still works?
> 
> I do have to hold the power button down for at least 3 seconds before the
> laptop will properly resume and the display comes on.

so, if you press the power button once and then release it and wait for 10 seconds, nothing happens?
and if you press the power button for 3 seconds and release it, the laptop resume properly?

> That doesn't happen in
> windows so something odd is going on there (although it does seem as the
> laptop has it's own suspend issues in windows as well). 
>

there is ACPI button device on this laptop, and  the intel_vbtn driver is supposed to control the power button in another way, I'm not sure which one is actually work, but let's give this lower priority as long as it still works.
 
> Any ideas which module would contain the trackpad driver or how to find out?
> Ran lsmod through every grep search I could think of and don't see anything
> obvious either. 
> 
TBH, I don't know neither.
I just googled, and see a tool xinput, which exports all the X input devices, and you can disable one by "xinput disable id" and see if the trackpad still works or not.
Comment 63 incarnis 2017-03-29 06:41:43 UTC
(In reply to Zhang Rui from comment #62)
> (In reply to incarnis from comment #60)
> > (In reply to Zhang Rui from comment #55)
> > > first of all, they are actually different problems, because the power
> > > button, keyboard, trackpad and touch screen are controlled by different
> > > drivers.
> > > 
> > > For the power button, I see there is ACPI power button, and power button
> > has
> > > been verified to work well, right?
> > > 
> > > For the trackpad, it seems that the driver is not working after resume,
> > this
> > > is usually a driver issue, can you please try to unload the trackpad
> driver
> > > before suspend and reload it after resume and see if it still works?
> > 
> > I do have to hold the power button down for at least 3 seconds before the
> > laptop will properly resume and the display comes on.
> 
> so, if you press the power button once and then release it and wait for 10
> seconds, nothing happens?
> and if you press the power button for 3 seconds and release it, the laptop
> resume properly?

Correct, when just pressing the power button normally the system will not fully resume. The keyboard backlights and LEDs come on but there is no display. Then the only way from there is to power off and reboot. So the only way to resume with a functioning display is to hold the power button just long enough to trigger some unknown process but not too long as to trigger a shutdown-- it's a bit of a challenge to accomplish. Plugging into AC power will also trigger a partial resume and again leave you in the same non-functional state.
Comment 64 Zhang Rui 2017-03-29 08:20:09 UTC
(In reply to incarnis from comment #63)
> (In reply to Zhang Rui from comment #62)
> > (In reply to incarnis from comment #60)
> > > (In reply to Zhang Rui from comment #55)
> > > > first of all, they are actually different problems, because the power
> > > > button, keyboard, trackpad and touch screen are controlled by different
> > > > drivers.
> > > > 
> > > > For the power button, I see there is ACPI power button, and power
> button
> > > has
> > > > been verified to work well, right?
> > > > 
> > > > For the trackpad, it seems that the driver is not working after resume,
> > > this
> > > > is usually a driver issue, can you please try to unload the trackpad
> > driver
> > > > before suspend and reload it after resume and see if it still works?
> > > 
> > > I do have to hold the power button down for at least 3 seconds before the
> > > laptop will properly resume and the display comes on.
> > 
> > so, if you press the power button once and then release it and wait for 10
> > seconds, nothing happens?
> > and if you press the power button for 3 seconds and release it, the laptop
> > resume properly?
> 
> Correct, when just pressing the power button normally the system will not
> fully resume. The keyboard backlights and LEDs come on but there is no
> display. Then the only way from there is to power off and reboot.

can you login it the system remotely at this time?
what do you get if you boot with kernel parameter 'nomodeset' and 'no_console_suspend'.
Comment 65 Paul Menzel 2017-04-04 20:00:00 UTC
I experienced the same issue with some Linux 4.10 release candidates on the Dell XPS 13 9360. Could you please make sure that it’s not the same problem as described in the LKML thread *Regression on Dell XPS13*.

[1] https://lkml.org/lkml/2017/1/17/609
Comment 66 Brennan Shacklett 2017-04-23 18:55:42 UTC
I still experience this issue on the latest kernel from git. Using the latest BIOS from Dell, I am unable to get the laptop to resume properly from suspend in any way. There are 3 possible outcomes:

1) After suspending, I hit the power button and release it relatively quickly (less than 3 seconds), in this case either nothing happens, or the keyboard backlight comes on and nothing else happens (even though I have the keyboard backlight turned off in linux).

2) I hold the power button down for more than 6 seconds: sometimes the screen comes on, but as soon as I release the power button the screen turns off again. This can be repeated indefinitely, but if the power button is held for too long the machine simply shuts off.

3) I hit the power button and immediately release: the light on the front of the laptop blinks orange and white, and I have to hold to the power button to force it to turn off.

What makes this especially frustrating is that the power button is very difficult to press and hold down for an extended period of time.

I see this bug is marked NEEDINFO, what can I provide to help get this fixed? Or can someone point me to where I need to look to get this laptop to turn on when I open the lid, or even turn on with a single press of the power button?
Comment 67 Paul Menzel 2017-04-23 20:01:39 UTC
(In reply to Brennan Shacklett from comment #66)

[…]

> I see this bug is marked NEEDINFO, what can I provide to help get this
> fixed? Or can someone point me to where I need to look to get this laptop to
> turn on when I open the lid, or even turn on with a single press of the
> power button?

See comment 65.

> can you login it the system remotely at this time?
> what do you get if you boot with kernel parameter 'nomodeset' and
> 'no_console_suspend'.
Comment 68 Brennan Shacklett 2017-04-23 20:14:10 UTC
Hi Paul,

How can I make sure that it isn't the same problem as described in the LKML thread? I saw that the patch was reverted, was that functionality added back in the latest git?

I will try booting with those options shortly.
Comment 69 Paul Menzel 2017-04-23 20:24:54 UTC
Dear Brennan,


(In reply to Brennan Shacklett from comment #68)

> How can I make sure that it isn't the same problem as described in the LKML
> thread? I saw that the patch was reverted, was that functionality added back
> in the latest git?

Sorry, I meant comment 64.

Regarding comment 65, I guess checking the content of `/sys/power/mem_sleep` should be enough.

> I will try booting with those options shortly.

Great.
Comment 70 Brennan Shacklett 2017-04-23 21:07:52 UTC
Booting with nomodeset and no_console_suspend didn't seem to have a positive impact. Something seems to have changed (possibly due to a reboot into windows?), now I can't get the screen to appear by holding down the power button like I could previously, I seem to always go into the flashing orange and white led mode. I can't ssh in when that is happening. 

The contents of /sys/power/mem_sleep is "s2idle [deep]".
Comment 71 Brennan Shacklett 2017-04-23 21:22:04 UTC
Looking at comment 23, I managed to get the machine to randomly resume from suspend: I suspended to idle, which caused the screen to go dark but the power button to remain on, I then held the power button for 8 seconds, which caused the screen to come on and then turn off immediately when I released the power button. The power button light also turned off at this point. I repeated the hold for 8 seconds, screen comes on, release power button, screen goes off, power light goes off cycle a couple more times, and had given up, but a few seconds later the screen came back on, with everything except the touchpad functional. I'll attach the dmesg from this process, anything else I should get?
Comment 72 Brennan Shacklett 2017-04-23 21:24:16 UTC
Created attachment 255957 [details]
Successful resume dmesg
Comment 73 Paul Menzel 2017-04-23 21:26:24 UTC
Dear Brennan,


(In reply to Brennan Shacklett from comment #70)
> Booting with nomodeset and no_console_suspend didn't seem to have a positive
> impact. Something seems to have changed (possibly due to a reboot into
> windows?), now I can't get the screen to appear by holding down the power
> button like I could previously, I seem to always go into the flashing orange
> and white led mode. I can't ssh in when that is happening. 
> 
> The contents of /sys/power/mem_sleep is "s2idle [deep]".

Thank you for trying the suggestions. I guess Zhang will take over again, just three more comments.

1.  The new portable devices are missing a serial port and also an Ethernet device. So getting messages out – serial console, netconsole – don’t work. An EHCI debug gadget [1] will (probably) also not work as all USB ports are probably xHCI ports. You could look for xHCI debug cables.

2.  Please contact Dell and make them aware of the problem. That’s really important in my opinion.

3.  You could try `systemctl suspend; sleep 10; systemctl poweroff`. Hopefully messages are stored by the systemd journal, and on new start you can retrieve them with `journalctl -a -b -1`.


[1] https://www.coreboot.org/EHCI_Gadget_Debug
Comment 74 Paul Menzel 2017-04-23 21:32:39 UTC
(In reply to Brennan Shacklett from comment #71)
> Looking at comment 23, I managed to get the machine to randomly resume from
> suspend: I suspended to idle, which caused the screen to go dark but the
> power button to remain on, I then held the power button for 8 seconds, which
> caused the screen to come on and then turn off immediately when I released
> the power button. The power button light also turned off at this point. I
> repeated the hold for 8 seconds, screen comes on, release power button,
> screen goes off, power light goes off cycle a couple more times, and had
> given up, but a few seconds later the screen came back on, with everything
> except the touchpad functional. I'll attach the dmesg from this process,

Nice.

Please report a separate issue for the messages below to the PCI folks.

```
[   98.723960] pcieport 0000:00:1c.4: AER: Corrected error received: id=00e4
[   98.723977] pcieport 0000:00:1c.4: PCIe Bus Error: severity=Corrected, type=Data Link Layer, id=00e4(Transmitter ID)
[   98.723992] pcieport 0000:00:1c.4:   device [8086:9d14] error status/mask=00001000/00002000
[   98.724001] pcieport 0000:00:1c.4:    [12] Replay Timer Timeout
```

So resume is only successful with the messages below?

```
[  349.905768] ACPI Error: Thread 2338328000 cannot release Mutex [PATM] acquired by thread 2305275648 (20170119/exmutex-416)
[  349.905788] ACPI Error: Method parse/execution failed [\_SB.PCI0.LPCB.ECDV._Q66] (Node ffff88048d0dec08), AE_AML_NOT_OWNER (20170119/psparse-543)
[  353.903645] ACPI Error: Thread 2338328000 cannot release Mutex [PATM] acquired by thread 2305275648 (20170119/exmutex-416)
[  353.903665] ACPI Error: Method parse/execution failed [\_SB.PCI0.LPCB.ECDV._Q66] (Node ffff88048d0dec08), AE_AML_NOT_OWNER (20170119/psparse-543)
```

> anything else I should get?

I didn’t see what operating system you use. Do you use Dell’s Ubuntu installation? Does resume work with that?
Comment 75 Brennan Shacklett 2017-04-23 21:36:05 UTC
Unfortunately this laptop does not have a Ubuntu preinstalled version, only the 9360 does, so Dell may not be willing to support it. The ACPI errors actually happen quite a bit, seems to have something to do with the usb ports and charging (since this laptop is charged through the usb c ports). It's worth noting that all my tests were done on battery only. The PCI errors also happen pretty often, I figured it was not a problem since they all say "Corrected".
Comment 76 Zhang Rui 2017-04-24 01:27:51 UTC
@lv.zheng@intel.com
please check the acpica error messages

For the power button issue during suspend, AFAICS, this problem is under investigation, by both Dell and Intel, and we're expecting a solution some time later.
Comment 77 Brennan Shacklett 2017-04-24 01:31:50 UTC
Does the power button issue also encapsulate the issue of the laptop not turning on when the lid is opened? Or is that a separate issue? If so I would say getting the laptop to turn on when the lid is opened (like in windows), would be higher priority.
Comment 78 Srinivas Pandruvada 2017-04-24 04:05:17 UTC
Dell XPS 13 9365: On Windows platform suspend to RAM is not used. But since the ACPI elements still exists, Linux still allows suspend to RAM. Windows will always connected standby which is suspend to idle Linux equivalent.

Soon patches will be posted to make suspend to idle work on this platform, which will make power button work. Then change the kernel default from "mem" to "s2idle",it should also work for lid close.
Comment 79 Paul Menzel 2017-04-24 07:23:25 UTC
(In reply to Srinivas Pandruvada from comment #78)
> Dell XPS 13 9365: On Windows platform suspend to RAM is not used. But since
> the ACPI elements still exists, Linux still allows suspend to RAM. Windows
> will always connected standby which is suspend to idle Linux equivalent.

So it’s a firmware issue then?

I would still prefer ACPI S3 over ACPI S0ix as it saves more power as more devices are put into sleep.

Is there a way to test ACPI S3 in Microsoft Windows? The Dell firmware advertises ACPI S3, so there should be a way. Then you could contact Dell.

> Soon patches will be posted to make suspend to idle work on this platform,
> which will make power button work. Then change the kernel default from "mem"
> to "s2idle", it should also work for lid close.

That be great if at least these patches would get out for people to test. The current situation is quite bad. Especially as this problem is known for several months now, cf. LKMl thread [1].


[1] https://lkml.org/lkml/2017/1/17/609
Comment 80 Srinivas Pandruvada 2017-04-26 19:58:22 UTC
Got conformation that S3 is not supported this platform.
Comment 81 Paul Menzel 2017-04-27 05:47:07 UTC
(In reply to Srinivas Pandruvada from comment #80)
> Got conformation that S3 is not supported this platform.

Thank you for following up on that. But what do you mean by “platform”? The specific *laptop* in question?
Comment 82 Mario Limonciello 2017-04-27 15:37:23 UTC
Hi Paul,

Srinivas means the Dell XPS 9365 does not support S3.

Microsoft indicates in their documentation for Windows 10 that the FADT bit will take precedence over the S3.
"The system ACPI firmware must not provide an S3 object in the root of the namespace. Windows supports a platform exposing either the S3 object or the ACPI_S0_LOW_POWER_IDLE FADT flag, but not both at the same time. Note The FADT bit takes precedence over an S3 object." [1]

To your question about whether or not S3 can be used in Windows, you would have to reinstall Windows with modern standby blocked to be able to test this.
"You cannot switch between S3 and Modern Standby by changing a setting in the BIOS. Switching the power model is not supported in Windows without a complete OS re-install." [2]

[1] https://msdn.microsoft.com/en-us/windows/hardware/commercialize/design/device-experiences/platform-design-for-modern-standby
[2] https://msdn.microsoft.com/en-us/windows/hardware/commercialize/design/device-experiences/modern-standby
Comment 83 Mario Limonciello 2017-04-27 15:44:03 UTC
Also, as Srinivas alluded to, there have been patches posted to LKML recently that will allow the XPS 9365 to work properly if /sys/power/mem_sleep is adjusted to "s2idle".

https://patchwork.kernel.org/patch/9702069/
https://patchwork.kernel.org/patch/9702081/
https://patchwork.kernel.org/patch/9702071/
https://patchwork.kernel.org/patch/9702059/
https://patchwork.kernel.org/patch/9702055/

Paul, this patch series will also fix the power button problem that you and I discussed some time back.  I expect with this series applied on top of the latest 4.11rc the power consumption on the XPS 9360 under s2idle should be significantly improved too.  There are a variety of factors that should have improved it.
Comment 84 Patrik Kullman 2017-06-11 20:04:55 UTC
Hi, I'm also very frustrated with the state of suspend on the 9365.
I can (often) resume it by holding the power button long (5s) twice in a row, but it's not always successful, and sometimes the touchpad doesn't work after resume.

The patchset above seems promising, except that patchwork says patches 3-5 are "Deferred", whatever that means practically?

When are 1-2 expected to be included, for 4.12 or later?

So /sys/power/mem_sleep needs to be adjusted from "s2idle [deep]" to "s2idle" ?
How is that carried out, once the patches gets merged?
Comment 85 Mario Limonciello 2017-06-12 17:59:35 UTC
@patrik,

Avoid using S3 at all.  Although S3 is advertised in the ACPI tables, the system should be using S2I and there are bugs with S3 that won't be fixed.

Following the status of these patches is a little difficult because a lot has changed in the last month.

Some of that series was adopted and then reverted due to a regression:
https://github.com/torvalds/linux/commit/eed4d47efe9508b855b09754cf6de4325d8a2f0d
https://github.com/torvalds/linux/commit/f3b7eaae1b35eb8077610eb7c7db042c9b0645e1

It was recently resubmitted as a 6 patch series:
https://patchwork.kernel.org/patch/9773625/
https://patchwork.kernel.org/patch/9773647/
https://patchwork.kernel.org/patch/9773655/
https://patchwork.kernel.org/patch/9773619/
https://patchwork.kernel.org/patch/9773643/
https://patchwork.kernel.org/patch/9773673/

Really the best status to follow is from the s2idle-dell-test branch @ linux-pm https://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm.git/

It's been including the latest available from various reworks and regression fixes.

There are some other things that need to be submitted after that 6 patch series makes it:
* A fix for turning off the power LED in S2I
* A way to default to S2I on systems that should support it instead of S3 (via FADT low power idle bit and/or a whitelist)

For now, you can carry something on your kernel command line that will let you set the default to s2idle though.
Comment 86 Patrik Kullman 2017-06-13 05:53:43 UTC
Thanks for your quick and informative answer!

I'll keep an eye out and hope that the patches are included in 4.12-rc6!

Finally found the "mem_sleep_default" parameter and it seems to work a lot better, now I can consistently resume with a 6s press to the power button.

I had to turn off "Suspend on power button press" in GNOME Shell though, otherwise it was an almost impossibly brief period during which I had to let the button go to avoid suspending again.

Thanks for the great work!
Comment 87 Zhang Rui 2017-06-17 07:37:42 UTC
Now, let's forget suspend to ram and stick with suspend to idle,

please check if all the problems are gone with Rafael' testing branch
git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm.git testing
Comment 88 Brennan Shacklett 2017-06-17 20:39:13 UTC
The testing branch still has numerous problems for me:

The laptop does come back on with an extended press to the power button, but sometimes the touchpad doesn't work.

Reopening the laptop or hitting keys on the keyboard doesn't wake up the laptop.

Sometimes the first press of the power button doesn't work and to wake it back up requires holding the button down, releasing and holding it down again. Like Patrik, I had to turn off Suspend on power button press, otherwise the laptop immediately suspends after waking up.
Comment 89 Mario Limonciello 2017-06-19 13:17:22 UTC
@Brennan,
To be clear - are you using s2idle or s3?
Comment 90 Len Brown 2017-06-19 22:42:10 UTC
We need to blacklist S3 on the Dell 9365, due to last of working BIOS support.
Comment 91 Brennan Shacklett 2017-06-20 01:00:38 UTC
It seems that closing the laptop puts it into s3. I tested with s2idle with echo freeze > /sys/power/state, and there are a different set of issues:

- The power light does not turn off
- Keyboard, trackpad still don't wake the laptop back up
- Power button instantly turns the screen back on, but the laptop is totally frozen.
Comment 92 Mario Limonciello 2017-06-20 17:32:59 UTC
@Brennan,

Please adjust the default the kernel uses by kernel command line parameter mem_sleep_default.  This will let the correct action occur on the XPS 9365.

I agree with Len, either S3 should be blacklisted on 9365 or default policy should be quirked to s2idle.

Power light not turning off is an understood problem, and updates should be coming soon regarding that.

That last issue on freezing is news to me.
Comment 93 Matt Beland 2017-07-05 18:37:59 UTC
Adding a comment that with 4.12, mem_sleep_default=s2idle on the boot command line, and the latest linux-firmware package (1.167_all) the behavior of my XPS 13 9365 2 in 1 is close to "as expected" - suspend works normally, through GUI menu, command line, or closing the lid. Power button LED does eventually turn off, but I'm not sure how long it takes - more than one hour, less than six. Does not fully wake up without holding the power button for several (>5) seconds; opening the lid, pressing keys or trackpad turns on the keyboard backlight and sometimes the screen backlight but no interactive activity until the power button is pressed. However, once the power button is pressed for that several seconds, everything functions as expected.
Comment 94 Patrik Kullman 2017-07-05 19:48:41 UTC
Same experience as Matt, also excited about 4.13 since the last(?) needed fixes (acpi-pm-test branch) for a smooth experience seems to be merged in the 4.13-rc1 tree: https://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm.git/tag/?h=pm-4.13-rc1
Comment 95 Patrik Kullman 2017-07-16 06:00:03 UTC
Using 4.13-rc1 from http://kernel.ubuntu.com/~kernel-ppa/mainline/v4.13-rc1/, suspend and resume works beautifully!

Thanks for the great work!
Comment 96 Patrik Kullman 2017-07-16 19:36:48 UTC
Actually it doesn't seem to suspend properly, I get a lot of these while suspended:

[48773.433130] Suspended for 1.736 seconds
[48773.487290] PM: noirq resume of devices complete after 54.007 msecs
[48773.584996] PM: noirq suspend of devices complete after 92.106 msecs
[48773.640021] PM: noirq resume of devices complete after 55.019 msecs
[48773.693010] PM: noirq suspend of devices complete after 52.909 msecs
[48774.772925] Suspended for 0.739 seconds
[48774.827173] PM: noirq resume of devices complete after 54.107 msecs
[48774.924789] PM: noirq suspend of devices complete after 92.105 msecs
[48774.979256] PM: noirq resume of devices complete after 54.460 msecs
[48775.040788] PM: noirq suspend of devices complete after 61.453 msecs
[48776.118507] Suspended for 1.732 seconds
[48776.173052] PM: noirq resume of devices complete after 54.401 msecs
[48776.274384] PM: noirq suspend of devices complete after 96.105 msecs
[48776.328516] PM: noirq resume of devices complete after 54.124 msecs
[48776.386397] PM: noirq suspend of devices complete after 57.801 msecs
[48777.458479] Suspended for 0.731 seconds
[48777.512360] PM: noirq resume of devices complete after 53.748 msecs
[48777.578332] PM: noirq suspend of devices complete after 60.094 msecs
[48777.639708] PM: noirq resume of devices complete after 61.369 msecs
[48777.698355] PM: noirq suspend of devices complete after 58.569 msecs
[48778.803968] Suspended for 0.760 seconds
[48778.858508] PM: noirq resume of devices complete after 54.374 msecs
[48778.955839] PM: noirq suspend of devices complete after 92.111 msecs
[48779.010944] PM: noirq resume of devices complete after 55.097 msecs
[48779.067844] PM: noirq suspend of devices complete after 56.823 msecs
[48780.143970] Suspended for 1.736 seconds
[48780.198045] PM: noirq resume of devices complete after 53.932 msecs
[48780.299838] PM: noirq suspend of devices complete after 96.105 msecs
[48780.354429] PM: noirq resume of devices complete after 54.583 msecs
[48780.415834] PM: noirq suspend of devices complete after 61.318 msecs
[48781.489487] Suspended for 0.728 seconds
[48781.543593] PM: noirq resume of devices complete after 53.953 msecs
[48781.641374] PM: noirq suspend of devices complete after 92.110 msecs
[48781.702691] PM: noirq resume of devices complete after 61.310 msecs
[48781.761389] PM: noirq suspend of devices complete after 58.618 msecs
[48782.829505] Suspended for 0.727 seconds
[48782.883630] PM: noirq resume of devices complete after 53.983 msecs
[48782.985371] PM: noirq suspend of devices complete after 96.111 msecs
[48783.039782] PM: noirq resume of devices complete after 54.403 msecs
[48783.089373] PM: noirq suspend of devices complete after 49.510 msecs
[48784.175236] Suspended for 1.740 seconds
[48784.230309] PM: noirq resume of devices complete after 54.926 msecs
[48784.323101] PM: noirq suspend of devices complete after 88.100 msecs
[48784.377473] PM: noirq resume of devices complete after 54.365 msecs
[48784.435113] PM: noirq suspend of devices complete after 57.558 msecs
[48785.515050] Suspended for 0.740 seconds
[48785.569140] PM: noirq resume of devices complete after 53.939 msecs
[48785.666922] PM: noirq suspend of devices complete after 92.106 msecs
[48785.721166] PM: noirq resume of devices complete after 54.237 msecs
[48785.774929] PM: noirq suspend of devices complete after 53.676 msecs
[48786.860605] Suspended for 0.739 seconds
[48786.914870] PM: noirq resume of devices complete after 54.121 msecs
[48787.012472] PM: noirq suspend of devices complete after 92.104 msecs
[48787.067280] PM: noirq resume of devices complete after 54.801 msecs
[48787.124480] PM: noirq suspend of devices complete after 57.120 msecs
[48788.200622] Suspended for 1.736 seconds

And on resume I get a, most likely completely unrelated and purely cosmetic:

[49569.968486] intel-vbtn INT33D6:00: unknown event index 0xcd
Comment 97 Rafael J. Wysocki 2017-07-17 11:53:33 UTC
(In reply to Patrik Kullman from comment #96)
> Actually it doesn't seem to suspend properly, I get a lot of these while
> suspended:

Well, that's how the platform works, unfortunately.

The EC needs to be enabled to wake up the system for the power button wakeup to work, but if the EC is enabled to wake up the system, it will wake it up every once a while, so either-or.

I would argue that this is an EC firmware bug on the XPS13 9365.

> [48773.433130] Suspended for 1.736 seconds
> [48773.487290] PM: noirq resume of devices complete after 54.007 msecs
> [48773.584996] PM: noirq suspend of devices complete after 92.106 msecs
> [48773.640021] PM: noirq resume of devices complete after 55.019 msecs
> [48773.693010] PM: noirq suspend of devices complete after 52.909 msecs
> [48774.772925] Suspended for 0.739 seconds
> [48774.827173] PM: noirq resume of devices complete after 54.107 msecs
> [48774.924789] PM: noirq suspend of devices complete after 92.105 msecs
> [48774.979256] PM: noirq resume of devices complete after 54.460 msecs
> [48775.040788] PM: noirq suspend of devices complete after 61.453 msecs
> [48776.118507] Suspended for 1.732 seconds
> [48776.173052] PM: noirq resume of devices complete after 54.401 msecs
> [48776.274384] PM: noirq suspend of devices complete after 96.105 msecs
> [48776.328516] PM: noirq resume of devices complete after 54.124 msecs
> [48776.386397] PM: noirq suspend of devices complete after 57.801 msecs
> [48777.458479] Suspended for 0.731 seconds
> [48777.512360] PM: noirq resume of devices complete after 53.748 msecs
> [48777.578332] PM: noirq suspend of devices complete after 60.094 msecs
> [48777.639708] PM: noirq resume of devices complete after 61.369 msecs
> [48777.698355] PM: noirq suspend of devices complete after 58.569 msecs
> [48778.803968] Suspended for 0.760 seconds
> [48778.858508] PM: noirq resume of devices complete after 54.374 msecs
> [48778.955839] PM: noirq suspend of devices complete after 92.111 msecs
> [48779.010944] PM: noirq resume of devices complete after 55.097 msecs
> [48779.067844] PM: noirq suspend of devices complete after 56.823 msecs
> [48780.143970] Suspended for 1.736 seconds
> [48780.198045] PM: noirq resume of devices complete after 53.932 msecs
> [48780.299838] PM: noirq suspend of devices complete after 96.105 msecs
> [48780.354429] PM: noirq resume of devices complete after 54.583 msecs
> [48780.415834] PM: noirq suspend of devices complete after 61.318 msecs
> [48781.489487] Suspended for 0.728 seconds
> [48781.543593] PM: noirq resume of devices complete after 53.953 msecs
> [48781.641374] PM: noirq suspend of devices complete after 92.110 msecs
> [48781.702691] PM: noirq resume of devices complete after 61.310 msecs
> [48781.761389] PM: noirq suspend of devices complete after 58.618 msecs
> [48782.829505] Suspended for 0.727 seconds
> [48782.883630] PM: noirq resume of devices complete after 53.983 msecs
> [48782.985371] PM: noirq suspend of devices complete after 96.111 msecs
> [48783.039782] PM: noirq resume of devices complete after 54.403 msecs
> [48783.089373] PM: noirq suspend of devices complete after 49.510 msecs
> [48784.175236] Suspended for 1.740 seconds
> [48784.230309] PM: noirq resume of devices complete after 54.926 msecs
> [48784.323101] PM: noirq suspend of devices complete after 88.100 msecs
> [48784.377473] PM: noirq resume of devices complete after 54.365 msecs
> [48784.435113] PM: noirq suspend of devices complete after 57.558 msecs
> [48785.515050] Suspended for 0.740 seconds
> [48785.569140] PM: noirq resume of devices complete after 53.939 msecs
> [48785.666922] PM: noirq suspend of devices complete after 92.106 msecs
> [48785.721166] PM: noirq resume of devices complete after 54.237 msecs
> [48785.774929] PM: noirq suspend of devices complete after 53.676 msecs
> [48786.860605] Suspended for 0.739 seconds
> [48786.914870] PM: noirq resume of devices complete after 54.121 msecs
> [48787.012472] PM: noirq suspend of devices complete after 92.104 msecs
> [48787.067280] PM: noirq resume of devices complete after 54.801 msecs
> [48787.124480] PM: noirq suspend of devices complete after 57.120 msecs
> [48788.200622] Suspended for 1.736 seconds
> 
> And on resume I get a, most likely completely unrelated and purely cosmetic:
> 
> [49569.968486] intel-vbtn INT33D6:00: unknown event index 0xcd

I'm quite confident that this is not the event that woke you up, but can you paste more lines from dmesg around this for more context?
Comment 98 Patrik Kullman 2017-07-17 12:26:29 UTC
(In reply to Rafael J. Wysocki from comment #97)
> (In reply to Patrik Kullman from comment #96)
> > Actually it doesn't seem to suspend properly, I get a lot of these while
> > suspended:
> 
> Well, that's how the platform works, unfortunately.
> 
> The EC needs to be enabled to wake up the system for the power button wakeup
> to work, but if the EC is enabled to wake up the system, it will wake it up
> every once a while, so either-or.

"every once a while" being every 0.7 - 1.7 seconds? :)
And I assume that it's no difference between the power button or the lid opening? (Looking for workarounds..)

> I would argue that this is an EC firmware bug on the XPS13 9365.

That it happens too frequently or at all?

> > And on resume I get a, most likely completely unrelated and purely
> cosmetic:
> > 
> > [49569.968486] intel-vbtn INT33D6:00: unknown event index 0xcd
> 
> I'm quite confident that this is not the event that woke you up, but can you
> paste more lines from dmesg around this for more context?

Here's a complete suspend/resume:

[  432.314171] wlp60s0: deauthenticating from 08:60:6e:cb:73:18 by local choice (Reason: 3=DEAUTH_LEAVING)
[  432.332499] wlp60s0: failed to remove key (2, ff:ff:ff:ff:ff:ff) from hardware (-22)
[  432.344754] IPv6: ADDRCONF(NETDEV_UP): wlp60s0: link is not ready
[  432.367450] ACPI Error: Thread 334907008 cannot release Mutex [PATM] acquired by thread 196679552 (20170531/exmutex-416)
[  432.367461] ACPI Error: Method parse/execution failed \_SB.PCI0.LPCB.ECDV._Q66, AE_AML_NOT_OWNER (20170531/psparse-550)
[  433.755730] PM: Syncing filesystems ... done.
[  433.759972] PM: Preparing system for sleep (freeze)
[  433.761479] Freezing user space processes ... (elapsed 0.002 seconds) done.
[  433.764445] OOM killer disabled.
[  433.764446] Freezing remaining freezable tasks ... (elapsed 0.001 seconds) done.
[  433.766170] PM: Suspending system (freeze)
[  433.766172] Suspending console(s) (use no_console_suspend to debug)
[  434.196204] psmouse serio1: Failed to disable mouse on isa0060/serio1
[  435.796225] PM: suspend of devices complete after 1815.500 msecs
[  435.816250] PM: late suspend of devices complete after 20.016 msecs
[  435.864129] PM: noirq suspend of devices complete after 45.221 msecs
[  435.864133] PM: suspend-to-idle
[  436.223170] Suspended for 0.049 seconds
[  436.276728] PM: noirq resume of devices complete after 53.389 msecs
[  436.375016] PM: noirq suspend of devices complete after 92.109 msecs
[  436.431696] PM: noirq resume of devices complete after 56.675 msecs
[  436.491007] PM: noirq suspend of devices complete after 59.227 msecs
[  437.568679] Suspended for 0.732 seconds
[  437.622933] PM: noirq resume of devices complete after 54.114 msecs
[  437.720561] PM: noirq suspend of devices complete after 92.110 msecs
[  437.774979] PM: noirq resume of devices complete after 54.411 msecs
[  437.832567] PM: noirq suspend of devices complete after 57.513 msecs
[  438.908663] Suspended for 0.735 seconds
[  438.962854] PM: noirq resume of devices complete after 54.043 msecs
[  439.060527] PM: noirq suspend of devices complete after 92.103 msecs
[  439.114780] PM: noirq resume of devices complete after 54.247 msecs
[  439.172531] PM: noirq suspend of devices complete after 57.676 msecs
[  440.263427] Suspended for 1.736 seconds
[  440.317668] PM: noirq resume of devices complete after 54.102 msecs
[  440.415294] PM: noirq suspend of devices complete after 92.101 msecs
[  440.469416] PM: noirq resume of devices complete after 54.116 msecs
[  440.527303] PM: noirq suspend of devices complete after 57.812 msecs
[  441.594159] Suspended for 0.735 seconds
[  441.648044] PM: noirq resume of devices complete after 53.738 msecs
[  441.742030] PM: noirq suspend of devices complete after 88.107 msecs
[  441.796508] PM: noirq resume of devices complete after 54.472 msecs
[  441.854034] PM: noirq suspend of devices complete after 57.451 msecs
[  442.038178] PM: noirq resume of devices complete after 53.834 msecs
[  442.140105] PM: noirq suspend of devices complete after 96.037 msecs
[  442.216149] PM: noirq resume of devices complete after 76.036 msecs
[  442.216225] PM: resume from suspend-to-idle
[  442.280660] PM: early resume of devices complete after 62.146 msecs
[  442.286025] Suspended for 0.438 seconds
[  442.460394] PM: resume of devices complete after 179.731 msecs
[  442.460864] PM: Finishing wakeup.
[  442.460865] OOM killer enabled.
[  442.460866] Restarting tasks ... done.
[  442.552172] thermal thermal_zone11: failed to read out thermal zone (-5)
[  442.553302] [drm] RC6 on
[  442.630048] IPv6: ADDRCONF(NETDEV_UP): wlp60s0: link is not ready
[  442.690004] intel-vbtn INT33D6:00: unknown event index 0xcd
[  442.841939] ACPI Error: Thread 334907008 cannot release Mutex [PATM] acquired by thread 196679552 (20170531/exmutex-416)
[  442.841945] ACPI Error: Method parse/execution failed \_SB.PCI0.LPCB.ECDV._Q66, AE_AML_NOT_OWNER (20170531/psparse-550)
[  442.896705] IPv6: ADDRCONF(NETDEV_UP): wlp60s0: link is not ready
[  443.148883] IPv6: ADDRCONF(NETDEV_UP): wlp60s0: link is not ready
[  443.240308] IPv6: ADDRCONF(NETDEV_UP): wlp60s0: link is not ready
[  446.710455] IPv6: ADDRCONF(NETDEV_UP): wlp60s0: link is not ready
[  446.754511] wlp60s0: authenticate with 08:60:6e:cb:73:1c
[  446.763516] wlp60s0: send auth to 08:60:6e:cb:73:1c (try 1/3)
[  446.769219] wlp60s0: authenticated
[  446.771978] wlp60s0: associate with 08:60:6e:cb:73:1c (try 1/3)
[  446.772923] wlp60s0: RX AssocResp from 08:60:6e:cb:73:1c (capab=0x11 status=0 aid=2)
[  446.774490] wlp60s0: associated
[  446.774549] IPv6: ADDRCONF(NETDEV_CHANGE): wlp60s0: link becomes ready
[  446.795836] wlp60s0: Limiting TX power to 23 (23 - 0) dBm as advertised by 08:60:6e:cb:73:1c
Comment 99 Rafael J. Wysocki 2017-07-17 12:40:34 UTC
(In reply to Patrik Kullman from comment #98)
> (In reply to Rafael J. Wysocki from comment #97)
> > (In reply to Patrik Kullman from comment #96)
> > > Actually it doesn't seem to suspend properly, I get a lot of these while
> > > suspended:
> > 
> > Well, that's how the platform works, unfortunately.
> > 
> > The EC needs to be enabled to wake up the system for the power button
> wakeup
> > to work, but if the EC is enabled to wake up the system, it will wake it up
> > every once a while, so either-or.
> 
> "every once a while" being every 0.7 - 1.7 seconds? :)

Yes, in this particular case ...

> And I assume that it's no difference between the power button or the lid
> opening? (Looking for workarounds..)

I'm not sure about the lid to be honest, let me try to dig into that somewhat.

> > I would argue that this is an EC firmware bug on the XPS13 9365.
> 
> That it happens too frequently or at all?

It shouldn't wake you up at all in the state we have requested from it, but if it did that every minute, say, it probably wouldn't really matter.

> > > And on resume I get a, most likely completely unrelated and purely
> > cosmetic:
> > > 
> > > [49569.968486] intel-vbtn INT33D6:00: unknown event index 0xcd
> > 
> > I'm quite confident that this is not the event that woke you up, but can
> you
> > paste more lines from dmesg around this for more context?
> 
> Here's a complete suspend/resume:

[cut]

> [  442.460866] Restarting tasks ... done.
> [  442.552172] thermal thermal_zone11: failed to read out thermal zone (-5)
> [  442.553302] [drm] RC6 on
> [  442.630048] IPv6: ADDRCONF(NETDEV_UP): wlp60s0: link is not ready
> [  442.690004] intel-vbtn INT33D6:00: unknown event index 0xcd

Well, so this happens when user space runs again, so it's not the same event.

Most likely there is one more event coming from the EC in addition to the first wakeup one and sure enough it is not in the driver's keymap.  Again, I need to have a look into the ACPI tables of this machine to see what may be going on here.
Comment 100 Rafael J. Wysocki 2017-07-17 22:10:32 UTC
Let's set the lid thing aside.

I will ask you to test a couple of things, but first, after doing this:

# echo enabled > /sys/devices/platform/i8042/serio0/power/wakeup

you should be able to use the keyboard to wake up the machine from suspend-to-idle.  Please check if that works for you.

If it works, it is sufficient to do the above once per boot from the init scripts.
Comment 101 Rafael J. Wysocki 2017-07-17 22:15:56 UTC
Created attachment 257577 [details]
ACPI / EC: Add module parameter to disable system wakeup

If the keyboard wakeup works for you (as per the previous comment), this patch will allow you to go back to the non-functional EC wakeup (then you will have to use the keyboard to wake up the system), for example by doing

# echo Y > /sys/module/acpi/parameters/ec_no_wakeup

as root (the default setting can be restored by writing N to this file).

If the keyboard wakeup works, please try that and attach dmesg after suspend/resume.
Comment 102 Rafael J. Wysocki 2017-07-17 22:17:17 UTC
The patch from the previous comment should apply on top of 4.13-rc1 (it will not apply to the plain 4.12 in particular).
Comment 103 Rafael J. Wysocki 2017-07-18 02:02:07 UTC
Created attachment 257579 [details]
PM: Avoid printing suspend debug messages by default

This, in turn, should make all of the ugly debug stuff go away from dmesg, but of course it won't reduce the churn, just the dmesg noise resulting from it.
Comment 104 Patrik Kullman 2017-07-18 12:29:24 UTC
(In reply to Rafael J. Wysocki from comment #100)
> Let's set the lid thing aside.
> 
> I will ask you to test a couple of things, but first, after doing this:
> 
> # echo enabled > /sys/devices/platform/i8042/serio0/power/wakeup
> 
> you should be able to use the keyboard to wake up the machine from
> suspend-to-idle.  Please check if that works for you.
> 
> If it works, it is sufficient to do the above once per boot from the init
> scripts.

Ok, I have tried this and it works.

It was a while ago I compiled my own kernel but I'll have a look at it tonight.

As another route, would it be possible to fix the EC platform bug with a BIOS upgrade if Dell/Intel was informed of this? Would you know who to talk to ?
Comment 105 Rafael J. Wysocki 2017-07-18 16:26:43 UTC
(In reply to Patrik Kullman from comment #104)
> (In reply to Rafael J. Wysocki from comment #100)
> > Let's set the lid thing aside.
> > 
> > I will ask you to test a couple of things, but first, after doing this:
> > 
> > # echo enabled > /sys/devices/platform/i8042/serio0/power/wakeup
> > 
> > you should be able to use the keyboard to wake up the machine from
> > suspend-to-idle.  Please check if that works for you.
> > 
> > If it works, it is sufficient to do the above once per boot from the init
> > scripts.
> 
> Ok, I have tried this and it works.

OK

> It was a while ago I compiled my own kernel but I'll have a look at it
> tonight.

Cool, thanks!

> As another route, would it be possible to fix the EC platform bug with a
> BIOS upgrade if Dell/Intel was informed of this? Would you know who to talk
> to ?

I'd love that to happen and Dell is aware of the problem already.
Comment 106 Patrik Kullman 2017-07-18 21:17:00 UTC
Created attachment 257591 [details]
dmesg with 4.13-rc1 + no_ec_wakeup

I got the nice help from jsalisbury from #ubuntu-kernel IRC to build me a kernel with the patch (https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1705099) and I'm happy to say that it works perfectly!

Attaching the dmesg and the last suspend/resume was with keyboard wake on and no_ec_wakeup=Y.. and it slept through it all! :)

Do as you please with the other patch, but please make the first one go into 4.13-rc2 :) This feels like the "perfect workaround" right now.
Comment 107 Rafael J. Wysocki 2017-07-19 00:58:42 UTC
(In reply to Patrik Kullman from comment #106)
> Created attachment 257591 [details]
> dmesg with 4.13-rc1 + no_ec_wakeup
> 
> I got the nice help from jsalisbury from #ubuntu-kernel IRC to build me a
> kernel with the patch
> (https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1705099) and I'm happy
> to say that it works perfectly!
> 
> Attaching the dmesg and the last suspend/resume was with keyboard wake on
> and no_ec_wakeup=Y.. and it slept through it all! :)
> 
> Do as you please with the other patch, but please make the first one go into
> 4.13-rc2 :) This feels like the "perfect workaround" right now.

Posted as https://patchwork.kernel.org/patch/9850147/ and we'll see what happens.

I may need to make changes to it, so -rc2 may not be a realistic target, but I think the patch is fair enough in principle.
Comment 108 Axel Klein 2017-07-19 22:11:49 UTC
Hello all,

I own an XPS 13 9365 and I am enjoying the improvements you are bringing to the usability of the "package" Linux-on-XPS-13-9365 every day!
I mostly use the hibernate-feature becaus it's working best.

I just want to thank you guys for the work you are doing.


You are doing a great job and I hope the suspend will also work flawlessly,
although I fear that the s2idle-mode is not as energy-saving as it could be with s3.
As far as I've understood, on Windows also the s2idle is used and not the s3.
Is this correct?
And if yes, do you have an idea why one (Dell, Intel) creates a mobile system that is not capable of using a suspend mode that keeps the battery-drain low?

Thank you in advance for a potential answer and of course again for your great work.

Best regards
Axel
Comment 109 Patrik Kullman 2017-07-20 05:46:28 UTC
A few last comments,

For anyone finding this bug to adjust these settings, "once per boot from the init scripts" isn't that easy anymore with systemd :)

The cleanest way I found was to install sysfsutils and adding this into /etc/sysfs.conf:

devices/platform/i8042/serio0/power/wakeup = enabled
module/acpi/parameters/ec_no_wakeup = Y 


Also, Rafael, what about the unknown events? Should I create a separate bug for those:

intel-vbtn INT33D6:00: unknown event index 0xcd
Comment 110 Rafael J. Wysocki 2017-07-20 14:50:46 UTC
(In reply to Patrik Kullman from comment #109)
> A few last comments,
> 
> For anyone finding this bug to adjust these settings, "once per boot from
> the init scripts" isn't that easy anymore with systemd :)
> 
> The cleanest way I found was to install sysfsutils and adding this into
> /etc/sysfs.conf:
> 
> devices/platform/i8042/serio0/power/wakeup = enabled
> module/acpi/parameters/ec_no_wakeup = Y 

You could also use a udev rule for that I think.

> Also, Rafael, what about the unknown events? Should I create a separate bug
> for those:
> 
> intel-vbtn INT33D6:00: unknown event index 0xcd

That's just a genuinely unknown event, so it's a separate issue, but I guess it's better to send e-mail to the driver maintainer/author in this case.  You can CC me too. :-)
Comment 111 Paul Menzel 2017-07-20 17:48:27 UTC
(In reply to Rafael J. Wysocki from comment #110)
> (In reply to Patrik Kullman from comment #109)
> > A few last comments,
> > 
> > For anyone finding this bug to adjust these settings, "once per boot from
> > the init scripts" isn't that easy anymore with systemd :)
> > 
> > The cleanest way I found was to install sysfsutils and adding this into
> > /etc/sysfs.conf:
> > 
> > devices/platform/i8042/serio0/power/wakeup = enabled
> > module/acpi/parameters/ec_no_wakeup = Y 
> 
> You could also use a udev rule for that I think.

Does passing `acpi.ec_no_wakeup=1` on the Linux command line work?

[…]
Comment 112 Mario Limonciello 2017-07-20 19:40:36 UTC
> intel-vbtn INT33D6:00: unknown event index 0xcd

Please CC me on the bugs you file regarding this too.  I think there are some events missing related to when you switch tablet/regular mode and this most likely corresponds to one of them.


> And if yes, do you have an idea why one (Dell, Intel) creates a mobile system
> that is not capable of using a suspend mode that keeps the battery-drain low?

Theoretically S2I/Modern Standby is supposed to be similar battery consumption as S3 when all the (moving) parts are properly optimized.
Comment 113 Rafael J. Wysocki 2017-07-20 21:41:37 UTC
(In reply to Paul Menzel from comment #111)
> (In reply to Rafael J. Wysocki from comment #110)
> 
> Does passing `acpi.ec_no_wakeup=1` on the Linux command line work?
> 
> […]

Yes, it should.
Comment 114 Patrik Kullman 2017-07-24 20:42:30 UTC
Rafael, I can't really follow patchwork and the git web client.
Did it make it into rc2? I don't see any objections?
Comment 115 Rafael J. Wysocki 2017-07-24 21:15:50 UTC
No, it is in linux-next right now and I'm going to push it for -rc3.
Comment 116 Patrik Kullman 2017-07-25 11:38:03 UTC
Oh ok great!
Comment 117 Sam Shepard 2017-09-17 13:42:20 UTC
(In reply to Patrik Kullman from comment #116)
> Oh ok great!

Dear Patrik and Rafael. Thank you both for your efforts on solving this issue. As a linux user who is not as tech savvy would it be possible to get an idiots explanation of how to apply this fix? Is it just a case of updating the kernel? I have updated the kernel to version 4.13 on ubuntu 16.04 and still the issue persists.
Comment 118 Mario Limonciello 2017-09-18 13:57:48 UTC
@Sam,

The behavior to pick S2I over S3 by default didn't land in 4.13, it's going to be in 4.14 however (unless reverted).  What you'll want to do is change it on your system like this:

echo "s2idle" | sudo tee /sys/power/mem_sleep

That will default to s2idle for your running session.  If that works well for you you can add to your kernel command line mem_sleep_default=s2idle.  You can do this by modifying GRUB_CMDLINE_LINUX_DEFAULT in /etc/default/grub and running update-grub.
Comment 119 Sam Shepard 2017-09-18 15:42:04 UTC
(In reply to Mario Limonciello from comment #118)
> @Sam,
> 
> The behavior to pick S2I over S3 by default didn't land in 4.13, it's going
> to be in 4.14 however (unless reverted).  What you'll want to do is change
> it on your system like this:
> 
> echo "s2idle" | sudo tee /sys/power/mem_sleep
> 
> That will default to s2idle for your running session.  If that works well
> for you you can add to your kernel command line mem_sleep_default=s2idle. 
> You can do this by modifying GRUB_CMDLINE_LINUX_DEFAULT in /etc/default/grub
> and running update-grub.

Thanks Mario! The laptop is now waking automatically on opening. I have successfully added the above command to my kernel command line.

I was just struggling with understanding the instructions here (https://liveandletlearn.net/post/dell-xps13-9365-dual-boot/) regarding installing sysfsutils and how to add the mentioned file to my system.

I assume that these additional steps are no longer necessary- is there anything else I need to do other than running 4.13.2 and having added that command line to my kernel command line? (for example the last two commands- # Enable key press to wakeup (devices/platform/i8042/serio0/power/wakeup = enabled) and # Disable heartbeat wakeups during suspend (module/acpi/parameters/ec_no_wakeup = Y)?
Comment 120 Patrik Kullman 2017-09-18 16:45:14 UTC
Well that blog post refers to my comment https://bugzilla.kernel.org/show_bug.cgi?id=192591#c109 and I would say rather follow those instructions than sending boot params in my opinion since they're less invasive.

The other two options will save dramatically increase battery time since they will prevent micro wakeups every 1-2 seconds.
Comment 121 Axel Klein 2017-09-21 18:08:27 UTC
Having included 

- 'mem_sleep_default=s2idle' into the GRUB_CMDLINE_LINUX_DEFAULT,

- installed sysfsutils and 
- included 'devices/platform/i8042/serio0/power/wakeup = enabled' and
  'module/acpi/parameters/ec_no_wakeup = Y' 
  
  into the  /etc/sysfs.conf of my Dell XPS-13 9365,

I experience pretty accurately a battery drain of 5% per hour with these settings in suspend.
Does this meet other user's experience?

I am curious whether this is the 'performance' we can expect.

Best regards
Axel
Comment 122 Axel Klein 2018-01-14 15:06:20 UTC
Hi,

compared to this 5 per hour battery drain while in sleep, I've found a maximum battery drain of 2-3% independent of sleep time when the 9365 runs Windows.

I assume they have a mechanism in place that changes the state to 'hibernate' after a certain time. For example when I wake up the 9365 after half an hour from sleep, it comes back very quickly, and when I wake it up after one hour from sleep it goes through a booting process that takes much more time.

If this is true, I am asking whether we could have a similar behavior running Linux on this machines.
To me such a procedure looks very sensible: Having it waking up very quickly within half an hour or so and limiting the battery drain to a couple of percents.
This is exactly what I would wish.  

So I am asking whether we could have a similar behavior running Linux on this machines?

By the way: Is this the right location to ask such a question?

Best regards
Comment 123 Mario Limonciello 2018-01-15 15:03:42 UTC
@axel,

This comparison matches the performance behavior I've been seeing on some other models too.  I'd expect more improvements to be done in the future so that the drain is "lower" and these two more closely match.

As for your question, I believe that functionality is possible to achieve too. 
That functionality is best designated to userspace however (such as systemd performing a hybrid suspend when it detects s2idle).  It would be best to bring that up with the systemd mailing lists.
Comment 124 Axel Klein 2018-02-03 12:22:33 UTC
@Mario

Thank you, Mario!

I filed this into the systemd-devel mailinglist.
Comment 125 Esokrarkose 2018-02-12 17:41:46 UTC
Created attachment 274135 [details]
Benchmark of kernel 4.9

I have used the 4.9x kernel series for a long time and patched the kernel, back then the patch was VERY effective as you can see (25ms for acpi_pm_finish).
Comment 126 Esokrarkose 2018-02-12 17:42:37 UTC
Sorry, posted to the wrong bug :-(.