Bug 199689 - s2idle does not work in Dell XPS 9370
Summary: s2idle does not work in Dell XPS 9370
Status: ASSIGNED
Alias: None
Product: Power Management
Classification: Unclassified
Component: Hibernation/Suspend (show other bugs)
Hardware: All Linux
: P1 normal
Assignee: Srinivas Pandruvada
URL:
Keywords:
: 201523 (view as bug list)
Depends on:
Blocks:
 
Reported: 2018-05-11 04:54 UTC by James Roper
Modified: 2019-09-09 15:12 UTC (History)
28 users (show)

See Also:
Kernel Version: 4.14
Tree: Mainline
Regression: No


Attachments
Blacklist XPS 13 9370 from s2idle (1.54 KB, patch)
2018-05-11 05:04 UTC, James Roper
Details | Diff
turbostat output on 2018-06-18, after sleeping 1 min (systemctl suspend) (44.46 KB, text/plain)
2018-06-18 18:35 UTC, Alyssa Hung
Details
turbostat output (systemctl suspend) (43.17 KB, text/plain)
2018-06-19 01:06 UTC, Alyssa Hung
Details
turbostat output (echo mem > /sys/power/state) (56.15 KB, text/plain)
2018-06-19 01:08 UTC, Alyssa Hung
Details
turbostat output (echo mem > /sys/power/state) (30.40 KB, text/plain)
2018-06-24 19:04 UTC, Alyssa Hung
Details
attachment-24371-0.html (1.47 KB, text/html)
2018-07-28 12:57 UTC, Mario.Limonciello
Details
Turbostat output after running LTR script (28.77 KB, text/plain)
2018-08-15 15:53 UTC, Theodore Tso
Details
Turbostat output after running LTR script w/o Satachi power monitor (20.90 KB, text/plain)
2018-08-16 01:40 UTC, Theodore Tso
Details
smime.p7s (5.05 KB, application/pkcs7-signature)
2018-09-07 11:57 UTC, Paul Menzel
Details
attachment-29801-0.html (502 bytes, text/html)
2019-02-07 09:59 UTC, Mario.Limonciello
Details
attachment-26080-0.html (3.06 KB, text/html)
2019-08-12 02:17 UTC, Ryan J Schave
Details

Description James Roper 2018-05-11 04:54:19 UTC
I believe this is the same problem as mentioned in https://bugzilla.kernel.org/show_bug.cgi?id=196907, but for the latest Dell XPS 13, the 9370.

Resuming from suspend often doesn't work, and even when it does, often the battery has been significantly drained during suspend (eg, overnight, drained to 50%). The default is s2idle, switching to deep solves it.
Comment 1 James Roper 2018-05-11 05:04:42 UTC
Created attachment 275913 [details]
Blacklist XPS 13 9370 from s2idle

Here's a patch that implements the same fix as for the XPS 13 9360. I've tested it on my machine, and it works for me.
Comment 2 James Roper 2018-05-11 05:08:17 UTC
Here's an additional discussion from other XPS 13 9370 users who have experienced the same problem:

https://www.reddit.com/r/Dell/comments/8b6eci/xp_13_9370_battery_drain_while_suspended/
Comment 3 Mario Limonciello 2018-05-15 20:56:41 UTC
@James,

At least locally feel free to change the policy to "deep", but it's actually intentional to be using s2idle on this machine with the latest upstream kernel.  

Rather than swing the giant hammer around to swap back to S3, I would prefer that we find the problems in the kernel preventing you from getting into deep enough C states to not burn too much battery.  

Can you please start with the following:

1) run powertop --autotune

This will reconfigure many of the defaults from the kernel to "better" values for power management purposes.

See if that helps in a measurable way.  If it's not helping in a significant way than this will require some more debugging.

Can you please notate if you have an NVMe SSD or SATA SSD in your XPS 9370?
Comment 4 Cédric Bellegarde 2018-05-18 05:11:43 UTC
Same problem here:
- TLP enabled so I guess equivalent to powertop --autotune
- NVMe SSD

Give me any command to run on this laptop to help you debugging.
Comment 5 Mario Limonciello 2018-05-18 12:39:50 UTC
Although TLP does many similar things to powertop autotune, some of its defaults are not adequate.

For example I filed this as a result:
https://github.com/linrunner/TLP/issues/344

That means that TLP will behave incorrectly both on AC and battery.

So please explicitly check with powertop autotune.
Comment 6 Srinivas Pandruvada 2018-05-18 16:29:30 UTC
After doing steps suggested in comment #3, then what is difference in count before and after wakeup from suspend to idle

/sys/devices/system/cpu/cpuidle/low_power_idle_cpu_residency_us
/sys/devices/system/cpu/cpuidle/low_power_idle_system_residency_us

Also do before, so that we get more PM debug messages in dmesg
echo 1 > /sys/power/pm_debug_messages
Comment 7 Zhang Rui 2018-05-30 04:53:31 UTC
ping...
Comment 8 Timur Kristóf 2018-06-11 18:25:36 UTC
Hi,

I also have an XPS 13 9370 so I hope I can help with investigating this.
The system has an NVMe SSD (at least the devices node is called /dev/nvme0).

powertop --autotune says: powertop: unrecognized option '--autotune' (using powertop-2.9-8.fc28.x86_64 here).

So, at the beginning it looks like this:
/sys/devices/system/cpu/cpuidle/low_power_idle_system_residency_us is 0
/sys/devices/system/cpu/cpuidle/low_power_idle_cpu_residency_us is 0

Then I put the laptop to sleep for ~10 minutes. After that:
/sys/devices/system/cpu/cpuidle/low_power_idle_system_residency_us is 0
/sys/devices/system/cpu/cpuidle/low_power_idle_cpu_residency_us is 80717

Then I enabled tlp and put the laptop to sleep for ~20 minutes. After that:
/sys/devices/system/cpu/cpuidle/low_power_idle_system_residency_us is 0
/sys/devices/system/cpu/cpuidle/low_power_idle_cpu_residency_us is still 80717

Not sure exactly what those numbers mean, but it looks suspicious that it stayed the same.
Comment 9 Srinivas Pandruvada 2018-06-11 18:27:33 UTC
The option is 
--auto-tune

not --autotune
Comment 10 Timur Kristóf 2018-06-11 18:30:33 UTC
(In reply to Srinivas Pandruvada from comment #9)
> The option is 
> --auto-tune
> 
> not --autotune

Sorry! I did check powertop after enabling tlp, and all powersaving options were enabled (except the VM writeback timeout), so I think we should be seeing the same result with both. But, if you want, I can re-run the numbers with powertop --auto-tune - would that help?
Comment 11 Srinivas Pandruvada 2018-06-11 19:02:51 UTC
Those numbers are very low. When you say sleep, I think you did suspend (echo mem > /sys/power/state or similar using some tools).
So for 10 minutes of suspend the it slept for 80uS. So please try with powertop --auto-tune. You can just let it sleep for 1 min and see what you get.
Comment 12 Alyssa Hung 2018-06-14 19:12:42 UTC
I have a Dell XPS 13 9730 (2018), also with an NVMe SSD.

This is the output from powertop --auto-tune:

$ sudo powertop --auto-tune
modprobe cpufreq_stats failedLoaded 5 prior measurements
Cannot load from file /var/cache/powertop/saved_parameters.powertop
File will be loaded after taking minimum number of measurement(s) with battery only 
RAPL device for cpu 0
RAPL Using PowerCap Sysfs : Domain Mask f
RAPL device for cpu 0
RAPL Using PowerCap Sysfs : Domain Mask f
Devfreq not enabled
glob returned GLOB_ABORTED
Cannot load from file /var/cache/powertop/saved_parameters.powertop
File will be loaded after taking minimum number of measurement(s) with battery only 
To show power estimates do 304 measurement(s) connected to battery only
Leaving PowerTOP

I checked the Tunables tab in powertop to confirm that all tunables were "Good". My laptop was then suspended overnight with s2idle. These are the numbers resulting from that:

$ date; cat /sys/devices/system/cpu/cpuidle/low_power_idle_cpu_residency_us; cat /sys/devices/system/cpu/cpuidle/low_power_idle_system_residency_us 
Thu Jun 14 03:06:01 PDT 2018
0
0

$ date; cat /sys/devices/system/cpu/cpuidle/low_power_idle_cpu_residency_us; cat /sys/devices/system/cpu/cpuidle/low_power_idle_system_residency_us 
Thu Jun 14 11:55:14 PDT 2018
0
0

The battery was 100% charged prior to suspending. It was down to 26% when resuming.
Comment 13 Mario Limonciello 2018-06-14 20:35:30 UTC
@Alyssa Hung,

Thanks, can you please confirm which kernel you tested and got that result?
Comment 14 Alyssa Hung 2018-06-14 20:39:24 UTC
$ uname -a
Linux xeli 4.16.13-2-ARCH #1 SMP PREEMPT Sat Jun 9 02:32:29 PDT 2018 x86_64 GNU/Linux
Comment 15 Srinivas Pandruvada 2018-06-18 16:40:40 UTC
First see if you see any error for download of firmware

#dmesg | grep -i i915

If you don't see any error try this:

# for i in {0..32}; do echo $i > ltr_ignore; done

# turbostat

# echo mem > /sys/power/state

Wait for 1 minutes and wake up the system
wait for few sample update on screen for turbostat output.

Attach output of turbostat.
Comment 16 Alyssa Hung 2018-06-18 18:35:25 UTC
Created attachment 276649 [details]
turbostat output on 2018-06-18, after sleeping 1 min (systemctl suspend)

I wasn't able to produce useful output exactly as requested.

# echo mem > /sys/power/state

would put the laptop to sleep (screen off, power indicator off) for about a second, after which it would re-wake itself. In order to make the laptop stay asleep, I had to use

# systemctl suspend

The attached output reflects that scenario.

Additionally, I have made some changes to the system since the last time I commented. This is the current kernel:

$ uname -a
Linux xeli 4.17.2-1-ARCH #1 SMP PREEMPT Sat Jun 16 11:08:59 UTC 2018 x86_64 GNU/Linux

This boot parameter was added:

i915.enable_guc=1

And TLP was installed. powertop --auto-tune was _not_ run prior to the attached output. powertop Tunables tab indicated that all tunables were set to "Good" (by TLP) except for "VM writeback timeout".

This is updated output from the files I cat-ed in a previous comment:

$ cat /sys/devices/system/cpu/cpuidle/low_power_idle_cpu_residency_us; cat /sys/devices/system/cpu/cpuidle/low_power_idle_system_residency_us 
14699
0
Comment 17 Srinivas Pandruvada 2018-06-18 23:03:59 UTC
Sorry, I was not clear, You need to change folder to
# cd /sys/kernel/debug/pmc_core
# for i in {0..32}; do echo $i > ltr_ignore; done


You mean you couldn't run turbostat?

You don't need to enable_guc=1. 
That residency is quite low.
Comment 18 Mario Limonciello 2018-06-19 00:16:57 UTC
turbostat output was in the previous attachment, but it wasn't clear if that was run across the suspend-to-idle cycle or not.

So Alyssa when you respond and do the LTR adjustment command from that directory please start turbostat in terminal one tab, enter S2I in another, sleep for 1 minute, wait a few seconds after wakeup for some more turbostat collection and then attach that.
Comment 19 Alyssa Hung 2018-06-19 01:06:47 UTC
Created attachment 276671 [details]
turbostat output (systemctl suspend)

Sorry, I meant to annotate the output I attached to indicate when the suspend-and-resume happened, but forgot to save changes to the file before uploading it.

I can't suspend the laptop using the command:

  # echo mem > /sys/power/state

because that only causes the screen to flicker briefly off, then back on. The laptop doesn't stay asleep.

The output attached to this comment was produced by putting the laptop to sleep with the command:

  # systemctl suspend

Search for the string "--- suspend and resume ---" to see where the break happened.
Comment 20 Alyssa Hung 2018-06-19 01:08:16 UTC
Created attachment 276673 [details]
turbostat output (echo mem > /sys/power/state)

For completeness's sake, I repeated the process using the command

  # echo mem > /sys/power/state

by re-running the command every time the laptop woke itself back up. Each time I had to re-run the command, that is annotated with "--- suspend and resume ---".
Comment 21 Paul Menzel 2018-06-19 06:40:44 UTC
Alyssa, could you please report a new bug for S2I not working for you with 4.17.2, and reference it here. Please attached the Linux messages, that means output of `dmesg`, there?

Could this bug report please be renamed to *Battery drained during s2idle on Dell XPS 13 9370*?
Comment 22 Rafael J. Wysocki 2018-06-19 16:48:24 UTC
@Alyssa: Can you please do (a) reboot the system and then (b)

# echo 1 > /sys/power/debug_messages
# echo mem > /sys/power/state

as root and attach a dmesg output after that?
Comment 23 Srinivas Pandruvada 2018-06-19 17:08:36 UTC
I think the attachment 276671 [details] is after LTR adjustment. I see
I see 92.68% residency in CPU's lowest power in 1 minute. This is a good number.
Did you do?
# cat /sys/devices/system/cpu/cpuidle/low_power_idle_cpu_residency_us; cat /sys/devices/system/cpu/cpuidle/low_power_idle_system_residency_us

They should have good numbers. If they are then we can identify which device LTR is a problem.
Comment 24 Srinivas Pandruvada 2018-06-19 18:27:21 UTC
I just updated to the latest BIOS and with 4.18-rc1 to my 9370. I get in a minute of suspend, Both the CPU and the total system was at the lowest power for 59.7 seconds.
Comment 25 Rafael J. Wysocki 2018-06-20 07:36:57 UTC
OK, thanks!

The problem doesn't affect all of the 9370's then, so basically we need to find out what the differences between them are and why they matter.
Comment 26 Mario Limonciello 2018-06-20 12:46:32 UTC
@srinivas,

So you had no LTR adjustments needed and are seeing good power state residency on your configuration?
Comment 27 Srinivas Pandruvada 2018-06-20 17:17:58 UTC
@Mario,

I didn't do any LTR adjustment.
Comment 28 Srinivas Pandruvada 2018-06-21 22:14:07 UTC
I want to know from the reporters of this issue, is any device is connected to type C port (mouse, kb, usb-ethernet etc.)?
Comment 29 Alyssa Hung 2018-06-22 02:55:34 UTC
@Rafael: I'm out of town all week, and can't seem to reproduce the problem (where echo mem > /sys/power/state can't make the laptop stay asleep) tonight. There may be something in my home environment that is a factor. I'll try to reproduce and provide the requested output next week.

@Srinivas: No devices were connected to any of the laptop's ports when I witnessed the overnight battery drain while suspended.
Comment 30 Timur Kristóf 2018-06-22 08:59:52 UTC
Hi,

@Srinivas: No, none of the Type-C ports were plugged in when I did the test. The device was running on battery.

Are you suggesting that this might be fixed on 4.18?

I'm also running on kernel 4.17.2 (Fedora), and I haven't seen a problem with 'echo mem > /sys/power/state'.
Comment 31 Alyssa Hung 2018-06-24 19:04:01 UTC
Created attachment 276787 [details]
turbostat output (echo mem > /sys/power/state)

I don't know what it is about my environment that changed, but 'echo mem >/sys/power/state' is working as expected now.

I repeated the test with 'for i in {0..32}; do echo $i >ltr_ignore; done'. The turbostat output is attached.

Residency numbers are much higher than before:

# cat /sys/devices/system/cpu/cpuidle/low_power_idle_cpu_residency_us; cat /sys/devices/system/cpu/cpuidle/low_power_idle_system_residency_us
62693152
0

If I do the test without first doing the ltr_ignore thing, then residency numbers are both 0. What can I do to help figure out which LTR(s) are causing problems?

Kernel and boot params in use:

[    0.000000] Linux version 4.17.2-1-ARCH (builduser@heftig-9574) (gcc version 8.1.1 20180531 (GCC)) #1 SMP PREEMPT Sat Jun 16 11:08:59 UTC 2018
[    0.000000] Command line: initrd=\intel-ucode.img initrd=\initramfs-linux.img root=PARTUUID=6d264517-7c51-4454-8b8c-2efff2bd878e rw i915.enable_guc=3
Comment 32 Marco 2018-06-24 19:18:54 UTC
Having the same issue with XPS 13 9365 (2-in-1). Resume from suspend (s2idle) never works, including by pressing the power button for 6sec+.  Machine requires a hard reboot every time.
Comment 33 Marco 2018-06-24 20:47:10 UTC
Actually please ignore above post.  Machine would go to s3 after a extended suspend duration, which caused the problem.  Forcing to stay at s2 only solved it.  I guess the problem on the 9365 is inverse to the 9370.
Comment 34 Srinivas Pandruvada 2018-06-25 15:51:41 UTC
Marco,
9365 can't be woken up from S3, so it has to be suspend to idle only.
Comment 35 Mario Limonciello 2018-06-25 15:53:38 UTC
Also unrelated to this issue, please keep this issue specifically around 9370 and s2idle power consumption.  Anything around a different system or a different behavior should be a different issue.
Comment 36 Srinivas Pandruvada 2018-06-25 16:27:18 UTC
Alyssa Hung,

We need to find the device which is causing this issue. You can run powertop in another window and see if some device is keeping system busy.

Try this.

After fresh boot and powertop --auto-tune, 
You can try to put in a script something like this:


#!/bin/bash

counter=0

until [ $counter -gt 32 ]
do
echo $counter > /sys/kernel/debug/pmc_core/ltr_ignore
echo "LTR ignore for" $counter

rtcwake  -m freeze -s 10

residency=$(cat /sys/devices/system/cpu/cpuidle/low_power_idle_cpu_residency_us)
echo "residency is" $residency

if [ $residency -eq 0 ]; then
        echo "Residency is non zero!"
        break
fi

((counter++))

sleep 2

done
Comment 37 Mario Limonciello 2018-06-29 17:48:41 UTC
There's a minor typo in the above script.

if [ $residency -eq 0 ]; then

Should be

if [ $residency -gt 0 ]; then


At least on my configuration that previously wasn't showing residency, after powertop autotune, configuring "0" and "1"  I started to show residency.
Comment 38 Srinivas Pandruvada 2018-06-29 19:10:06 UTC
Thanks Mario. The corrected script:

#!/bin/bash

counter=0

until [ $counter -gt 32 ]
do
echo $counter > /sys/kernel/debug/pmc_core/ltr_ignore
echo "LTR ignore for" $counter

rtcwake  -m freeze -s 10

residency=$(cat /sys/devices/system/cpu/cpuidle/low_power_idle_cpu_residency_us)
echo "residency is" $residency

if [ $residency -gt 0 ]; then
        echo "Residency is non zero!"
        break
fi

((counter++))

sleep 2

done
Comment 39 Alyssa Hung 2018-07-01 06:38:03 UTC
Hi, sorry again for taking so long to respond. Here is the result of running the corrected version of the script:

# sh ltr-test.sh 
LTR ignore for 0
rtcwake: wakeup from "freeze" using /dev/rtc0 at Sun Jul  1 06:29:16 2018

residency is 0
LTR ignore for 1
rtcwake: wakeup from "freeze" using /dev/rtc0 at Sun Jul  1 06:29:30 2018
residency is 9763589
Residency is non zero!



Just to confirm the results, I re-ran it again after rebooting:

# sh ltr-test.sh 
LTR ignore for 0
rtcwake: wakeup from "freeze" using /dev/rtc0 at Sun Jul  1 06:35:37 2018

residency is 0
LTR ignore for 1
rtcwake: wakeup from "freeze" using /dev/rtc0 at Sun Jul  1 06:35:50 2018
residency is 9231413
Residency is non zero!


LTR 1 both times.
Comment 40 Mario Limonciello 2018-07-02 15:14:42 UTC
Alyssa,

Can you please try to go into BIOS setup and disable Thunderbolt?  After doing this please re-run the test script to confirm if it's helped.
Comment 41 Alyssa Hung 2018-07-04 02:35:22 UTC
Disabling Thunderbolt support does seem to have helped. Script output with Thunderbolt completely disabled in the firmware setup:

LTR ignore for 0
rtcwake: wakeup from "freeze" using /dev/rtc0 at Wed Jul  4 02:15:36 2018

residency is 9436585
Residency is non zero!


With Thunderbolt enabled, but Thunderbolt Boot Support disabled, residency was lower at 8627152.

Disabling Thunderbolt completely is not something I'm able to do long-term, as I have docks and dongles that work only with Thunderbolt.
Comment 42 Mario Limonciello 2018-07-04 04:16:49 UTC
I see Thunderbolt being turned off as a debugging tactic, once we have this fully comprehended I believe we should be leaving it turned on.

Do you mean that if you have thunderbolt boot support turned off but thunderbolt on your are still seeing residency without running LTR ignore script?

The other thing that I would like to know is if the power consumption seems reasonable to you when in this configuration (As this issue was originally about).
Comment 43 Alyssa Hung 2018-07-04 06:31:26 UTC
Yes, I meant that having Thunderbolt turned on but Thunderbolt boot support turned off results in non-zero residency without running the LTR ignore script.

With that configuration, I saw battery decrease 7% over 121 minutes while sleeping with s2idle. Subjectively, that seems unreasonable to me. I think the drain was closer to 1% per hour with deep sleep.
Comment 44 Mario Limonciello 2018-07-19 13:53:33 UTC
> With that configuration, I saw battery decrease 7% over 121 minutes while
> sleeping with s2idle. Subjectively, that seems unreasonable to me. I think
> the drain was closer to 1% per hour with deep sleep.

I suspect there is a second issue then here for your configuration.

Are you sure that you had run --auto-tune with powertop (or used TLP to affect the same changes) in that test with TBT on but TBT boot off?

We know for sure right now that TBT boot in BIOS setup causes problems with LTR.  On a system with SATA I was able to use these two patches to make sure SATA got to deepest sleep state when --auto-tune was used:
https://patchwork.kernel.org/patch/10502285/
https://patchwork.kernel.org/patch/10502287/

And then confirmed across an 7 hour span to have a 4% drop in battery.  This was with one of the 4.18-rcX kernels (Sorry I forget if it was RC2 or RC5 and have both installed right now).
Comment 45 Theodore Tso 2018-07-28 12:56:55 UTC
Mario Limonciello from Dell suggested that I join this bug if I still had problems with s2idle.   I was using 4.18-rc2 (about to upgrade to 4.18-rc6 if that's going to make a difference).  I have a Dell XPS model 9370, with NVMe 1TB flash attached.   I do have TBT boot turned off.

I did run "powertop --auto-tune" before suspending.  (In fact I trigger it out of a systemd unit at boot, and I double-checked that all of the powertop settings were "Good" before I did the suspend.)    After an 11 hour (668 minute) suspend, the batteries declined from 6486 mAh to 3331 mAh.  That works out to roughly 2.3 W per hour drain, and Mario suggested that if it was more than 1 W per hour, that I join this bug.

# cat /sys/devices/system/cpu/cpuidle/low_power_idle_cpu_residency_us; cat /sys/devices/system/cpu/cpuidle/low_power_idle_system_residency_us
0
0

Since this is a NVMe system, there shouldn't be any SATA issues....
Comment 46 Mario.Limonciello 2018-07-28 12:57:05 UTC
Created attachment 277585 [details]
attachment-24371-0.html

?Hello,

I am currently out on Paternity leave.
Comment 47 Mario Limonciello 2018-08-14 04:28:40 UTC
@Ted,

Did you have anything plugged into USB ports over the suspend to idle run?  Particularly of interest would be if anything was plugged into the Thunderbolt port.

If you did - can you please compare results with nothing plugged in?
Comment 48 Theodore Tso 2018-08-14 12:06:17 UTC
Nothing was plugged in; there aren't enough USB ports, alas, for me to use a USB-C Nano Yubikey or anything like that.   (My two USB-A Yubikeys are attached to a Hootoo mini-USB C dock that looks like a massively oversized dongle, and which is *not* plugged in when my laptop is in transit, for obvious reasons.)
Comment 49 Mario Limonciello 2018-08-14 13:08:41 UTC
OK, thanks for confirming.  Can you please try the LTR ignore script that was shared above?  Comment 38.  See if you get any different results.
Comment 50 Theodore Tso 2018-08-15 15:53:15 UTC
Created attachment 277879 [details]
Turbostat output after running LTR script

I tried using 

# for i in {0..32}; do echo $i > ltr_ignore; done

... and it didn't seem to help.  After running the above and collecting the data, I also tried this:

# cat /sys/devices/system/cpu/cpuidle/low_power_idle_cpu_residency_us; cat /sys/devices/system/cpu/cpuidle/low_power_idle_system_residency_us
0
0

I was also measuring power utilization (with the battery fully charged) using a Satechi USB-C power monitor, and while doing a mem sleep, the Dell XPS (9370) was pulling 0.12-0.13 amps at 20 volts.   This is compared to the 0.02 amps at 20V when in a deep sleep.    I don't mind burning 2.5 Watts while the lid is closed when walking to a conference room, but if I'm taking my laptop home, and leaving it unconnected to power for ~12 hours, this is highly unfortunate....
Comment 51 Theodore Tso 2018-08-15 15:54:30 UTC
P.S.  All of this was running 4.18 plus the ext4.git and random.git patches that have been submitted to Linus for the 4.19 merge window:

# uname -a
Linux cwcc 4.18.0-00043-gfdade4115840 #35 SMP Sun Aug 12 21:33:40 EDT 2018 x86_64 GNU/Linux
Comment 52 Mario Limonciello 2018-08-15 16:14:02 UTC
> I was also measuring power utilization (with the battery fully charged) using
> a Satechi USB-C power monitor

When you say also, does that mean it was a separate test?  Or that during the LTR ignore run that it was run it was also connected?
Comment 53 Mario Limonciello 2018-08-15 16:17:05 UTC
According to your turbostat output you're not getting past PC2/PC3.  Are you able to read debugfs pmc_core output (IIRC /sys/kernel/debug/pmc_core/pch_ip_gating or something similar)?  Can you cat that to see what it claims is gating the PMC?
Comment 54 Theodore Tso 2018-08-16 01:25:20 UTC
I had the Satachi attached while I was doing the LTR ignore run.    I can do a run without the Satachi if you think it might be what was causing it to not drop into lower states.  (I doubt it, because my laptop has been draining without the Satachi attached.)

Side note: one of the annoying things about the Dell XPS is that I can't tell if it is suspended while it is closed.  One of the nice things about the Thinkpad is that there is an LED which is on solid when the laptop is running, and slowly blinks when it is suspended, and is totally off when the laptop is powered down.   With the Dell XPS, there is no way to tell the status the laptop (which is why I use the Satachi power monitor most of the time when I'm at work --- what's especially annoying is when I try to suspend the laptop, and sometimes when the networking is up and chrome is running, something will cause the kernel to fail to suspend, and the only way I can tell is by watching the power utilization meter --- or by checking the temperature of my laptop bag when I get home.  :-P )


# cat  /sys/kernel/debug/pmc_core/pch_ip_power_gating_status 
PCH IP: 0  - PMC                             	State: On
PCH IP: 1  - OPI-DMI                         	State: On
PCH IP: 2  - SPI / eSPI                      	State: On
PCH IP: 3  - XHCI                            	State: On
PCH IP: 4  - SPA                             	State: On
PCH IP: 5  - SPB                             	State: Off
PCH IP: 6  - SPC                             	State: Off
PCH IP: 7  - GBE                             	State: Off
PCH IP: 8  - SATA                            	State: Off
PCH IP: 9  - HDA-PGD0                        	State: Off
PCH IP: 10 - HDA-PGD1                        	State: Off
PCH IP: 11 - HDA-PGD2                        	State: Off
PCH IP: 12 - HDA-PGD3                        	State: Off
PCH IP: 13 - RSVD                            	State: Off
PCH IP: 14 - LPSS                            	State: Off
PCH IP: 15 - LPC                             	State: Off
PCH IP: 16 - SMB                             	State: Off
PCH IP: 17 - ISH                             	State: Off
PCH IP: 18 - P2SB                            	State: Off
PCH IP: 19 - DFX                             	State: Off
PCH IP: 20 - SCC                             	State: Off
PCH IP: 21 - RSVD                            	State: Off
PCH IP: 22 - FUSE                            	State: On
PCH IP: 23 - CAMERA                          	State: Off
PCH IP: 24 - RSVD                            	State: Off
PCH IP: 25 - USB3-OTG                        	State: Off
PCH IP: 26 - EXI                             	State: Off
PCH IP: 27 - CSE                             	State: Off
PCH IP: 28 - CSME_KVM                        	State: Off
PCH IP: 29 - CSME_PMT                        	State: Off
PCH IP: 30 - CSME_CLINK                      	State: Off
PCH IP: 31 - CSME_PTIO                       	State: Off
PCH IP: 32 - CSME_USBR                       	State: Off
PCH IP: 33 - CSME_SUSRAM                     	State: Off
PCH IP: 34 - CSME_SMT                        	State: Off
PCH IP: 35 - RSVD                            	State: Off
PCH IP: 36 - CSME_SMS2                       	State: Off
PCH IP: 37 - CSME_SMS1                       	State: Off
PCH IP: 38 - CSME_RTC                        	State: Off
PCH IP: 39 - CSME_PSF                        	State: Off
Comment 55 Theodore Tso 2018-08-16 01:40:35 UTC
Created attachment 277881 [details]
Turbostat output after running LTR script w/o Satachi power monitor

Here's a turbostat / LTR ignore run without the Satechi power monitor.  (The laptop was powered via an Apple USB-C power adapter at the time, though).
Comment 56 Mario Limonciello 2018-08-16 18:36:19 UTC
> PCH IP: 3  - XHCI                             State: On

The part standing out to me is that XHCI is "On".  Without your power adapter plugged in, is that the same result in the PMC debugging read?
Comment 57 Theodore Tso 2018-08-16 19:02:53 UTC
Yes, XHCI appears to be always on.  I tried doing a reboot and then unplugged the power, and it's still on:

<tytso.root@cwcc> {/usr/projects/linux/ext4-fsverity}, level 2   (master)
998# cat /sys/kernel/debug/pmc_core/pch_ip_power_gating_status  | grep XHCI
PCH IP: 3  - XHCI                            	State: On
<tytso.root@cwcc> {/usr/projects/linux/ext4-fsverity}, level 2   (master)
998# uname -a
Linux cwcc 4.18.0-00043-gfdade4115840 #35 SMP Sun Aug 12 21:33:40 EDT 2018 x86_64 GNU/Linux
<tytso.root@cwcc> {/usr/projects/linux/ext4-fsverity}, level 2   (master)
999# fwupdmgr get-devices
XPS 13 9370 System Firmware
  DeviceId:             8a21cacfb0a8d2b30c5ee9290eb71db021619f8b
  Guid:                 7ceaf7a8-0611-4480-9e30-64d8de420c7c
  Guid:                 230c8b18-8d9b-53ec-838b-6cfc0383493a
  Plugin:               uefi
  Flags:                internal|updatable|require-ac|supported|registered|needs-reboot
  Version:              0.1.4.0
  VersionLowest:        0.1.4.0
  Icon:                 computer
  Created:              2018-08-16

XPS 9370 Thunderbolt Controller
  DeviceId:             40ed9997af3dc4b0fda197fd2e4f1243afa74c5b
  Guid:                 4eeb9d07-a96c-56d6-92d3-4a23ee7a6e4a
  Summary:              Unmatched performance for high-speed I/O
  Plugin:               thunderbolt
  Flags:                internal|updatable|supported|registered
  Vendor:               Dell
  VendorId:             TBT:0x00D4
  Version:              28.00
  Icon:                 computer
  Created:              2018-08-16
Comment 58 Theodore Tso 2018-08-29 20:56:06 UTC
Ping?  Is there anything else you'd like me to try?
Comment 59 Srinivas Pandruvada 2018-08-30 19:33:52 UTC
Is this system is with NVMe or SATA-SSD? The one of the PCI bridge is ON.
I wonder if there is problem with APST on this card.
Comment 60 Theodore Tso 2018-08-30 19:44:08 UTC
My system has a NVMe SSD:

# nvme list
Node             SN                   Model                                    Namespace Usage                      Format           FW Rev  
---------------- -------------------- ---------------------------------------- --------- -------------------------- ---------------- --------
/dev/nvme0n1     Y77S10C8TYAT         KXG50ZNV1T02 NVMe TOSHIBA 1024GB         1           1.02  TB /   1.02  TB    512   B +  0 B   AADA4102

I can attach the dmidecode output if that would be helpful.
Comment 61 Mario Limonciello 2018-08-30 19:48:45 UTC
I think
# nvme id-ctrl

output would be more useful
Comment 62 Theodore Tso 2018-08-30 21:10:30 UTC
Here you go:

# nvme id-ctrl /dev/nvme0
NVME Identify Controller:
vid       : 0x1179
ssvid     : 0x1179
sn        : Y77S10C8TYAT        
mn        : KXG50ZNV1T02 NVMe TOSHIBA 1024GB        
fr        : AADA4102
rab       : 3
ieee      : 00080d
cmic      : 0
mdts      : 9
cntlid    : 0
ver       : 10201
rtd3r     : 186a0
rtd3e     : 7a120
oaes      : 0
ctratt    : 0
rrls      : 0
oacs      : 0x17
acl       : 3
aerl      : 7
frmw      : 0x14
lpa       : 0x2
elpe      : 127
npss      : 4
avscc     : 0
apsta     : 0x1
wctemp    : 351
cctemp    : 355
mtfa      : 20
hmpre     : 0
hmmin     : 0
tnvmcap   : 1024209543168
unvmcap   : 0
rpmbs     : 0
edstt     : 36
dsto      : 1
fwug      : 0
kas       : 0
hctma     : 0
mntmt     : 0
mxtmt     : 0
sanicap   : 0
hmminds   : 0
hmmaxd    : 0
nsetidmax : 0
sqes      : 0x66
cqes      : 0x44
maxcmd    : 0
nn        : 1
oncs      : 0x5f
fuses     : 0x1
fna       : 0
vwc       : 0x1
awun      : 31
awupf     : 0
nvscc     : 0
acwu      : 31
sgls      : 0
subnqn    : nqn.2017-03.jp.co.toshiba:KXG50ZNV1T02 NVMe TOSHIBA 1024GB:Y77S10C8TYAT
ioccsz    : 0
iorcsz    : 0
icdoff    : 0
ctrattr   : 0
msdbd     : 0
ps    0 : mp:6.00W operational enlat:0 exlat:0 rrt:0 rrl:0
          rwt:0 rwl:0 idle_power:- active_power:-
ps    1 : mp:2.40W operational enlat:0 exlat:0 rrt:1 rrl:1
          rwt:1 rwl:1 idle_power:- active_power:-
ps    2 : mp:1.90W operational enlat:0 exlat:0 rrt:2 rrl:2
          rwt:2 rwl:2 idle_power:- active_power:-
ps    3 : mp:0.0500W non-operational enlat:1500 exlat:1500 rrt:3 rrl:3
          rwt:3 rwl:3 idle_power:- active_power:-
ps    4 : mp:0.0030W non-operational enlat:50000 exlat:80000 rrt:4 rrl:4
          rwt:4 rwl:4 idle_power:- active_power:-
Comment 63 Srinivas Pandruvada 2018-08-31 15:41:14 UTC
Have you installed a new NVMe yourself or this system was shipped with this disk?Can you still boot Windows?
Comment 64 Theodore Tso 2018-08-31 19:01:09 UTC
It's the original NVMe, and I installed Debian until of Windows.  I had tried to transfer the Windows to a USB drive, but it didn't quite work and the Windows installation on the USB toasted itself after an update got confused because it was on the USB drive.   I do have a Windows on a USB drive that was originally installed on a Lenovo laptop, and for which I had downloaded the Dell XPS 13 drivers.   I was using this update the XPS 13 BIOS before I had managed to get fwdupdmgr working.  (short version: fwdupdmgr is unhappy if you boot in Legacy BIOS mode, and but the Debian installer was not able to install UEFI boot on the XPS 13.   So I had to do a Legacy BIOS mode installation, and then manually set up the UEFI boot partition and set up UEFI boot by hand.)   UEFI boot and fwdupdmgr is working now and I've done at least one or two BIOS updates using fwdupdmgr, so I don't think it's related to my current issue with power management --- especially since S3 suspend works just fine.

I can try booting the Windows from a USB flash drive, but that's Windows 10 booting in Legacy mode.  Microsoft doesn't seem to like people booting off of external USB devices, so between the fact that it's an external boot device, and it's in Legacy mode, some things might not work.  But if you want me to perform an experiment using the external Windows 10 system, I can give it a try.

I haven't needed to boot Windows in months, though, so there's a possibility that some Microsoft auto-update will end up trashing the Windows on a USB flash disk setup.   At which point I can pull the Lenovo T470 out of storage, update Windows on it, and then transfer the Windows to the USB stick, and then copy over the Dell drivers..... what a mess.  I don't miss Windows.  :-)
Comment 65 Srinivas Pandruvada 2018-08-31 19:13:05 UTC
It is the original NVMe, so I guess then Windows would entered low power states.After powertop auto-tune, do you ever get

PCH IP: 4  - SPA                             	State: On
as
PCH IP: 4  - SPA                             	State: Off

When you do multiple times with some wait between multiple calls?
cat  /sys/kernel/debug/pmc_core/pch_ip_power_gating_status
Comment 66 Theodore Tso 2018-09-02 02:22:00 UTC
# powertop --auto-tune
modprobe cpufreq_stats failedLoaded 750 prior measurements
RAPL device for cpu 0
RAPL Using PowerCap Sysfs : Domain Mask f
RAPL device for cpu 0
RAPL Using PowerCap Sysfs : Domain Mask f
Devfreq not enabled
glob returned GLOB_ABORTED
Leaving PowerTOP
# grep SPA /sys/kernel/debug/pmc_core/pch_ip_power_gating_status
PCH IP: 4  - SPA                             	State: On
# grep SPA /sys/kernel/debug/pmc_core/pch_ip_power_gating_status
PCH IP: 4  - SPA                             	State: On
# grep SPA /sys/kernel/debug/pmc_core/pch_ip_power_gating_status
PCH IP: 4  - SPA                             	State: On
#

I've tried waiting a while and it's always "On", and never "Off".
Comment 67 Szilárd Páll 2018-09-05 14:51:33 UTC
Ping, is there any more feedback that can aid in getting to the bottom of this. I see ~0.7-0.8 Wh drain in sleep which is really unsustainable.

BTW, I've tried checking the PCH IP gating check and in my case I do see it switching:

 $ while true; do sudo grep SPA /sys/kernel/debug/pmc_core/pch_ip_power_gating_status; sleep 10s; done 
PCH IP: 4  - SPA                             	State: On
PCH IP: 4  - SPA                             	State: On
PCH IP: 4  - SPA                             	State: Off
PCH IP: 4  - SPA                             	State: Off
PCH IP: 4  - SPA                             	State: On
PCH IP: 4  - SPA                             	State: Off
PCH IP: 4  - SPA                             	State: Off
PCH IP: 4  - SPA                             	State: Off
PCH IP: 4  - SPA                             	State: Off
PCH IP: 4  - SPA                             	State: On
PCH IP: 4  - SPA                             	State: Off
PCH IP: 4  - SPA                             	State: Off
PCH IP: 4  - SPA                             	State: Off
PCH IP: 4  - SPA                             	State: Off
PCH IP: 4  - SPA                             	State: Off
PCH IP: 4  - SPA                             	State: Off
PCH IP: 4  - SPA                             	State: On
PCH IP: 4  - SPA                             	State: Off
PCH IP: 4  - SPA                             	State: On
PCH IP: 4  - SPA                             	State: On
PCH IP: 4  - SPA                             	State: On
PCH IP: 4  - SPA                             	State: On
PCH IP: 4  - SPA                             	State: Off
PCH IP: 4  - SPA                             	State: Off
PCH IP: 4  - SPA                             	State: On
PCH IP: 4  - SPA                             	State: On
PCH IP: 4  - SPA                             	State: Off
PCH IP: 4  - SPA                             	State: On
PCH IP: 4  - SPA                             	State: On
PCH IP: 4  - SPA                             	State: On
PCH IP: 4  - SPA                             	State: Off
PCH IP: 4  - SPA                             	State: On
Comment 68 Srinivas Pandruvada 2018-09-05 15:16:12 UTC
Pall:
You may have different issue. Did you do powertop --auto-tune and run script in comment 38?
Comment 69 Srinivas Pandruvada 2018-09-05 15:18:15 UTC
Tso:
In your case I think NVMe is keeping the root bridge busy? I don't have such NVMe card.
Mario,
Do you happen to have such card?
Comment 70 Paul Menzel 2018-09-05 15:24:10 UTC
(In reply to Srinivas Pandruvada from comment #69)
> Tso:
> In your case I think NVMe is keeping the root bridge busy? I don't have such
> NVMe card.
> Mario,
> Do you happen to have such card?

Could this be the same issue as bug #196907, where there are problems with a PC300 NVMe SK hynix 512GB?

[1]: https://bugzilla.kernel.org/show_bug.cgi?id=196907
Comment 71 Szilárd Páll 2018-09-05 16:37:55 UTC
(In reply to Srinivas Pandruvada from comment #68)
> Pall:
> You may have different issue. Did you do powertop --auto-tune and run script
> in comment 38?

Yes and no. I powertop --auto-tune at startup, so that's been done.

When it comes to the low_power_idle_cpu_residency_us, I get non-zero value reported. Not entirely sure whether that makes my issue different?

BTW, I have the XPS 13 9370 with the same NVME drive as T. Tso just 512 GB in size (model KXG50ZNV512G NVMe TOSHIBA 512GB).
Comment 72 Srinivas Pandruvada 2018-09-05 18:02:51 UTC
Paul Menzel: I can't say that I don't see turbostat output in that bug to check while not doing suspend to idle the system has a residency lower than PC3.

But won't hurt and try to blacklist.
Can anybody try?
Comment 73 Mario Limonciello 2018-09-05 18:17:20 UTC
Drive falling off the bus over s2idle and a drive staying awake are two different problems to me.

Blacklisting s2idle will certainly work around high power consumption, but there appears to still be at least one (maybe two) real cases of higher power consumption with these particular NVMe SSDs.

With NVMe, the expectation is that (Autonomous Power State Transition) is used to put the SSD into lower power states.  If you look at Ted's output you'll notice two "non-operational" states that have a much lower power consumption (ps3 and ps4)

ps    3 : mp:0.0500W non-operational enlat:1500 exlat:1500 rrt:3 rrl:3
          rwt:3 rwl:3 idle_power:- active_power:-
ps    4 : mp:0.0030W non-operational enlat:50000 exlat:80000 rrt:4 rrl:4
          rwt:4 rwl:4 idle_power:- active_power:-

In order for the PCH to show s0 residency the SSD needs to be spending enough time idle to automatically enter those states in micro seconds (enlat).

APST has been supported since kernel 4.11 with this commit.
https://github.com/torvalds/linux/commit/c5552fde102fcc3f2cf9e502b8ac90e3500d8fdf

The kernel did adjust the max latency it would allow to enter these states with this commit in 4.12:
https://github.com/torvalds/linux/commit/9947d6a09cd71937dade2fc14640e4843ae19802

Once configured the drives are supposed to work autonomously.  If they're idle long enough, they stop using power.  If something prods them they way up.

So I would wonder if something is causing periodic activity on those disks even over s2idle?
Comment 74 Srinivas Pandruvada 2018-09-05 19:14:37 UTC
We have 9370 with the same NVMe as Szilárd Páll. This system can go to low power and shows both counts > 0
/sys/devices/system/cpu/cpuidle/low_power_idle_cpu_residency_us
and
/sys/devices/system/cpu/cpuidle/low_power_idle_system_residency_us

Here the system has 4.19-rc1. But it used to work even with older kernels.

So please check if you get both counts during s2idle. Remove any devices connected to the system to avoid dependency on peripherals.

The power you are measuring is wall power, I guess, so this is not the power the system is consuming during s2idle. The power brick also consumes power and may be charging too.

So your issue is not same as Theodore Tso.
Comment 75 Szilárd Páll 2018-09-07 10:48:34 UTC
I'm having a seriously hard time following the discussion and matching replies to the messages they reply to (bugzilla ftw), so sorry if I sound confused.

$ cat /sys/devices/system/cpu/cpuidle/low_power_idle_cpu_residency_us; cat /sys/devices/system/cpu/cpuidle/low_power_idle_system_residency_us
0
0
$ systemctl suspend 
$ cat /sys/devices/system/cpu/cpuidle/low_power_idle_cpu_residency_us; cat /sys/devices/system/cpu/cpuidle/low_power_idle_system_residency_us
6408462
6365970

I have nothing connected -- never had during the experimentation and reporting here. I seem to get both counters to show residency, but I still get serious battery drain during sleep.

> The power you are measuring is wall power, I guess, so this is not the power
> the system is consuming during s2idle. The power brick also consumes power
> and may be charging too.

Not sure if this is to me, but I'm confused: the energy spent during idle on battery has nothing to do with the charger. I simply look at the change in energy field reported in /org/freedesktop/UPower/devices/battery_BAT0 and calculate the energy decrease per unit of time.

> So your issue is not same as Theodore Tso.

I don't see why, but fine, it may well be. Should I file a separate bug report than? Any suggestions for short to mid-term mitigation -- this issue takes me back to the state of Linux on laptops from 15 years ago and I'd really like to snap back to the present and be able to use my laptop as it's intended to be used. :)
Comment 76 Paul Menzel 2018-09-07 11:57:12 UTC
Created attachment 278367 [details]
smime.p7s

On 09/07/18 12:48, bugzilla-daemon@bugzilla.kernel.org wrote:
> I don't see why, but fine, it may well be. Should I file a separate
> bug report than?

Yes, please, and document the bug number here.

> Any suggestions for short to mid-term mitigation -- this issue takes
> me back to the state of Linux on laptops from 15 years ago and I'd
> really like to snap back to the present and be able to use my laptop
> as it's intended to be used. :)

Can’t you just disable s2idle, and enable ACPI S3 by setting
`mem_sleep_default=deep` on the Linux kernel command line?
Comment 77 Mario Limonciello 2018-09-07 13:19:27 UTC
I agree that's the proper short term mitigation and we probably have two different (but similar issues) happening here.
Comment 78 Erik Rigtorp 2018-10-30 06:03:23 UTC
*** Bug 201523 has been marked as a duplicate of this bug. ***
Comment 79 Erik Rigtorp 2018-11-29 12:29:24 UTC
If you have the SK Hynix SSD this could be the cause of high power draw: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1801875
Comment 80 Ondřej Caletka 2019-02-07 09:58:55 UTC
Just for the record, my system was suffering with this issue of high power drain during S2IDLE. My XPS 13 9370 has NVMe model KXG50ZNV512G, shipped with firmware AADA4102, that is mentioned as problematic on ArchWiki [https://wiki.archlinux.org/index.php/Dell_XPS_13_(9370)#Storage]. Yesterday, I've upgraded firmware to AADA4106, released by Dell on Jan 22, 2019. I've switched back to s2idle sleep. Overnight, it drained around 10 percent so I guess this issue is fixed for me by NVMe firmware upgrade.

As I don't have Windows, I've managed to extract the firmware out of .exe file supplied by Dell and do the NVMe upgrade using nvme-cli on Linux: https://gist.github.com/klingtnet/22ab0b907e2d9d20f98c72c93ea5dd37#gistcomment-2830279
Comment 81 Mario.Limonciello 2019-02-07 09:59:13 UTC
Created attachment 281035 [details]
attachment-29801-0.html

?
?Hello,

I'm home sick Feb 6, expect delayed response..
Comment 82 Adam Caldwell 2019-03-06 03:50:01 UTC
Should I assume whatever is going on here is also why my 9380 experiences high power consumption during sleep (maybe 5-10% battery per hour instead of the 1-2% I'd expect)?

Just to clarify, I have no issues entering or resuming from sleep, just the high power consumption during sleep. I am using powertop --auto-tune at boot.
Comment 83 Mario Limonciello 2019-03-06 13:32:10 UTC
@Adam, without digging into the details it's impossible to know if it's the same root cause for your particular issue.

Would you please open a separate bug and attach similar things as were requested in this bug to various folks and we can see?

Also if you can please make sure you are checking with latest kernel release.
Comment 84 Timur Kristóf 2019-06-01 09:17:29 UTC
I can also confirm that the high power drain is caused by the SSD that the 9370 comes with.

A little while ago I replaced the original Toshiba SSD with a Samsung 970 EVO 1TB model, and now the power drain in s2idle is significantly lower. I think Ondřej's solution should also work for those using the original SSD.

Since Ondřej was able to upgrade the SSD firmware with nvme-cli, it looks like it is upgradeable from Linux. Is there a possibility to release the updated SSD firmware through fwupd?
Comment 85 Alejandro Díaz-Caro 2019-06-13 20:39:34 UTC
Hi, I have a HP Spectre x360 13t-ap000, with the exact same problem. I also have NVMe and it drains lots of power under s2idle, and the only option I have in /sys/power/mem_sleep is that one.

I am running Arch Linux.

$ uname -a
Linux behemoth 5.1.8-arch1-1-ARCH #1 SMP PREEMPT Sun Jun 9 20:28:28 UTC 2019 x86_64 GNU/Linux
Comment 86 Alejandro Díaz-Caro 2019-06-13 20:53:55 UTC
fyi

$ sudo lshw -class storage
  *-storage                 
       description: Non-Volatile memory controller
       product: SK hynix
       vendor: SK hynix
       physical id: 0
       bus info: pci@0000:6d:00.0
       version: 00
       width: 64 bits
       clock: 33MHz
       capabilities: storage pm pciexpress msix nvm_express bus_master cap_list
       configuration: driver=nvme latency=0
       resources: irq:16 memory:a0000000-a0003fff
Comment 87 Sebastian 2019-07-08 08:31:00 UTC
I am having battery drains on the Dell XPS13 9370 as well. That's why some time ago I changed Kernel command line to mem_sleep_default=deep. That creates now lately some issues on Ubuntu 18.04 LTS with gdm3 as unlocking after suspending the notebook let gdm3 freeze.

I am on Kernel 4.18.0-25 and have a NVMe model SSDPEKKF512G8 NVMe INTEL 512GB.
Comment 88 Paul Menzel 2019-07-08 09:31:22 UTC
(In reply to Sebastian from comment #87)
> I am having battery drains on the Dell XPS13 9370 as well. That's why some
> time ago I changed Kernel command line to mem_sleep_default=deep. That
> creates now lately some issues on Ubuntu 18.04 LTS with gdm3 as unlocking
> after suspending the notebook let gdm3 freeze.
> 
> I am on Kernel 4.18.0-25 and have a NVMe model SSDPEKKF512G8 NVMe INTEL
> 512GB.

Your issue is unrelated to this bug report. Please report your issue to the Ubuntu bug tracker [1]. You might want to try the latest Linux kernel as a data point of this is Linux kernel related and has been fixed in the mean-time.


[1]: https://bugs.launchpad.net/
[2]: https://kernel.ubuntu.com/~kernel-ppa/mainline/
Comment 89 Sebastian 2019-07-08 09:37:31 UTC
I am not talking here about the gdm3 issue but the battery drain in S2 on a XPS13 9370. Why is this unrelated to this bug report?
Comment 90 Paul Menzel 2019-07-08 09:45:39 UTC
(In reply to Sebastian from comment #89)
> I am not talking here about the gdm3 issue but the battery drain in S2 on a
> XPS13 9370. Why is this unrelated to this bug report?

Sorry, I misunderstood that then. Please try if the battery drain issue with s2idle still exists with the latest Linux version. (You should also contact the Dell support, if you bought the device with a GNU/Linux distribution.)
Comment 91 Mario Limonciello 2019-07-08 13:22:46 UTC
@All,

I'd like to mention that regarding battery drain on 9370 there is a patch series for putting NVME into proper sleep state that should be merged into 5.3.  This should hopefully help the remaining power drain issues on 9370 that have been seen in S2I.

You can either test this branch:
https://git.kernel.org/pub/scm/linux/kernel/git/kbusch/linux.git/log/?h=nvme-power or wait for the first 5.3rc to be cut.
Comment 92 Mario Limonciello 2019-07-11 02:11:55 UTC
The patches are in Linus' tree now, you can test from there and battery drain from NVME should be resolved in S2I.
Comment 93 Andy Wang 2019-08-12 01:44:14 UTC
I'm seeing a possibly related issue with my new XPS 13 9380.
I haven't measured loss on s2idle because I can't get it to stay sleeping.
I have both a yubikey 5c nano, and whenever it's installed, the xps 13 9380 wakes up within a couple of minutes of suspending and stays awake.  the kernel logs seem to show that it's attempting to sleep every few minutes but wakes up immediately (or doesn't actually sleep, i ahven't been able to tell).  Also, as soon as I plug in power (usb-c power) it wakes up and doesn't go to sleep.  The bios option of waking up on ac power is disabled.

Setting the sleep mode to deep instead of s2idle seems to solve this problem.  Is this related? or something entirely separate?
Comment 94 Ryan J Schave 2019-08-12 02:17:37 UTC
Created attachment 284329 [details]
attachment-26080-0.html


I will be out of the office until Monday, August 12th.  I will have limited access to email during this time. If you need immediate assistance please call my office at 586.263.1775 and press 1 for support or email support@eclipse-online.com.

Thanks,

Ryan
Comment 95 Mario Limonciello 2019-08-12 13:47:55 UTC
@Andy Wang:

With a USB device plugged in the CPU package will not go into as deep of a state, but it should still be using less power than most "active" use cases.

I believe that USB device plugged causing certain behaviors in is a separate case than those reporting on this bug however. 

Those in this bug I believe have issues with one of two things:
1) NVME not going into proper state (which is resolved in kernel 5.3)
2) ASPM not configured properly (Which can be caused by using TLP 1.1 or less or configuring the kernel ASPM policy to anything but "default").

@All others:
(Btw) It would be good if anyone who was affected by this bug could confirm using 5.3rc3 or later that they don't have any remaining issues so we can close this bug.
Comment 96 Nitin 2019-08-30 19:03:27 UTC
(In reply to Mario Limonciello from comment #95)
> @Andy Wang:
> 
> With a USB device plugged in the CPU package will not go into as deep of a
> state, but it should still be using less power than most "active" use cases.
> 
> I believe that USB device plugged causing certain behaviors in is a separate
> case than those reporting on this bug however. 
> 
> Those in this bug I believe have issues with one of two things:
> 1) NVME not going into proper state (which is resolved in kernel 5.3)
> 2) ASPM not configured properly (Which can be caused by using TLP 1.1 or
> less or configuring the kernel ASPM policy to anything but "default").
> 
> @All others:
> (Btw) It would be good if anyone who was affected by this bug could confirm
> using 5.3rc3 or later that they don't have any remaining issues so we can
> close this bug.

I can confirm that this problem is fixed on my 9370 with kernel version 5.3rc4
Comment 97 Mario Limonciello 2019-09-03 00:41:03 UTC
@Srinivas, can you close this based on #96?
Comment 98 Timur Kristóf 2019-09-03 15:00:49 UTC
I can confirm that S2 sleep is considerably better with 5.3 than previously (though I don't use the original SSD that came with the 9370 anymore). The 9370 can now actually go a couple of days in S2.
Comment 99 Leho Kraav 2019-09-09 15:08:36 UTC
(In reply to Mario Limonciello from comment #95)
> 
> Those in this bug I believe have issues with one of two things:
> 1) NVME not going into proper state (which is resolved in kernel 5.3)
> 2) ASPM not configured properly (Which can be caused by using TLP 1.1 or
> less or configuring the kernel ASPM policy to anything but "default").
> 
> @All others:
> (Btw) It would be good if anyone who was affected by this bug could confirm
> using 5.3rc3 or later that they don't have any remaining issues so we can
> close this bug.

Hi @Mario.

Does this patchset have a chance of getting backported to 4.19?
Comment 100 Mario Limonciello 2019-09-09 15:12:05 UTC
At least right now not via 4.19.y.  Distros certainly can backport it.  I know that ChromeOS has done this for their 4.19, so you can reference that if you want to try.

Note You need to log in before you can comment on or make changes to this bug.