Bug 211879

Summary: S0ix: Unable to achieve S0ix on Dell XPS 13 9310
Product: Power Management Reporter: Stelian Pop (stelian)
Component: Hibernation/SuspendAssignee: Rafael J. Wysocki (rjw)
Severity: normal CC: eugeniu, kvalo, Rondom, rui.zhang, wendy.wang, wil.thomason, wt
Priority: P1    
Hardware: All   
OS: Linux   
Kernel Version: 5.12 Tree: Mainline
Regression: No
Attachments: ts.out
dmesg with drm.debug=0xe
ts.out with drm.debug=0xe
BIOS v2.0.0 DSDT
Kernel output
Turbostat output
Working turbostat output
Working dmesg output

Description Stelian Pop 2021-02-21 17:42:47 UTC
Created attachment 295385 [details]

This is on the late 2020 9310 XPS edition, with the AX500 Wifi card.

s2idle enter / exit seem to work without any obvious problems.

However, S0ix residency is not achieved (if my undestanding is correct), making the power consumption at about 5% per hour (50% overnight).

Some debug traces:

# cat /sys/class/drm/card0/power/rc6_residency_ms

# cat /sys/kernel/debug/pmc_core/package_cstate_show
Package C2 : 815452306
Package C3 : 440007007
Package C6 : 0
Package C7 : 0
Package C8 : 0
Package C9 : 0
Package C10 : 0

# dmesg -C && ./turbostat -o ts.out rtcwake -m freeze -s 60 && dmesg > dmesg.out

./turbostat -o ts-idle.out sleep 60

Feel free to ask for further information if needed !

Thanks !
Comment 1 Stelian Pop 2021-02-21 17:43:28 UTC
Created attachment 295387 [details]
Comment 2 Stelian Pop 2021-02-21 20:54:47 UTC
I should also mention that the BIOS has been updated to the latest available version from DELL at this date (v2.0.0).
Comment 3 Stelian Pop 2021-02-22 11:39:12 UTC
/sys/kernel/debug/dri/0/i915_dmc_info shows zero DC5 -> DC6 transitions, and the counter for DC3 -> DC5 transitions keeps incrementing even over suspend cycles (from my reading of the documentation "By design, DC5 value is reset after DC9 entry." it would mean DC9 is never entered.

I've booted with drm.debug=0xe and captured a new dmesg log over the suspend cycle. Just before going into suspend, I see:

i915 0000:00:02.0: [drm:intel_display_power_suspend_late [i915]] Enabling DC9
i915 0000:00:02.0: [drm:gen9_set_dc_state [i915]] Setting DC state from 00 to 08

I'm not sure how to interpret this DC5/DC9 discrepancy.

Logs attached.
Comment 4 Stelian Pop 2021-02-22 11:39:51 UTC
Created attachment 295393 [details]
dmesg with drm.debug=0xe
Comment 5 Stelian Pop 2021-02-22 11:40:24 UTC
Created attachment 295395 [details]
ts.out with drm.debug=0xe
Comment 6 Stelian Pop 2021-02-25 09:02:48 UTC
It looks like it worked with a pre-2.0.0 BIOS: http://lists.infradead.org/pipermail/ath11k/2021-February/001092.html

Is there something I can do except waiting for Dell to release an updated BIOS ?

DSDT attached.
Comment 7 Stelian Pop 2021-02-25 09:03:52 UTC
Created attachment 295437 [details]
BIOS v2.0.0 DSDT
Comment 8 Warren Turkal 2021-03-07 04:43:18 UTC
I have a question. Does this happen on the XPS 13 without the AX500 card (the developer model) and the 2.0.0 bios?

FWIW, I have one of these laptops, and suspend is totally broken for me. It also appears to not reboot via software trigger (like by running `systemctl reboot`).

I am running the very tip of Linus' master branch right now.

Suspend was working on earlier kernels with an earlier bios verison. I am not sure if this is the bios or the OS. Do we happen to know if suspend is broken on the Windows OS that these laptops ship with? I have blown Windows away on my laptop, so I actually don't know.
Comment 9 Stelian Pop 2021-03-07 08:26:52 UTC
I kept W10 on my laptop ("just in case"), but I never use it.

If somebody knows how to check if S0ix residency is achieved under W10, I can easily test.
Comment 10 Stelian Pop 2021-03-07 17:03:42 UTC
I discovered the powercfg.exe /SleepStudy W10 tool, and the output (over about 2h of time the laptop being lid closed) seems to show that Sleep works well.

Comment 11 Stelian Pop 2021-03-07 17:04:14 UTC
Created attachment 295699 [details]
W10 SleepStudy output
Comment 12 EP 2021-03-19 03:13:13 UTC
I am experiencing a similar issue: ~2.5 W drain when suspended.
Also late 2020 XPS 9310 with 2.0 BIOS on 5.11.7-051107 kernel.

With or without powertop --auto-tune I get zero low power residency before and after suspend:
# cat /sys/devices/system/cpu/cpuidle/low_power_idle_system_residency_us 
# cat /sys/devices/system/cpu/cpuidle/low_power_idle_cpu_residency_us

# dmesg | grep i915
[    1.476200] i915 0000:00:02.0: [drm] Finished loading DMC firmware i915/tgl_dmc_ver2_08.bin (v2.8)
[    1.503412] i915 0000:00:02.0: [drm] Panel advertises DPCD backlight support, but VBT disagrees. If your backlight controls don't work try booting with i915.enable_dpcd_backlight=1. If your machine needs this, please file a _new_ bug report on drm/i915, see https://gitlab.freedesktop.org/drm/intel/-/wikis/How-to-file-i915-bugs for details.
[    2.752189] [drm] Initialized i915 1.6.0 20201103 for 0000:00:02.0 on minor 0
[    2.763008] fbcon: i915drmfb (fb0) is primary device
[    2.810732] i915 0000:00:02.0: [drm] fb0: i915drmfb frame buffer device
[    6.454859] i915 0000:00:02.0: [drm] *ERROR* CPU pipe A FIFO underrun
[    9.767374] snd_hda_intel 0000:00:1f.3: bound 0000:00:02.0 (ops i915_audio_component_bind_ops [i915])
[    9.993838] mei_hdcp 0000:00:16.0-b638ab7e-94e2-4ea2-a552-d1c54b627f04: bound 0000:00:02.0 (ops i915_hdcp_component_ops [i915])

Finally, the suspend/resume sequence:
# dmesg
[ 5401.965882] PM: suspend entry (s2idle)
[ 5401.986966] Filesystems sync: 0.021 seconds
[ 5401.986971] PM: Preparing system for sleep (s2idle)
[ 5401.996100] Freezing user space processes ... (elapsed 0.002 seconds) done.
[ 5401.998501] OOM killer disabled.
[ 5401.998502] Freezing remaining freezable tasks ... (elapsed 0.001 seconds) done.
[ 5401.999785] PM: Suspending system (s2idle)
[ 5401.999787] printk: Suspending console(s) (use no_console_suspend to debug)
[ 5402.705170] mhi mhi0: Allowing M3 transition
[ 5402.705192] mhi mhi0: Wait for M3 completion
[ 5402.706157] PM: suspend of devices complete after 705.843 msecs
[ 5402.706166] PM: start suspend of devices complete after 706.347 msecs
[ 5402.730405] PM: late suspend of devices complete after 24.230 msecs
[ 5402.757914] ACPI: EC: interrupt blocked
[ 5402.807653] PM: noirq suspend of devices complete after 71.799 msecs
[ 5402.810959] PM: suspend-to-idle
[ 5402.811139] PM: ACPI EC GPE dispatched
[ 5414.626652] PM: Timekeeping suspended for 10.995 seconds
[ 5414.627284] PM: ACPI EC GPE dispatched
[ 5414.632167] PM: Wakeup after ACPI Notify sync
[ 5414.632169] PM: resume from suspend-to-idle
[ 5414.636464] ACPI: EC: interrupt unblocked
[ 5415.109435] PM: noirq resume of devices complete after 473.633 msecs
[ 5415.130725] PM: early resume of devices complete after 1.719 msecs
[ 5415.139405] mhi mhi0: Entered with PM state: M3, MHI state: M3
[ 5415.246215] pcieport 10000:e0:06.0: can't derive routing for PCI INT A
[ 5415.246217] nvme 10000:e1:00.0: PCI INT A: no GSI
[ 5415.402532] nvme nvme0: 8/0/0 default/read/poll queues
[ 5415.501255] PM: resume of devices complete after 370.508 msecs
[ 5415.510871] PM: Finishing wakeup.
Comment 13 Stelian Pop 2021-04-08 21:27:19 UTC
Some news: I have updated tonight to BIOS v2.1.1, and seen no change in behaviour: S0ix is still broken here.

FWIW, I'm running 5.11.12 at the moment.
Comment 14 Stelian Pop 2021-04-23 19:27:39 UTC
Updated again to BIOS v2.2.0, Linux v5.11.16.

Still no change :(
Comment 15 EP 2021-04-27 01:25:34 UTC
I'm on 5.12.0-051200-generic, with v2.2.0 BIOS.

Still zero low power residency before and after suspend:
# cat /sys/devices/system/cpu/cpuidle/low_power_idle_system_residency_us 
# cat /sys/devices/system/cpu/cpuidle/low_power_idle_cpu_residency_us
A more accurate estimate of the power drain during suspend is 1.6W.

On the bright side, my sound card is now recognized out of the box.
Comment 16 wendy.wang 2021-05-25 06:48:00 UTC
Are you still see PC3 only by "cat /sys/kernel/debug/pmc_core/package_cstate_show"?

In order to get S0ix, the first step is to enter PC10. Looks like your gfx driver and DMC FW is loaded.
Can you please double check if your audio driver is loaded correctly? 
try "lspci -vvv" output.

Also suggest check PCI device D3 status:
Below are the commands:
echo -n "file pci-driver.c +p" > /sys/kernel/debug/dynamic_debug/control
echo N > /sys/module/printk/parameters/console_suspend
echo 1 > /sys/power/pm_debug_message
turbostat -o tc.out rtcwake -m freeze -s 60
after resume back, check turbostat log: tc.log and dmesg log: dmesg | grep "PCI PM"
Comment 17 wendy.wang 2021-05-25 06:50:29 UTC
For the audio driver,
Currently you are running snd_hda_intel, which is a legacy driver.

Can have a try with SOF driver:
In etc/modprobe.d/alsa-base.conf
Add one line:
options snd-intel-dspcfg dsp_driver=3

Then reboot and double check the PC10 status.
Comment 18 Stelian Pop 2021-05-25 08:57:07 UTC
I have modified the sound config, and confirmed it used the correct driver:

0000:00:1f.3 Multimedia audio controller: Intel Corporation Tiger Lake-LP Smart Sound Technology Audio Controller (rev 20)
	Subsystem: Dell Tiger Lake-LP Smart Sound Technology Audio Controller
	Flags: bus master, fast devsel, latency 64, IRQ 230, IOMMU group 17
	Memory at 60552d8000 (64-bit, non-prefetchable) [size=16K]
	Memory at 6055000000 (64-bit, non-prefetchable) [size=1M]
	Capabilities: [50] Power Management version 3
	Capabilities: [80] Vendor Specific Information: Len=14 <?>
	Capabilities: [60] MSI: Enable+ Count=1/1 Maskable- 64bit+
	Kernel driver in use: sof-audio-pci-intel-tgl
	Kernel modules: snd_hda_intel, snd_sof_pci_intel_tgl

Still no luck:

# cat /sys/kernel/debug/pmc_core/package_cstate_show
Package C2 : 971414694
Package C3 : 143067721
Package C6 : 0
Package C7 : 0
Package C8 : 0
Package C9 : 0
Package C10 : 0

I have activated the PCI PM debug messages, find the results below. If the (slightly edited) dmesg there are severals suspend attempts, at first my USB-C switch was connected (with ethernet, HDMI, power), the last attempt is with the switch disconnected.
Comment 19 Stelian Pop 2021-05-25 08:58:56 UTC
Created attachment 296975 [details]
Kernel output
Comment 20 Stelian Pop 2021-05-25 08:59:26 UTC
Created attachment 296977 [details]
Turbostat output
Comment 21 wendy.wang 2021-05-25 11:57:51 UTC
Thanks for the log,
From dmesg, ISH is in D0, did not enter D3, would you mind trying to disable ISH from BIOS, recheck the Package C-state?
[ 1098.925936] intel_ish_ipc 0000:00:12.0: PCI PM: Suspend power state: D0
[ 1098.925939] intel_ish_ipc 0000:00:12.0: PCI PM: Skipped
Comment 22 Stelian Pop 2021-05-25 20:43:54 UTC

That ISH device seems to be an ALS.

I looked all over the BIOS, there is nothing obviously related to a light sensor. I tried disabling other things like keyboard illumination etc, to no avail.

So what next ?
Comment 23 wendy.wang 2021-05-26 03:15:52 UTC

I compared your PCI log with mine, I realized that ish is not the problem(I did not realize it before, sorry for that), but your PCIe root ports D3hot should be the issue.

[ 1098.929935] pcieport 10000:e0:06.0: PCI PM: Suspend power state: D3hot
[ 1098.966037] pcieport 0000:00:1c.0: PCI PM: Suspend power state: D3hot
[ 1098.966050] pcieport 0000:00:1d.0: PCI PM: Suspend power state: D3hot

Meanwhile I notice your system enabled VMD device, can you please show your PCI output with command: lspci -tvv

Please double check if you can disable VMD from BIOS to see any deeper Package C-state?
For S0ix, currently mainline kernel VMD Linux driver has issue to put NVMe PCIe root port into D3cold, which is working in progress.
Comment 24 Stelian Pop 2021-05-26 10:01:27 UTC
Many thanks, changing in the BIOS the disk driver from RAID to AHCI did the job.

It seems to work perfectly now, even with my external HDMI/ethernet/USB switch attached.

# cat /sys/kernel/debug/pmc_core/package_cstate_show
Package C2 : 38306888
Package C3 : 23144768
Package C6 : 26142871
Package C7 : 3306
Package C8 : 432121
Package C9 : 0
Package C10 : 117466089

# cat /sys/devices/system/cpu/cpuidle/low_power_idle_system_residency_us 

# cat /sys/devices/system/cpu/cpuidle/low_power_idle_cpu_residency_us

I'll attach below the new turbostat output and dmesg (first suspend is without any external devices connected, second one is with the external USB-C switch attached).
Comment 25 Stelian Pop 2021-05-26 10:02:07 UTC
Created attachment 296983 [details]
Working turbostat output
Comment 26 Stelian Pop 2021-05-26 10:02:38 UTC
Created attachment 296985 [details]
Working dmesg output
Comment 27 wendy.wang 2021-05-26 14:38:26 UTC
I'm glad to hear you finally get s0ix working on your machine.
Checked your Working dmesg output, it shows below: NVMe and its PCIe root port are in D0 but S0ix works? I'm a little confused.

[  219.572569] nvme 0000:01:00.0: PCI PM: Suspend power state: D0
[  219.572585] nvme 0000:01:00.0: PCI PM: Skipped
[  219.572665] ACPI: EC: interrupt blocked
[  219.584712] i801_smbus 0000:00:1f.4: PCI PM: Suspend power state: D0
[  219.584715] i801_smbus 0000:00:1f.4: PCI PM: Skipped
[  219.590232] intel_ish_ipc 0000:00:12.0: PCI PM: Suspend power state: D0
[  219.590234] intel_ish_ipc 0000:00:12.0: PCI PM: Skipped
[  219.591707] i915 0000:00:02.0: PCI PM: Suspend power state: D3hot
[  219.592516] pcieport 0000:00:06.0: PCI PM: Suspend power state: D0
[  219.592517] pcieport 0000:00:06.0: PCI PM: Skipped
[  219.605385] intel-lpss 0000:00:1e.0: PCI PM: Suspend power state: D3hot
[  219.609187] rtsx_pci 0000:73:00.0: PCI PM: Suspend power state: D3hot
[  219.609345] ath11k_pci 0000:72:00.0: PCI PM: Suspend power state: D3hot
[  219.609415] mei_me 0000:00:16.0: PCI PM: Suspend power state: D3hot
[  219.610393] intel-lpss 0000:00:15.0: PCI PM: Suspend power state: D3hot
[  219.611014] intel-lpss 0000:00:15.1: PCI PM: Suspend power state: D3hot
[  219.612414] proc_thermal 0000:00:04.0: PCI PM: Suspend power state: D3hot
[  219.612967] xhci_hcd 0000:00:14.0: PCI PM: Suspend power state: D3hot
[  219.616413] xhci_hcd 0000:00:0d.0: PCI PM: Suspend power state: D3cold
[  219.628496] pcieport 0000:00:1c.0: PCI PM: Suspend power state: D3hot
[  219.629573] thunderbolt 0000:00:0d.2: PCI PM: Suspend power state: D3cold
[  219.630786] thunderbolt 0000:00:0d.3: PCI PM: Suspend power state: D3cold
[  219.632486] pcieport 0000:00:1d.0: PCI PM: Suspend power state: D3hot
Comment 28 Stelian Pop 2021-05-26 15:42:27 UTC
Turbostat output says CPU%LPI SYS%LPI are at 95% or so, so I think S0ix works.

Maybe the power consumption isn't great is some devices are left in D0, I don't know.

I'll leave the laptop overnight in suspend, and see the battery percentage tomorrow morning.
Comment 29 wendy.wang 2021-05-27 02:17:57 UTC
Talked to our kernel developer, NVMe D0 should be expected for Kabylake to enter S0ix. Something else on the board blocks the S0ix.
Our engineer will try to reproduce the problem first, then update here.
Comment 30 wendy.wang 2021-05-27 02:28:05 UTC
(In reply to wendy.wang from comment #29)
> Talked to our kernel developer, NVMe D0 should be expected for Kabylake to
> enter S0ix. Something else on the board blocks the S0ix.
> Our engineer will try to reproduce the problem first, then update here.

Please ignore this message.

Learned from our kernel developer, NVMe D0 is expected in AHCI mode for Dell XPS 13 9310(Ice lake) to enter S0ix.
We can close this bug since S0ix is achieved and residency is high.
If you are interested in the VMD mode status, here is the bugzilla link to track the VMD driver update: https://bugzilla.kernel.org/show_bug.cgi?id=213047
Comment 31 Stelian Pop 2021-05-27 07:12:52 UTC
For the record, overnight (10 hours span) the battery went down from 98% to 93%, so s2idle S0ix power saving definitely works (before switching from RAID to AHCI my laptop barely survived overnight).

I still haven't been able to make opportunistic S0ix work (achieve S0ix residency during normal usage), so it's not perfect yet.