Bug 215367

Summary: S0ix: not able to enter S0ix, low_power_idle_system_residency_us 0 - XPS 13 7390 ICL
Product: Power Management Reporter: Leho Kraav (leho)
Component: Hibernation/SuspendAssignee: David Box (david.e.box)
Status: NEEDINFO ---    
Severity: normal CC: felash, lenb, rajvi.jingar, rui.zhang
Priority: P1    
Hardware: Intel   
OS: Linux   
Kernel Version: 5.17.2 Subsystem:
Regression: Yes Bisected commit-id:
Attachments: get_pcie_port_link_status

Description Leho Kraav 2021-12-19 17:47:50 UTC
This seems like a regression somewhere, and after running S0ixSelftestTool through multiple kernel versions, I'm starting to think it may even be at DELL BIOS level.

I recently started noticing laptop warm and battery depleting fast while in suspend. Having gained s2idle chops at bug 204867, I knew where to look.

PROBLEM: indeed, low_power_idle_system_residency_us is staying in 0.

I just tested kernels 5.10.87, 5.12.8 (my previous uptime record holder at 80 days, so definitely worked right), 5.12.19, 5.15.{5,7,10} - same SYS%LPI 0 failure now with all of them.

DELL BIOS updates may be involved, as over the past 1 mo or so, I think I updated to both 1.11.0 and 1.12.0.

Unfortunately, today BIOS stopped me from downgrading back to 1.11.0, even though I have "BIOS Downgrade" setting enabled - not sure what's up with that.

(Learning: must set up continous s2idle performance monitoring - quite difficult to tell what is likely culprit now, half-blind.)

Anyone with any thoughts on what to try next?

Current https://github.com/intel/S0ixSelftestTool/ output:

---Check S2idle path S0ix Residency---:

The system OS Kernel version is:
Linux papaya 5.15.10-gentoo+ #54 SMP PREEMPT Sat Dec 18 16:07:29 EET 2021 x86_64 Intel(R) Core(TM) i7-1065G7 CPU @ 1.30GHz GenuineIntel GNU/Linux

---Check whether your system supports S0ix or not---:

Low Power S0 Idle is:1
Your system supports low power S0 idle capability.



---Check whether intel_pmc_core sysfs files exit---:

The pmc_core debug sysfs files are OK on your system.



---Judge PC10, S0ix residency available status---:
Test system does not support S0ix.y substate

Turbostat output: 
15.052794 sec
CPU%c1	CPU%c6	CPU%c7	GFX%rc6	Pkg%pc2	Pkg%pc3	Pkg%pc6	Pkg%pc7	Pkg%pc8	Pkg%pc9	Pk%pc10	SYS%LPI
1.43	0.00	97.99	20901.02	1.74	3.43	0.26	0.25	1.21	0.00	89.15	0.00
1.24	0.00	97.05	20900.83	1.74	3.43	0.26	0.25	1.21	0.00	89.15	0.00
2.57
1.52	0.00	97.97
1.49
1.12	0.00	98.55
1.05
1.22	0.00	98.37
1.25

CPU Core C7 residency after S2idle is: 97.99
GFX RC6 residency after S2idle is: 20901.02
CPU Package C-state 2 residency after S2idle is: 1.74
CPU Package C-state 3 residency after S2idle is: 3.43
CPU Package C-state 8 residency after S2idle is: 1.21
CPU Package C-state 9 residency after S2idle is: 0.00
CPU Package C-state 10 residency after S2idle is: 89.15
S0ix residency after S2idle is: 0.00

Need to debug which IP blocked S0ix since PC10 is observed.

---Debug S0ix failure scenario--PCH IP power gating check---:

Your system south port controller did not meet S0ix requirement: [31mSPC[0m

---Debug S0ix failure scenario--Setting No ACPI DSM Callback---:

Setting no ACPI DSM callback is not helpful to the S0ix residency.

---Debug PCIeport D states and link PM states---

Checking PCI Devices D3 States:
[59414.404733] nvme 0000:01:00.0: PCI PM: Suspend power state: D0
[59414.404750] nvme 0000:01:00.0: PCI PM: Skipped
[59414.406044] i801_smbus 0000:00:1f.4: PCI PM: Suspend power state: D0
[59414.406046] i801_smbus 0000:00:1f.4: PCI PM: Skipped
[59414.406051] intel_ish_ipc 0000:00:12.0: PCI PM: Suspend power state: D0
[59414.406052] intel_ish_ipc 0000:00:12.0: PCI PM: Skipped
[59414.407458] pcieport 0000:00:1d.0: PCI PM: Suspend power state: D0
[59414.407460] pcieport 0000:00:1d.0: PCI PM: Skipped
[59414.410835] snd_hda_intel 0000:00:1f.3: PCI PM: Suspend power state: D3hot
[59414.410850] i915 0000:00:02.0: PCI PM: Suspend power state: D3hot
[59414.417708] mei_me 0000:00:16.0: PCI PM: Suspend power state: D3hot
[59414.418077] intel-lpss 0000:00:15.1: PCI PM: Suspend power state: D3hot
[59414.418877] xhci_hcd 0000:00:14.0: PCI PM: Suspend power state: D3hot
[59414.418969] iwlwifi 0000:00:14.3: PCI PM: Suspend power state: D3hot
[59414.419679] proc_thermal 0000:00:04.0: PCI PM: Suspend power state: D3hot
[59414.420029] xhci_hcd 0000:00:0d.0: PCI PM: Suspend power state: D3cold


Checking PCI Devices tree diagram:
-[0000:00]-+-00.0  Intel Corporation Ice Lake-LP Processor Host Bridge/DRAM Registers
           +-02.0  Intel Corporation Iris Plus Graphics G7
           +-04.0  Intel Corporation Device 8a03
           +-05.0  Intel Corporation Image Signal Processor
           +-0d.0  Intel Corporation Ice Lake Thunderbolt 3 USB Controller
           +-12.0  Intel Corporation Ice Lake-LP Integrated Sensor Solution
           +-14.0  Intel Corporation Ice Lake-LP USB 3.1 xHCI Host Controller
           +-14.2  Intel Corporation Ice Lake-LP DRAM Controller
           +-14.3  Intel Corporation Ice Lake-LP PCH CNVi WiFi
           +-15.0  Intel Corporation Ice Lake-LP Serial IO I2C Controller #0
           +-15.1  Intel Corporation Ice Lake-LP Serial IO I2C Controller #1
           +-15.3  Intel Corporation Ice Lake-LP Serial IO I2C Controller #3
           +-16.0  Intel Corporation Ice Lake-LP Management Engine
           +-1d.0-[01]----00.0  KIOXIA Corporation Device 0001
           +-1f.0  Intel Corporation Ice Lake-LP LPC Controller
           +-1f.3  Intel Corporation Ice Lake-LP Smart Sound Technology Audio Controller
           +-1f.4  Intel Corporation Ice Lake-LP SMBus Controller
           \-1f.5  Intel Corporation Ice Lake-LP SPI Controller

The pcieport 0000:00:1d.0 ASPM enable status:
		LnkCtl:	ASPM L0s L1 Enabled; RCB 64 bytes, Disabled- CommClk+

Pcieport is not in D3cold:    
[31m0000:00:1d.0[0m


The PCIe bridge link power management state is:
[31m0000:00:1d.0 Link is in L0[0m

The link power management state of PCIe bridge: [31m0000:00:1d.0[0m is not expected. 
which is expected to be L1.1 or L1.2, or user would run this script again.


The L1SubCap of the failed 0000:00:1d.0 is:


The L1SubCtl1 of the failed 0000:00:1d.0 is:



Checking PCI Devices tree diagram:
-[0000:00]-+-00.0  Intel Corporation Ice Lake-LP Processor Host Bridge/DRAM Registers
           +-02.0  Intel Corporation Iris Plus Graphics G7
           +-04.0  Intel Corporation Device 8a03
           +-05.0  Intel Corporation Image Signal Processor
           +-0d.0  Intel Corporation Ice Lake Thunderbolt 3 USB Controller
           +-12.0  Intel Corporation Ice Lake-LP Integrated Sensor Solution
           +-14.0  Intel Corporation Ice Lake-LP USB 3.1 xHCI Host Controller
           +-14.2  Intel Corporation Ice Lake-LP DRAM Controller
           +-14.3  Intel Corporation Ice Lake-LP PCH CNVi WiFi
           +-15.0  Intel Corporation Ice Lake-LP Serial IO I2C Controller #0
           +-15.1  Intel Corporation Ice Lake-LP Serial IO I2C Controller #1
           +-15.3  Intel Corporation Ice Lake-LP Serial IO I2C Controller #3
           +-16.0  Intel Corporation Ice Lake-LP Management Engine
           +-1d.0-[01]----00.0  KIOXIA Corporation Device 0001
           +-1f.0  Intel Corporation Ice Lake-LP LPC Controller
           +-1f.3  Intel Corporation Ice Lake-LP Smart Sound Technology Audio Controller
           +-1f.4  Intel Corporation Ice Lake-LP SMBus Controller
           \-1f.5  Intel Corporation Ice Lake-LP SPI Controller


[31mThe pcieroot port 0000:00:1d.0 ASPM setting is Enabled, its D state and Link PM are not expected,
please investigate or report a bug.[0m


Checking slp_s0_debug_status:

Your system ModPHY lane Core domain has issue blocks S0ix.

Isolation suggestions:     
Check ModPHY related high speed I/O controller list:     
covering from XHCI, XDCI, SATA, PCIe (all instances), Gbe and SCC (UFS)

Your system Main PLL or Oscillator Crystal PLL has issue to power off during S2idle, which may block S0ix.     
Failed PLL:  OC_PLL_OFF
Comment 1 Leho Kraav 2021-12-19 18:14:15 UTC
PCIe 0000:00:1d.0 points to NVMe and there's a firmware upgrade for this drive https://www.dell.com/support/home/en-us/drivers/driversdetails?driverid=14dfw&oscode=wt64a&productcode=xps-13-7390-2-in-1-laptop but there seems to be no way to apply this on Linux? fwupd / lvfs doesn't seem to provision it.
Comment 2 Leho Kraav 2021-12-19 18:51:59 UTC
If there's something good that came out of this, it's that people discovered disabling BIOS "Signs of Life > Early DELL Logo" option allows S3 deep sleep to work, so I switched to that for time being.
Comment 3 Julien Wajsberg 2022-04-08 14:11:13 UTC
(In reply to Leho Kraav from comment #2)
> If there's something good that came out of this, it's that people discovered
> disabling BIOS "Signs of Life > Early DELL Logo" option allows S3 deep sleep
> to work, so I switched to that for time being.

This trick didn't work for me (XPS 15 9510 from 2021), but the patch in bug 215063 did.
Comment 4 Leho Kraav 2022-04-08 20:14:18 UTC
Still not working on 5.17.2

leho@papaya S0ixSelftestTool $ [git:main+?] sudo ./s0ix-selftest-tool.sh -s

---Check S2idle path S0ix Residency---:

The system OS Kernel version is:
Linux papaya 5.17.2-gentoo+ #63 SMP PREEMPT Fri Apr 8 21:18:26 EEST 2022 x86_64 Intel(R) Core(TM) i7-1065G7 CPU @ 1.30GHz GenuineIntel GNU/Linux

---Check whether your system supports S0ix or not---:

Low Power S0 Idle is:1
Your system supports low power S0 idle capability.



---Check whether intel_pmc_core sysfs files exit---:

The pmc_core debug sysfs files are OK on your system.



---Judge PC10, S0ix residency available status---:
cat: /sys/kernel/debug/pmc_core/substate_residencies: No such file or directory
grep: /sys/kernel/debug/pmc_core/substate_residencies: No such file or directory
Test system does not support S0ix.y substate

Turbostat output: 
15.705451 sec
CPU%c1	CPU%c6	CPU%c7	GFX%rc6	Pkg%pc2	Pkg%pc3	Pkg%pc6	Pkg%pc7	Pkg%pc8	Pkg%pc9	Pk%pc10	SYS%LPI
1.97	0.00	97.17	8772.69	3.25	4.56	0.03	0.14	0.20	0.00	86.09	0.00
1.96	0.00	97.44	8770.97	3.25	4.56	0.03	0.14	0.20	0.00	86.09	0.00
1.83
1.86	0.00	96.71
2.58
1.61	0.00	97.23
2.09
2.15	0.00	97.32
1.67

CPU Core C7 residency after S2idle is: 97.17
GFX RC6 residency after S2idle is: 8772.69
CPU Package C-state 2 residency after S2idle is: 3.25
CPU Package C-state 3 residency after S2idle is: 4.56
CPU Package C-state 8 residency after S2idle is: 0.20
CPU Package C-state 9 residency after S2idle is: 0.00
CPU Package C-state 10 residency after S2idle is: 86.09
S0ix residency after S2idle is: 0.00
cat: /sys/kernel/debug/pmc_core/substate_residencies: No such file or directory

Need to debug which IP blocked S0ix since PC10 is observed.

---Debug S0ix failure scenario--PCH IP power gating check---:

Your system south port controller did not meet S0ix requirement: SPC
SPD
PCH IP: 6  - SPC                             	State: On
PCH IP: 13 - SPD                             	State: On

---Debug S0ix failure scenario--Setting No ACPI DSM Callback---:

Setting no ACPI DSM callback is not helpful to the S0ix residency.

---Debug PCIeport D states and link PM states---

Checking PCI Devices D3 States:
[ 1400.642437] nvme 0000:57:00.0: PCI PM: Suspend power state: D0
[ 1400.642440] nvme 0000:57:00.0: PCI PM: Skipped
[ 1400.646412] i801_smbus 0000:00:1f.4: PCI PM: Suspend power state: D0
[ 1400.646414] i801_smbus 0000:00:1f.4: PCI PM: Skipped
[ 1400.646417] intel_ish_ipc 0000:00:12.0: PCI PM: Suspend power state: D0
[ 1400.646419] intel_ish_ipc 0000:00:12.0: PCI PM: Skipped
[ 1400.647213] snd_hda_intel 0000:00:1f.3: PCI PM: Suspend power state: D3hot
[ 1400.651201] pcieport 0000:00:1d.0: PCI PM: Suspend power state: D0
[ 1400.651202] pcieport 0000:00:1d.0: PCI PM: Skipped
[ 1400.651208] i915 0000:00:02.0: PCI PM: Suspend power state: D3hot
[ 1400.659528] mei_me 0000:00:16.0: PCI PM: Suspend power state: D3hot
[ 1400.660903] intel-lpss 0000:00:15.1: PCI PM: Suspend power state: D3hot
[ 1400.663038] rtsx_pci 0000:58:00.0: PCI PM: Suspend power state: D3hot
[ 1400.663146] pcieport 0000:00:1d.7: PCI PM: Suspend power state: D0
[ 1400.663149] pcieport 0000:00:1d.7: PCI PM: Skipped
[ 1400.663564] proc_thermal 0000:00:04.0: PCI PM: Suspend power state: D3hot
[ 1400.663744] iwlwifi 0000:00:14.3: PCI PM: Suspend power state: D3hot
[ 1400.664751] xhci_hcd 0000:00:0d.0: PCI PM: Suspend power state: D3cold
[ 1400.664784] xhci_hcd 0000:00:14.0: PCI PM: Suspend power state: D3hot
[ 1400.824961] thunderbolt 0000:00:0d.3: PCI PM: Suspend power state: D3cold
[ 1400.825990] thunderbolt 0000:00:0d.2: PCI PM: Suspend power state: D3cold


Checking PCI Devices tree diagram:
-[0000:00]-+-00.0  Intel Corporation Ice Lake-LP Processor Host Bridge/DRAM Registers
           +-02.0  Intel Corporation Iris Plus Graphics G7
           +-04.0  Intel Corporation Device 8a03
           +-05.0  Intel Corporation Image Signal Processor
           +-07.0-[01-2b]--
           +-07.2-[2c-56]--
           +-0d.0  Intel Corporation Ice Lake Thunderbolt 3 USB Controller
           +-0d.2  Intel Corporation Ice Lake Thunderbolt 3 NHI #0
           +-0d.3  Intel Corporation Ice Lake Thunderbolt 3 NHI #1
           +-12.0  Intel Corporation Ice Lake-LP Integrated Sensor Solution
           +-14.0  Intel Corporation Ice Lake-LP USB 3.1 xHCI Host Controller
           +-14.2  Intel Corporation Ice Lake-LP DRAM Controller
           +-14.3  Intel Corporation Ice Lake-LP PCH CNVi WiFi
           +-15.0  Intel Corporation Ice Lake-LP Serial IO I2C Controller #0
           +-15.1  Intel Corporation Ice Lake-LP Serial IO I2C Controller #1
           +-15.3  Intel Corporation Ice Lake-LP Serial IO I2C Controller #3
           +-16.0  Intel Corporation Ice Lake-LP Management Engine
           +-1d.0-[57]----00.0  KIOXIA Corporation Device 0001
           +-1d.7-[58]----00.0  Realtek Semiconductor Co., Ltd. RTS525A PCI Express Card Reader
           +-1f.0  Intel Corporation Ice Lake-LP LPC Controller
           +-1f.3  Intel Corporation Ice Lake-LP Smart Sound Technology Audio Controller
           +-1f.4  Intel Corporation Ice Lake-LP SMBus Controller
           \-1f.5  Intel Corporation Ice Lake-LP SPI Controller

The pcieport 0000:00:1d.0 ASPM enable status:
		LnkCtl:	ASPM L0s L1 Enabled; RCB 64 bytes, Disabled- CommClk+

Pcieport is not in D3cold:          
0000:00:1d.0

The pcieport 0000:00:1d.7 ASPM enable status:
		LnkCtl:	ASPM L0s L1 Enabled; RCB 64 bytes, Disabled- CommClk+

Pcieport is not in D3cold:          
0000:00:1d.7

Available bridge device: 0000:00:07.0 0000:00:07.2 0000:00:1d.0 0000:00:1d.7

0000:00:07.0 Link is in Detect

The link power management state of PCIe bridge: 0000:00:07.0 is OK.

0000:00:07.2 Link is in Detect

The link power management state of PCIe bridge: 0000:00:07.2 is OK.

The PCIe bridge link power management state is:
0000:00:1d.0 Link is in L1

The link power management state of PCIe bridge: 0000:00:1d.0 is not expected. 
which is expected to be L1.1 or L1.2, or user would run this script again.


The L1SubCap of the failed 0000:00:1d.0 is:


The L1SubCtl1 of the failed 0000:00:1d.0 is:



Checking PCI Devices tree diagram:
-[0000:00]-+-00.0  Intel Corporation Ice Lake-LP Processor Host Bridge/DRAM Registers
           +-02.0  Intel Corporation Iris Plus Graphics G7
           +-04.0  Intel Corporation Device 8a03
           +-05.0  Intel Corporation Image Signal Processor
           +-07.0-[01-2b]--
           +-07.2-[2c-56]--
           +-0d.0  Intel Corporation Ice Lake Thunderbolt 3 USB Controller
           +-0d.2  Intel Corporation Ice Lake Thunderbolt 3 NHI #0
           +-0d.3  Intel Corporation Ice Lake Thunderbolt 3 NHI #1
           +-12.0  Intel Corporation Ice Lake-LP Integrated Sensor Solution
           +-14.0  Intel Corporation Ice Lake-LP USB 3.1 xHCI Host Controller
           +-14.2  Intel Corporation Ice Lake-LP DRAM Controller
           +-14.3  Intel Corporation Ice Lake-LP PCH CNVi WiFi
           +-15.0  Intel Corporation Ice Lake-LP Serial IO I2C Controller #0
           +-15.1  Intel Corporation Ice Lake-LP Serial IO I2C Controller #1
           +-15.3  Intel Corporation Ice Lake-LP Serial IO I2C Controller #3
           +-16.0  Intel Corporation Ice Lake-LP Management Engine
           +-1d.0-[57]----00.0  KIOXIA Corporation Device 0001
           +-1d.7-[58]----00.0  Realtek Semiconductor Co., Ltd. RTS525A PCI Express Card Reader
           +-1f.0  Intel Corporation Ice Lake-LP LPC Controller
           +-1f.3  Intel Corporation Ice Lake-LP Smart Sound Technology Audio Controller
           +-1f.4  Intel Corporation Ice Lake-LP SMBus Controller
           \-1f.5  Intel Corporation Ice Lake-LP SPI Controller


The pcieroot port 0000:00:1d.0 ASPM setting is Enabled, its D state and Link PM are not expected,
please investigate or report a bug.


Checking slp_s0_debug_status:

Your system ModPHY lane Core domain has issue blocks S0ix.

Isolation suggestions:     
Check ModPHY related high speed I/O controller list:     
covering from XHCI, XDCI, SATA, PCIe (all instances), Gbe and SCC (UFS)

Your system Main PLL or Oscillator Crystal PLL has issue to power off during S2idle, which may block S0ix.     
Failed PLL:  OC_PLL_OFF
Comment 5 Rajvi Jingar 2022-04-23 01:53:14 UTC
Hi @leho, to get the pcieport link state, can you please run the attached script in a loop (for a minute, 10 times a sec)?

Also, please enable warn_on_s0ix_failures before suspend test and provide the log.
Comment 6 Rajvi Jingar 2022-04-23 01:54:55 UTC
Created attachment 300794 [details]
get_pcie_port_link_status

script to get link status for pcie ports.
Comment 7 Rajvi Jingar 2022-06-02 17:57:19 UTC
Hi Leho, do you have more details or any updates on this?
Comment 8 Leho Kraav 2022-06-21 10:02:54 UTC
Hi. I've been lacking bandwidth, but will get to it and provide the data.
Comment 9 Leho Kraav 2024-02-08 20:12:06 UTC
Been living with deep sleep for a while, but this week updated to Kernel 6.6, and decided to give this another try. Latest S0ixSelftestTool build output looks promising, but let's see how it does in overnight sleep next.


```
⌁ [leho:S0ixSelftestTool]└2 main+* 1 ± sudo ./s0ix-selftest-tool.sh -s

---Check S2idle path S0ix Residency---:

The system OS Kernel version is:
Linux papaya 6.6.16-gentoo+ #2 SMP PREEMPT_DYNAMIC Thu Feb  8 20:20:41 EET 2024 x86_64 Intel(R) Core(TM) i7-1065G7 CPU @ 1.30GHz GenuineIntel GNU/Linux

---Check whether your system supports S0ix or not---:

Low Power S0 Idle is:1
Your system supports low power S0 idle capability.



---Check whether intel_pmc_core sysfs files exit---:

The pmc_core debug sysfs files are OK on your system.



---Judge PC10, S0ix residency available status---:
cat: /sys/kernel/debug/pmc_core/substate_residencies: No such file or directory
grep: /sys/kernel/debug/pmc_core/substate_residencies: No such file or directory
Test system does not support S0ix.y substate

Turbostat output: 
15.055275 sec
CPU%c1  CPU%c6  CPU%c7  GFX%rc6 Pkg%pc2 Pkg%pc3 Pkg%pc6 Pkg%pc7 Pkg%pc8 Pkg%pc9 Pk%pc10 SYS%LPI
1.53    0.00    97.50   16002.02        3.32    7.02    0.11    0.05    0.03    0.00    84.78   80.74
1.64    0.00    97.61   16002.59        3.32    7.02    0.11    0.05    0.03    0.00    84.79   80.74
1.59
1.68    0.00    96.60
2.25
1.09    0.00    98.12
1.20
1.51    0.00    97.66
1.32

CPU Core C7 residency after S2idle is: 97.50
GFX RC6 residency after S2idle is: 16002.02
CPU Package C-state 2 residency after S2idle is: 3.32
CPU Package C-state 3 residency after S2idle is: 7.02
CPU Package C-state 8 residency after S2idle is: 0.03
CPU Package C-state 9 residency after S2idle is: 0.00
CPU Package C-state 10 residency after S2idle is: 84.78
S0ix residency after S2idle is: 80.74
cat: /sys/kernel/debug/pmc_core/substate_residencies: No such file or directory

Congratulations! Your system achieved S2idle S0ix residency: 80.74
```