Bug 212355

Summary: S0ix: no s0ix residency during suspend with VMD mode - TGL-H
Product: Power Management Reporter: KaiChuan-Hsieh (kaichuan.hsieh)
Component: Hibernation/SuspendAssignee: David Box (david.e.box)
Status: NEEDINFO ---    
Severity: blocking CC: bugzilla, david.e.box, kaichuan.hsieh, leho, rajvi.jingar, rui.zhang
Priority: P1    
Hardware: Intel   
OS: Linux   
Kernel Version: 5.12.0-rc3 Subsystem:
Regression: No Bisected commit-id:
Attachments: kernel log shows s0ix failed

Description KaiChuan-Hsieh 2021-03-19 07:20:09 UTC
Created attachment 295951 [details]
kernel log shows s0ix failed

A Dell platform with TGL-H cpu has no s0ix residency during suspend. It shows that the main pll is on.

kernel: intel_pmc_core INT33A1:00: CPU did not enter SLP_S0!!! (S0ix cnt=0)
......

kernel: intel_pmc_core INT33A1:00: USB2PLL_OFF_STS                1                             
kernel: intel_pmc_core INT33A1:00: PCIe/USB3.1_Gen2PLL_OFF_STS    1                             
kernel: intel_pmc_core INT33A1:00: PCIe_Gen3PLL_OFF_STS           1                             
kernel: intel_pmc_core INT33A1:00: OPIOPLL_OFF_STS                1                             
kernel: intel_pmc_core INT33A1:00: OCPLL_OFF_STS                  1                             
kernel: intel_pmc_core INT33A1:00: MainPLL_OFF_STS                0                             
kernel: intel_pmc_core INT33A1:00: MIPIPLL_OFF_STS                1                             
kernel: intel_pmc_core INT33A1:00: Fast_XTAL_Osc_OFF_STS          0                             kernel: intel_pmc_core INT33A1:00: AC_Ring_Osc_OFF_STS            0                             
kernel: intel_pmc_core INT33A1:00: MC_Ring_Osc_OFF_STS            0                             
kernel: intel_pmc_core INT33A1:00: SATAPLL_OFF_STS                1                             
kernel: intel_pmc_core INT33A1:00: XTAL_USB2PLL_OFF_STS           1                             
kernel: intel_pmc_core INT33A1:00:
Comment 1 Zhang Rui 2021-06-01 05:25:25 UTC
In the BIOS, are you using VMD mode or AHCI mode? If you're using VMD mode, can you please check if there is any difference with AHCI mode?
Comment 2 KaiChuan-Hsieh 2021-06-01 05:46:51 UTC
The issue is solved after update the BIOS and set storage to AHCI mode.
Currently, it is still failed on VMD mode.

I need to sort the current status, since the bug has been stalled for a while.

Thanks,
Comment 3 Leho Kraav 2021-12-18 18:15:32 UTC
https://bugzilla.kernel.org/show_bug.cgi?id=213717 seems to have VMD related work.

Discovered while debugging why my XPS 13 7390 2-in-1 can't get to s0ix anymore, when it used to do so just fine just some months ago.
Comment 4 KaiChuan-Hsieh 2021-12-19 03:54:40 UTC
My XPS 9310 was impact by https://patchwork.kernel.org/project/intel-gfx/patch/20210325120947.11950-1-anshuman.gupta@intel.com/ for s0ix.

It has no s0ix while doing:
$ sudo turbostat -S --show CPU%LPI,SYS%LPI rtcwake -m freeze -s 15
CPU%LPI SYS%LPI
84.52 0.00

However, i can get the residency without it.
Could you check if you contain this commit in your kernel?
Comment 5 Leho Kraav 2021-12-19 14:14:25 UTC
I just tested reverting this patch on top of 5.12.8 (a kernel I have previously recorded 80 days uptime with), but still no difference.

Starting to think DELL BIOS updates have messed something up.

Either way, I think I'll file another bug, but I'll test with an Ubuntu live kernel  as well.

Currently https://github.com/intel/S0ixSelftestTool/ indicates NVMe doesn't suddenly want to go to sleep - PC10 and RC6 work, but SYS%LPI stays at 0.

```
The system OS Kernel version is:
Linux papaya 5.12.8-gentoo+ #57 SMP PREEMPT Sun Dec 19 15:00:53 EET 2021 x86_64 Intel(R) Core(TM) i7-1065G7 CPU @ 1.30GHz GenuineIntel GNU/Linux

Checking PCI Devices tree diagram:
-[0000:00]-+-00.0  Intel Corporation Ice Lake-LP Processor Host Bridge/DRAM Registers
           +-02.0  Intel Corporation Iris Plus Graphics G7
           +-04.0  Intel Corporation Device 8a03
           +-05.0  Intel Corporation Image Signal Processor
           +-0d.0  Intel Corporation Ice Lake Thunderbolt 3 USB Controller
           +-12.0  Intel Corporation Ice Lake-LP Integrated Sensor Solution
           +-14.0  Intel Corporation Ice Lake-LP USB 3.1 xHCI Host Controller
           +-14.2  Intel Corporation Ice Lake-LP DRAM Controller
           +-14.3  Intel Corporation Ice Lake-LP PCH CNVi WiFi
           +-15.0  Intel Corporation Ice Lake-LP Serial IO I2C Controller #0
           +-15.1  Intel Corporation Ice Lake-LP Serial IO I2C Controller #1
           +-15.3  Intel Corporation Ice Lake-LP Serial IO I2C Controller #3
           +-16.0  Intel Corporation Ice Lake-LP Management Engine
           +-1d.0-[01]----00.0  KIOXIA Corporation Device 0001
           +-1f.0  Intel Corporation Ice Lake-LP LPC Controller
           +-1f.3  Intel Corporation Ice Lake-LP Smart Sound Technology Audio Controller
           +-1f.4  Intel Corporation Ice Lake-LP SMBus Controller
           \-1f.5  Intel Corporation Ice Lake-LP SPI Controller

Pcieport is not in D3cold:     
0000:00:1d.0


The PCIe bridge link power management state is:
0000:00:1d.0 Link is in L1

The link power management state of PCIe bridge: 0000:00:1d.0 is not expected. 
which is expected to be L1.1 or L1.2, or user would run this script again.

...

Checking slp_s0_debug_status:

Your system ModPHY lane Core domain has issue blocks S0ix.

Isolation suggestions:     
Check ModPHY related high speed I/O controller list:     
covering from XHCI, XDCI, SATA, PCIe (all instances), Gbe and SCC (UFS)

Your system Main PLL or Oscillator Crystal PLL has issue to power off during S2idle, which may block S0ix.     
Failed PLL:  OC_PLL_OFF
```
Comment 6 Leho Kraav 2021-12-19 17:54:04 UTC
Filed https://bugzilla.kernel.org/show_bug.cgi?id=215367
Comment 7 Rajvi Jingar 2022-07-19 21:30:51 UTC
Hi @Kaichuan-Hsieh, do you still see this issue?