Bug 204867
Summary: | 5.3.0-rc8: Dell Latitude 7400 2-in-1 (i5-8365U Coffee Lake / CFL) unable to enter system S0ix / slp_s0_residency_usec, and low_power_idle_system_residency_us stay 0 | ||
---|---|---|---|
Product: | Power Management | Reporter: | Leho Kraav (leho) |
Component: | Run-Time-PM | Assignee: | wendy.wang |
Status: | CLOSED UNREPRODUCIBLE | ||
Severity: | normal | CC: | rui.zhang |
Priority: | P1 | ||
Hardware: | Intel | ||
OS: | Linux | ||
Kernel Version: | 5.3.0-rc8 | Subsystem: | |
Regression: | No | Bisected commit-id: | |
Attachments: |
dmesg.txt
config-5.3.0-rc8 |
Description
Leho Kraav
2019-09-15 16:43:32 UTC
Created attachment 284975 [details]
config-5.3.0-rc8
Kernel config
So the questions for this bug are: 1. Why cannot get s0ix residency after doing one cycle s2idle, please show us this one: echo 1 > /sys/kernel/debug/pmc_core/slp_s0_dbg_latch rtcwake -m freeze -s 30 cat /sys/kernel/debug/pmc_core/slp_s0_debug_status and can you please try disable wifi, then double check the S2idle s0ix status? 2. Why your custom built 5.3.0-rc8 kernel cannot get opportunistic PC10? Did you run powertop --auto-tune before you check the opportunistic PC10? Does your display support PSR? cat /sys/kernel/debug/dri/0/i915_dmc_info Can you get pc10 after resuming from S2idle (rtcwake -m freeze -s 20)? cat /sys/kernel/debug/pmc_core/package_cstate_show ltr_ignore scrip will help for the PC10 in case your devices have ltr issue. which can be have a try for this pc10 failure case. Thank you for responding. > Did you run powertop --auto-tune before you check the opportunistic PC10? Yes, after I learned about it, I have it as a systemd service. Have repeatedly checked this, all Tunables are in Good state. > Can you please try disable wifi I have tested both disabling wifi, and also with all system peripherals disabled at BIOS level. There is no apparent change in this behavior. PS I don't want to confuse people, but I should note, that *I think* I had one "Try Ubuntu" USB stick boot, where both `low_power_idle_system_residency_us` and `low_power_idle_cpu_residency_us` started increasing immediately, without suspend. I think all system devices were disabled at BIOS level, as that was the specific test I was attempting. After suspending in this state, I rebooted back and forth between my kernel and Ubuntu - and I have not been able to increase `low_power_idle_system_residency_us` counter since, with either kernel. I wish I had taken a photo, because later you start doubting whether you really saw what you saw. It feels like some piece of platform hardware now stays in some active state. > Does your display support PSR? cat /sys/kernel/debug/dri/0/i915_dmc_info Yes, I believe it supports PSR2 even, it's a modern CFL machine. $ [-] sudo cat /sys/kernel/debug/dri/0/i915_dmc_info fw loaded: yes path: i915/kbl_dmc_ver1_04.bin version: 1.4 DC3 -> DC5 count: 643885 DC5 -> DC6 count: 642912 program base: 0x09004040 ssp base: 0x00002fc0 htp: 0x00b40068 > 1. Why cannot get s0ix residency after doing one cycle s2idle, please show us > this one: papaya ~ # echo 1 > /sys/kernel/debug/pmc_core/slp_s0_dbg_latch papaya ~ # rtcwake -m freeze -s 30 rtcwake: wakeup from "freeze" using /dev/rtc0 at Mon Sep 16 05:32:43 2019 papaya ~ # cat /sys/kernel/debug/pmc_core/slp_s0_debug_status SLP_S0_DBG: AUDIO_D3 State: Yes SLP_S0_DBG: OTG_D3 State: Yes SLP_S0_DBG: XHCI_D3 State: No SLP_S0_DBG: LPIO_D3 State: Yes SLP_S0_DBG: SDX_D3 State: Yes SLP_S0_DBG: SATA_D3 State: Yes SLP_S0_DBG: UFS0_D3 State: Yes SLP_S0_DBG: UFS1_D3 State: Yes SLP_S0_DBG: EMMC_D3 State: Yes SLP_S0_DBG: SDIO_PLL_OFF State: Yes SLP_S0_DBG: USB2_PLL_OFF State: Yes SLP_S0_DBG: AUDIO_PLL_OFF State: Yes SLP_S0_DBG: OC_PLL_OFF State: Yes SLP_S0_DBG: MAIN_PLL_OFF State: No SLP_S0_DBG: XOSC_OFF State: No SLP_S0_DBG: LPC_CLKS_GATED State: Yes SLP_S0_DBG: PCIE_CLKREQS_IDLE State: Yes SLP_S0_DBG: AUDIO_ROSC_OFF State: Yes SLP_S0_DBG: HPET_XOSC_CLK_REQ State: Yes SLP_S0_DBG: PMC_ROSC_SLOW_CLK State: No SLP_S0_DBG: AON2_ROSC_GATED State: No SLP_S0_DBG: CLKACKS_DEASSERTED State: No SLP_S0_DBG: MPHY_CORE_GATED State: No SLP_S0_DBG: CSME_GATED State: Yes SLP_S0_DBG: USB2_SUS_GATED State: No SLP_S0_DBG: DYN_FLEX_IO_IDLE State: Yes SLP_S0_DBG: GBE_NO_LINK State: Yes SLP_S0_DBG: THERM_SEN_DISABLED State: No SLP_S0_DBG: PCIE_LOW_POWER State: No SLP_S0_DBG: ISH_VNNAON_REQ_ACT State: Yes SLP_S0_DBG: ISH_VNN_REQ_ACT State: No SLP_S0_DBG: CNV_VNNAON_REQ_ACT State: Yes SLP_S0_DBG: CNV_VNN_REQ_ACT State: No SLP_S0_DBG: NPK_VNNON_REQ_ACT State: No SLP_S0_DBG: PMSYNC_STATE_IDLE State: No SLP_S0_DBG: ALST_GT_THRES State: No SLP_S0_DBG: PMC_ARC_PG_READY State: No > Can you get pc10 after resuming from S2idle (rtcwake -m freeze -s 20)? papaya ~ # cat /sys/kernel/debug/pmc_core/package_cstate_show Package C2 : 1700042915 Package C3 : 1479401909 Package C6 : 961542094 Package C7 : 30824383 Package C8 : 2840699353 Package C9 : 190229330 Package C10 : 40357279406 > ltr_ignore scrip will help for the PC10 in case your devices have ltr issue. Should I increase the counter to 64 instead of 32? Or just first 32 matter? From the papaya ~ # cat /sys/kernel/debug/pmc_core/slp_s0_debug_status log, we notice below ones are show Not in the state, based on my experience, we should check the high speed devices first(Maybe I'm wrong), just have a try. SLP_S0_DBG: XHCI_D3 State: No---------can be ignore SLP_S0_DBG: MAIN_PLL_OFF State: No---------need check high speed devices first SLP_S0_DBG: XOSC_OFF State: No---------not sure SLP_S0_DBG: PMC_ROSC_SLOW_CLK State: No---------can be ignore SLP_S0_DBG: AON2_ROSC_GATED State: No---------can be ignore temporaily SLP_S0_DBG: CLKACKS_DEASSERTED State: No---------not sure SLP_S0_DBG: MPHY_CORE_GATED State: No---------need check first high-speed devices first SLP_S0_DBG: USB2_SUS_GATED State: No---------not sure SLP_S0_DBG: THERM_SEN_DISABLED State: No---------not sure SLP_S0_DBG: PCIE_LOW_POWER State: No---------can be ignore SLP_S0_DBG: ISH_VNN_REQ_ACT State: No---------can be ignore SLP_S0_DBG: CNV_VNN_REQ_ACT State: No---------can be ignore SLP_S0_DBG: NPK_VNNON_REQ_ACT State: No---------can be ignore SLP_S0_DBG: PMSYNC_STATE_IDLE State: No---------can be ignore SLP_S0_DBG: ALST_GT_THRES State: No---------not sure SLP_S0_DBG: PMC_ARC_PG_READY State: No---------can be ignore Then let's check which high speed devices your CFL has, I suggest to check from NVMe(I just guess you are running NVMe) and GBe first. XHCI XDCI SATA NVMe PCIe SCC GBe The questions are: For GBe: Did you enabled PCH LAN controller in the BIOS setup? How about disable it and re-check S2idle S0ix? For NVMe: linux kernel v5.3 have some change for the NVMe PCI D3, and make sure ASPM policy needs to be "default" or "powersupersave" (via /sys/module/pcie_aspm/parameters/policy) For this question: ltr_ignore scrip will help for the PC10 in case your devices have ltr issue. Should I increase the counter to 64 instead of 32? Or just first 32 matter? Which should depend on this cat /sys/kernel/debug/pmc_core/ltr_show, to see how many devices support LTR value, no need counter to 64. I upgraded to 5.3.0 release mid-day, and surprisingly, now I'm suddenly seeing `low_power_idle_system_residency_us` counters increase, after exactly one suspend. There were a small set of various reverts from 5.3.0-rc8 to release. Maybe one of these patches have an effect here..? I'm noticing I may have Audio disabled in BIOS right now, no `snd-hda-intel` modules are loading. Some logs below. Anything else interesting I can paste here, let me know. $ [-] sudo ./sys-devices-system-cpu-cpuidle.sh /sys/class/drm/card0/power/rc6_residency_ms:5398373 /sys/devices/system/cpu/cpuidle/current_driver:intel_idle /sys/devices/system/cpu/cpuidle/current_governor_ro:menu /sys/devices/system/cpu/cpuidle/low_power_idle_cpu_residency_us:3844954290 /sys/devices/system/cpu/cpuidle/low_power_idle_system_residency_us:77205427 /sys/kernel/debug/pmc_core/slp_s0_residency_usec:73962800 $ [-] sudo dmesg | grep suspend [16308.032289] PM: suspend entry (s2idle) [16308.161492] printk: Suspending console(s) (use no_console_suspend to debug) [29868.207178] PM: suspend exit $ [-] diff -urN slp-s0-debug-5.3.0-rc8.txt slp-s0-debug-5.3.0.txt --- slp-s0-debug-5.3.0-rc8.txt 2019-09-16 19:39:40.771166077 +0300 +++ slp-s0-debug-5.3.0.txt 2019-09-16 19:40:34.436306186 +0300 @@ -1,6 +1,6 @@ SLP_S0_DBG: AUDIO_D3 State: Yes SLP_S0_DBG: OTG_D3 State: Yes -SLP_S0_DBG: XHCI_D3 State: No +SLP_S0_DBG: XHCI_D3 State: Yes SLP_S0_DBG: LPIO_D3 State: Yes SLP_S0_DBG: SDX_D3 State: Yes SLP_S0_DBG: SATA_D3 State: Yes @@ -11,7 +11,7 @@ SLP_S0_DBG: USB2_PLL_OFF State: Yes SLP_S0_DBG: AUDIO_PLL_OFF State: Yes SLP_S0_DBG: OC_PLL_OFF State: Yes -SLP_S0_DBG: MAIN_PLL_OFF State: No +SLP_S0_DBG: MAIN_PLL_OFF State: Yes SLP_S0_DBG: XOSC_OFF State: No SLP_S0_DBG: LPC_CLKS_GATED State: Yes SLP_S0_DBG: PCIE_CLKREQS_IDLE State: Yes @@ -25,13 +25,13 @@ SLP_S0_DBG: USB2_SUS_GATED State: No SLP_S0_DBG: DYN_FLEX_IO_IDLE State: Yes SLP_S0_DBG: GBE_NO_LINK State: Yes -SLP_S0_DBG: THERM_SEN_DISABLED State: No +SLP_S0_DBG: THERM_SEN_DISABLED State: Yes SLP_S0_DBG: PCIE_LOW_POWER State: No SLP_S0_DBG: ISH_VNNAON_REQ_ACT State: Yes SLP_S0_DBG: ISH_VNN_REQ_ACT State: No SLP_S0_DBG: CNV_VNNAON_REQ_ACT State: Yes SLP_S0_DBG: CNV_VNN_REQ_ACT State: No SLP_S0_DBG: NPK_VNNON_REQ_ACT State: No -SLP_S0_DBG: PMSYNC_STATE_IDLE State: No -SLP_S0_DBG: ALST_GT_THRES State: No +SLP_S0_DBG: PMSYNC_STATE_IDLE State: Yes +SLP_S0_DBG: ALST_GT_THRES State: Yes SLP_S0_DBG: PMC_ARC_PG_READY State: No It's interesting, I see -SLP_S0_DBG: MAIN_PLL_OFF State: No +SLP_S0_DBG: MAIN_PLL_OFF State: Yes Your v5.3 kernel can have Main PLL OFF, which let you get S0ix residency, but it's hard to say which device power state changed, and MAIN PLL can OFF right now. Maybe this one, but not sure. -SLP_S0_DBG: XHCI_D3 State: No +SLP_S0_DBG: XHCI_D3 State: Yes This is good, which is the communication thing between CPU and PCH. -SLP_S0_DBG: PMSYNC_STATE_IDLE State: No +SLP_S0_DBG: PMSYNC_STATE_IDLE State: Yes I do not know what is ALST: -SLP_S0_DBG: ALST_GT_THRES State: No +SLP_S0_DBG: ALST_GT_THRES State: Yes Looks like more explanation need expert to comment. Anyway, glad to see you have s0ix residency right now. > Anyway, glad to see you have s0ix residency right now. Yep, but should be noted that in this session I'm running with most devices disabled in BIOS, kind of "limp-mode". Most importantly, Audio is off. Thunderbolt, camera, fingerprint reader, touchscreen, also off. My next step after the work week will be to reboot and enable Audio and Camera, as those are devices I'm actually likely to use. > For GBe: Did you enabled PCH LAN controller in the BIOS setup? How about > disable it and re-check S2idle S0ix? I think I missed replying about GBe before: this 7400 2-in-1 model does *not* have a built-in Ethernet adapter. ``` $ [-] sudo lspci 00:00.0 Host bridge: Intel Corporation Device 3e34 (rev 0c) 00:02.0 VGA compatible controller: Intel Corporation UHD Graphics 620 (Whiskey Lake) (rev 02) 00:04.0 Signal processing controller: Intel Corporation Xeon E3-1200 v5/E3-1500 v5/6th Gen Core Processor Thermal Subsystem (rev 0c) 00:08.0 System peripheral: Intel Corporation Xeon E3-1200 v5/v6 / E3-1500 v5 / 6th/7th Gen Core Processor Gaussian Mixture Model 00:12.0 Signal processing controller: Intel Corporation Cannon Point-LP Thermal Controller (rev 30) 00:13.0 Serial controller: Intel Corporation Cannon Point-LP Integrated Sensor Hub (rev 30) 00:14.0 USB controller: Intel Corporation Cannon Point-LP USB 3.1 xHCI Controller (rev 30) 00:14.2 RAM memory: Intel Corporation Cannon Point-LP Shared SRAM (rev 30) 00:14.3 Network controller: Intel Corporation Cannon Point-LP CNVi [Wireless-AC] (rev 30) 00:15.0 Serial bus controller [0c80]: Intel Corporation Cannon Point-LP Serial IO I2C Controller #0 (rev 30) 00:15.1 Serial bus controller [0c80]: Intel Corporation Cannon Point-LP Serial IO I2C Controller #1 (rev 30) 00:16.0 Communication controller: Intel Corporation Cannon Point-LP MEI Controller #1 (rev 30) 00:1c.0 PCI bridge: Intel Corporation Cannon Point-LP PCI Express Root Port #1 (rev f0) 00:1d.0 PCI bridge: Intel Corporation Cannon Point-LP PCI Express Root Port #9 (rev f0) 00:1d.4 PCI bridge: Intel Corporation Cannon Point-LP PCI Express Root Port #13 (rev f0) 00:1f.0 ISA bridge: Intel Corporation Cannon Point-LP LPC Controller (rev 30) 00:1f.4 SMBus: Intel Corporation Cannon Point-LP SMBus Controller (rev 30) 00:1f.5 Serial bus controller [0c80]: Intel Corporation Cannon Point-LP SPI Controller (rev 30) 01:00.0 Unassigned class [ff00]: Realtek Semiconductor Co., Ltd. RTS525A PCI Express Card Reader (rev 01) 02:00.0 PCI bridge: Intel Corporation JHL6540 Thunderbolt 3 Bridge (C step) [Alpine Ridge 4C 2016] (rev 02) 03:00.0 PCI bridge: Intel Corporation JHL6540 Thunderbolt 3 Bridge (C step) [Alpine Ridge 4C 2016] (rev 02) 03:01.0 PCI bridge: Intel Corporation JHL6540 Thunderbolt 3 Bridge (C step) [Alpine Ridge 4C 2016] (rev 02) 03:02.0 PCI bridge: Intel Corporation JHL6540 Thunderbolt 3 Bridge (C step) [Alpine Ridge 4C 2016] (rev 02) 03:04.0 PCI bridge: Intel Corporation JHL6540 Thunderbolt 3 Bridge (C step) [Alpine Ridge 4C 2016] (rev 02) 04:00.0 System peripheral: Intel Corporation JHL6540 Thunderbolt 3 NHI (C step) [Alpine Ridge 4C 2016] (rev 02) 38:00.0 USB controller: Intel Corporation JHL6540 Thunderbolt 3 USB Controller (C step) [Alpine Ridge 4C 2016] (rev 02) 6d:00.0 Non-Volatile memory controller: Toshiba America Info Systems XG4 NVMe SSD Controller (rev 01) ``` So I guess we can close this bug. I've been discussing with Mario Limonciello in e-mail as well. Learning is that while I'm indeed now somehow getting some residency (uptime: 14 days): $ [-] sudo ./sys-devices-system-cpu-cpuidle.sh /sys/class/drm/card0/power/rc6_residency_ms:18328374 /sys/devices/system/cpu/cpuidle/current_driver:intel_idle /sys/devices/system/cpu/cpuidle/current_governor_ro:menu /sys/devices/system/cpu/cpuidle/low_power_idle_cpu_residency_us:7271198865 /sys/devices/system/cpu/cpuidle/low_power_idle_system_residency_us:69456920146 /sys/kernel/debug/pmc_core/slp_s0_residency_usec:66539729500 /sys/module/pcie_aspm/parameters/policy:[default] performance powersave powersupersave ...on this 7400 2-in-1 machine s2idle only lasts about max 48h before battery goes from full to completely depleted, and this is with some key peripherals disabled in BIOS. Some things are definitely still not low-powering themselves. Mario advised me he's able to s2idle a healthy laptop configuration (granted, another hardware model) for up to 2 weeks, even. There's more to uncover here, but it will take me some time to collect more data. We can leave the bug as closed, I'll add data and additional discoveries over time, and we'll see where it goes. Bug 202519 pointed out additional S0ix patches merged into 5.4.0-rc. This seems to have made a significant difference on my Latitude 7400 2-in-1 machine. I now have all devices enabled in BIOS, and s2idle still consumed only 5% battery over 7 hrs today, whereas it would be ~5% / hr before. Looking good \o/ If Wendy could help with diagnosing a s0ix regression with ICL 7390 at bug 215367, would be much appreciated. I can't figure it out. Hi Kraav, I've bring that issue to our Internal Power management forum meeting, David will look at that issue. |