Bug 202519

Summary: S0ix: Low residency in S0ix while suspend - HP Elite x2 1013 G3 (KBL)
Product: Power Management Reporter: RussianNeuroMancer (russianneuromancer)
Component: Hibernation/SuspendAssignee: David Box (david.e.box)
Status: REJECTED INSUFFICIENT_DATA    
Severity: normal CC: david.e.box, IdaWallace89, kernel-NTEO, leho, rajneesh.bhardwaj, rajvi.jingar, rui.zhang
Priority: P1    
Hardware: Intel   
OS: Linux   
Kernel Version: 5.0rc4 Subsystem:
Regression: No Bisected commit-id:
Attachments: turbostat report on Linux 5.0rc6 with "ICL support and other enhancements for PMC Core" patch series
turbostat report on Linux 5.0rc6 with "ICL support and other enhancements for PMC Core" patch series (2)
dmesg of Linux 5.0rc6 with "ICL support and other enhancements for PMC Core" patch series and enabled pm_debug_messages
dmesg of Linux 5.2rc7 with enabled pm_debug_messages, after BIOS update to Q87 Ver. 01.07.00
lps0 dsm disable patch
dmesg of Linux 5.2rc7 with lps0 dsm disable patch and acpi.no_lps0=1 boot option

Description RussianNeuroMancer 2019-02-06 08:23:51 UTC
Hello!

After resolving Bug 201579 HP Elite x2 1013 G3 now can successfully reach S0ix suspend state, however power consumption during S0ix is still much higher than with preinstalled OS, as you can see in following comments: 

Windows 10: https://bugzilla.kernel.org/show_bug.cgi?id=201579#c14
Linux 5.0*: https://bugzilla.kernel.org/show_bug.cgi?id=201579#c39

* Linux 5.0rc4 with https://patchwork.kernel.org/patch/10714257/ and https://patchwork.kernel.org/project/platform-driver-x86/list/?series=74547

Additional information about hardware is available in Bug 201579.
Comment 1 RussianNeuroMancer 2019-02-06 08:46:46 UTC
Similar issue is reported in this comment: https://bugzilla.kernel.org/show_bug.cgi?id=199689#c75
Comment 2 RussianNeuroMancer 2019-02-06 10:53:08 UTC
Foolow up to: https://bugzilla.kernel.org/show_bug.cgi?id=201579#c40

> Have you tried enabling low power mode in BIOS, sometimes it signals EC to
> reduce power but its not a universal implementation.

There is no low power mode option in BIOS, as you can see: https://yadi.sk/d/xEtS5lXk8kKawg/018.jpg

"Deep sleep" is probably should recommend OS to use S3 instead of S0ix (however, unlike HP Elite Folio G1, on HP Elite x2 1013 G3 enabling or disabling this option doesn't affect Linux behaviour in any way - S0ix is always used by default and "mem_sleep_default=deep" works even if "Deep sleep" checkbox is disabled).
"Power control" is "enables the notebook to support power management applications such as IPM+" according to description in http://h10032.www1.hp.com/ctg/Manual/c05166986 (page 43).
Comment 3 Rajneesh Bhardwaj 2019-02-06 19:02:22 UTC
Added David Box.
Comment 4 David Box 2019-02-11 23:54:44 UTC
The counter values are very low for what you say was 3 hours of suspend time:

~# cat /sys/devices/system/cpu/cpuidle/low_power_idle_cpu_residency_us
1066365868
~# cat /sys/devices/system/cpu/cpuidle/low_power_idle_system_residency_us
139486638
~# cat /sys/kernel/debug/pmc_core/slp_s0_residency_usec
133628200

Each increment is 100us. According to these numbers the platform spent less than 18 minutes in low_power_idle_cpu_residency and less than 3 minutes in s0ix. That's assuming the counters were 0 when the test started. Please try a shorter 10 minute test and capture the above counters immediately before and after. Run the test with turbostat using the following command:

turbostat -o /tmp/s2idle.txt -q -S echo freeze > /sys/power/state

Post the counter values along with the turbostat file. If numbers indicate decent residency in s0ix (90+% of the time) then you can try longer runs. Otherwise don't bother. The higher power consumption is a symptom of the low residency.
Comment 5 RussianNeuroMancer 2019-02-12 04:45:44 UTC
I will try to build 5.0rc6 with linux-tools today or tomorrow, and then will run ten minutes test.
Comment 6 RussianNeuroMancer 2019-02-13 03:46:22 UTC
Created attachment 281119 [details]
turbostat report on Linux 5.0rc6 with "ICL support and other enhancements for PMC Core" patch series

> Please try a shorter 10 minute test and capture the above counters
> immediately before and after.

> Post the counter values along with the turbostat file. 

File is attached.

~# cat /sys/devices/system/cpu/cpuidle/low_power_idle_cpu_residency_us
0
~# cat /sys/devices/system/cpu/cpuidle/low_power_idle_system_residency_us
0
~# cat /sys/kernel/debug/pmc_core/slp_s0_residency_usec
0
~# cat /sys/devices/system/cpu/cpuidle/low_power_idle_cpu_residency_us
598363147
~# cat /sys/devices/system/cpu/cpuidle/low_power_idle_system_residency_us
86921085
~# cat /sys/kernel/debug/pmc_core/slp_s0_residency_usec
83270400

> If numbers indicate decent residency in s0ix (90+% of the time) then you can
> try longer runs. Otherwise don't bother. 

Seems like system was in S0ix for one and half minute.

> The higher power consumption is a symptom of the low residency.

Should I change bug title to "Low residency in S0ix while suspend on HP Elite x2 1013 G3"?
Comment 7 David Box 2019-02-21 21:41:43 UTC
Please do change the title. Looks like the system is waking out of s0ix during suspend. Hopefully this is something the kernel is catching. Try the following. You need CONFIG_PM_DEBUG enabled in your kernel.

echo 1 > /sys/power/pm_debug_messages

Do the same 10 minute test with turbostat. Send the same as before with the dmesg. Also send the output of /sys/kernel/debug/suspend_stats. This one continually increments so you'll also want to capture it before and after suspend.
Comment 8 Zhang Rui 2019-03-11 03:04:54 UTC
probably system does not enter s2idle_loop()
please attach the dmesg output after resume.
Comment 9 RussianNeuroMancer 2019-03-11 11:01:53 UTC
Hello
Comment 10 RussianNeuroMancer 2019-03-11 11:03:36 UTC
Created attachment 281697 [details]
turbostat report on Linux 5.0rc6 with "ICL support and other enhancements for PMC Core" patch series (2)

Due to two issues I have with Linux 5.0 (Btrfs stability issue: https://www.phoronix.com/forums/forum/software/general-linux-open-source/1081905-the-most-interesting-highlights-to-the-linux-5-0-kernel?p=1081958#post1081958 and various USB stability issues) I had to rollback to Linux 4.20 and figure out how to continue testing without impact on stability of my system. I decided to setup separate partition specifically for testing S0ix and related issues like this one https://github.com/linrunner/TLP/issues/386 but doing so takes time, so unfortunately I answered much later than I had hoped. Sorry for that.
Comment 11 RussianNeuroMancer 2019-03-11 11:06:07 UTC
Created attachment 281699 [details]
dmesg of Linux 5.0rc6 with "ICL support and other enhancements for PMC Core" patch series and enabled pm_debug_messages

> Please do change the title.

Done.

> Hopefully this is something the kernel is catching. Try the following. You
> need CONFIG_PM_DEBUG enabled in your kernel.
> echo 1 > /sys/power/pm_debug_messages
> Do the same 10 minute test with turbostat. Send the same as before with the
> dmesg. 

turbostat report attached to previous message. debug dmesg attached to this message.
Comment 12 RussianNeuroMancer 2019-03-11 11:13:29 UTC
> Also send the output of /sys/kernel/debug/suspend_stats. This one continually
> increments so you'll also want to capture it before and after suspend.

~# cat /sys/kernel/debug/suspend_stats
success: 0
fail: 0
failed_freeze: 0
failed_prepare: 0
failed_suspend: 0
failed_suspend_late: 0
failed_suspend_noirq: 0
failed_resume: 0
failed_resume_early: 0
failed_resume_noirq: 0
failures:
  last_failed_dev:	
			
  last_failed_errno:	0
			0
  last_failed_step:	
			
~# cat /sys/kernel/debug/suspend_stats
success: 1
fail: 0
failed_freeze: 0
failed_prepare: 0
failed_suspend: 0
failed_suspend_late: 0
failed_suspend_noirq: 0
failed_resume: 0
failed_resume_early: 0
failed_resume_noirq: 0
failures:
  last_failed_dev:	
			
  last_failed_errno:	0
			0
  last_failed_step:	

> probably system does not enter s2idle_loop()
> please attach the dmesg output after resume.

Attached to previous message.
Comment 14 David Box 2019-07-02 14:42:27 UTC
Hello. Have you tried an updated BIOS from HP?
Comment 15 RussianNeuroMancer 2019-07-04 12:22:40 UTC
Created attachment 283541 [details]
dmesg of Linux 5.2rc7 with enabled pm_debug_messages, after BIOS update to Q87 Ver. 01.07.00

> Have you tried an updated BIOS from HP?

Re-tested with BIOS Q87 Ver. 01.07.00 04/17/2019 and Linux 5.2rc7. Result is below, dmesg with enabled pm_debug_messages is attached.

~# cat /sys/devices/system/cpu/cpuidle/low_power_idle_cpu_residency_us
0
~# cat /sys/devices/system/cpu/cpuidle/low_power_idle_system_residency_us
0
~# cat /sys/kernel/debug/pmc_core/slp_s0_residency_usec
0
~# cat /sys/kernel/debug/suspend_stats
success: 0
fail: 0
failed_freeze: 0
failed_prepare: 0
failed_suspend: 0
failed_suspend_late: 0
failed_suspend_noirq: 0
failed_resume: 0
failed_resume_early: 0
failed_resume_noirq: 0
failures:
  last_failed_dev:	
			
  last_failed_errno:	0
			0
  last_failed_step:	
			
~# cat /sys/kernel/debug/pmc_core/mphy_core_lanes_power_gating_status
MPHY CORE LANE 0                	State: Power gated
MPHY CORE LANE 1                	State: Power gated
MPHY CORE LANE 2                	State: Not power gated
MPHY CORE LANE 3                	State: Power gated
MPHY CORE LANE 4                	State: Power gated
MPHY CORE LANE 5                	State: Power gated
MPHY CORE LANE 6                	State: Power gated
MPHY CORE LANE 7                	State: Power gated
MPHY CORE LANE 8                	State: Power gated
MPHY CORE LANE 9                	State: Power gated
MPHY CORE LANE 10               	State: Power gated
MPHY CORE LANE 11               	State: Power gated
MPHY CORE LANE 12               	State: Power gated
MPHY CORE LANE 13               	State: Power gated
MPHY CORE LANE 14               	State: Power gated
MPHY CORE LANE 15               	State: Power gated
~# systemctl suspend
~# cat /sys/devices/system/cpu/cpuidle/low_power_idle_cpu_residency_us
3877126185
~# cat /sys/devices/system/cpu/cpuidle/low_power_idle_system_residency_us
43942588
~# cat /sys/kernel/debug/pmc_core/slp_s0_residency_usec
42097000
~# cat /sys/kernel/debug/suspend_stats
success: 1
fail: 0
failed_freeze: 0
failed_prepare: 0
failed_suspend: 0
failed_suspend_late: 0
failed_suspend_noirq: 0
failed_resume: 0
failed_resume_early: 0
failed_resume_noirq: 0
failures:
  last_failed_dev:	
			
  last_failed_errno:	0
			0
  last_failed_step:	
			
~#
Comment 16 David Box 2019-07-11 21:18:50 UTC
Try setting "acpi.ec_no_wakeup=1". If this doesn't help to improve residency then try the attached patch. It should apply to Linux 5.2.
Comment 17 David Box 2019-07-11 21:24:26 UTC
Created attachment 283633 [details]
lps0 dsm disable patch

After building with this patch use "acpi.no_lps0=1" on the kernel command line at boot time (from grub menu) to enable.
Comment 18 RussianNeuroMancer 2019-07-16 11:32:29 UTC
> Try setting "acpi.ec_no_wakeup=1". 

After hour of suspend:

~# cat /sys/devices/system/cpu/cpuidle/low_power_idle_cpu_residency_us
3647822984
~# cat /sys/devices/system/cpu/cpuidle/low_power_idle_system_residency_us
3637572860
~# cat /sys/kernel/debug/pmc_core/slp_s0_residency_usec
3484794800

Do I still need to try patch from Comment #17 ?
Comment 19 David Box 2019-07-17 17:52:57 UTC
If the patch works it's a more preferable solution to disabling EC wake events. So actually please do test it.

As far as the root cause the platform is continuing to get interrupt events from the EC. They are not wake events so the kernel ignores them and goes back into idle. But it is happening often enough that it's lowering s0ix residency on your system and increasing power consumption.

This is a known issue. We didn't expect it on this bug because usually the issue doesn't allow a system to get any s0ix residency at all. The work around for now is to use this parameter to disable EC wake events (assuming the patch doesn't work). However this will disable being able to wake from lid open events and possibly the keyboard. You'll need to use the power button to wake from s0ix in this case.
Comment 20 David Box 2019-07-17 17:55:19 UTC
If the patch works it's a more preferable solution to disabling EC wake events. So actually please do test it.

As far as the root cause the platform is continuing to get interrupt events from the EC. They are not wake events so the kernel ignores them and goes back into idle. But it is happening often enough that it's lowering s0ix residency on your system and increasing power consumption.

This is a known issue. We didn't expect it on this bug because usually the issue doesn't allow a system to get any s0ix residency at all. The work around for now is to use this parameter to disable EC wake events (assuming the patch doesn't work). However this will disable being able to wake from lid open events and possibly the keyboard. You'll need to use the power button to wake from s0ix in this case.
Comment 21 RussianNeuroMancer 2019-07-17 18:26:37 UTC
> So actually please do test it.

Ok, I will try to test it tomorrow.

Please clarify, as I understand now I should test patch without acpi.ec_no_wakeup=1, right?

> You'll need to use the power button to wake from s0ix in this case.

Wakeup by power button from s0ix is kind of problematic ever right now, and by lid event too: bug 201575 (also check comments 6 and 7).
Comment 22 David Box 2019-07-17 20:30:33 UTC
(In reply to RussianNeuroMancer from comment #21)
> > So actually please do test it.
> 
> Ok, I will try to test it tomorrow.
> 
> Please clarify, as I understand now I should test patch without
> acpi.ec_no_wakeup=1, right?

That's right.

> 
> > You'll need to use the power button to wake from s0ix in this case.
> 
> Wakeup by power button from s0ix is kind of problematic ever right now, and
> by lid event too: bug 201575 (also check comments 6 and 7).

Either of these workarounds may alter this behavior so you'll have to reevaluate it.
Comment 23 RussianNeuroMancer 2019-07-24 03:35:11 UTC
> That's right.

Sorry for delay. I was able to build patched kernel next day, but didn't had a chance to properly test S0ix until today. Results of testing patched kernel after one hour of suspend, without acpi.ec_no_wakeup=1:

~# cat /sys/devices/system/cpu/cpuidle/low_power_idle_cpu_residency_us
3645562431
~# cat /sys/devices/system/cpu/cpuidle/low_power_idle_system_residency_us
15739561
~# cat /sys/kernel/debug/pmc_core/slp_s0_residency_usec
15078500
~# cat /sys/kernel/debug/suspend_stats
success: 1
fail: 0
failed_freeze: 0
failed_prepare: 0
failed_suspend: 0
failed_suspend_late: 0
failed_suspend_noirq: 0
failed_resume: 0
failed_resume_early: 0
failed_resume_noirq: 0
failures:
  last_failed_dev:	
			
  last_failed_errno:	0
			0
  last_failed_step:	

> Either of these workarounds may alter this behavior so you'll have to
> reevaluate it.

Specifically on patched kernel without acpi.ec_no_wakeup=1 behaviour didn't changed.
Comment 24 David Box 2019-07-24 21:39:53 UTC
> Results of testing patched
> kernel after one hour of suspend, without acpi.ec_no_wakeup=1:

That kernel patch actually requires a different command line parameter to take effect, "acpi.no_lps0=1". Sorry I didn't just make it the default behavior. This has to be set at boot time. When enabled you'll see "Applying SLP_S0 BIOS quirk" in dmesg.
Comment 25 RussianNeuroMancer 2019-07-25 04:51:12 UTC
Created attachment 283947 [details]
dmesg of Linux 5.2rc7 with lps0 dsm disable patch and acpi.no_lps0=1 boot option

Below is result of suspend for 74 minutes. In attached dmesg you can verify that acpi.no_lps0 in enabled and quirk message is present. 

~# cat /sys/devices/system/cpu/cpuidle/low_power_idle_cpu_residency_us
4422257559
~# cat /sys/devices/system/cpu/cpuidle/low_power_idle_system_residency_us
0
~# cat /sys/kernel/debug/pmc_core/slp_s0_residency_usec
0
~# cat /sys/kernel/debug/suspend_stats
success: 1
fail: 0
failed_freeze: 0
failed_prepare: 0
failed_suspend: 0
failed_suspend_late: 0
failed_suspend_noirq: 1
failed_resume: 0
failed_resume_early: 0
failed_resume_noirq: 0
failures:
  last_failed_dev:	
			
  last_failed_errno:	0
			0
  last_failed_step:	suspend_noirq

I also want to note that with acpi.no_lps0=1 it's easy to wakeup tablet one hour later after suspend (with stylus eraser button for example) but it's actually difficult to wakeup tablet few minutes after suspend. In latter case opening lid, pressing power button, pressing any key on keyboard, pressing eraser button on stylus does not wakeup tablet immediately. Instead different attempts to wakeup tablet work out minutes later. suspend_stats contain this after two such wakeup attempts (I not sure why failed_suspend is 4)

~# cat /sys/kernel/debug/suspend_stats
success: 1
fail: 4
failed_freeze: 0
failed_prepare: 0
failed_suspend: 4
failed_suspend_late: 0
failed_suspend_noirq: 1
failed_resume: 0
failed_resume_early: 0
failed_resume_noirq: 0
failures:
  last_failed_dev:	
			
  last_failed_errno:	-16
			-16
  last_failed_step:	suspend
			suspend
 
Another interesting observation is that usually power button led is glowing when tablet is suspended, however in this cases when it was difficult to wakeup tablet, power button led just lights up, like tablet is powered on and doesn't sleep (so probably it's failed to suspend in these two cases).

Please let me know what additional information I can provide.
Comment 26 RussianNeuroMancer 2019-08-27 15:27:52 UTC
There is couple of new issues with suspend freeze on this tablet:

Bug 204717

https://bugzilla.kernel.org/show_bug.cgi?id=201575#c9
Comment 27 RussianNeuroMancer 2019-09-09 11:34:59 UTC
I find that Bug 204717 is reproducible only if tablet was suspended via power button. If it was suspended via lid event there is no spontaneously wakeup. So I can continue with testing of further patches.
Comment 28 David Box 2019-10-23 20:23:16 UTC
Hello,

Patches from Rafael were merged during 5.4rc1 that improve s0ix residency on some systems. Can you try the latest kernel.
Comment 29 RussianNeuroMancer 2019-10-25 04:58:25 UTC
Okay, I will try to test Linux 5.4 as soon as I get access to hardware again, which could be next month.
Comment 30 Leho Kraav 2019-10-25 18:14:52 UTC
> Patches from Rafael were merged during 5.4rc1 that improve s0ix residency on
> some systems. Can you try the latest kernel.

Hi David. I may be seeing significant S0ix residency improvement with 5.4.0-rc4 on Dell Latitude 7400, bug 204867.

Is it possible to point out some specific 5.4 patches to look into for more details and understand what might be the underlying cause?

I'm not sure I can spot all relevant ones on my own.
Comment 31 Rajneesh Bhardwaj 2019-10-29 13:46:52 UTC
This one from Rafael simplified the overall S2Idle flow and laid foundation for better S2Idle S0ix residency.

https://lore.kernel.org/lkml/71085220.z6FKkvYQPX@kreacher/
Comment 32 Zhang Rui 2020-06-29 07:47:44 UTC
@RussianNeuroMancer 
what is the status of this issue on the latest upstream kernel?
Comment 33 RussianNeuroMancer 2020-06-29 11:14:29 UTC
Unfortunately, I was able to get this hardware on hands again just a couple of weeks ago (not in November, as I hoped) so I wasn't able to answer earlier.

If issue described in Comment 9 here https://bugzilla.kernel.org/show_bug.cgi?id=201575#c9 is fixed by now I will be able to check this issue again next month.
Comment 34 RussianNeuroMancer 2020-09-26 20:49:46 UTC
Tested Linux 5.9rc6 - no positive changes.
Comment 35 RussianNeuroMancer 2021-04-02 11:39:28 UTC
What info is needed?
Comment 36 David Box 2021-04-02 20:28:38 UTC
Is the status the same on 5.12?
Comment 37 RussianNeuroMancer 2021-05-22 10:00:31 UTC
Can't check yet, and probably won't be able to do so at least for a few months.
Comment 38 Rajvi Jingar 2022-07-19 22:21:41 UTC
Hi @russianneuromancer, Can you please provide the latest status for this issue?
Comment 39 RussianNeuroMancer 2022-08-17 16:20:08 UTC
Unfortunately I don't have access to this hardware anymore.
Comment 40 David Box 2022-10-21 16:04:00 UTC
Since there's no longer access to this hardware we will close this as rejected. Feel free to reopen if you get the hardware again and still see the issue with the latest kernel.