Bug 202519
Description
RussianNeuroMancer
2019-02-06 08:23:51 UTC
Similar issue is reported in this comment: https://bugzilla.kernel.org/show_bug.cgi?id=199689#c75 Foolow up to: https://bugzilla.kernel.org/show_bug.cgi?id=201579#c40 > Have you tried enabling low power mode in BIOS, sometimes it signals EC to > reduce power but its not a universal implementation. There is no low power mode option in BIOS, as you can see: https://yadi.sk/d/xEtS5lXk8kKawg/018.jpg "Deep sleep" is probably should recommend OS to use S3 instead of S0ix (however, unlike HP Elite Folio G1, on HP Elite x2 1013 G3 enabling or disabling this option doesn't affect Linux behaviour in any way - S0ix is always used by default and "mem_sleep_default=deep" works even if "Deep sleep" checkbox is disabled). "Power control" is "enables the notebook to support power management applications such as IPM+" according to description in http://h10032.www1.hp.com/ctg/Manual/c05166986 (page 43). Added David Box. The counter values are very low for what you say was 3 hours of suspend time: ~# cat /sys/devices/system/cpu/cpuidle/low_power_idle_cpu_residency_us 1066365868 ~# cat /sys/devices/system/cpu/cpuidle/low_power_idle_system_residency_us 139486638 ~# cat /sys/kernel/debug/pmc_core/slp_s0_residency_usec 133628200 Each increment is 100us. According to these numbers the platform spent less than 18 minutes in low_power_idle_cpu_residency and less than 3 minutes in s0ix. That's assuming the counters were 0 when the test started. Please try a shorter 10 minute test and capture the above counters immediately before and after. Run the test with turbostat using the following command: turbostat -o /tmp/s2idle.txt -q -S echo freeze > /sys/power/state Post the counter values along with the turbostat file. If numbers indicate decent residency in s0ix (90+% of the time) then you can try longer runs. Otherwise don't bother. The higher power consumption is a symptom of the low residency. I will try to build 5.0rc6 with linux-tools today or tomorrow, and then will run ten minutes test. Created attachment 281119 [details] turbostat report on Linux 5.0rc6 with "ICL support and other enhancements for PMC Core" patch series > Please try a shorter 10 minute test and capture the above counters > immediately before and after. > Post the counter values along with the turbostat file. File is attached. ~# cat /sys/devices/system/cpu/cpuidle/low_power_idle_cpu_residency_us 0 ~# cat /sys/devices/system/cpu/cpuidle/low_power_idle_system_residency_us 0 ~# cat /sys/kernel/debug/pmc_core/slp_s0_residency_usec 0 ~# cat /sys/devices/system/cpu/cpuidle/low_power_idle_cpu_residency_us 598363147 ~# cat /sys/devices/system/cpu/cpuidle/low_power_idle_system_residency_us 86921085 ~# cat /sys/kernel/debug/pmc_core/slp_s0_residency_usec 83270400 > If numbers indicate decent residency in s0ix (90+% of the time) then you can > try longer runs. Otherwise don't bother. Seems like system was in S0ix for one and half minute. > The higher power consumption is a symptom of the low residency. Should I change bug title to "Low residency in S0ix while suspend on HP Elite x2 1013 G3"? Please do change the title. Looks like the system is waking out of s0ix during suspend. Hopefully this is something the kernel is catching. Try the following. You need CONFIG_PM_DEBUG enabled in your kernel. echo 1 > /sys/power/pm_debug_messages Do the same 10 minute test with turbostat. Send the same as before with the dmesg. Also send the output of /sys/kernel/debug/suspend_stats. This one continually increments so you'll also want to capture it before and after suspend. probably system does not enter s2idle_loop() please attach the dmesg output after resume. Hello Created attachment 281697 [details] turbostat report on Linux 5.0rc6 with "ICL support and other enhancements for PMC Core" patch series (2) Due to two issues I have with Linux 5.0 (Btrfs stability issue: https://www.phoronix.com/forums/forum/software/general-linux-open-source/1081905-the-most-interesting-highlights-to-the-linux-5-0-kernel?p=1081958#post1081958 and various USB stability issues) I had to rollback to Linux 4.20 and figure out how to continue testing without impact on stability of my system. I decided to setup separate partition specifically for testing S0ix and related issues like this one https://github.com/linrunner/TLP/issues/386 but doing so takes time, so unfortunately I answered much later than I had hoped. Sorry for that. Created attachment 281699 [details] dmesg of Linux 5.0rc6 with "ICL support and other enhancements for PMC Core" patch series and enabled pm_debug_messages > Please do change the title. Done. > Hopefully this is something the kernel is catching. Try the following. You > need CONFIG_PM_DEBUG enabled in your kernel. > echo 1 > /sys/power/pm_debug_messages > Do the same 10 minute test with turbostat. Send the same as before with the > dmesg. turbostat report attached to previous message. debug dmesg attached to this message. > Also send the output of /sys/kernel/debug/suspend_stats. This one continually > increments so you'll also want to capture it before and after suspend. ~# cat /sys/kernel/debug/suspend_stats success: 0 fail: 0 failed_freeze: 0 failed_prepare: 0 failed_suspend: 0 failed_suspend_late: 0 failed_suspend_noirq: 0 failed_resume: 0 failed_resume_early: 0 failed_resume_noirq: 0 failures: last_failed_dev: last_failed_errno: 0 0 last_failed_step: ~# cat /sys/kernel/debug/suspend_stats success: 1 fail: 0 failed_freeze: 0 failed_prepare: 0 failed_suspend: 0 failed_suspend_late: 0 failed_suspend_noirq: 0 failed_resume: 0 failed_resume_early: 0 failed_resume_noirq: 0 failures: last_failed_dev: last_failed_errno: 0 0 last_failed_step: > probably system does not enter s2idle_loop() > please attach the dmesg output after resume. Attached to previous message. Hello. Have you tried an updated BIOS from HP? Created attachment 283541 [details] dmesg of Linux 5.2rc7 with enabled pm_debug_messages, after BIOS update to Q87 Ver. 01.07.00 > Have you tried an updated BIOS from HP? Re-tested with BIOS Q87 Ver. 01.07.00 04/17/2019 and Linux 5.2rc7. Result is below, dmesg with enabled pm_debug_messages is attached. ~# cat /sys/devices/system/cpu/cpuidle/low_power_idle_cpu_residency_us 0 ~# cat /sys/devices/system/cpu/cpuidle/low_power_idle_system_residency_us 0 ~# cat /sys/kernel/debug/pmc_core/slp_s0_residency_usec 0 ~# cat /sys/kernel/debug/suspend_stats success: 0 fail: 0 failed_freeze: 0 failed_prepare: 0 failed_suspend: 0 failed_suspend_late: 0 failed_suspend_noirq: 0 failed_resume: 0 failed_resume_early: 0 failed_resume_noirq: 0 failures: last_failed_dev: last_failed_errno: 0 0 last_failed_step: ~# cat /sys/kernel/debug/pmc_core/mphy_core_lanes_power_gating_status MPHY CORE LANE 0 State: Power gated MPHY CORE LANE 1 State: Power gated MPHY CORE LANE 2 State: Not power gated MPHY CORE LANE 3 State: Power gated MPHY CORE LANE 4 State: Power gated MPHY CORE LANE 5 State: Power gated MPHY CORE LANE 6 State: Power gated MPHY CORE LANE 7 State: Power gated MPHY CORE LANE 8 State: Power gated MPHY CORE LANE 9 State: Power gated MPHY CORE LANE 10 State: Power gated MPHY CORE LANE 11 State: Power gated MPHY CORE LANE 12 State: Power gated MPHY CORE LANE 13 State: Power gated MPHY CORE LANE 14 State: Power gated MPHY CORE LANE 15 State: Power gated ~# systemctl suspend ~# cat /sys/devices/system/cpu/cpuidle/low_power_idle_cpu_residency_us 3877126185 ~# cat /sys/devices/system/cpu/cpuidle/low_power_idle_system_residency_us 43942588 ~# cat /sys/kernel/debug/pmc_core/slp_s0_residency_usec 42097000 ~# cat /sys/kernel/debug/suspend_stats success: 1 fail: 0 failed_freeze: 0 failed_prepare: 0 failed_suspend: 0 failed_suspend_late: 0 failed_suspend_noirq: 0 failed_resume: 0 failed_resume_early: 0 failed_resume_noirq: 0 failures: last_failed_dev: last_failed_errno: 0 0 last_failed_step: ~# Try setting "acpi.ec_no_wakeup=1". If this doesn't help to improve residency then try the attached patch. It should apply to Linux 5.2. Created attachment 283633 [details]
lps0 dsm disable patch
After building with this patch use "acpi.no_lps0=1" on the kernel command line at boot time (from grub menu) to enable.
> Try setting "acpi.ec_no_wakeup=1". After hour of suspend: ~# cat /sys/devices/system/cpu/cpuidle/low_power_idle_cpu_residency_us 3647822984 ~# cat /sys/devices/system/cpu/cpuidle/low_power_idle_system_residency_us 3637572860 ~# cat /sys/kernel/debug/pmc_core/slp_s0_residency_usec 3484794800 Do I still need to try patch from Comment #17 ? If the patch works it's a more preferable solution to disabling EC wake events. So actually please do test it. As far as the root cause the platform is continuing to get interrupt events from the EC. They are not wake events so the kernel ignores them and goes back into idle. But it is happening often enough that it's lowering s0ix residency on your system and increasing power consumption. This is a known issue. We didn't expect it on this bug because usually the issue doesn't allow a system to get any s0ix residency at all. The work around for now is to use this parameter to disable EC wake events (assuming the patch doesn't work). However this will disable being able to wake from lid open events and possibly the keyboard. You'll need to use the power button to wake from s0ix in this case. If the patch works it's a more preferable solution to disabling EC wake events. So actually please do test it. As far as the root cause the platform is continuing to get interrupt events from the EC. They are not wake events so the kernel ignores them and goes back into idle. But it is happening often enough that it's lowering s0ix residency on your system and increasing power consumption. This is a known issue. We didn't expect it on this bug because usually the issue doesn't allow a system to get any s0ix residency at all. The work around for now is to use this parameter to disable EC wake events (assuming the patch doesn't work). However this will disable being able to wake from lid open events and possibly the keyboard. You'll need to use the power button to wake from s0ix in this case. > So actually please do test it. Ok, I will try to test it tomorrow. Please clarify, as I understand now I should test patch without acpi.ec_no_wakeup=1, right? > You'll need to use the power button to wake from s0ix in this case. Wakeup by power button from s0ix is kind of problematic ever right now, and by lid event too: bug 201575 (also check comments 6 and 7). (In reply to RussianNeuroMancer from comment #21) > > So actually please do test it. > > Ok, I will try to test it tomorrow. > > Please clarify, as I understand now I should test patch without > acpi.ec_no_wakeup=1, right? That's right. > > > You'll need to use the power button to wake from s0ix in this case. > > Wakeup by power button from s0ix is kind of problematic ever right now, and > by lid event too: bug 201575 (also check comments 6 and 7). Either of these workarounds may alter this behavior so you'll have to reevaluate it. > That's right. Sorry for delay. I was able to build patched kernel next day, but didn't had a chance to properly test S0ix until today. Results of testing patched kernel after one hour of suspend, without acpi.ec_no_wakeup=1: ~# cat /sys/devices/system/cpu/cpuidle/low_power_idle_cpu_residency_us 3645562431 ~# cat /sys/devices/system/cpu/cpuidle/low_power_idle_system_residency_us 15739561 ~# cat /sys/kernel/debug/pmc_core/slp_s0_residency_usec 15078500 ~# cat /sys/kernel/debug/suspend_stats success: 1 fail: 0 failed_freeze: 0 failed_prepare: 0 failed_suspend: 0 failed_suspend_late: 0 failed_suspend_noirq: 0 failed_resume: 0 failed_resume_early: 0 failed_resume_noirq: 0 failures: last_failed_dev: last_failed_errno: 0 0 last_failed_step: > Either of these workarounds may alter this behavior so you'll have to > reevaluate it. Specifically on patched kernel without acpi.ec_no_wakeup=1 behaviour didn't changed. > Results of testing patched
> kernel after one hour of suspend, without acpi.ec_no_wakeup=1:
That kernel patch actually requires a different command line parameter to take effect, "acpi.no_lps0=1". Sorry I didn't just make it the default behavior. This has to be set at boot time. When enabled you'll see "Applying SLP_S0 BIOS quirk" in dmesg.
Created attachment 283947 [details]
dmesg of Linux 5.2rc7 with lps0 dsm disable patch and acpi.no_lps0=1 boot option
Below is result of suspend for 74 minutes. In attached dmesg you can verify that acpi.no_lps0 in enabled and quirk message is present.
~# cat /sys/devices/system/cpu/cpuidle/low_power_idle_cpu_residency_us
4422257559
~# cat /sys/devices/system/cpu/cpuidle/low_power_idle_system_residency_us
0
~# cat /sys/kernel/debug/pmc_core/slp_s0_residency_usec
0
~# cat /sys/kernel/debug/suspend_stats
success: 1
fail: 0
failed_freeze: 0
failed_prepare: 0
failed_suspend: 0
failed_suspend_late: 0
failed_suspend_noirq: 1
failed_resume: 0
failed_resume_early: 0
failed_resume_noirq: 0
failures:
last_failed_dev:
last_failed_errno: 0
0
last_failed_step: suspend_noirq
I also want to note that with acpi.no_lps0=1 it's easy to wakeup tablet one hour later after suspend (with stylus eraser button for example) but it's actually difficult to wakeup tablet few minutes after suspend. In latter case opening lid, pressing power button, pressing any key on keyboard, pressing eraser button on stylus does not wakeup tablet immediately. Instead different attempts to wakeup tablet work out minutes later. suspend_stats contain this after two such wakeup attempts (I not sure why failed_suspend is 4)
~# cat /sys/kernel/debug/suspend_stats
success: 1
fail: 4
failed_freeze: 0
failed_prepare: 0
failed_suspend: 4
failed_suspend_late: 0
failed_suspend_noirq: 1
failed_resume: 0
failed_resume_early: 0
failed_resume_noirq: 0
failures:
last_failed_dev:
last_failed_errno: -16
-16
last_failed_step: suspend
suspend
Another interesting observation is that usually power button led is glowing when tablet is suspended, however in this cases when it was difficult to wakeup tablet, power button led just lights up, like tablet is powered on and doesn't sleep (so probably it's failed to suspend in these two cases).
Please let me know what additional information I can provide.
There is couple of new issues with suspend freeze on this tablet: Bug 204717 https://bugzilla.kernel.org/show_bug.cgi?id=201575#c9 I find that Bug 204717 is reproducible only if tablet was suspended via power button. If it was suspended via lid event there is no spontaneously wakeup. So I can continue with testing of further patches. Hello, Patches from Rafael were merged during 5.4rc1 that improve s0ix residency on some systems. Can you try the latest kernel. Okay, I will try to test Linux 5.4 as soon as I get access to hardware again, which could be next month. > Patches from Rafael were merged during 5.4rc1 that improve s0ix residency on > some systems. Can you try the latest kernel. Hi David. I may be seeing significant S0ix residency improvement with 5.4.0-rc4 on Dell Latitude 7400, bug 204867. Is it possible to point out some specific 5.4 patches to look into for more details and understand what might be the underlying cause? I'm not sure I can spot all relevant ones on my own. This one from Rafael simplified the overall S2Idle flow and laid foundation for better S2Idle S0ix residency. https://lore.kernel.org/lkml/71085220.z6FKkvYQPX@kreacher/ @RussianNeuroMancer what is the status of this issue on the latest upstream kernel? Unfortunately, I was able to get this hardware on hands again just a couple of weeks ago (not in November, as I hoped) so I wasn't able to answer earlier. If issue described in Comment 9 here https://bugzilla.kernel.org/show_bug.cgi?id=201575#c9 is fixed by now I will be able to check this issue again next month. Tested Linux 5.9rc6 - no positive changes. What info is needed? Is the status the same on 5.12? Can't check yet, and probably won't be able to do so at least for a few months. Hi @russianneuromancer, Can you please provide the latest status for this issue? Unfortunately I don't have access to this hardware anymore. Since there's no longer access to this hardware we will close this as rejected. Feel free to reopen if you get the hardware again and still see the issue with the latest kernel. |