If, for whatever reason, some CPUs are offline during a suspend resume cycle, they end up just spinning and consuming a lot of power after the resume. This issue has existed for at least a couple of few years now. It is as though they were forgotten about and nothing told them to be off-line during the resume. This bug report replaces bug 80651, because the Original Poster keeps setting that one to closed, and we don't want to lose track of the issue. See that bug report for some of the history. Example: 1.) Before (edited for readability): $ sudo turbostat -S --debug sleep 10 Avg_MHz Busy% Bzy_MHz TSC_MHz CPU%c6 PkgWatt 1 0.08 1615 3411 99.66 3.81 2.) Take some cores off-line (this is an i7-2700K): # echo -n 0 > /sys/devices/system/cpu/cpu1/online # echo -n 0 > /sys/devices/system/cpu/cpu2/online # echo -n 0 > /sys/devices/system/cpu/cpu3/online # echo -n 0 > /sys/devices/system/cpu/cpu5/online # echo -n 0 > /sys/devices/system/cpu/cpu6/online # echo -n 0 > /sys/devices/system/cpu/cpu7/online # cat /sys/devices/system/cpu/cpu*/online 0 0 0 1 0 0 0 3.) Do a suspend / resume cycle: # echo mem > /sys/power/state 4.) Check now (edited for readability): $ sudo turbostat -S --debug sleep 10 Avg_MHz Busy% Bzy_MHz TSC_MHz CPU%c6 PkgWatt 23 0.67 3403 3411 98.61 35.97 5.) Bring the CPUs back online: # echo mem > /sys/power/state # echo -n 1 > /sys/devices/system/cpu/cpu1/online # echo -n 1 > /sys/devices/system/cpu/cpu2/online # echo -n 1 > /sys/devices/system/cpu/cpu3/online # echo -n 1 > /sys/devices/system/cpu/cpu5/online # echo -n 1 > /sys/devices/system/cpu/cpu6/online # echo -n 1 > /sys/devices/system/cpu/cpu7/online # cat /sys/devices/system/cpu/cpu*/online 1 1 1 1 1 1 1 6.) Check now (edited for readability): Avg_MHz Busy% Bzy_MHz TSC_MHz CPU%c6 PkgWatt 0 0.03 1651 3410 99.86 4.03
By the way, this same issue exists if I use the acpi-cpufreq CPU frequency scaling driver instead of the intel_pstate CPU frequency scaling driver. So while I copied the component settings from the other bug report, this actually doesn't appear to be intel_pstate specific. @Chen Yu: From your comment 38 in the old bug report (https://bugzilla.kernel.org/show_bug.cgi?id=80651#c38), am I to assume the problem does not exist on your computer? i.e. is the issue perhaps hardware and/or distro specific?
(In reply to Doug Smythies from comment #1) > By the way, this same issue exists if I use the acpi-cpufreq CPU frequency > scaling driver instead of the intel_pstate CPU frequency scaling driver. So > while I copied the component settings from the other bug report, this > actually doesn't appear to be intel_pstate specific. > > @Chen Yu: From your comment 38 in the old bug report > (https://bugzilla.kernel.org/show_bug.cgi?id=80651#c38), am I to assume the > problem does not exist on your computer? i.e. is the issue perhaps hardware > and/or distro specific? @Doug: Previously I didn't notice the problem on my laptop, and this is interesting and is this problem still reproducible? I think we can leverage intel_pstate trace both before/after suspend to figure it out whether it is a hw issue or a software bug?
(In reply to Chen Yu from comment #2) > @Doug: Previously I didn't notice the problem on my laptop, and this is > interesting and is this problem still reproducible? I think we can leverage > intel_pstate trace both before/after suspend to figure it out whether it is > a hw issue or a software bug? @Chen Yu. Yes, the problem is still reproducible with Kernel 4.7. Myself, I don't think we can learn anything new from intel_pstate trace data. However, I did a before and after suspend trace with 6 of 8 CPUs offline (3 of 4 cores). The trace data was consistent with what we already know, after the suspend the CPUs that are supposed to be offline are spinning away. I assume their vote into the PLL is what is holding the CPU frequency high, although the high frequency also messes up the intel_pstate driver math for the 2 CPUs that are still online. PState being given: $ sudo rdmsr --bitfield 15:8 -d -a 0x198 35 35 PState being asked for: $ sudo rdmsr --bitfield 15:8 -d -a 0x199 38 16 For reference (excerpt from turbostat): cpu4: MSR_TURBO_RATIO_LIMIT: 0x23242526 35 * 100 = 3500 MHz max turbo 4 active cores 36 * 100 = 3600 MHz max turbo 3 active cores 37 * 100 = 3700 MHz max turbo 2 active cores 38 * 100 = 3800 MHz max turbo 1 active cores i.e. the processor thinks all 4 cores are online, but the kernel thinks only 1 core is online.
This issue is still reproducible with Kernel 4.10-rc2. The status is "NEEDINFO" but I don't know what info further is required.
This issue is still reproducible with Kernel 4.11-rc2.
> # cat /sys/devices/system/cpu/cpu*/online > 0 > 0 > 0 > 1 > 0 > 0 > 0 Is cpu0 offline in this experiment? Why are there 7 rows above, instead of 8?
(ignore comment #6, didn't realize that cpu0 had no /online file)
I see this issue also, on my Skylake desktop, running upstream 4.11-rc5. to repeat: offline some processors, suspend, resume, and with turbostat, observe that: 1. there is no longer any package C-state residency 2. PkgWatt of the active idle system is over 6W, compared to 1W before suspend 3. active idle Bzy_Mhz of the available CPU(s) is over 3GHz, as compared to 800 Mhz before the suspend. 4. as reported above, the max turbo frequency is impacted. eg. on my machine the max turbo for 4,3,2,1 cores is 3600,3700,3800,3900 so a spinloop with 1 cpu online before the suspend runs at 3900, but after the resume, it runs at 3600. This looks like a BIOS bug. Upon resume from S3, the BIOS should put offline processors in c6, just like it (correctly) does on boot. For Linux to work around this bug, it will have to get them out of the BIOS with a Linux online, and then re-offline them to get back into the user-requested configuration. Unclear if we'd have to do that always on all systems, or if there is a way to detect when such a workaround is needed. That said... In testing this issue I noticed that Ubuntu 15.10 ships with a systemd configuration that uses cpusets. cpusets are broken by manual ofline/online, and so if you offline/online as above, you'll find that no tasks will run on the newly online'd processors.
(In reply to Len Brown from comment #8) > That said... > In testing this issue I noticed that Ubuntu 15.10 Ubuntu 15.10 is past End of Life. > ships with a systemd > configuration that uses cpusets. cpusets are broken by manual ofline/online, > and so if you offline/online as above, you'll find that no tasks > will run on the newly online'd processors. That is not consistent with my findings. Once I online the CPUs, and as far as I have ever been able to determine, everything is normal, including scheduling. (I use Ubuntu 16.04 LTS on my test server.)
This issue is still reproducible with Kernel 4.13-rc1.
Reproduced on my HP KBL platform using 4.14 kernel. The HWP is enabled on this platform. For comparison, I've tested pm_test set to core, and this issue does not appear. That is to say, the BIOS might have done something strange across S3. I noticed the following description regarding IA32_HWP_REQUEST Register in SDM: Maximum_Performance (bits 15:8, RW) — Excursions above the limit requested by OS are possible due to hardware coordination between the processor cores and other components in the package. although I don't have clue what does the coordination mean here. @Doug, I think you are using non-HWP mode, right?
(In reply to Chen Yu from comment #11) > > @Doug, I think you are using non-HWP mode, right? Correct. My older i7-2600K processor does not have HWP.
Chen Yu: - First check HWP_REQUEST_MSR values on all the cpus. - cat /sys/class/drm/card0/gt* It is possible that graphics is forcing higher frequencies.
(In reply to Srinivas Pandruvada from comment #13) > Chen Yu: > - First check HWP_REQUEST_MSR values on all the cpus. How? The kernel thinks the CPUs are offline (but actually they are not) and so we can not inquire as to the state of any MSRs, at least I have not been able to figure out how. > - > cat /sys/class/drm/card0/gt* > > It is possible that graphics is forcing higher frequencies. at least in my case, it isn't. I get the exact same numbers before and after creating this problem: doug@s15:~/temp-k-git/linux$ grep . /sys/class/drm/card0/gt* /sys/class/drm/card0/gt_act_freq_mhz:850 /sys/class/drm/card0/gt_boost_freq_mhz:1650 /sys/class/drm/card0/gt_cur_freq_mhz:850 /sys/class/drm/card0/gt_max_freq_mhz:1350 /sys/class/drm/card0/gt_min_freq_mhz:850 /sys/class/drm/card0/gt_RP0_freq_mhz:1350 /sys/class/drm/card0/gt_RP1_freq_mhz:850 /sys/class/drm/card0/gt_RPn_freq_mhz:850 Kernel = 4.15-rc9.
Actually there's a problem when dealing with hwp after resumed, I'll test with fix patch applied and check the drm freq. @Doug I think we encountered different issues.
(In reply to Chen Yu from comment #15) > Actually there's a problem when dealing with hwp after resumed, I'll test > with fix patch applied and check the drm freq. @Doug I think we encountered > different issues. Sent to https://patchwork.kernel.org/patch/10183901/ Humm, with this patch applied, the high freq issue disappeared - because the HWP is working well after resumed...
(In reply to Chen Yu from comment #15) > @Doug I think we encountered different issues. Yes, I agree. See also my comment 1 above, where the issue is also present when using the acpi-cpufreq driver.
(In reply to Doug Smythies from comment #17) > (In reply to Chen Yu from comment #15) > > @Doug I think we encountered different issues. > > Yes, I agree. See also my comment 1 above, where the issue is also present > when using the acpi-cpufreq driver. So I come back now. Per Len's information, there's no pc6% after resumed. Since this is more likely a BIOS issue, how about the following "solution": 1. Try to detect this symptom in turbostat. 2. When all the following conditions are met, print the warning and request the user to do a manual online->offline sequence for the CPUs: 2.1 Periodically record the Busy% and pc6 residency when all the CPUs are online && 2.2 Later once some CPUs are offline, compare the pc6 residency if the cpu utilization Busy% is lower than the threshold, say, 5%, with the ones recorded in step 1. 2.3 If the pc6 with cpus offline is much higher than the ones with all cpus online, then this symptom is detected.
So this issue seems to be platform specific, I'm wondering if this could be marked as a known issue/document, as I could not find a proper workaround in the kernel(not sure if the community would accept the solution to kick the offline cpus after resume) Or, does it make sense to add a hook in .suspend(), .resume(), to compare the pc6 before/after resume, if there is too much drift, print a warning there that there might be something wrong with the system?
(In reply to Chen Yu from comment #19) > So this issue seems to be platform specific, I'm wondering if this could be > marked as a known issue/document, as I could not find a proper workaround in > the kernel(not sure if the community would accept the solution to kick the > offline cpus after resume) That would be O.K. with me. As mentioned in comment #1 I only made this bug report because the original one got closed, and I thought the issue shouldn't be forgotten. Personally, I would never take CPUs off-line anyway.
Thanks. Marked as Documented as we could not find a proper solution to fix the BIOS issue in the kernel.