Bug 90301 - [haswell] Unable obtain Package Cstates C3 or lower
Summary: [haswell] Unable obtain Package Cstates C3 or lower
Status: RESOLVED INVALID
Alias: None
Product: Power Management
Classification: Unclassified
Component: intel_pstate (show other bugs)
Hardware: All Linux
: P1 normal
Assignee: Len Brown
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2014-12-25 04:31 UTC by Shawn Starr
Modified: 2016-11-15 07:39 UTC (History)
2 users (show)

See Also:
Kernel Version: 4.9-rc4+
Subsystem:
Regression: Yes
Bisected commit-id:


Attachments
turbostat debug dump (18.59 KB, text/plain)
2016-11-15 07:19 UTC, Shawn Starr
Details
Powertop report (78.97 KB, text/html)
2016-11-15 07:22 UTC, Shawn Starr
Details

Description Shawn Starr 2014-12-25 04:31:45 UTC
Something seems very wrong with intel_ptate. When using acpi-cpufreq the laptop fans do not go full throttle when one logical core is utilized.

This is using an Intel i7-4910MQ CPU @ 2.90GHz (Turbo 3.9Ghz) on a Dell Precision M6800 with A11 BIOS (latest available). C-States is enabled in BIOS.

With pstates, if one logical CPU is busy the average frequency for all the cores is much higher and wasting more power while idle.

If this is by design, it seems like a design flaw.

sensors info (pstates disabled)
radeon-pci-0100
Adapter: PCI adapter
temp1:        +58.0°C  (crit = +120.0°C, hyst = +90.0°C)

Current with acpi-cpufreq and conservative gov
i8k-virtual-0
Adapter: Virtual device
fan1:        59190 RPM
fan2:        69180 RPM
temp1:        +69.0°C  
temp2:        +51.0°C  
temp3:        +57.0°C  
temp4:        +47.0°C  

coretemp-isa-0000
Adapter: ISA adapter
Physical id 0:  +69.0°C  (high = +84.0°C, crit = +100.0°C)
Core 0:         +69.0°C  (high = +84.0°C, crit = +100.0°C)
Core 1:         +69.0°C  (high = +84.0°C, crit = +100.0°C)
Core 2:         +68.0°C  (high = +84.0°C, crit = +100.0°C)
Core 3:         +64.0°C  (high = +84.0°C, crit = +100.0°C)

Turbostat (pstates disabled)

RAPL: 5578 sec. Joule Counter Range, at 47 Watts
cpu0: MSR_NHM_PLATFORM_INFO: 0x80838f3011d00
8 * 100 = 800 MHz max efficiency
29 * 100 = 2900 MHz TSC frequency
cpu0: MSR_IA32_POWER_CTL: 0x0004005d (C1E auto-promotion: DISabled)
cpu0: MSR_NHM_SNB_PKG_CST_CFG_CTL: 0x1e008405 (UNdemote-C3, UNdemote-C1, demote-C3, demote-C1, locked: pkg-cstate-limit=5: pc7s)
cpu0: MSR_NHM_TURBO_RATIO_LIMIT: 0x25252627
37 * 100 = 3700 MHz max turbo 4 active cores
37 * 100 = 3700 MHz max turbo 3 active cores
38 * 100 = 3800 MHz max turbo 2 active cores
39 * 100 = 3900 MHz max turbo 1 active cores
cpu0: MSR_IA32_ENERGY_PERF_BIAS: 0x00000006 (balanced)
cpu0: MSR_RAPL_POWER_UNIT: 0x000a0e03 (0.125000 Watts, 0.000061 Joules, 0.000977 sec.)
cpu0: MSR_PKG_POWER_INFO: 0x00000178 (47 W TDP, RAPL 0 - 0 W, 0.000000 sec.)
cpu0: MSR_PKG_POWER_LIMIT: 0x804281d600dc8178 (locked)
cpu0: PKG Limit #1: ENabled (47.000000 Watts, 28.000000 sec, clamp DISabled)
cpu0: PKG Limit #2: ENabled (58.750000 Watts, 0.002441* sec, clamp DISabled)
cpu0: MSR_PP0_POLICY: 0
cpu0: MSR_PP0_POWER_LIMIT: 0x00000000 (UNlocked)
cpu0: Cores Limit: DISabled (0.000000 Watts, 0.000977 sec, clamp DISabled)
cpu0: MSR_PP1_POLICY: 0
cpu0: MSR_PP1_POWER_LIMIT: 0x00000000 (UNlocked)
cpu0: GFX Limit: DISabled (0.000000 Watts, 0.000977 sec, clamp DISabled)
cpu0: MSR_IA32_TEMPERATURE_TARGET: 0x03641000 (100 C)
cpu0: MSR_IA32_PACKAGE_THERM_STATUS: 0x88230882 (65 C)
cpu0: MSR_IA32_THERM_STATUS: 0x88230800 (65 C +/- 1)
cpu1: MSR_IA32_THERM_STATUS: 0x88230802 (65 C +/- 1)
cpu2: MSR_IA32_THERM_STATUS: 0x88230802 (65 C +/- 1)
cpu3: MSR_IA32_THERM_STATUS: 0x88250800 (63 C +/- 1)

   Core     CPU Avg_MHz   %Busy Bzy_MHz TSC_MHz     SMI  CPU%c1  CPU%c3  CPU%c6  CPU%c7 CoreTmp  PkgTmp Pkg%pc2 Pkg%pc3 Pkg%pc6 Pkg%pc7 PkgWatt CorWatt GFXWatt
       -       -     383   11.35    3377    2893       0   14.58    4.01    3.64   66.41      69      69   23.09    0.00    0.00    0.00   17.23    8.76    0.20
       0       0     245    7.51    3260    2893      61    8.64    4.31    3.51   76.03      65      69   23.09    0.00    0.00    0.00   17.23    8.76    0.20
       0       4     107    3.32    3215    2893      61   12.83
       1       1     193    6.01    3216    2893      61   54.62    2.30    2.11   34.96      69
       1       5    1892   54.58    3466    2893      61    6.05
       2       2     229    7.05    3242    2893      61    6.06    5.22    5.01   76.66      64
       2       6      66    1.98    3308    2893      61   11.13
       3       3     233    7.18    3247    2893      61    6.67    4.22    3.95   77.97      64
       3       7     104    3.20    3239    2893      61   10.65

I notice PkgTemp never gets to C7 state so don't know if it can or not.

Is there issues with Haswell and pstates right now still?

Thanks,
Shawn
Comment 1 Shawn Starr 2014-12-27 00:18:42 UTC
Running similar tests in the Windows 8.1 partition, I can get to C3 state for PkgTemp. So something seems to be preventing this in Linux.
Comment 2 Shawn Starr 2014-12-27 01:27:10 UTC
The issue isn't pstates its idle... closing this
Comment 3 Len Brown 2015-07-22 00:36:06 UTC
to run an apples/apples comparison of Windows and Linux,
you'll nee to disable the Linux intel_idle so that Linux
uses acpi_idle like Windows does.  Only then can you compare
"C3 residency" -- since then they are both talking about ACPI C3,
not hardware C3.

eg.
boot with "intel_idle.max_cstate=0"
dmesg | grep idle
grep . /sys/devices/system/cpu/cpu0/cpuidle/*/*
to see what states ACPI is actually exporting
and then run turbostat.
Comment 4 Shawn Starr 2015-07-22 20:09:53 UTC
The problem isn't so much intel_idle, the issued turned to be that I had to disable GFX with the IOMMU (intel_iommu=igfx_off) when this is set, now I get to Package Cstate of 6 and will get to 7 if display is shut off.

Right now, I disable the IGP and the laptop display only using the AMD GPU:

Boot options:

IMAGE=/boot/vmlinuz-4.2.0-0.rc3.git1.2.fc24.x86_64 root=UUID=f834b05c-55f3-407b-9a63-fc6b04d4c845 ro rhgb slub_debug=- cgroup_disable=memory console=tty0 console=ttyS0,115200n8 nmi_watchdog=0 i915.powersave=1 intel_iommu=igfx_off vfio_iommu_type1.allow_unsafe_interrupts=1 zswap.zpool=zsmalloc zswap.enabled=1 video=LVDS-1:d video=VGA-0:e pcie_aspm=off i915.enable_rc6=1 i915.enable_fbc=1 intel_iommu=on audit=0 radeon.gartsize=2048 LANG=en_US.UTF-8

Current run:

    Core     CPU Avg_MHz   %Busy Bzy_MHz TSC_MHz     SMI  CPU%c1  CPU%c3  CPU%c6  CPU%c7 CoreTmp  PkgTmp Pkg%pc2 Pkg%pc3 Pkg%pc6 Pkg%pc7 PkgWatt CorWatt GFXWatt
       -       -       3    0.35     963    2893       0    6.67    0.09    0.28   92.61      48      51    0.70    0.30   80.30    0.00    0.92    0.04    0.00
       0       0       3    0.36     963    2893       0    0.44    0.18    0.00   99.02      46      51    0.70    0.30   80.30    0.00    0.92    0.04    0.00
       0       4       0    0.02     940    2893       0    0.77
       1       1       7    0.77     957    2893       0    0.95    0.12    0.00   98.16      48
       1       5       0    0.02    1182    2893       0    1.70
       2       2       5    0.49     964    2893       0    8.47    0.01    1.11   89.92      47
       2       6       1    0.08     888    2893       0    8.88
       3       3      10    1.08     969    2893       0   15.54    0.06    0.00   83.32      48
       3       7       0    0.01     950    2893       0   16.61
Comment 5 Zhang Rui 2015-12-28 05:23:06 UTC
As the problem is that system is unable to enter PC3 and deeper package states when gfx DMAR unit is enabled, please file a bug at bugs.freedesktop.org instead.
Comment 6 Shawn Starr 2015-12-28 08:21:37 UTC
Which component? i915 for kernel DRM side?
Comment 7 Zhang Rui 2015-12-29 00:27:48 UTC
Well, I don't know.
Please feel free to try any component. They will redirect you to the right component if you file it wrongly. :p
Comment 8 Shawn Starr 2015-12-29 00:49:46 UTC
Ok, I'm still unsure fully, as enabling gfx DMAR when HSW GPU active still causes package state to not go past PC3, but either way, I'll file with fdo folks.
Comment 9 Shawn Starr 2016-11-15 07:18:04 UTC
So, this problem still exists with using the iGPU + dGPU on the latop: 

Using these boot options:

root=UUID=f834b05c-55f3-407b-9a63-fc6b04d4c845 ro rhgb slub_debug=- cgroup_disable=memory console=tty0 console=ttyUSB0,9600n8 nmi_watchdog=0 audit=0 amdgpu.gartsize=8192 amdgpu.vm_size=16 amdgpu.powerplay=1 amdgpu.exp_hw_support=1 rd.driver.blacklist=radeon amdgpu.dal=1 video=eDP-1:d video=VGA-1:e resume=/dev/sda7 intel_iommu=igfx_off i915.enable_rc6=1 i915.enable_fbc=1 pcie_aspm=off

I can only reach PC2 states, pretty much 99% of PC2 states.


I've set to low power the SATA controller ports

echo 'min_power' > '/sys/class/scsi_host/host1/link_power_management_policy';
echo 'min_power' > '/sys/class/scsi_host/host0/link_power_management_policy';
echo 'min_power' > '/sys/class/scsi_host/host2/link_power_management_policy';
echo 'min_power' > '/sys/class/scsi_host/host3/link_power_management_policy';
echo 'min_power' > '/sys/class/scsi_host/host4/link_power_management_policy';
echo 'min_power' > '/sys/class/scsi_host/host5/link_power_management_policy';

This is still not good on 4.9-rc4+
Comment 10 Shawn Starr 2016-11-15 07:19:13 UTC
Created attachment 244551 [details]
turbostat debug dump
Comment 11 Shawn Starr 2016-11-15 07:22:43 UTC
Created attachment 244561 [details]
Powertop report

Powertop report
Comment 12 Shawn Starr 2016-11-15 07:26:07 UTC
If I disable the iGPU I have no problem reaching PC6 state, so something the iGPU is doing?
Comment 13 Shawn Starr 2016-11-15 07:39:22 UTC
Seems it just takes alot longer to get to PC6, but it finally does. Closing.

Note You need to log in before you can comment on or make changes to this bug.