model name : Intel(R) Core(TM) i7-3520M CPU @ 2.90GHz When sitting idle, the cpu frequency does not decrease. This seems to lead to increased heat, battery usage, and a constantly running fan. $ cat /sys/devices/system/cpu/cpufreq/policy*/scaling_cur_freq 3461159 3446653 3515235 3496802 $ cd /sys/devices/system/cpu/cpufreq/policy0/ $ for i in *; do echo $i: $(cat $i); done affected_cpus: 0 cpuinfo_max_freq: 3600000 cpuinfo_min_freq: 1200000 cpuinfo_transition_latency: 4294967295 related_cpus: 0 scaling_available_governors: performance powersave scaling_cur_freq: 3511442 scaling_driver: intel_pstate scaling_governor: powersave scaling_max_freq: 3600000 scaling_min_freq: 1200000 scaling_setspeed: <unsupported> Am happy to try and provide any further details.
output of turbostat: Core CPU Avg_MHz Busy% Bzy_MHz TSC_MHz IRQ SMI C1 C1E C3 C6 C7 C1% C1E% C3% C6% C7% CPU%c1 CPU%c3 CPU%c6 CPU%c7 CoreTmp PkgTmp GFX%rc6 GFXMHz Pkg%pc2 Pkg%pc3 Pkg%pc6 Pkg%pc7 PkgWatt CorWatt GFXWatt - - 215 6.24 3466 2891 4746 0 252 2034 481 0 10186 1.63 18.54 0.79 0.00 72.69 44.65 1.11 0.00 48.00 56 57 0.00 1250 0.00 0.00 0.00 0.0018.11 8.33 6.04 0 0 227 6.54 3473 2891 1010 0 58 673 93 0 2164 1.09 28.04 0.60 0.00 63.59 49.26 1.21 0.00 42.98 56 57 0.00 1250 0.00 0.00 0.00 0.0018.11 8.33 6.04 0 2 207 5.99 3470 2891 1206 0 51 465 154 0 2875 1.27 15.64 0.94 0.00 76.05 49.82 1 1 229 6.65 3460 2891 1566 0 103 407 125 0 2787 2.90 13.46 1.05 0.00 75.82 39.32 1.00 0.00 53.03 53 1 3 199 5.76 3462 2891 964 0 40 489 109 0 2360 1.28 16.99 0.57 0.00 75.30 40.20
cpuinfo_transition_latency is bogus, but that shouldn't matter. Something's odd. Have you seen this with plain 4.13? Have you tried 4.14? If not, please do.
Please also check what's in there in /sys/devices/system/cpu/intel_pstate/* and attach the output of dmesg (after a fresh boot) from a failing kernel.
While I gather info, I have another laptop (none work issue) that does seem to work correctly: From dmidecode: Handle 0x0001, DMI type 1, 27 bytes System Information Manufacturer: Dell Inc. Product Name: Inspiron 3537 Version: A08 Serial Number: 3ZW3SX1 UUID: 4C4C4544-005A-5710-8033-B3C04F535831 Wake-up Type: Power Switch SKU Number: Inspiron 3537 Family: 00 model name : Intel(R) Core(TM) i7-4500U CPU @ 1.80GHz $ cat /proc/version Linux version 4.13.13-300.fc27.x86_64 (mockbuild@bkernel02.phx2.fedoraproject.org) (gcc version 7.2.1 20170915 (Red Hat 7.2.1-2) (GCC)) #1 SMP Wed Nov 15 15:47:50 UTC 2017 $ cat /sys/devices/system/cpu/cpufreq/policy*/scaling_cur_freq 885425 982535 886804 798451 $ cd /sys/devices/system/cpu/cpufreq/policy0/ $ for i in *; do echo $i: $(cat $i); done affected_cpus: 0 cpuinfo_max_freq: 3000000 cpuinfo_min_freq: 800000 cpuinfo_transition_latency: 4294967295 related_cpus: 0 scaling_available_governors: performance powersave scaling_cur_freq: 2127263 scaling_driver: intel_pstate scaling_governor: powersave scaling_max_freq: 3000000 scaling_min_freq: 800000 scaling_setspeed: <unsupported> $ cd /sys/devices/system/cpu/intel_pstate/ $ for i in *; do echo $i: $(cat $i); done max_perf_pct: 100 min_perf_pct: 26 no_turbo: 0 num_pstates: 23 status: active turbo_pct: 57 Give me a few minutes and I'll fire up the work laptop and gather exactly the same details again - plus the dmesg.
From the work laptop that I originally lodged this issue about: Handle 0x0001, DMI type 1, 27 bytes System Information Manufacturer: Dell Inc. Product Name: Latitude E5530 non-vPro Version: 01 Serial Number: BNZDWW1 UUID: 4C4C4544-004E-5A10-8044-C2C04F575731 Wake-up Type: Power Switch SKU Number: Latitude E5530 non-vPro Family: Not Specified $ cat /proc/version Linux version 4.13.13-300.fc27.x86_64 (mockbuild@bkernel02.phx2.fedoraproject.org) (gcc version 7.2.1 20170915 (Red Hat 7.2.1-2) (GCC)) #1 SMP Wed Nov 15 15:47:50 UTC 2017 $ cat /sys/devices/system/cpu/cpufreq/policy*/scaling_cur_freq 3410222 3114373 3138282 3263202 $ cd /sys/devices/system/cpu/cpufreq/policy0/ $ for i in *; do echo $i: $(cat $i); done affected_cpus: 0 cpuinfo_max_freq: 3600000 cpuinfo_min_freq: 1200000 cpuinfo_transition_latency: 4294967295 related_cpus: 0 scaling_available_governors: performance powersave scaling_cur_freq: 3093358 scaling_driver: intel_pstate scaling_governor: performance scaling_max_freq: 3600000 scaling_min_freq: 1200000 scaling_setspeed: <unsupported> $ cd /sys/devices/system/cpu/intel_pstate/ $ for i in *; do echo $i: $(cat $i); done max_perf_pct: 100 min_perf_pct: 33 no_turbo: 0 num_pstates: 25 status: active turbo_pct: 29
Created attachment 260761 [details] dmesg output of Dell Lattitude E5530
On the same machine, I also note this is interesting: $ while true; do date; cat /sys/devices/system/cpu/cpufreq/policy*/scaling_cur_freq; sleep 1; done Wed Nov 22 05:02:21 AEDT 2017 3439173 3545192 3461566 3524594 Wed Nov 22 05:02:23 AEDT 2017 2080108 3490290 3531344 3451074 Wed Nov 22 05:02:24 AEDT 2017 1968543 2677264 1311234 3251583 Wed Nov 22 05:02:25 AEDT 2017 1855877 2264437 3107480 2258118 Wed Nov 22 05:02:26 AEDT 2017 1524171 2365220 1850871 2442407 Wed Nov 22 05:02:27 AEDT 2017 2077300 3090501 3563860 3480842 Wed Nov 22 05:02:28 AEDT 2017 3480785 3571819 3548358 3452782 Wed Nov 22 05:02:29 AEDT 2017 3403664 3126065 1941803 3404770 Wed Nov 22 05:02:30 AEDT 2017 2518917 3076549 1624617 3368908 Wed Nov 22 05:02:31 AEDT 2017 3077208 3345059 3514819 3426478 Wed Nov 22 05:02:32 AEDT 2017 1650530 3212238 3574962 3553168 Wed Nov 22 05:02:33 AEDT 2017 3507200 3540436 3441952 3564402 Wed Nov 22 05:02:35 AEDT 2017 2939500 3565573 3539760 3570695 Wed Nov 22 05:02:36 AEDT 2017 3361258 3412267 3425142 3363249 Wed Nov 22 05:02:37 AEDT 2017 1433039 3397464 3269907 3052990 Wed Nov 22 05:02:38 AEDT 2017 2866227 3021738 3380710 3566913 Wed Nov 22 05:02:39 AEDT 2017 1807272 3059071 2660091 3375225 Wed Nov 22 05:02:40 AEDT 2017 3085166 3569317 3567716 3473496 Wed Nov 22 05:02:41 AEDT 2017 1871541 3326360 1656168 3357178 Wed Nov 22 05:02:42 AEDT 2017 2424710 2968791 3399576 3092302 However, looking at the power states in 'powertop', I see the package stays mostly in C7. Package | Core | CPU 0 CPU 2 | | C0 active 0.7% 0.7% | | POLL 0.0% 0.0 ms 0.0% 0.0 ms | | C1E 0.3% 0.6 ms 0.1% 0.7 ms C2 (pc2) 1.7% | | C3 (pc3) 0.0% | C3 (cc3) 0.2% | C3 0.1% 0.3 ms 0.0% 0.2 ms C6 (pc6) 2.2% | C6 (cc6) 0.0% | C6 0.0% 0.7 ms 0.0% 0.0 ms C7 (pc7) 72.7% | C7 (cc7) 96.8% | C7 98.4% 10.1 ms 98.8% 9.2 ms | Core | CPU 1 CPU 3 | | C0 active 0.8% 0.6% | | POLL 0.0% 0.1 ms 0.0% 0.0 ms | | C1E 0.2% 0.6 ms 0.3% 1.0 ms | | | C3 (cc3) 0.1% | C3 0.1% 0.2 ms 0.1% 0.3 ms | C6 (cc6) 0.0% | C6 0.0% 0.0 ms 0.0% 0.0 ms | C7 (cc7) 97.1% | C7 98.7% 9.7 ms 98.7% 13.9 ms | GPU | | | | Powered On 18.8% | | RC6 22.8% | | RC6p 58.4% | | RC6pp 0.0% | | | | |
On the affected machine, please do (as root) # echo passive > /sys/devices/system/cpu/intel_pstate/status then try to change the governor to "powersave" for all CPUs: # for j in 0 1 2 3; do \ > echo powersave > /sys/devices/system/cpu/cpufreq/policy$j/scaling_governor > done and see what you get in scaling_cur_freq after that. If that's 1200 MHz (as expected), try to change the governor to "schedutil" and see if the scaling works.
Ok, as root: # echo passive > /sys/devices/system/cpu/intel_pstate/status # for j in 0 1 2 3; do echo powersave > /sys/devices/system/cpu/cpufreq/policy$j/scaling_governor; done # cat /sys/devices/system/cpu/cpufreq/policy*/scaling_cur_freq 1196490 1196197 1196154 1196139 Then: # for j in 0 1 2 3; do echo schedutil > /sys/devices/system/cpu/cpufreq/policy$j/scaling_governor; done # cat /sys/devices/system/cpu/cpufreq/policy*/scaling_cur_freq 1196211 1196593 1497467 1196529
OK, it looks like the problem is limited to intel_pstate in the active mode then. Please reboot, check that /sys/devices/system/cpu/intel_pstate/status is "active" and then see if there is a file called "pstate_snb" in /sys/kernel/debug/ (as root).
# cat /sys/devices/system/cpu/intel_pstate/status active # ls -l /sys/kernel/debug/pstate_snb ls: cannot access '/sys/kernel/debug/pstate_snb': No such file or directory
Run a tool which is present in the upstream Linux and attach output tools/power/x86/intel_pstate_tracer/intel_pstate_tracer.py It has some prerequisite packages which are described in the file header. I think there is some constant IO activity.
So I've just been observing the output of a 'watch cat /sys/devices/system/cpu/cpufreq/policy*/scaling_cur_freq'. I do notice that the frequency *does* actually drop - however it seems to raise again very quickly - even when the system is mostly idle. Comparing to when using passive mode and a different governor - which seems to keep the lower clock frequency for longer - with the occasional increase, then back down. Is this indicative of an issue? or is this how the pstate changing is supposed to work? Could it be that the pstate frequency changing is just a little ..... hyperactive?
The output of the above tool will help us to find root cause.
Trying to not turn this into a support session, but right now, I'm seeing: # ./intel_pstate_tracer.py -i 10 -n test python version (should be >= 2.7): sys.version_info(major=2, minor=7, micro=14, releaselevel='final', serial=0) Sleeping for 10 seconds No valid data to plot I've tried an interval of 1, 10, 60 - all with 'no valid data' errors. What specifics are required kernel side for this? The Fedora kernel has: # grep -i trace /boot/config-4.13.14-300.fc27.x86_64 CONFIG_STACKTRACE_SUPPORT=y CONFIG_SYSCTL_EXCEPTION_TRACE=y CONFIG_TRACEPOINTS=y CONFIG_KPROBES_ON_FTRACE=y CONFIG_HAVE_KPROBES_ON_FTRACE=y CONFIG_HAVE_ARCH_TRACEHOOK=y CONFIG_HAVE_RELIABLE_STACKTRACE=y CONFIG_PM_TRACE=y CONFIG_PM_TRACE_RTC=y CONFIG_NETFILTER_XT_TARGET_TRACE=m # CONFIG_DMA_FENCE_TRACE is not set CONFIG_SCSI_IPR_TRACE=y # CONFIG_VXGE_DEBUG_TRACE_ALL is not set # CONFIG_ATH5K_TRACER is not set # CONFIG_CAPI_TRACE is not set # CONFIG_TRACE_SINK is not set # CONFIG_F2FS_IO_TRACE is not set # CONFIG_PSTORE_FTRACE is not set CONFIG_TRACE_IRQFLAGS_SUPPORT=y CONFIG_STACKTRACE=y # CONFIG_RCU_TRACE is not set CONFIG_USER_STACKTRACE_SUPPORT=y CONFIG_NOP_TRACER=y CONFIG_HAVE_FUNCTION_TRACER=y CONFIG_HAVE_FUNCTION_GRAPH_TRACER=y CONFIG_HAVE_DYNAMIC_FTRACE=y CONFIG_HAVE_DYNAMIC_FTRACE_WITH_REGS=y CONFIG_HAVE_FTRACE_MCOUNT_RECORD=y CONFIG_HAVE_SYSCALL_TRACEPOINTS=y CONFIG_TRACER_MAX_TRACE=y CONFIG_TRACE_CLOCK=y CONFIG_CONTEXT_SWITCH_TRACER=y CONFIG_GENERIC_TRACER=y CONFIG_FTRACE=y CONFIG_FUNCTION_TRACER=y CONFIG_FUNCTION_GRAPH_TRACER=y # CONFIG_IRQSOFF_TRACER is not set CONFIG_SCHED_TRACER=y CONFIG_HWLAT_TRACER=y CONFIG_FTRACE_SYSCALLS=y CONFIG_TRACER_SNAPSHOT=y # CONFIG_TRACER_SNAPSHOT_PER_CPU_SWAP is not set CONFIG_STACK_TRACER=y CONFIG_BLK_DEV_IO_TRACE=y CONFIG_DYNAMIC_FTRACE=y CONFIG_DYNAMIC_FTRACE_WITH_REGS=y CONFIG_FTRACE_MCOUNT_RECORD=y # CONFIG_FTRACE_STARTUP_TEST is not set CONFIG_MMIOTRACE=y # CONFIG_MMIOTRACE_TEST is not set # CONFIG_TRACEPOINT_BENCHMARK is not set CONFIG_TRACE_EVAL_MAP_FILE=y # CONFIG_BACKTRACE_SELF_TEST is not set CONFIG_HAVE_MMIOTRACE_SUPPORT=y
It is not finding trace file with contents # echo 1 > sys/kernel/debug/tracing/events/power/pstate_sample/enable let system be idle for 10 seconds and then attach the file # cat /sys/kernel/debug/tracing/trace
I get the feeling I have to turn something else on: $ echo 1 > /sys/kernel/debug/tracing/events/power/pstate_sample/enable $ cat /sys/kernel/debug/tracing/trace # tracer: nop # # _-----=> irqs-off # / _----=> need-resched # | / _---=> hardirq/softirq # || / _--=> preempt-depth # ||| / delay # TASK-PID CPU# |||| TIMESTAMP FUNCTION # | | | |||| | |
I guess you are using default Fedora kernel. You don't have your own kernel config. You seems to have all CONFIG for tracing and you don't see any sample. This is a problem and may be function doing scaling is not called. Try this # cd /sys/kernel/debug/tracing # cat available_tracers You should see "function" tracer # echo function > current_tracer # cat current_tracer You should see "function" # cat trace | grep -i intel_pstate You should see something like.. gnome-terminal--2304 [002] d... 220.566747: intel_pstate_update_util <-update_load_avg bash-2349 [003] d.h. 220.567031: intel_pstate_update_util <-task_tick_fair <idle>-0 [002] d.h. 220.567348: intel_pstate_update_util <-update_load_avg gnome-shell-1784 [000] d... 220.567373: intel_pstate_update_util <-update_load_avg gnome-shell-1784 [000] d... 220.567374: intel_pstate_adjust_pstate <-intel_pstate_update_util
# cd /sys/kernel/debug/tracing # # cat available_tracers hwlat blk mmiotrace function_graph wakeup_dl wakeup_rt wakeup function nop # echo function > current_tracer # cat current_tracer function # cat trace | grep -i intel_pstate # cat trace | grep -i intel_pstate # cat trace | grep -i intel_pstate Ah! Forgot a step: # echo 1 > /sys/kernel/debug/tracing/events/power/pstate_sample/enable However, still no output in the trace regarding intel_pstate
Interestingly, if I back the grep down to 'intel', I see lots of: <idle>-0 [000] d... 16224.904736: leave_mm <-intel_idle <idle>-0 [001] d... 16224.904800: intel_idle <-cpuidle_enter_state <idle>-0 [000] d... 16224.904839: intel_idle <-cpuidle_enter_state <idle>-0 [000] d... 16224.904839: leave_mm <-intel_idle <idle>-0 [001] d... 16224.904864: intel_idle <-cpuidle_enter_state <idle>-0 [000] d... 16224.905039: intel_idle <-cpuidle_enter_state <idle>-0 [001] d... 16224.905158: intel_idle <-cpuidle_enter_state <idle>-0 [000] d... 16224.905193: intel_idle <-cpuidle_enter_state <idle>-0 [000] d... 16224.905193: leave_mm <-intel_idle <idle>-0 [001] d... 16224.905215: intel_idle <-cpuidle_enter_state <idle>-0 [001] d... 16224.905253: intel_idle <-cpuidle_enter_state <idle>-0 [000] d... 16224.905379: intel_idle <-cpuidle_enter_state <idle>-0 [000] d... 16224.905379: leave_mm <-intel_idle <idle>-0 [001] d... 16224.905425: intel_idle <-cpuidle_enter_state <idle>-0 [000] d... 16224.905465: intel_idle <-cpuidle_enter_state <idle>-0 [001] d... 16224.905485: intel_idle <-cpuidle_enter_state
Did you let it run for few seconds? Can you also increase the trace buffer size (You can increase buffer_size_in_kb in the same folder as trace file). If you still don't see then we are not getting periodic callback. So the driver will not scale frequency. We need to create a debug patch to check further.
Hmmm ok. I did: # cat trace > ~/trace.txt # ls -l ~/trace.txt -rw-r--r-- 1 root root 17219952 Nov 22 19:03 /root/trace.txt If it was all behaving, surely I'd see at least one hit in 17Mb of logs....
Can you add this to kernel command line during boot and collect dmesg? dyndbg="file intel_pstate.c +p"
Created attachment 260779 [details] dmesg with added dyndbg=file intel_pstate.c +p $ cat /proc/cmdline BOOT_IMAGE=/vmlinuz-4.13.14-300.fc27.x86_64 root=UUID=8b90bed8-43b9-4283-a5cf-0169acd438c7 ro rootflags=subvol=root rd.luks.uuid=luks-94704f71-460b-456d-9473-145077f9f772 rd.luks.uuid=luks-60898f8f-1e33-47fd-8a0a-b5add65027b0 rhgb quiet audit=0 LANG=en_AU.UTF-8 "dyndbg=file intel_pstate.c +p"
@Srinivas: It sort of behaves as though the governor is "performance" even though it says "powersave". @Steven: Would it be a problem to switch over to 4.14? It may be somewhat easier to debug then.
@Rafael: Correct. There is no trace so I guess cpufreq set_policy is called with PERFORMANCE. Can you add a printk to check cpu->policy value static int intel_pstate_set_policy(struct cpufreq_policy *policy) { ... ... if (cpu->policy == CPUFREQ_POLICY_PERFORMANCE) { printk(KERN_ERR "intel_pstate policy is performance !!\n"); ....
Created attachment 260789 [details] intel_pstate: A couple of debug printk()s Please apply this debug patch to the kernel (should apply to 4.13.y too), boot with it and attach dmesg output.
It seems there's an unofficial build of 4.14.0 for Fedora 28 - which should still run ok on Fedora 27... Grabbed the packages from here: https://koji.fedoraproject.org/koji/buildinfo?buildID=998399 # cat /proc/version Linux version 4.14.0-1.fc28.x86_64 (mockbuild@bkernel02.phx2.fedoraproject.org) (gcc version 7.2.1 20170829 (Red Hat 7.2.1-1) (GCC)) #1 SMP Mon Nov 13 14:09:27 UTC 2017 # cat /sys/devices/system/cpu/intel_pstate/status active # cat /sys/devices/system/cpu/cpufreq/policy*/scaling_cur_freq 1962729 3390637 1351553 3507481 # cat /sys/devices/system/cpu/cpufreq/policy*/scaling_cur_freq 1291050 1237406 1598442 1540077 # cat /sys/devices/system/cpu/cpufreq/policy*/scaling_cur_freq 1293710 1240327 1774801 3129251 # cd /sys/devices/system/cpu/cpufreq/policy0/ # for i in *; do echo $i: $(cat $i); done affected_cpus: 0 cpuinfo_max_freq: 3600000 cpuinfo_min_freq: 1200000 cpuinfo_transition_latency: 0 related_cpus: 0 scaling_available_governors: performance powersave scaling_cur_freq: 2121493 scaling_driver: intel_pstate scaling_governor: performance scaling_max_freq: 3600000 scaling_min_freq: 1200000 scaling_setspeed: <unsupported> # cd /sys/devices/system/cpu/intel_pstate/ # for i in *; do echo $i: $(cat $i); done max_perf_pct: 100 min_perf_pct: 33 no_turbo: 0 num_pstates: 25 status: active turbo_pct: 29 Switching to powersave for testing: # for j in 0 1 2 3; do echo powersave > /sys/devices/system/cpu/cpufreq/policy$j/scaling_governor; done; Confirm change: # cd /sys/devices/system/cpu/cpufreq/policy0/ # for i in *; do echo $i: $(cat $i); done affected_cpus: 0 cpuinfo_max_freq: 3600000 cpuinfo_min_freq: 1200000 cpuinfo_transition_latency: 0 related_cpus: 0 scaling_available_governors: performance powersave scaling_cur_freq: 1294849 scaling_driver: intel_pstate scaling_governor: powersave scaling_max_freq: 3600000 scaling_min_freq: 1200000 scaling_setspeed: <unsupported> This looks like expected behaviour: # cat /sys/devices/system/cpu/cpufreq/policy*/scaling_cur_freq 1196386 1196213 1196168 1196264 # cat /sys/devices/system/cpu/cpufreq/policy*/scaling_cur_freq 1196599 1196222 1196697 1196737 # cat /sys/devices/system/cpu/cpufreq/policy*/scaling_cur_freq 1196215 1196099 1196179 1196807
OK, so it looks like 4.14 behaves correctly. If that's the case, no need to test the patch I've attached, but please verify that the problem is really not there.
Yeah - its just hit 1:30am now - so I'll check it out tomorrow at the office. Usually I notice if its behaving properly as the laptop fan turns off :) At some point, the fan just started running constantly - which is what lead me down this path. I'll observe for a while and see how we go with 4.14.0. If that is actually resolved fully, then maybe it indicates a change responsible.... but we'll see what the results are first :)
The observed symptoms can be explained by issues with the tracking of time, which also would explain why some systems are affected and the others aren't.
Ok - so after a bit of tinkering and trial / error, I booted today into 4.13.15 kernel. I noticed much more CPU fan activity (the thing is loud!). This is when I noticed I booted into 4.13.15 - and not the expected 4.14.0 kernel. After validating this and rebooting into the 4.14.0 kernel, there is certainly lower CPU fan activity - and looking at the lmsensors output, a reduction of ~6-10 degrees C on the same workload while booted into 4.14.0. Reported temps on both cores hover around 50-56c using kernel 4.13.15 - but reduce to 36-45c on 4.14.0. The general workloads between measurements are roughly the same - browsing, email, ssh etc.
OK Just to be sure, please try to run intel_pstate_tracer.py as per comment #12, but this time with 4.14.0(+).
I still don't seem to get any output using the tracer. # echo 1 > /sys/kernel/debug/tracing/tracing_on # echo function > /sys/kernel/debug/tracing/current_tracer # echo 1 > /sys/kernel/debug/tracing/events/power/pstate_sample/enable # ./intel_pstate_tracer.py -i 10 -n test python version (should be >= 2.7): sys.version_info(major=2, minor=7, micro=14, releaselevel='final', serial=0) Sleeping for 10 seconds No valid data to plot It seems that the script disables the pstate_sample/enable correctly after running - so I'm not quite sure what's going on.
OK Please run # turbostat --quiet --show CPU,Avg_MHz,Busy%,Bzy_MHz \ --add msr0x199,u32,raw,PERF_CTL --interval 3 with 4.14 for 30 s or so (~ 10 iterations) and attach the output.
Fedora 27 - using kernel 4.14.0 from f28): # turbostat --quiet --show CPU,Avg_MHz,Busy%,Bzy_MHz --add msr0x199,u32,raw,PERF_CTL --interval 3 CPU Avg_MHz Busy% Bzy_MHz PERF_CTL - 187 5.74 3275 0x00000000 0 170 5.12 3321 0x00002400 2 172 5.24 3297 0x00002400 1 195 6.19 3164 0x00002400 3 213 6.42 3326 0x00002400 CPU Avg_MHz Busy% Bzy_MHz PERF_CTL - 174 5.39 3236 0x00000000 0 156 4.72 3314 0x00002400 2 161 4.99 3235 0x00002400 1 187 6.10 3082 0x00002400 3 191 5.74 3337 0x00002400 CPU Avg_MHz Busy% Bzy_MHz PERF_CTL - 173 5.33 3257 0x00000000 0 155 4.66 3339 0x00002400 2 169 5.20 3263 0x00002400 1 196 6.25 3143 0x00002400 3 173 5.23 3314 0x00002400 CPU Avg_MHz Busy% Bzy_MHz PERF_CTL - 188 5.73 3301 0x00000000 0 209 6.19 3390 0x00002400 2 149 4.60 3258 0x00002400 1 213 6.66 3204 0x00002400 3 182 5.46 3353 0x00002400 CPU Avg_MHz Busy% Bzy_MHz PERF_CTL - 168 5.17 3267 0x00000000 0 203 6.05 3375 0x00002400 2 163 5.01 3267 0x00002400 1 159 5.11 3119 0x00002400 3 148 4.51 3292 0x00002400 CPU Avg_MHz Busy% Bzy_MHz PERF_CTL - 161 4.98 3249 0x00000000 0 197 5.93 3341 0x00002400 2 151 4.59 3303 0x00002400 1 170 5.45 3124 0x00002400 3 127 3.96 3218 0x00002400 CPU Avg_MHz Busy% Bzy_MHz PERF_CTL - 168 5.20 3237 0x00000000 0 130 3.94 3301 0x00002400 2 194 5.93 3285 0x00002400 1 196 6.32 3118 0x00002400 3 151 4.63 3282 0x00002400 CPU Avg_MHz Busy% Bzy_MHz PERF_CTL - 166 5.12 3254 0x00000000 0 140 4.22 3326 0x00002400 2 169 5.16 3290 0x00002400 1 191 6.13 3124 0x00002400 3 164 4.96 3315 0x00002400 CPU Avg_MHz Busy% Bzy_MHz PERF_CTL - 178 5.45 3277 0x00000000 0 157 4.69 3357 0x00002400 2 170 5.22 3277 0x00002400 1 190 6.08 3136 0x00002400 3 195 5.82 3361 0x00002400 CPU Avg_MHz Busy% Bzy_MHz PERF_CTL - 164 5.00 3292 0x00000000 0 173 5.12 3397 0x00002400 2 159 4.83 3307 0x00002400 1 168 5.32 3168 0x00002400 3 156 4.72 3305 0x00002400 CPU Avg_MHz Busy% Bzy_MHz PERF_CTL - 167 5.08 3288 0x00000000 0 166 4.95 3370 0x00002400 2 163 4.92 3321 0x00002400 1 191 6.03 3183 0x00002400 3 146 4.43 3304 0x00002400 CPU Avg_MHz Busy% Bzy_MHz PERF_CTL - 170 5.15 3314 0x00000000 0 168 4.99 3374 0x00002400 2 169 5.12 3315 0x00002400 1 179 5.58 3211 0x00002400 3 165 4.91 3370 0x00002400
Since there is no callback and limits are set to max, this is performance mode. Is it possible that you have some script which is changing in policy on a specific CPU? Can you try # cpupower frequency-set -g powersave Also output with debug patch by Rafael and suggested printk will be great. This is in addition to dyndbg in kernel command line to enable debug logs for intel_pstate.
# cpupower frequency-set -g powersave Setting cpu: 0 Setting cpu: 1 Setting cpu: 2 Setting cpu: 3 # turbostat --quiet --show CPU,Avg_MHz,Busy%,Bzy_MHz --add msr0x199,u32,raw,PERF_CTL --interval 3 CPU Avg_MHz Busy% Bzy_MHz PERF_CTL - 322 12.20 2648 0x00000000 0 242 10.03 2423 0x00001400 2 518 18.56 2799 0x00001500 1 342 12.57 2727 0x00001500 3 186 7.65 2444 0x00001500 CPU Avg_MHz Busy% Bzy_MHz PERF_CTL - 171 8.45 2031 0x00000000 0 159 7.81 2041 0x00001500 2 196 9.51 2062 0x00001500 1 167 8.45 1984 0x00001600 3 163 8.04 2032 0x00001600 CPU Avg_MHz Busy% Bzy_MHz PERF_CTL - 171 8.09 2127 0x00000000 0 184 8.44 2182 0x00001400 2 171 7.71 2226 0x00001500 1 206 9.62 2151 0x00001400 3 125 6.59 1905 0x00001500 CPU Avg_MHz Busy% Bzy_MHz PERF_CTL - 162 8.19 1985 0x00000000 0 155 7.77 1998 0x00001500 2 183 9.49 1939 0x00001600 1 151 7.64 1985 0x00001600 3 159 7.86 2027 0x00001500 CPU Avg_MHz Busy% Bzy_MHz PERF_CTL - 160 7.97 2013 0x00000000 0 150 7.53 2002 0x00001500 2 161 8.23 1961 0x00001500 1 165 7.90 2094 0x00001500 3 163 8.21 1997 0x00001500 CPU Avg_MHz Busy% Bzy_MHz PERF_CTL - 169 8.45 2002 0x00000000 0 148 7.90 1880 0x00001400 2 188 9.23 2048 0x00001400 1 163 8.26 1976 0x00001400 3 176 8.43 2091 0x00001400 CPU Avg_MHz Busy% Bzy_MHz PERF_CTL - 168 8.34 2024 0x00000000 0 130 6.48 2011 0x00001600 2 173 8.66 2005 0x00001400 1 166 8.20 2027 0x00001500 3 205 10.04 2047 0x00001500 CPU Avg_MHz Busy% Bzy_MHz PERF_CTL - 175 8.49 2068 0x00000000 0 187 9.28 2019 0x00001500 2 205 9.36 2203 0x00001400 1 167 8.27 2025 0x00001500 3 141 7.07 2004 0x00001500 CPU Avg_MHz Busy% Bzy_MHz PERF_CTL - 353 13.47 2631 0x00000000 0 237 10.17 2334 0x00001600 2 246 10.63 2321 0x00001500 1 179 7.92 2264 0x00001600 3 751 25.13 2998 0x00001500 CPU Avg_MHz Busy% Bzy_MHz PERF_CTL - 187 8.92 2109 0x00000000 0 205 9.87 2083 0x00001500 2 183 8.68 2116 0x00001400 1 162 7.49 2168 0x00001500 3 200 9.64 2083 0x00001500 CPU Avg_MHz Busy% Bzy_MHz PERF_CTL - 171 8.31 2060 0x00000000 0 190 8.80 2161 0x00001600 2 167 8.07 2071 0x00001500 1 146 7.17 2043 0x00001500 3 181 9.21 1967 0x00001600 ^C I don't have a buildroot handy for Fedora 27 to build the kernel - so I can't quite promise anything here at the moment...
Well, it appears to be working. You still can try to run intel_pstate_tracer.py with scaling_governor set to "powersave" (like in the previous comment).
As Rafael noted, the scaling seems to work. I am thinking that you have some CPU policy is performance and some powersave. cpupower will keep all same. So your work computer may have some scripts.
I believe the thing is that the 4.14.0 kernel works as it should - but the 4.13.x kernel does not. I certainly have no heat / fan issues under 4.14.0.
But the log you added to comment 36 shows that 4.14 kernel is requesting performance mode always. Your log in comment 38 shows different dump (which is correct). So it is confusing.
Anyway, since the problem is not reproducible with 4.14, I'm regarding it as fixed in the mainline. 4.13.y will be discontinued at one point, so you will need to switch over to 4.14.y anyway.