Created attachment 107170 [details] Kernel config (3.11-rc4) The kernel used for this bug report is not a vanilla 3.11-rc4, but the one patched with https://bugzilla.kernel.org/attachment.cgi?id=106891&action=diff from bug #58971. It is not possible to know whether this is a regression, because the bug (if it existed before) was effectively masked by bug #48721. To reproduce, compile a simple package (like media-libs/tiff-4.0.3-r2) on Gentoo using portage. Sometimes (but not always), during the whole process, the CPU frequency (both as reported by cpufreq-info and as reported by cat /proc/cpuinfo) doesn't raise as high as I would expect. I.e. it raises only to 1176 MHz, while I expect it to become 3.50 GHz on at least one CPU core, because the compilation process is clearly CPU bound. The driver is intel_pstate and the governor is powersave. It may be relevant that the load created by portage is nice (but there is no ignore_nice_load file in my sysfs tree), and that all of the configuration tests and gcc calls exit quickly. If the setting to ignore the nice load exists with intel_pstate but is not named ignore_nice_load, then please tell me about it - the bug may be invalid after all due to it. # cat /proc/cpuinfo processor : 0 vendor_id : GenuineIntel cpu family : 6 model : 42 model name : Intel(R) Core(TM) i7-2640M CPU @ 2.80GHz stepping : 7 microcode : 0x28 cpu MHz : 1176.000 cache size : 4096 KB physical id : 0 siblings : 4 core id : 0 cpu cores : 2 apicid : 0 initial apicid : 0 fpu : yes fpu_exception : yes cpuid level : 13 wp : yes flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc aperfmperf eagerfpu pni pclmulqdq dtes64 monitor ds_cpl vmx smx est tm2 ssse3 cx16 xtpr pdcm pcid sse4_1 sse4_2 x2apic popcnt tsc_deadline_timer aes xsave avx lahf_lm ida arat epb xsaveopt pln pts dtherm tpr_shadow vnmi flexpriority ept vpid bogomips : 5587.21 clflush size : 64 cache_alignment : 64 address sizes : 36 bits physical, 48 bits virtual power management: processor : 1 vendor_id : GenuineIntel cpu family : 6 model : 42 model name : Intel(R) Core(TM) i7-2640M CPU @ 2.80GHz stepping : 7 microcode : 0x28 cpu MHz : 1176.000 cache size : 4096 KB physical id : 0 siblings : 4 core id : 0 cpu cores : 2 apicid : 1 initial apicid : 1 fpu : yes fpu_exception : yes cpuid level : 13 wp : yes flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc aperfmperf eagerfpu pni pclmulqdq dtes64 monitor ds_cpl vmx smx est tm2 ssse3 cx16 xtpr pdcm pcid sse4_1 sse4_2 x2apic popcnt tsc_deadline_timer aes xsave avx lahf_lm ida arat epb xsaveopt pln pts dtherm tpr_shadow vnmi flexpriority ept vpid bogomips : 5587.21 clflush size : 64 cache_alignment : 64 address sizes : 36 bits physical, 48 bits virtual power management: processor : 2 vendor_id : GenuineIntel cpu family : 6 model : 42 model name : Intel(R) Core(TM) i7-2640M CPU @ 2.80GHz stepping : 7 microcode : 0x28 cpu MHz : 1176.000 cache size : 4096 KB physical id : 0 siblings : 4 core id : 1 cpu cores : 2 apicid : 2 initial apicid : 2 fpu : yes fpu_exception : yes cpuid level : 13 wp : yes flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc aperfmperf eagerfpu pni pclmulqdq dtes64 monitor ds_cpl vmx smx est tm2 ssse3 cx16 xtpr pdcm pcid sse4_1 sse4_2 x2apic popcnt tsc_deadline_timer aes xsave avx lahf_lm ida arat epb xsaveopt pln pts dtherm tpr_shadow vnmi flexpriority ept vpid bogomips : 5587.21 clflush size : 64 cache_alignment : 64 address sizes : 36 bits physical, 48 bits virtual power management: processor : 3 vendor_id : GenuineIntel cpu family : 6 model : 42 model name : Intel(R) Core(TM) i7-2640M CPU @ 2.80GHz stepping : 7 microcode : 0x28 cpu MHz : 1176.000 cache size : 4096 KB physical id : 0 siblings : 4 core id : 1 cpu cores : 2 apicid : 3 initial apicid : 3 fpu : yes fpu_exception : yes cpuid level : 13 wp : yes flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc aperfmperf eagerfpu pni pclmulqdq dtes64 monitor ds_cpl vmx smx est tm2 ssse3 cx16 xtpr pdcm pcid sse4_1 sse4_2 x2apic popcnt tsc_deadline_timer aes xsave avx lahf_lm ida arat epb xsaveopt pln pts dtherm tpr_shadow vnmi flexpriority ept vpid bogomips : 5587.21 clflush size : 64 cache_alignment : 64 address sizes : 36 bits physical, 48 bits virtual power management:
Could you provide the output of running "grep . /sys/bus/cpu/devices/cpu*/cpufreq/*" when the issue happen?
Created attachment 107187 [details] The requested command output, repeated many times I slightly deviated from your instructons, and instead attached the output files created by the following command why compiling bison with -j3: i=20 ; while true ; do ( top -n 1 ; grep . /sys/bus/cpu/devices/cpu*/cpufreq/* ) > cpufreq.out.$i ; i=$(( $i + 1 )) ; sleep 2 ; done The problem can be seen clearly e.g. in cpufreq.out.54. The kernel seems to have a very wrong idea about the CPU usage. Even though there are three cc1 processes each eating a lot of CPU time, the kernel thinks it is almost idle. top - 21:14:06 up 25 min, 3 users, load average: 1,48, 0,75, 0,42 Tasks: 162 total, 4 running, 158 sleeping, 0 stopped, 0 zombie %Cpu(s): 0,5 us, 3,6 sy, 4,2 ni, 91,7 id, 0,1 wa, 0,0 hi, 0,0 si, 0,0 st KiB Mem: 8083340 total, 928732 used, 7154608 free, 138692 buffers KiB Swap: 6291440 total, 0 used, 6291440 free, 322384 cached PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 25928 portage 23 3 59860 32556 7380 R 96,7 0,4 0:02.83 cc1 25962 portage 23 3 42432 13044 4836 R 96,7 0,2 0:00.33 cc1 25968 portage 23 3 37644 7244 3428 R 39,8 0,1 0:00.07 cc1 25965 root 20 0 123356 1656 1104 R 5,7 0,0 0:00.04 top 25969 portage 23 3 21500 6328 1328 S 5,7 0,1 0:00.01 as 1 root 20 0 4240 772 676 S 0,0 0,0 0:00.95 init ... As for the cpugreq sysfs files, /sys/bus/cpu/devices/cpu0/cpufreq/affected_cpus:0 /sys/bus/cpu/devices/cpu0/cpufreq/cpuinfo_cur_freq:980000 /sys/bus/cpu/devices/cpu0/cpufreq/cpuinfo_max_freq:3500000 /sys/bus/cpu/devices/cpu0/cpufreq/cpuinfo_min_freq:800000 /sys/bus/cpu/devices/cpu0/cpufreq/cpuinfo_transition_latency:4294967295 /sys/bus/cpu/devices/cpu0/cpufreq/related_cpus:0 /sys/bus/cpu/devices/cpu0/cpufreq/scaling_available_governors:performance powersave /sys/bus/cpu/devices/cpu0/cpufreq/scaling_driver:intel_pstate /sys/bus/cpu/devices/cpu0/cpufreq/scaling_governor:powersave /sys/bus/cpu/devices/cpu0/cpufreq/scaling_max_freq:3500000 /sys/bus/cpu/devices/cpu0/cpufreq/scaling_min_freq:800000 /sys/bus/cpu/devices/cpu0/cpufreq/scaling_setspeed:<unsupported> and exactly the same for cpu1, cpu2 and cpu3. See more details in the attached tarball. You may want to filter the files there through the following sed substitution to get rid of color escapes and extra spaces: sed -r -e 's/\x1B\[([0-9]{1,2}(;[0-9]{1,2})?)?[a-zA-Z]//g' -e 's/ *$//'
The patch I submitted on the mailing list (http://www.mail-archive.com/linux-kernel@vger.kernel.org/msg514089.html) fixes the problem for me. Here is an example of the values which result in 100% usage being truncated down to the setpoint, causing the frequency to effectively be stuck at 980 MHz. aperf: 9977648 mperf: 27937414 max_pstate: 28 current_pstate: 10
Hi Alexander: Could you try Brennan's patch?
I have tried it on my laptop, and it works against this particular problem.
Ok. Thanks for test. Mark this bug as resolved by Code-FIXED. The patch has sent to cpufreq maillist. http://www.spinics.net/lists/cpufreq/msg07960.html
commit d253d2a52676cfa3d89b8f0737a08ce7db665207 Author: Brennan Shacklett <brennan@genyes.org> Date: Mon Oct 21 09:20:32 2013 -0700 intel_pstate: Improve accuracy by not truncating until final result entered upstream.