Bug 60727

Summary: The CPU frequency does not raise when compiling packages on Gentoo
Product: Power Management Reporter: Alexander E. Patrakov (patrakov)
Component: cpufreqAssignee: Dirk Brandewie (dirk.brandewie)
Status: CLOSED CODE_FIX    
Severity: normal CC: aaron.lu, bpshacklett, rui.zhang, tianyu.lan
Priority: P1    
Hardware: All   
OS: Linux   
Kernel Version: 3.11-rc4 Subsystem:
Regression: No Bisected commit-id:
Attachments: Kernel config (3.11-rc4)
The requested command output, repeated many times

Description Alexander E. Patrakov 2013-08-10 15:02:34 UTC
Created attachment 107170 [details]
Kernel config (3.11-rc4)

The kernel used for this bug report is not a vanilla 3.11-rc4, but the one patched with https://bugzilla.kernel.org/attachment.cgi?id=106891&action=diff from bug #58971. It is not possible to know whether this is a regression, because the bug (if it existed before) was effectively masked by bug #48721.

To reproduce, compile a simple package (like media-libs/tiff-4.0.3-r2) on Gentoo using portage. Sometimes (but not always), during the whole process, the CPU frequency (both as reported by cpufreq-info and as reported by cat /proc/cpuinfo) doesn't raise as high as I would expect. I.e. it raises only to 1176 MHz, while I expect it to become 3.50 GHz on at least one CPU core, because the compilation process is clearly CPU bound. The driver is intel_pstate and the governor is powersave.

It may be relevant that the load created by portage is nice (but there is no ignore_nice_load file in my sysfs tree), and that all of the configuration tests and gcc calls exit quickly. If the setting to ignore the nice load exists with intel_pstate but is not named ignore_nice_load, then please tell me about it - the bug may be invalid after all due to it.

# cat /proc/cpuinfo 
processor	: 0
vendor_id	: GenuineIntel
cpu family	: 6
model		: 42
model name	: Intel(R) Core(TM) i7-2640M CPU @ 2.80GHz
stepping	: 7
microcode	: 0x28
cpu MHz		: 1176.000
cache size	: 4096 KB
physical id	: 0
siblings	: 4
core id		: 0
cpu cores	: 2
apicid		: 0
initial apicid	: 0
fpu		: yes
fpu_exception	: yes
cpuid level	: 13
wp		: yes
flags		: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc aperfmperf eagerfpu pni pclmulqdq dtes64 monitor ds_cpl vmx smx est tm2 ssse3 cx16 xtpr pdcm pcid sse4_1 sse4_2 x2apic popcnt tsc_deadline_timer aes xsave avx lahf_lm ida arat epb xsaveopt pln pts dtherm tpr_shadow vnmi flexpriority ept vpid
bogomips	: 5587.21
clflush size	: 64
cache_alignment	: 64
address sizes	: 36 bits physical, 48 bits virtual
power management:

processor	: 1
vendor_id	: GenuineIntel
cpu family	: 6
model		: 42
model name	: Intel(R) Core(TM) i7-2640M CPU @ 2.80GHz
stepping	: 7
microcode	: 0x28
cpu MHz		: 1176.000
cache size	: 4096 KB
physical id	: 0
siblings	: 4
core id		: 0
cpu cores	: 2
apicid		: 1
initial apicid	: 1
fpu		: yes
fpu_exception	: yes
cpuid level	: 13
wp		: yes
flags		: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc aperfmperf eagerfpu pni pclmulqdq dtes64 monitor ds_cpl vmx smx est tm2 ssse3 cx16 xtpr pdcm pcid sse4_1 sse4_2 x2apic popcnt tsc_deadline_timer aes xsave avx lahf_lm ida arat epb xsaveopt pln pts dtherm tpr_shadow vnmi flexpriority ept vpid
bogomips	: 5587.21
clflush size	: 64
cache_alignment	: 64
address sizes	: 36 bits physical, 48 bits virtual
power management:

processor	: 2
vendor_id	: GenuineIntel
cpu family	: 6
model		: 42
model name	: Intel(R) Core(TM) i7-2640M CPU @ 2.80GHz
stepping	: 7
microcode	: 0x28
cpu MHz		: 1176.000
cache size	: 4096 KB
physical id	: 0
siblings	: 4
core id		: 1
cpu cores	: 2
apicid		: 2
initial apicid	: 2
fpu		: yes
fpu_exception	: yes
cpuid level	: 13
wp		: yes
flags		: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc aperfmperf eagerfpu pni pclmulqdq dtes64 monitor ds_cpl vmx smx est tm2 ssse3 cx16 xtpr pdcm pcid sse4_1 sse4_2 x2apic popcnt tsc_deadline_timer aes xsave avx lahf_lm ida arat epb xsaveopt pln pts dtherm tpr_shadow vnmi flexpriority ept vpid
bogomips	: 5587.21
clflush size	: 64
cache_alignment	: 64
address sizes	: 36 bits physical, 48 bits virtual
power management:

processor	: 3
vendor_id	: GenuineIntel
cpu family	: 6
model		: 42
model name	: Intel(R) Core(TM) i7-2640M CPU @ 2.80GHz
stepping	: 7
microcode	: 0x28
cpu MHz		: 1176.000
cache size	: 4096 KB
physical id	: 0
siblings	: 4
core id		: 1
cpu cores	: 2
apicid		: 3
initial apicid	: 3
fpu		: yes
fpu_exception	: yes
cpuid level	: 13
wp		: yes
flags		: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc aperfmperf eagerfpu pni pclmulqdq dtes64 monitor ds_cpl vmx smx est tm2 ssse3 cx16 xtpr pdcm pcid sse4_1 sse4_2 x2apic popcnt tsc_deadline_timer aes xsave avx lahf_lm ida arat epb xsaveopt pln pts dtherm tpr_shadow vnmi flexpriority ept vpid
bogomips	: 5587.21
clflush size	: 64
cache_alignment	: 64
address sizes	: 36 bits physical, 48 bits virtual
power management:
Comment 1 Lan Tianyu 2013-08-12 02:56:49 UTC
Could you provide the output of running
"grep . /sys/bus/cpu/devices/cpu*/cpufreq/*" when the issue happen?
Comment 2 Alexander E. Patrakov 2013-08-12 15:40:51 UTC
Created attachment 107187 [details]
The requested command output, repeated many times

I slightly deviated from your instructons, and instead attached the output files created by the following command why compiling bison with -j3:

i=20 ; while true ; do ( top -n 1 ; grep . /sys/bus/cpu/devices/cpu*/cpufreq/* ) > cpufreq.out.$i ; i=$(( $i + 1 )) ; sleep 2 ; done

The problem can be seen clearly e.g. in cpufreq.out.54.

The kernel seems to have a very wrong idea about the CPU usage. Even though there are three cc1 processes each eating a lot of CPU time, the kernel thinks it is almost idle.

top - 21:14:06 up 25 min,  3 users,  load average: 1,48, 0,75, 0,42
Tasks: 162 total,   4 running, 158 sleeping,   0 stopped,   0 zombie
%Cpu(s):  0,5 us,  3,6 sy,  4,2 ni, 91,7 id,  0,1 wa,  0,0 hi,  0,0 si,  0,0 st
KiB Mem:   8083340 total,   928732 used,  7154608 free,   138692 buffers
KiB Swap:  6291440 total,        0 used,  6291440 free,   322384 cached

  PID USER      PR  NI    VIRT    RES    SHR S  %CPU %MEM     TIME+ COMMAND
25928 portage   23   3   59860  32556   7380 R  96,7  0,4   0:02.83 cc1
25962 portage   23   3   42432  13044   4836 R  96,7  0,2   0:00.33 cc1
25968 portage   23   3   37644   7244   3428 R  39,8  0,1   0:00.07 cc1
25965 root      20   0  123356   1656   1104 R   5,7  0,0   0:00.04 top
25969 portage   23   3   21500   6328   1328 S   5,7  0,1   0:00.01 as
    1 root      20   0    4240    772    676 S   0,0  0,0   0:00.95 init
...

As for the cpugreq sysfs files,

/sys/bus/cpu/devices/cpu0/cpufreq/affected_cpus:0
/sys/bus/cpu/devices/cpu0/cpufreq/cpuinfo_cur_freq:980000
/sys/bus/cpu/devices/cpu0/cpufreq/cpuinfo_max_freq:3500000
/sys/bus/cpu/devices/cpu0/cpufreq/cpuinfo_min_freq:800000
/sys/bus/cpu/devices/cpu0/cpufreq/cpuinfo_transition_latency:4294967295
/sys/bus/cpu/devices/cpu0/cpufreq/related_cpus:0
/sys/bus/cpu/devices/cpu0/cpufreq/scaling_available_governors:performance powersave
/sys/bus/cpu/devices/cpu0/cpufreq/scaling_driver:intel_pstate
/sys/bus/cpu/devices/cpu0/cpufreq/scaling_governor:powersave
/sys/bus/cpu/devices/cpu0/cpufreq/scaling_max_freq:3500000
/sys/bus/cpu/devices/cpu0/cpufreq/scaling_min_freq:800000
/sys/bus/cpu/devices/cpu0/cpufreq/scaling_setspeed:<unsupported>

and exactly the same for cpu1, cpu2 and cpu3. See more details in the attached tarball. You may want to filter the files there through the following sed substitution to get rid of color escapes and extra spaces:

sed -r -e 's/\x1B\[([0-9]{1,2}(;[0-9]{1,2})?)?[a-zA-Z]//g' -e 's/ *$//'
Comment 3 Brennan Shacklett 2013-10-04 14:19:28 UTC
The patch I submitted on the mailing list (http://www.mail-archive.com/linux-kernel@vger.kernel.org/msg514089.html) fixes the problem for me.
Here is an example of the values which result in 100% usage being truncated down to the setpoint, causing the frequency to effectively be stuck at 980 MHz.
aperf: 9977648
mperf: 27937414
max_pstate: 28
current_pstate: 10
Comment 4 Lan Tianyu 2013-10-14 01:32:08 UTC
Hi Alexander:
            Could you try Brennan's patch?
Comment 5 Alexander E. Patrakov 2013-10-14 01:59:53 UTC
I have tried it on my laptop, and it works against this particular problem.
Comment 6 Lan Tianyu 2013-10-14 02:03:52 UTC
Ok. Thanks for test. Mark this bug as resolved by Code-FIXED.
The patch has sent to cpufreq maillist.
http://www.spinics.net/lists/cpufreq/msg07960.html
Comment 7 Aaron Lu 2013-12-09 08:23:09 UTC
commit d253d2a52676cfa3d89b8f0737a08ce7db665207
Author: Brennan Shacklett <brennan@genyes.org>
Date:   Mon Oct 21 09:20:32 2013 -0700

    intel_pstate: Improve accuracy by not truncating until final result

entered upstream.