Bug 17001 - ondemand governor non-functional / ACPI P-states driver
Summary: ondemand governor non-functional / ACPI P-states driver
Status: CLOSED DUPLICATE of bug 19702
Alias: None
Product: Power Management
Classification: Unclassified
Component: cpufreq (show other bugs)
Hardware: All Linux
: P1 normal
Assignee: Thomas Renninger
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2010-08-25 11:52 UTC by Peter Ganzhorn
Modified: 2010-10-15 10:11 UTC (History)
3 users (show)

See Also:
Kernel Version: 2.6.35.3
Subsystem:
Regression: No
Bisected commit-id:


Attachments
Output of 'cat /proc/cpuinfo' (1.47 KB, text/plain)
2010-08-25 20:57 UTC, Peter Ganzhorn
Details
Requested information from sysfs (1.05 KB, text/plain)
2010-08-26 14:40 UTC, Peter Ganzhorn
Details
cat /sys/devices/system/cpu/cpu0/cpufreq/ondemand/* (401 bytes, text/plain)
2010-08-27 12:21 UTC, Peter Ganzhorn
Details
dmesg with cpufreq.debug=7 (44.29 KB, text/plain)
2010-08-27 12:24 UTC, Peter Ganzhorn
Details
dmesg (load test with lowered up_treshold) (48.71 KB, text/plain)
2010-08-27 12:27 UTC, Peter Ganzhorn
Details

Description Peter Ganzhorn 2010-08-25 11:52:08 UTC
Hardware:
Toshiba Tecra A9 notebook
Intel Core2 Duo T7300 2.0 GHz processor

Software:
Debian testing "Squeeze"
Kernel 2.6.35.3

Bug description:
With the cpufreq-ondemand governor enabled, the CPU frequency does not increase at all with any workload.
Even with both cores running at 100% load the CPU frequency will never increase above the minimum frequency of 800MHz.
Using the performance governor makes the CPU switch to 2GHz immediately.

Running on battery or AC power does not have any effect on this issue.

I am using the exact same software configuration (same distro, kernel .config is identical except for different NIC drivers) on a workstation computer with an Intel Core2 Quad Q9550, the workstation is not affected by the issue (CPU frequency is changed when load situation changes almost instantaneously to a frequency fitting the current load situation).

Here's some sysfs info about the cpufreq configuration (looks fine to me):

centauri:/sys/devices/system/cpu/cpu0/cpufreq# cat ./cpuinfo_max_freq 
2001000
centauri:/sys/devices/system/cpu/cpu0/cpufreq# cat ./cpuinfo_min_freq 
800000
centauri:/sys/devices/system/cpu/cpu0/cpufreq# cat ./scaling_max_freq 
2001000
centauri:/sys/devices/system/cpu/cpu0/cpufreq# cat ./scaling_min_freq
800000
centauri:/sys/devices/system/cpu/cpu0/cpufreq# cat ./scaling_setspeed 
<unsupported>
centauri:/sys/devices/system/cpu/cpu0/cpufreq# cat ./scaling_governor 
performance
centauri:/sys/devices/system/cpu/cpu0/cpufreq# cat ./scaling_driver 
acpi-cpufreq
centauri:/sys/devices/system/cpu/cpu0/cpufreq# cat ./scaling_available_frequencies 
2001000 2000000 1600000 1200000 800000

Please tell me what other information I need to submit to you.

As far as I can tell, the problem exists at least since kernel version 2.6.30, I did not test any earlier versions.
Debian distribution stock kernel is affected from the issue as well.

Thanks for looking into this issue.

best regards,
Peter Ganzhorn
Comment 1 Peter Ganzhorn 2010-08-25 20:57:26 UTC
Created attachment 27961 [details]
Output of 'cat /proc/cpuinfo'
Comment 2 Thomas Renninger 2010-08-26 11:14:39 UTC
> Here's some sysfs info
But this is with the working performance governor, better would be if you show sysfs files in broken case.

Could it be that you did the performance vs ondemand governor check at different times or did you simply echo performance >scaling_governor and things started to work?
Background: The frequency might get limited on purpose for whatever reason. If you e.g. rebooted in between, the temperature (or whatever limitation reason), might be different and this is not a governor, but a general freq limiting issue.

Best you check whether you run the latest BIOS and update if not.
If the problem persists, go through the BIOS options and search for SpeedStep, CPU frequency, P-state or power management related options. Possibly you find a knob which fixes the issue.

If you still see this, please use a recent kernel (2.6.34 or above) and provide the output of:
for x in /sys/devices/system/cpu/cpu0/cpufreq/*;do echo $x;cat $x;done
If you see the issue.
Also try whether it really seem to be an ondemand issue by switching governors manually several times:
echo performance >scaling_governor
echo ondemand >scaling_governor
and make sure this really only happens with ondemand.
Comment 3 Peter Ganzhorn 2010-08-26 14:38:45 UTC
Sorry, using the performance governor was my bad - I almost always switch to performace manually because working on that computer with 800MHz ist just painful.

I rebooted the machine to get some clean results and ran some CPU intensive calculations with matlab, the load was at 100% for quite some time and the frequency did not change at all.
Temperature can't be the problem since the CPU temperature was reported as 35°C-40°C (I checked), with the performance governor the temperature will go above 70°C before the CPU slows down because of too high temperature.

I'll append the information you requested now.
Please also note the following:

# cat /sys/devices/system/cpu/cpu0/cpufreq/stats/time_in_state
2001000 0
2000000 0
1600000 0
1200000 0
800000 84859

Was done with ondemand governor enabled this time ;)
Comment 4 Peter Ganzhorn 2010-08-26 14:40:34 UTC
Created attachment 28021 [details]
Requested information from sysfs
Comment 5 Thomas Renninger 2010-08-26 15:13:54 UTC
Can you also post the ondemand settings, that would be:
for x in /sys/devices/system/cpu/cpu0/cpufreq/ondemand/*;do echo $x;cat $x;done

Does it help if you increase the sampling_rate value, e.g. to 100000 or lower the up_threshold or set ignore_nice_load to 0?

You could use two times:
cat /dev/zero >/dev/null &
to fully utilize the cpus, this should ramp the freq up.

If you have CONFIG_CPU_FREQ_DEBUG set in your config, check with:
zcat /proc/config.gz |grep CONFIG_CPU_FREQ_DEBUG
you could boot with cpufreq.debug=7, quickly (so that the output is not too much and the boot process is still included in dmesg) try above and fully utilize the CPUs and attach the full dmesg output, hopefully there is a hint what goes wrong...
Comment 6 Peter Ganzhorn 2010-08-27 12:21:12 UTC
Created attachment 28111 [details]
cat /sys/devices/system/cpu/cpu0/cpufreq/ondemand/*

Increasing the sampling_rate does not affect the issue at all.

Lowering up_treshold to 53 or below makes the ondemand governor work, even short load spikes (like when starting firefox) makes the cpu frequency jump up to 2GHz and then back down to 800MHz.
Any value greater than 53 does not change anything, even 54 has no effect at all.
Same with increased and default sampling_rate. 

ignore_nice_load was already set to 0 by default.
Comment 7 Peter Ganzhorn 2010-08-27 12:24:17 UTC
Created attachment 28121 [details]
dmesg with cpufreq.debug=7

cat /dev/zero >/dev/null &
cat /dev/zero >/dev/null &
sleep 10
dmesg > /data/dmesg0.txt
killall cat

dmesg output after reboot and running above commands. Frequency did not increase while CPU was at reported 100% load.
Comment 8 Peter Ganzhorn 2010-08-27 12:27:49 UTC
Created attachment 28131 [details]
dmesg (load test with lowered up_treshold)

I ran the same commands as before with up_treshold=53.

CPU frequency did immediately change to 2GHz when under load.

Is this still a cpufreq bug? I can't help the feeling that the load calculation is off in some odd way...maybe you'll find something in the log that hints to the problem.
Thanks for your quick responses by the way!
Comment 9 Gerhard Killesreiter 2010-09-06 01:47:25 UTC
I have a HP Compaq 6910p laptop with the same type processor as the original submitter.

I've built myself a 2.6.35 kernel from Debian sources. Everytime I boot using this kernel the processor speed is stuck at 800 MHz (the lowest speed available for this processor). I can change the governor (standard governor is "ondemand") but it doesn't allow the processor speed to increase.

The reason is probably that the bios_limit at e.g.

/sys/devices/system/cpu/cpu0/cpufreq/bios_limit

is stuck at 800 MHz and cannot be changed.

Earlier Debian kernels (2.6.32) did not exhibit this problem.
Comment 10 Gerhard Killesreiter 2010-09-06 02:30:52 UTC
Shortly after I wrote the above, the current kernel (Linux Kaktus 2.6.32-bpo.3-686 #1 SMP Wed Mar 17 14:31:18 UTC 2010 i686 GNU/Linux) decided to prove me wrong and got stuck at 800 MHz while running.
Comment 11 Thomas Renninger 2010-09-06 09:20:23 UTC
Gerhard: You probably have another problem. Peter's problem seem to point to a real kernel bug, you have to find out why the BIOS is limiting the frequency (which the kernel is probably doing correct).
You find hints how to find that out here:
https://bugzilla.kernel.org/show_bug.cgi?id=16362
But best would be to open another bug as the reason why BIOS is limiting the freq is probably another one. Update your BIOS first and double check whether the problem persists. State HP Compaq 6910p again (in the title?) so that others also find the issue quickly.

Peter: Could you try another kernel version. Either the latest 2.6.36-rcX or easier and even better would be something like 2.6.27-32. I can help you to try that out easily if you use openSUSE, otherwise you have to find out yourself, but it shouldn't be that hard to additionally install and try out an older kernel. Also check the "bias" ondemand governor setting (same dir(s) as up_threshold), this must be zero. If it's not, it's probably that.
Unfortunately the ondemand governor is not very verbose about it's load statistics, even with cpufreq.debug on.
Comment 12 Peter Ganzhorn 2010-09-06 09:37:26 UTC
Thomas, do you need me to enable cpufreq.debug=7 and upload the results or do I just need to check if the ondemand governor works "out of the box" with the other kernels?
I'm going to download 2.6.27.32 and the latest 2.6.36-rc now (Vanilla from kernel.org, the 2.6.35.4 I'm running is plain Vanilla as well!) and use my current .config .
If you want me to check if specific kernel options are enabled, please let me know.
My usual CPUfreq .config section looks like this:

#
# CPU Frequency scaling
#
CONFIG_CPU_FREQ=y
CONFIG_CPU_FREQ_TABLE=y
CONFIG_CPU_FREQ_DEBUG=y
CONFIG_CPU_FREQ_STAT=y
CONFIG_CPU_FREQ_STAT_DETAILS=y
# CONFIG_CPU_FREQ_DEFAULT_GOV_PERFORMANCE is not set
# CONFIG_CPU_FREQ_DEFAULT_GOV_POWERSAVE is not set
# CONFIG_CPU_FREQ_DEFAULT_GOV_USERSPACE is not set
CONFIG_CPU_FREQ_DEFAULT_GOV_ONDEMAND=y
# CONFIG_CPU_FREQ_DEFAULT_GOV_CONSERVATIVE is not set
CONFIG_CPU_FREQ_GOV_PERFORMANCE=y
CONFIG_CPU_FREQ_GOV_POWERSAVE=y
CONFIG_CPU_FREQ_GOV_USERSPACE=y
CONFIG_CPU_FREQ_GOV_ONDEMAND=y
CONFIG_CPU_FREQ_GOV_CONSERVATIVE=y

#
# CPUFreq processor drivers
#
# CONFIG_X86_PCC_CPUFREQ is not set
CONFIG_X86_ACPI_CPUFREQ=y
# CONFIG_X86_POWERNOW_K8 is not set
# CONFIG_X86_SPEEDSTEP_CENTRINO is not set
# CONFIG_X86_P4_CLOCKMOD is not set

#
# shared options
#
# CONFIG_X86_SPEEDSTEP_LIB is not set
CONFIG_CPU_IDLE=y
CONFIG_CPU_IDLE_GOV_LADDER=y
CONFIG_CPU_IDLE_GOV_MENU=y
CONFIG_INTEL_IDLE=y
Comment 13 Gerhard Killesreiter 2010-09-06 10:13:08 UTC
Thomas: I've been running this laptop on Linux for close to two years and never had this problem before. I don't think it is really related to the Bios.
Comment 14 Thomas Renninger 2010-09-06 10:22:10 UTC
> Thomas, do you need me to enable cpufreq.debug=7 and upload the results
I fear there won't be much usable info for your problem with this debug setting.

> My usual CPUfreq .config section looks like this:
Interesting, you compile in everything. No idea, why it should make a difference yet, but it should be a difference to most other distro .config setups.

> I'm going to download 2.6.27.32 and the latest 2.6.36-rc now
Great. Do you compile the kernels yourself or do you have the affected one compiled yourself?
If yes, it's not that much of work to add some printks into the ondemand governor. The relevant parts would be in drivers/cpufreq/cpufreq_ondemand.c
dbs_check_cpu(..):
printing out all kind of idle/load/wall_time, better too much as missing something is a good idea.
You'll get a lot output if ondemand is running then, but you could limit it by switching governors, e.g. this should limit output to one sec:
---
#/bin/bash

cat /dev/zero >/dev/null &
cat /dev/zero >/dev/null &
for x in /sys/devices/system/cpu/cpu0/cpufreq/*;do echo performance >$x/scaling_governor;done
logger "XXXX: Here starts my ondemand test in /var/log/messages"
for x in /sys/devices/system/cpu/cpu0/cpufreq/*;do echo ondemand >$x/scaling_governor;done
sleep 1;
for x in /sys/devices/system/cpu/cpu0/cpufreq/*;do echo performance >$x/scaling_governor;done
logger "YYYY: Here ends my ondemand test in /var/log/messages"
killall cat
---
(could have typos, did not run above)

The problem does not sound HW/BIOS related. It more looks like a kernel bug which slipped in at a specific version, possibly together with some of your .config specifics -> otherwise there should exist more reports.

> I just need to check if the ondemand governor works "out of the box" with the
> other kernels?
Yep, if we have non-/working versions to compare that would make things a lot easier.

Another idea: You could adopt a bit our CONFIG_CPU_FREQ* settings which are more modular, possibly this gives us a hint why your system has issues:
#
# CPU Frequency scaling
#
CONFIG_CPU_FREQ=y
CONFIG_CPU_FREQ_TABLE=y
# CONFIG_CPU_FREQ_DEBUG is not set
CONFIG_CPU_FREQ_STAT=m
CONFIG_CPU_FREQ_STAT_DETAILS=y
# CONFIG_CPU_FREQ_DEFAULT_GOV_PERFORMANCE is not set
# CONFIG_CPU_FREQ_DEFAULT_GOV_USERSPACE is not set
CONFIG_CPU_FREQ_DEFAULT_GOV_ONDEMAND=y
# CONFIG_CPU_FREQ_DEFAULT_GOV_CONSERVATIVE is not set
CONFIG_CPU_FREQ_GOV_PERFORMANCE=y
CONFIG_CPU_FREQ_GOV_POWERSAVE=m
CONFIG_CPU_FREQ_GOV_USERSPACE=m
CONFIG_CPU_FREQ_GOV_ONDEMAND=y
CONFIG_CPU_FREQ_GOV_CONSERVATIVE=m

CONFIG_X86_ACPI_CPUFREQ=m
==============

CONFIG_CPU_FREQ_STAT=m
and not loading the module sounds like a good idea...
Comment 15 Thomas Renninger 2010-09-06 10:29:51 UTC
About comment #13:
There has been a *correct* patch to read out bios frequency limitations at cpufreq init time recently (before we only reacted on ACPI processor events to take the limitations into account). That is probably why you see this. Please at least comment on the other bug, as this one is totally unrelated. processor.ignore_ppc=1 boot param will be your workaround. If you like to hunt down the limitation initiated by your BIOS (there might be a reason for the limitation (only happens on battery or on AC or at specific temp or with a specific AC adapter, ...)), please look up:
https://bugzilla.kernel.org/show_bug.cgi?id=16362
Comment 16 Peter Ganzhorn 2010-09-06 10:46:19 UTC
Yes, I do compile kernels myself. The distro kernels usually are patched and contain back-ports of all kind, so filing a bug report on bugzilla.kernel.org with a distro kernel is somewhat pointless I figure...
On top of that I only include drivers and features I need and usually compile the stuff in - the Linux kernel was designed to be a monolithic kernel, not a microkernel. ;)

If you want to add some more debugging printks to the ondemand governor sources, just send me a patch or tell me where I should put them. Although I guess this will be tricky for me to do right, since you know what you'd like to see and I'm not a kernel hacker... (I know how to read C and can write small C programs, but I'm not a professional programmer or something like that)

And here's a bit of a problem: 2.6.27.* still does have EXT4 marked as experimental, will this work with current EXT4 file systems or am I going to run into some serious trouble when running this kernel? (My root partition is an EXT4 partition, and obviously I didn't pay attention when I installed the system...seems Debian Squeeze uses EXT4 as default FS already?!)
Second problem: 2.6.36-rc3 won't compile, I'm getting an implicit declaration error in the i915 DRM driver...

Do you think trying 2.6.28 instead of .27 could help since EXT4 was marked stable in .28 if I recall this right.
Or have there been major changes in the CPUFREQ code after 2.6.27?

I'll tweak my .config to include CPUFREQ as modules as you suggested for now.
Comment 17 Thomas Renninger 2010-10-15 10:11:52 UTC
> As far as I can tell, the problem exists at least since kernel version 2.6.30
Strange, that would mean this is somehow HW specific, I doubt every/much machines shows that behavior..., possible that it's only 2.6.35 and something went wrong or got mixed up when trying to test 2.6.30?

*** This bug has been marked as a duplicate of bug 19702 ***

Note You need to log in before you can comment on or make changes to this bug.