Bug 9708 - powersave governor does not scale down when power gets connected
Summary: powersave governor does not scale down when power gets connected
Status: CLOSED CODE_FIX
Alias: None
Product: Power Management
Classification: Unclassified
Component: cpufreq (show other bugs)
Hardware: All Linux
: P1 normal
Assignee: Dave Jones
URL:
Keywords:
: 9151 (view as bug list)
Depends on:
Blocks:
 
Reported: 2008-01-07 13:11 UTC by Helge Deller
Modified: 2008-06-04 23:41 UTC (History)
3 users (show)

See Also:
Kernel Version: 2.6.24-rc6 (at least) to 2.6.25-rc6
Subsystem:
Regression: ---
Bisected commit-id:


Attachments
Full syslog from bootup to shutdown. Problem happened shortly before timestamp "Jan 9 21:14:59" (23.67 KB, application/x-gzip)
2008-01-09 12:40 UTC, Helge Deller
Details
Might fix your problem - not the final patch (625 bytes, patch)
2008-02-01 05:35 UTC, Thomas Renninger
Details | Diff
full system boot up with debugging enabled. (34.21 KB, text/plain)
2008-03-22 07:26 UTC, Helge Deller
Details
full log, directly after having power cable attached (36.01 KB, text/plain)
2008-03-22 07:30 UTC, Helge Deller
Details
syslog, describing what happened when inserting power cable. (1.88 KB, text/x-diff)
2008-03-22 07:32 UTC, Helge Deller
Details
Hack-ish patch to 2.6.25-rc6 which fixes the problem (437 bytes, patch)
2008-03-22 08:17 UTC, Helge Deller
Details | Diff
Make acpi-cpufreq more robust against BIOS frequency changes (1.16 KB, patch)
2008-03-22 12:08 UTC, Thomas Renninger
Details | Diff
Make acpi-cpufreq more robust - take 2 (1.35 KB, patch)
2008-03-26 11:57 UTC, Venkatesh Pallipadi
Details | Diff
Make acpi-cpufreq more robust - take 2 (1.35 KB, patch)
2008-03-26 11:57 UTC, Venkatesh Pallipadi
Details | Diff

Description Helge Deller 2008-01-07 13:11:38 UTC
Hardware Environment:
HP NC6000 Laptop (Intel(R) Pentium(R) M processor 1.60GHz)

Problem Description:
cpufreq powermanagemnt does work pretty ok.
Nevertheless, I face a problem when I suddenly plug in the power cable.

Machine has no load, nevertheless I then can see (even minutes after I plugged in the power cable):

[root@halden cpufreq]# uptime &&  grep "" /sys/devices/system/cpu/cpu0/cpufreq/*

 21:57:23 up  1:04,  3 users,  load average: 0.09, 0.08, 0.09

/sys/devices/system/cpu/cpu0/cpufreq/affected_cpus:0
/sys/devices/system/cpu/cpu0/cpufreq/cpuinfo_cur_freq:1600000
/sys/devices/system/cpu/cpu0/cpufreq/cpuinfo_max_freq:1600000
/sys/devices/system/cpu/cpu0/cpufreq/cpuinfo_min_freq:600000
/sys/devices/system/cpu/cpu0/cpufreq/scaling_available_frequencies:1600000 1400000 1200000 1000000 800000 600000
/sys/devices/system/cpu/cpu0/cpufreq/scaling_available_governors:conservative ondemand powersave userspace performance
/sys/devices/system/cpu/cpu0/cpufreq/scaling_cur_freq:1600000
/sys/devices/system/cpu/cpu0/cpufreq/scaling_driver:acpi-cpufreq
/sys/devices/system/cpu/cpu0/cpufreq/scaling_governor:powersave
/sys/devices/system/cpu/cpu0/cpufreq/scaling_max_freq:1600000
/sys/devices/system/cpu/cpu0/cpufreq/scaling_min_freq:600000

As you can see, the CPU can scale between 600 MHz and 1600 MHz.
The governour is "powersave", but scaling_cur_freq constantly stays at 1600 MHz. In principle the powersave governour should now scale it down to 600 MHz, but it does not.

Interestingly, when I then switch the scaling governor to "performance" (or any other value than powersave) and then directly back to powersave, it then suddenly does work as designed and scales the CPU down to 600MHz. It seems the short change in the governor triggers start of the scale-down.
Comment 1 Thomas Renninger 2008-01-08 03:46:56 UTC
Do you use the latest BIOS for this machine?
Do you have a related option in BIOS configuration set, like:
"Full performance on AC"
or similar?

Do you see ACPI PACKAGE_LIMIT errors in dmesg? (-> then this is probably related to: http://bugzilla.kernel.org/show_bug.cgi?id=9558)
Comment 2 Helge Deller 2008-01-08 10:51:54 UTC
Yes, I'm using the latest BIOS.

In the BIOS there is no specific Performance/ACPI changes possibile. The only option I found was something like "Use Intel SpeedStep Technology" with Yes, No, Automatic. I choosed Automatic.

I don't see any ACPI error messages in dmesg. The problem in http://bugzilla.kernel.org/show_bug.cgi?id=9558 does not seem related to my bug.

Any other ideas or any other information needed?
Comment 3 Thomas Renninger 2008-01-09 01:41:29 UTC
I expect you are seeing the same bug:
https://bugzilla.novell.com/show_bug.cgi?id=334378

Compiling the kernel with CONFIG_CPU_FREQ_DEBUG=y and boot with cpufreq.debug=7 parameter. Also add ACPI_DEBUG=y and add acpi.debug_level=0x1f boot parameter (maybe an ACPI event happens at a certain time that leads to this condition?).
It may also help if you do the performance and back to powersave governor switches and mark the positions in dmesg logs when those have been issued.
Comment 4 Helge Deller 2008-01-09 12:40:39 UTC
Created attachment 14385 [details]
Full syslog from bootup to shutdown. Problem happened shortly before timestamp "Jan  9 21:14:59"

Good news! Attached is the full syslog from startup of system to shutdown with ACPI debug enabled.
If you watch out for the "MARKER:" flags, you'll see what I just started to do, or what I just found. The real problem happens between timestamp 21:14:35 and 21:14:59 
Hope that helps to find the bug.

PS: It seems cpufreq debugging is not visible. maybe the boot option you mentioned was wrong? (I'll check if necessary).
Comment 5 Helge Deller 2008-01-09 12:47:49 UTC
Hi Thomas,
yep, this bug report here is exactly the same symptom/problem as in Novell's bugzilla https://bugzilla.novell.com/show_bug.cgi?id=334378.

Differences are:
- different Laptops (HP & Dell)
- different Linux kernel versions (incl. mainline 2.6.24 which I run)
- different distributions (Fedora & SUSE)
-> IMHO clearly a Linux kernel problem in all kernels (2.6.22 -> 2.6.24).
Comment 6 Thomas Renninger 2008-01-22 16:43:58 UTC
*** Bug 9151 has been marked as a duplicate of this bug. ***
Comment 7 Thomas Renninger 2008-02-01 05:32:14 UTC
> PS: It seems cpufreq debugging is not visible. maybe the boot option you
> mentioned was wrong? (I'll check if necessary).
The parameter should be right. Are you sure you compiled CPU_FREQ_DEBUG into the kernel? You can double check with:
zcat /proc/config.gz|less
Maybe it is wrong, you should find help with google...

Can you also add a debug patch (which hopefully will be mainline soon..., Dave?), I posted it to the cpufreq list a while ago:
http://article.gmane.org/gmane.linux.kernel.cpufreq/5642

Also add another patch (I will attach it)
Comment 8 Thomas Renninger 2008-02-01 05:35:44 UTC
Created attachment 14673 [details]
Might fix your problem - not the final patch

If this one fixes your problem and you see a "out-of-sync" message with cpufreq debug enabled (exactly at that time when you run into the problem), then this is nearly the fix.
The governor limits check should be moved up into the else branch, shortly after the cpufreq_out_of_sync call.
Comment 9 Thomas Renninger 2008-02-13 06:18:55 UTC
The patch from comment above is wrong, you need:
&policy
instead of:
policy
Comment 10 Helge Deller 2008-02-15 13:42:39 UTC
Thomas, I've just tried your patch from comment #8 (with the "&" fix) on 2.6.24-final, but the problem still exists.
Comment 11 Thomas Renninger 2008-02-16 15:45:01 UTC
There will pop up a boot parameter cpufreq.ignore_ppc (or similiar), in 2.6.25.
I can double check when exactly it goes in, in some days. This should help/work around.

If you have some time, it would be great if you could start a kernel with CONFIG_CPU_FREQ_DEBUG=y compiled in and cpufreq.debug=7 boot param (all info out of my mind...).
Comment 12 Helge Deller 2008-03-21 02:40:27 UTC
Bug still exists in 2.6.25-rc6.
I haven't yet checked cpufreq.ignore_ppc parameter or tried the debug options yet.
Comment 13 Helge Deller 2008-03-22 07:26:30 UTC
Created attachment 15392 [details]
full system boot up with debugging enabled.

This is the full syslog, debug messages are enabled.
Laptop was booted, power cable not attached, CPUfreq scales correctly down to 600 MHz.
Comment 14 Helge Deller 2008-03-22 07:30:16 UTC
Created attachment 15393 [details]
full log, directly after having power cable attached

same log as the previous one, but this now includes the syslog messages which were created after I plugged in the power cable.
Running a diff between this log and the previous one shows what happened.
It seems here it is visible, that the CPU frequency goes up to 1600 MHz, although it should (with the powersave governour) stay at 600 MHz.
Comment 15 Helge Deller 2008-03-22 07:32:17 UTC
Created attachment 15394 [details]
syslog, describing what happened when inserting power cable.

This is just the diff-file between the previous two syslogs.
Comment 16 Helge Deller 2008-03-22 08:17:51 UTC
Created attachment 15395 [details]
Hack-ish patch to 2.6.25-rc6 which fixes the problem

Hi Thomas,

I just tested this patch/hack and it fixes the problem. 

Main problem is apparently visible in the message "acpi-cpufreq: Already at target state (P5)".
Since the acpi-cpufreq driver seems to think that it's already running at the given target state (600MHz), it won't change the cpufreq level again (although it should, since in reality it's running with 1600 MHz).

The real fix is probably somewhere else, e.g. setting specific variables so that the acpi-cpufreq driver changes the speed.

Could you take a look at it again? Esp. the last "diff-file" which I attached may help a lot.

THX, Helge
Comment 17 Thomas Renninger 2008-03-22 12:08:16 UTC
Created attachment 15401 [details]
Make acpi-cpufreq more robust against BIOS frequency changes

That the BIOS changes freq behind the back of the OS again, seems to be true.
The problem here seem to be that out_of_sync adjusts the current frequency of the cpufreq core subsystem, but acpi-cpufreq keeps a low-level variable on which frequency the system currently is.
This could IMO be a final solution.

You may want to wait with a test until Venkatesh has reviewed this.
Comment 18 Thomas Renninger 2008-03-22 12:13:12 UTC
Compile tested only.
Venkatesh, pls review.
Dave, can you add this one if Venkatesh gives his ok and Helge confirmed it working, pls.
Comment 19 Helge Deller 2008-03-22 13:35:01 UTC
Thanks Thomas !
The patch works. 
CPU frequency stays at 600 MHz, even if I plug in/out the power cable.
Everything else, including other frequency governours work as well.
So, this is:

Tested-by: Helge Deller <deller@gmx.de>


Nevertheless, I'd have expected a "Frequency changed unexpectedly (by BIOS?)" message each time I plug in/out the power cable. This doesn't happen. Instead I see this message only once during the boot phase. I assume this is correct, so it would be great, if this patch is pushed upstream soon.
Comment 20 Venkatesh Pallipadi 2008-03-26 11:13:59 UTC
I guess I am missing something in the patch from comment #17.

I dont see anything changing in the return value of get_cur_freq_on_cpu() due to the patch. It was return freq before and now with no change to value of freq. And assumed_freq is not really used anywhere....
Comment 21 Venkatesh Pallipadi 2008-03-26 11:25:28 UTC
Hmm.. I got what I was missing. But changing something like  
data->freq_table[data->acpi_data->state].frequency = actual freq
may not be the right thing to do. I somehow think we should actually change the data->acpi_data->state = actual_state
Comment 22 Venkatesh Pallipadi 2008-03-26 11:57:53 UTC
Created attachment 15452 [details]
Make acpi-cpufreq more robust - take 2


Made a change to Thomas's patch.

Helge: Can you check whether this patch works?
Thomas: OK with the change?

If the answer is yes to both of those questions, we can push the patch towards Dave/Len.
Comment 23 Venkatesh Pallipadi 2008-03-26 11:57:54 UTC
Created attachment 15453 [details]
Make acpi-cpufreq more robust - take 2


Made a change to Thomas's patch.

Helge: Can you check whether this patch works?
Thomas: OK with the change?

If the answer is yes to both of those questions, we can push the patch towards Dave/Len.
Comment 24 Thomas Renninger 2008-03-27 03:03:07 UTC
looks fine.

I wonder whether someone (at Intel?) has contact to HP and Dell and can tell them to not do such nasty things.
They:
  - probably also have problems on M$ as Dell (and now HP) was the only one
    doing this

  - do not understand the concept of ACPI and introduce unneeded SMM code, while
    ACPI should avoid exactly that here.

I'll try, but in the laptop area we do not have much contacts to them...
Comment 25 Thomas Renninger 2008-03-27 03:09:02 UTC
> Nevertheless, I'd have expected a "Frequency changed unexpectedly (by BIOS?)"
> message each time I plug in/out the power cable. This doesn't happen. Instead
> I see this message only once during the boot phase.
This is indeed strange, I also have expected this message appearing on each cable plug-in. Maybe you can run for a while the cpufreq debug kernel and sometimes have a look at it, whether you see e.g. this message popping up unexpectedly.
Maybe you can modify a cpufreq setting in BIOS configuration?
Hope your BIOS is not that old and you mainly run on the default settings?
Comment 26 Helge Deller 2008-03-27 12:56:29 UTC
Hello Venkatesh and Thomas,

patch in comment #22/#23 works fine as well.

Venkatesh, maybe adding a short debug comment to where you set "data->resume = 1;" makes sense? I added the string below (see line marked with XXXXXXXX).

Here is the relevant output:

cpi-cpufreq: get_cur_freq_on_cpu (0)
acpi-cpufreq: get_cur_val = 100667432
acpi-cpufreq: Frequency changed unexpectedly (by BIOS?) (XXXXXXXXX)
acpi-cpufreq: cur freq = 1600000
cpufreq-core: Warning: CPU frequency out of sync: cpufreq and timing core thinks of 600000, is 1600000 kHz.
cpufreq-core: notification 0 of frequency transition to 1600000 kHz
cpufreq-core: scaling loops_per_jiffy to 5319872 for frequency 1600000 kHz
cpufreq-core: notification 1 of frequency transition to 1600000 kHz
cpufreq-core: setting new policy for CPU 0: 600000 - 1600000 kHz
acpi-cpufreq: acpi_cpufreq_verify
freq-table: request for verification of policy (600000 - 1600000 kHz) for cpu 0
freq-table: verification lead to (600000 - 1600000 kHz) for cpu 0
acpi-cpufreq: acpi_cpufreq_verify
freq-table: request for verification of policy (600000 - 1600000 kHz) for cpu 0
freq-table: verification lead to (600000 - 1600000 kHz) for cpu 0
cpufreq-core: new min and max freqs are 600000 - 1600000 kHz
cpufreq-core: governor: change or update limits
cpufreq-core: __cpufreq_governor for CPU 0, event 3
powersave: setting to 600000 kHz because of event 3
cpufreq-core: target for CPU 0: 600000 kHz, relation 0
acpi-cpufreq: acpi_cpufreq_target 600000 (0)
freq-table: request for target 600000 kHz (relation: 0) for cpu 0
freq-table: target is 5 (600000 kHz, 5)
acpi-cpufreq: Called after resume, resetting to P5
cpufreq-core: notification 0 of frequency transition to 600000 kHz
cpufreq-core: Warning: CPU frequency is 600000, cpufreq assumed 1600000 kHz.
cpufreq-core: notification 1 of frequency transition to 600000 kHz
cpufreq-core: scaling loops_per_jiffy to 1994952 for frequency 600000 kHz

Interestingly I see this message again only once, even if I plug in/out the cable multiple times.

@Thomas (regarding your comment #25): My BIOS configuration does not offer any possibility to change the cpufreq/ACPI behaviour. See also my notes in comment #2.
Comment 27 Helge Deller 2008-04-25 14:10:23 UTC
Patch ist OK. Could this patch (see comment #22 or #23 which are identical) be included in upcoming 2.6.26 kernel ?

tested-by: Helge Deller <deller@gmx.de>
Comment 28 Dave Jones 2008-04-28 12:17:33 UTC
added to cpufreq.git, will go to linus soon.

Note You need to log in before you can comment on or make changes to this bug.