Bug 10982

Summary: scaling_max_freq locked - regression 2.6.18 worked, 2.6.25 fails - Thinkpad T42
Product: ACPI Reporter: Nicolas Vinot (nicolas.vinot)
Component: Power-ProcessorAssignee: Venkatesh Pallipadi (venki)
Status: REJECTED UNREPRODUCIBLE    
Severity: normal CC: acpi-bugzilla, akpm, auke-jan.h.kok, trenn, yakui.zhao
Priority: P1    
Hardware: All   
OS: Linux   
Kernel Version: 2.6.25 Subsystem:
Regression: Yes Bisected commit-id:
Attachments: dmesg output
acpidump output

Description Nicolas Vinot 2008-06-25 14:22:51 UTC
Distribution: Debian testing
Hardware Environment: IBM Thinkpad T42
Software Environment: 2.6.25 kernel
Problem Description:

Few minutes after boot, scaling_max_freq is locked to the lowest cpu speed.
Impossible to change it, even by hand or cpufreq-set.

root:cpufreq# pwd
/sys/devices/system/cpu/cpu0/cpufreq

root:cpufreq# cat scaling_driver
acpi-cpufreq

root:cpufreq# cat scaling_available_frequencies
1800000 1600000 1400000 1200000 1000000 800000 600000

root:cpufreq# cat scaling_available_governors
powersave userspace ondemand conservative performance

root:cpufreq# cat scaling_governor
ondemand

root:cpufreq# cat scaling_max_freq
600000

root:cpufreq# echo 1800000 > scaling_max_freq
root:cpufreq# cat scaling_max_freq
600000

root:cpufreq# cat cpuinfo_max_freq
1800000
root:cpufreq# cat cpuinfo_max_freq > scaling_max_freq
root:cpufreq# cat scaling_max_freq
600000

root:cpufreq# cpufreq-set -u 1800000
root:cpufreq# cat scaling_max_freq
600000

Before value locking, cpufreq works perfectly as it should.

This problem often occurs when the processor heats (high load, compilation...) but randomly too (few minutes after boot, even if it does nothing).
Sometimes, after cpu cooling or cpufreqd reboot, I can manage again the speed by hand.

My old kernel version (2.6.18 with speedstep-centrino driver) hasn't this problem and cpufreqd works perfectly on it.
Comment 1 Len Brown 2008-06-25 15:25:56 UTC
Please boot with processor.ignore_ppc=1
or
# echo 1 > /sys/module/processor/parameters/ignore_ppc 

and report if scaling_max_freq stops changing.
Comment 2 Andrew Morton 2008-06-25 15:32:19 UTC
I marked this as a regression.
Comment 3 Thomas Renninger 2008-06-26 04:37:38 UTC
The T61 where this has been reported recently on the cpufreq list did always show a _PPC value of 0 which is correct. If ignore_ppc=1 still helps it could also be some more complex kernel/cpufreq and not a BIOS/_PPC problem.

Can you reproduce this easily (e.g. always after some short time or trigger it)? The report on the T61 was that it only happens on some boots, which makes it very hard to debug it there or to verify whether a patch/change helps or not.

If this is ACPI thermal related and is a regression it might be this one:
commit d9460fd227ed2ce52941b6a12ad4de05c195f6aa
Date:   Thu Jan 17 15:51:23 2008 +0800
    ACPI: register ACPI Processor as generic thermal cooling device

If you can reproduce the problem easily, uou may want to give it a try and test with and without that patch (go back the git history if it doesn't patch anymore). git-bisect is also an option then (if easy to reproduce) if it is something else.
Comment 4 Nicolas Vinot 2008-06-26 05:30:27 UTC
With ignore_ppc=1, no problem at all since my last boot.

But this problem occures very randomly and in very differente cases (with hot or cold cpu, on heavy load or not, at the launch of screensaver...).

So, not sure for the moment the problem is completely resolved with ignore_ppc=1, but in good way.
Comment 5 Len Brown 2008-06-27 11:43:28 UTC
since it (seems) PPC related, moving to ACPI category...

re: comment #3, good idea Thomas.
Nicolas, does the problem also go away with CONFIG_ACPI_THERMAL=n?
Comment 6 Len Brown 2008-06-27 11:45:05 UTC
re: regression
i guess, so since it was not a problem in 2.6.18
and fails in 2.6.25.
Nicolas,
Assuming the thermal issue isn't it, can you narrow
down which release this broken in?
Comment 7 Nicolas Vinot 2008-06-27 11:57:17 UTC
No problem today too. It seems to be PPC.
For CONFIG_ACPI_THERMAL, my current kernel option is set to "yes".
I try tomorrow with "no".

For the revision broken, no problem with kernel 2.6.18 and 2.6.21.
I don't try with 22 and 24. But i can try it.
Comment 8 Nicolas Vinot 2008-06-27 15:05:43 UTC
Latest news.
I reinstalled a 2.6.18 kernel (debian package) to test on it.
And after a while, same problem, lock of cpufreq_max....
(ignore_ppc isn't set on boot command line, confirm PPC may be guilty)

I try to found my old 2.6.18 kernel config to find difference between current version (debian base) which doesn't work and old version (hand-compiled) which hasn't this problem...
Comment 9 Nicolas Vinot 2008-06-28 09:28:22 UTC
I recompile my kernel 2.6.25 with CONFIG_ACPI_THERMAL=n.
It doesn't resolve my problem.

After a new compilation to reactivate thermal acpi, and booting with processor.ignore_ppc=1, the problem is still here! Even if processor.ignore_ppc=1 resolved the problem 48h ago....

This bug seems to occure very randomly.........

Is there any way to verbose-log all switches of cpufreq or acpi events? It could be very usefull to find the cause of this bug.
Comment 10 ykzhao 2008-06-29 20:35:54 UTC
Hi, Nicolas
    Will you please attach the output of dmesg and acpidump?

Thanks.
Comment 11 Nicolas Vinot 2008-07-06 02:45:59 UTC
Created attachment 16746 [details]
dmesg output
Comment 12 Nicolas Vinot 2008-07-06 02:46:37 UTC
Created attachment 16747 [details]
acpidump output
Comment 13 Nicolas Vinot 2008-07-06 02:47:26 UTC
Sorry for my late, I was very busy this week.
Here is my dmesg and acpidump output.
Comment 14 Thomas Renninger 2008-07-10 09:03:12 UTC
For a T42 it could be that (by Bjoern Steinbrink). It might be something else, something is defintely fishy. But you may want to check for this BIOS option:

OK, a stop at thinkwiki[1] later, I know what's happening now. The BIOS
has a few settings regarding CPU speed on AC/battery. One is about
balancing power and noise. The above throttling does no longer kick in
if that option is set to "Maximum Power".

Björn

[1] http://www.thinkwiki.org/wiki/How_to_make_use_of_Dynamic_Frequency_Scaling#Troubleshooting
Comment 15 Zhang Rui 2008-08-14 00:20:29 UTC
Hi, Nicolas,
does the problem still exist in the latest kernel?
does comment #14 answer your question?
Comment 16 Nicolas Vinot 2008-09-10 14:19:26 UTC
Hi all.

I upgrade to kernel 2.6.26 today.

Same problem, always random freezes of cpu max freq.
Sometimes cpufreq works good, sometimes it doesn't, and 	
I don't find a way to reproduce the failure surely.

For BIOS settings, I tried all possible settings, the problem stays.
Comment 17 ykzhao 2008-09-24 00:26:06 UTC
Hi, Nicolas
    Do you mean that the problem still exists even when the "processor.ignore_ppc=1" is added?
    When the problem appears, please attach the output of /proc/acpi/thermal_zone/thrm/*
    thanks.
Comment 18 ykzhao 2008-09-24 00:30:13 UTC
How to ignore ppc can be referred what Len said in comment #1.

thanks.
Comment 19 Zhang Rui 2008-11-06 23:23:51 UTC
ping Nicolas.
Comment 20 Nicolas Vinot 2008-11-07 00:39:07 UTC
ping timeout :)
Sorry for my late.

With my v2.6.26, problem is resolved after 1 month usage...
No kernel update and no acpi or cpufreq update, but it's seems working correctly now.
Cpufreq freeze only on (very) high cpu temp (it's advisabled :))

Very stange bug and very strange resolution.
Comment 21 Zhang Rui 2008-11-09 18:31:13 UTC
well,
the problem can not be reproduced, while we still not fix this issue.
Reject it.
Nicolas, please re-open it if the problem shows up again.