Bug 12826 - cpufreq driver do not expose all data and configuration to /sys
Summary: cpufreq driver do not expose all data and configuration to /sys
Status: CLOSED CODE_FIX
Alias: None
Product: Power Management
Classification: Unclassified
Component: cpufreq (show other bugs)
Hardware: All Linux
: P1 normal
Assignee: cpufreq
URL:
Keywords:
Depends on:
Blocks: 12398
  Show dependency tree
 
Reported: 2009-03-06 05:21 UTC by uzytkownik2@gmail.com
Modified: 2009-04-02 19:35 UTC (History)
3 users (show)

See Also:
Kernel Version: 2.6.29-rc7
Subsystem:
Regression: Yes
Bisected commit-id:


Attachments
dmesg (26.50 KB, text/plain)
2009-03-06 08:53 UTC, uzytkownik2@gmail.com
Details

Description uzytkownik2@gmail.com 2009-03-06 05:21:49 UTC
Latest working kernel version: 2.6.28
Earliest failing kernel version: 2.6.29-rc7 (also -rc5 and -rc6 but tested only with patchset)
Distribution: Gentoo
Hardware Environment: Thinkpad R51e
Software Environment: Standard stack (although reproduced with only /bin/sh as init)
Problem Description:
Contrary to documentation there are little files controlling the cpufreq

# ls /sys/devices/system/cpu/cpu0/cpufreq/
conservative  stats
# ls /sys/devices/system/cpu/cpu0/cpufreq/conservative 
down_threshold	ignore_nice_load      sampling_rate	 sampling_rate_min
freq_step	sampling_down_factor  sampling_rate_max  up_threshold

Steps to reproduce:
Boot
Comment 1 uzytkownik2@gmail.com 2009-03-06 06:45:21 UTC
In email I was as asked to "write down in the bug, which files exactly you mean".
The files present are listed above. In documentation (Documentation/cpu-freq/user-guide.txt) there are in /sys/devices/system/cpu/cpu0/cpufreq/:
cpuinfo_min_freq, cpuinfo_max_freq, scaling_driver, scaling_available_governors, scaling_governor, cpuinfo_cur_freq, scaling_available_frequencies, scaling_min_freq, scaling_max_freq, affected_cpus, related_cpus, scaling_driver and scaling_cur_freq (13 instead of 0).
Comment 2 Thomas Renninger 2009-03-06 07:16:52 UTC
Ah now I get it. I looked at the conservative files only...
This is indeed strange. What cpufreq driver is used?
(I hope the p4-clockmode was not accidently loaded there...).
Can you post dmesg, please.
If there is nothing obvious you might want to compile with:
CPU_FREQ_DEBUG=y
and add cpufreq.debug=7 boot param.
Then boot and attach dmesg of the freshly booted system (dmesg buffer might run full after some time).
Comment 3 uzytkownik2@gmail.com 2009-03-06 08:53:04 UTC
Created attachment 20447 [details]
dmesg

(In reply to comment #2)
> Ah now I get it. I looked at the conservative files only...
> This is indeed strange. What cpufreq driver is used?
> (I hope the p4-clockmode was not accidently loaded there...).

p4-clockmode. No other driver is working (see log). AFAIR it was written that for some Intel Celeron M processors they are only that work on thinkwiki. If it is otherwise I will be happy to change.

> Can you post dmesg, please.
> If there is nothing obvious you might want to compile with:
> CPU_FREQ_DEBUG=y
> and add cpufreq.debug=7 boot param.
> Then boot and attach dmesg of the freshly booted system (dmesg buffer might
> run
> full after some time).
> 

Attached output for "rescue" system (i.e. for /bin/sh as init).
Comment 4 Thomas Renninger 2009-03-06 09:34:06 UTC
Someone else have to help you out here.
IMO this driver should not exist, best do:
blacklist p4-clockmode
in /etc/modprobe.conf.
A Celeron M probably has better ways to save power.
Comment 5 Anonymous Emailer 2009-03-06 15:30:37 UTC
Reply-To: akpm@linux-foundation.org


(switched to email.  Please respond via emailed reply-to-all, not via the
bugzilla web interface).

On Fri,  6 Mar 2009 05:21:50 -0800 (PST)
bugme-daemon@bugzilla.kernel.org wrote:

> http://bugzilla.kernel.org/show_bug.cgi?id=12826
> 
>            Summary: cpufreq driver do not expose all data and configuration
>                     to /sys
>            Product: Power Management
>            Version: 2.5
>      KernelVersion: 2.6.29-rc7
>           Platform: All
>         OS/Version: Linux
>               Tree: Mainline
>             Status: NEW
>           Severity: normal
>           Priority: P1
>          Component: cpufreq
>         AssignedTo: cpufreq@vger.kernel.org
>         ReportedBy: uzytkownik2@gmail.com
> 
> 
> Latest working kernel version: 2.6.28
> Earliest failing kernel version: 2.6.29-rc7 (also -rc5 and -rc6 but tested
> only
> with patchset)
> Distribution: Gentoo
> Hardware Environment: Thinkpad R51e
> Software Environment: Standard stack (although reproduced with only /bin/sh
> as
> init)
> Problem Description:
> Contrary to documentation there are little files controlling the cpufreq
> 
> # ls /sys/devices/system/cpu/cpu0/cpufreq/
> conservative  stats
> # ls /sys/devices/system/cpu/cpu0/cpufreq/conservative 
> down_threshold  ignore_nice_load      sampling_rate      sampling_rate_min
> freq_step       sampling_down_factor  sampling_rate_max  up_threshold
> 
> Steps to reproduce:
> Boot

I'd say that

commit 8529154ec3f3ac20344c65b7a040c604c7af7651
Author: Herton Ronaldo Krzesinski <herton@mandriva.com.br>
Date:   Sat Nov 15 17:02:46 2008 -0200

    [CPUFREQ] Add Celeron Core support to p4-clockmod.

has a good chance of being the cause of this regression?
Comment 6 uzytkownik2@gmail.com 2009-03-06 18:39:46 UTC
On Fri, 2009-03-06 at 15:30 -0800, Andrew Morton wrote: 
> (switched to email.  Please respond via emailed reply-to-all, not via the
> bugzilla web interface).
> 
> On Fri,  6 Mar 2009 05:21:50 -0800 (PST)
> bugme-daemon@bugzilla.kernel.org wrote:
> 
> > http://bugzilla.kernel.org/show_bug.cgi?id=12826
> > 
> >            Summary: cpufreq driver do not expose all data and configuration
> >                     to /sys
> >            Product: Power Management
> >            Version: 2.5
> >      KernelVersion: 2.6.29-rc7
> >           Platform: All
> >         OS/Version: Linux
> >               Tree: Mainline
> >             Status: NEW
> >           Severity: normal
> >           Priority: P1
> >          Component: cpufreq
> >         AssignedTo: cpufreq@vger.kernel.org
> >         ReportedBy: uzytkownik2@gmail.com
> > 
> > 
> > Latest working kernel version: 2.6.28
> > Earliest failing kernel version: 2.6.29-rc7 (also -rc5 and -rc6 but tested
> only
> > with patchset)
> > Distribution: Gentoo
> > Hardware Environment: Thinkpad R51e
> > Software Environment: Standard stack (although reproduced with only /bin/sh
> as
> > init)
> > Problem Description:
> > Contrary to documentation there are little files controlling the cpufreq
> > 
> > # ls /sys/devices/system/cpu/cpu0/cpufreq/
> > conservative  stats
> > # ls /sys/devices/system/cpu/cpu0/cpufreq/conservative 
> > down_threshold  ignore_nice_load      sampling_rate      sampling_rate_min
> > freq_step       sampling_down_factor  sampling_rate_max  up_threshold
> > 
> > Steps to reproduce:
> > Boot
> 
> I'd say that
> 
> commit 8529154ec3f3ac20344c65b7a040c604c7af7651
> Author: Herton Ronaldo Krzesinski <herton@mandriva.com.br>
> Date:   Sat Nov 15 17:02:46 2008 -0200
> 
>     [CPUFREQ] Add Celeron Core support to p4-clockmod.
> 
> has a good chance of being the cause of this regression?

Unfortunately I reverted this commit and it had no effect.

However I found however commit:
commit e088e4c9cdb618675874becb91b2fd581ee707e6
Author: Matthew Garrett <mjg@redhat.com>
Date:   Tue Nov 25 13:29:47 2008 -0500

    [CPUFREQ] Disable sysfs ui for p4-clockmod.
    
    p4-clockmod has a long history of abuse.   It pretends to be a CPU
    frequency scaling driver, even though it doesn't actually change
    the CPU frequency, but instead just modulates the frequency with
    wait-states.
    The biggest misconception is that when running at the lower
'frequency'
    p4-clockmod is saving power.  This isn't the case, as workloads
running
    slower take longer to complete, preventing the CPU from entering
deep C stat
es.
    
    However p4-clockmod does have a purpose.  It can prevent
overheating.
    Having it hooked up to the cpufreq interfaces is the wrong way to
achieve
    cooling however. It should instead be hooked up to ACPI.
    
    This diff introduces a means for a cpufreq driver to register with
the
    cpufreq core, but not present a sysfs interface.


I guess lack of sysfs ui is the problem  (at lest AFAIU 'sysfs ui').
However lack of sysfs ui prevents the cpufreq from lowering frequency on
overheat[1]. I'll try tomorrow (well. today morning) if this commit
causes it.

While I understend that the p4-clockmod shouldn't be used no other
driver is working. p4-clockmod is recommend on thinkwiki[2]. All I found
is a post from 2006 mentioning there is a patch for speedstep driver
which the author is going to try in a spare time - but p4-clockmod is
working. So there are no known replacements for them.

Regards

PS, Even if the commit will not be reverted documentation should be
updated. For example in help of p4-clockmod the change should be
mentioned.

[1] I'd find out the problem if the system didn't started to overheat.
Something is wrong but lowering the 'frequency' drop the temperature
only by 20 degrees (from 9x to 7x).
[2] http://www.thinkwiki.org/wiki/Intel_Celeron_M#Speed_Step
Comment 7 Anonymous Emailer 2009-03-06 18:50:48 UTC
Reply-To: akpm@linux-foundation.org

On Sat, 07 Mar 2009 03:39:31 +0100 Maciej Piechotka <uzytkownik2@gmail.com> wrote:

> On Fri, 2009-03-06 at 15:30 -0800, Andrew Morton wrote: 
> > (switched to email.  Please respond via emailed reply-to-all, not via the
> > bugzilla web interface).
> > 
> > On Fri,  6 Mar 2009 05:21:50 -0800 (PST)
> > bugme-daemon@bugzilla.kernel.org wrote:
> > 
> > > http://bugzilla.kernel.org/show_bug.cgi?id=12826
> > > 
> > >            Summary: cpufreq driver do not expose all data and
> configuration
> > >                     to /sys
> > >            Product: Power Management
> > >            Version: 2.5
> > >      KernelVersion: 2.6.29-rc7
> > >           Platform: All
> > >         OS/Version: Linux
> > >               Tree: Mainline
> > >             Status: NEW
> > >           Severity: normal
> > >           Priority: P1
> > >          Component: cpufreq
> > >         AssignedTo: cpufreq@vger.kernel.org
> > >         ReportedBy: uzytkownik2@gmail.com
> > > 
> > > 
> > > Latest working kernel version: 2.6.28
> > > Earliest failing kernel version: 2.6.29-rc7 (also -rc5 and -rc6 but
> tested only
> > > with patchset)
> > > Distribution: Gentoo
> > > Hardware Environment: Thinkpad R51e
> > > Software Environment: Standard stack (although reproduced with only
> /bin/sh as
> > > init)
> > > Problem Description:
> > > Contrary to documentation there are little files controlling the cpufreq
> > > 
> > > # ls /sys/devices/system/cpu/cpu0/cpufreq/
> > > conservative  stats
> > > # ls /sys/devices/system/cpu/cpu0/cpufreq/conservative 
> > > down_threshold  ignore_nice_load      sampling_rate     
> sampling_rate_min
> > > freq_step       sampling_down_factor  sampling_rate_max  up_threshold
> > > 
> > > Steps to reproduce:
> > > Boot
> > 
> > I'd say that
> > 
> > commit 8529154ec3f3ac20344c65b7a040c604c7af7651
> > Author: Herton Ronaldo Krzesinski <herton@mandriva.com.br>
> > Date:   Sat Nov 15 17:02:46 2008 -0200
> > 
> >     [CPUFREQ] Add Celeron Core support to p4-clockmod.
> > 
> > has a good chance of being the cause of this regression?
> 
> Unfortunately I reverted this commit and it had no effect.
> 
> However I found however commit:
> commit e088e4c9cdb618675874becb91b2fd581ee707e6
> Author: Matthew Garrett <mjg@redhat.com>
> Date:   Tue Nov 25 13:29:47 2008 -0500
> 
>     [CPUFREQ] Disable sysfs ui for p4-clockmod.
>     
>     p4-clockmod has a long history of abuse.   It pretends to be a CPU
>     frequency scaling driver, even though it doesn't actually change
>     the CPU frequency, but instead just modulates the frequency with
>     wait-states.
>     The biggest misconception is that when running at the lower
> 'frequency'
>     p4-clockmod is saving power.  This isn't the case, as workloads
> running
>     slower take longer to complete, preventing the CPU from entering
> deep C stat
> es.
>     
>     However p4-clockmod does have a purpose.  It can prevent
> overheating.
>     Having it hooked up to the cpufreq interfaces is the wrong way to
> achieve
>     cooling however. It should instead be hooked up to ACPI.
>     
>     This diff introduces a means for a cpufreq driver to register with
> the
>     cpufreq core, but not present a sysfs interface.

eh?  So we deliberately added a regression?

> 
> I guess lack of sysfs ui is the problem  (at lest AFAIU 'sysfs ui').
> However lack of sysfs ui prevents the cpufreq from lowering frequency on
> overheat[1]. I'll try tomorrow (well. today morning) if this commit
> causes it.

Thanks.

> While I understend that the p4-clockmod shouldn't be used no other
> driver is working. p4-clockmod is recommend on thinkwiki[2]. All I found
> is a post from 2006 mentioning there is a patch for speedstep driver
> which the author is going to try in a spare time - but p4-clockmod is
> working. So there are no known replacements for them.
> 
> Regards
> 
> PS, Even if the commit will not be reverted documentation should be
> updated. For example in help of p4-clockmod the change should be
> mentioned.
> 
> [1] I'd find out the problem if the system didn't started to overheat.
> Something is wrong but lowering the 'frequency' drop the temperature
> only by 20 degrees (from 9x to 7x).
> [2] http://www.thinkwiki.org/wiki/Intel_Celeron_M#Speed_Step
> 
Comment 8 Anonymous Emailer 2009-03-06 18:58:02 UTC
Reply-To: mjg@redhat.com

On Fri, Mar 06, 2009 at 06:50:42PM -0800, Andrew Morton wrote:

> eh?  So we deliberately added a regression?

No, we removed functionality that doesn't save power from a subsystem 
that's designed to save power.
Comment 9 Anonymous Emailer 2009-03-06 19:08:02 UTC
Reply-To: akpm@linux-foundation.org

On Sat, 7 Mar 2009 02:57:43 +0000 Matthew Garrett <mjg@redhat.com> wrote:

> On Fri, Mar 06, 2009 at 06:50:42PM -0800, Andrew Morton wrote:
> 
> > eh?  So we deliberately added a regression?
> 
> No, we removed functionality that doesn't save power from a subsystem 
> that's designed to save power.

That's just stupid spin - stop wasting our time.

The userspace interface to this driver changed in an incompatible
fashion.  This can lead to failure (and hence premature termination) of
userspace configuration scripts and in this case (at least) it has led
to CPU overheating.

Please fix this driver so that existing userspace will continue to
function in an unchanged manner.
Comment 10 Anonymous Emailer 2009-03-06 19:17:13 UTC
Reply-To: mjg@redhat.com

On Fri, Mar 06, 2009 at 07:07:59PM -0800, Andrew Morton wrote:
> On Sat, 7 Mar 2009 02:57:43 +0000 Matthew Garrett <mjg@redhat.com> wrote:
> 
> > On Fri, Mar 06, 2009 at 06:50:42PM -0800, Andrew Morton wrote:
> > 
> > > eh?  So we deliberately added a regression?
> > 
> > No, we removed functionality that doesn't save power from a subsystem 
> > that's designed to save power.
> 
> That's just stupid spin - stop wasting our time.

No, really.

> The userspace interface to this driver changed in an incompatible
> fashion.  This can lead to failure (and hence premature termination) of
> userspace configuration scripts and in this case (at least) it has led
> to CPU overheating.

Every single case of userspace using this interface is a bug. It's 
either wasting power or unnecessarily reducing performance.
Comment 11 Anonymous Emailer 2009-03-06 19:31:02 UTC
Reply-To: akpm@linux-foundation.org

On Sat, 7 Mar 2009 03:16:29 +0000 Matthew Garrett <mjg@redhat.com> wrote:

> > The userspace interface to this driver changed in an incompatible
> > fashion.  This can lead to failure (and hence premature termination) of
> > userspace configuration scripts and in this case (at least) it has led
> > to CPU overheating.
> 
> Every single case of userspace using this interface is a bug. It's 
> either wasting power or unnecessarily reducing performance.

I repeat:

  The userspace interface to this driver changed in an incompatible
  fashion.  This can lead to failure (and hence premature termination)
  of userspace configuration scripts and in this case (at least) it has
  led to CPU overheating.

I can promise you that if the changelog to
e088e4c9cdb618675874becb91b2fd581ee707e6 had included the text "this
patch will break existing scripts and will lead to CPU overheating"
then it would not have been applied.

Look, there are better ways of fixing things like this.  Revert the
patch, add some noisy printks (triggered by use of the sysfs interface)
telling people that they are doing the wrong thing and telling them how
to fix it.  After 6-9 months, then we can make the kernel interface
change.

We shouldn't just rip the thing out without any warning and break
stuff.
Comment 12 Anonymous Emailer 2009-03-06 19:45:55 UTC
Reply-To: mjg@redhat.com

The low-level cpufreq drivers have no idea whether a speed request 
originated from userspace or the kernel, so we'd need to either special 
case p4-clockmod in the core or add an argument that everything other 
than p4-clockmod ignores. Or we could figure out why this computer 
overheats and fix that bug.
Comment 13 Anonymous Emailer 2009-03-06 20:28:00 UTC
Reply-To: akpm@linux-foundation.org

On Sat, 7 Mar 2009 03:45:12 +0000 Matthew Garrett <mjg@redhat.com> wrote:

> The low-level cpufreq drivers have no idea whether a speed request 
> originated from userspace or the kernel, so we'd need to either special 
> case p4-clockmod in the core or add an argument that everything other 
> than p4-clockmod ignores. Or we could figure out why this computer 
> overheats and fix that bug.

Please stop deleting and then ignoring everything I say.

The ONLY way of fully repairing this regression is to restore the sysfs
files, and their 2.6.28 functionality.
Comment 14 Anonymous Emailer 2009-03-06 20:39:13 UTC
Reply-To: mjg@redhat.com

On Fri, Mar 06, 2009 at 08:27:56PM -0800, Andrew Morton wrote:
> On Sat, 7 Mar 2009 03:45:12 +0000 Matthew Garrett <mjg@redhat.com> wrote:
> 
> > The low-level cpufreq drivers have no idea whether a speed request 
> > originated from userspace or the kernel, so we'd need to either special 
> > case p4-clockmod in the core or add an argument that everything other 
> > than p4-clockmod ignores. Or we could figure out why this computer 
> > overheats and fix that bug.
> 
> Please stop deleting and then ignoring everything I say.
> 
> The ONLY way of fully repairing this regression is to restore the sysfs
> files, and their 2.6.28 functionality.

Resulting in computers that run slower and consume more power. "My 
script that does something stupid now gives an error" isn't a 
regression. "My computer now overheats" is a bug that was being hidden 
in the first place. Why don't we just fix that bug?
Comment 15 Anonymous Emailer 2009-03-06 20:46:11 UTC
Reply-To: akpm@linux-foundation.org

On Sat, 7 Mar 2009 04:38:57 +0000 Matthew Garrett <mjg@redhat.com> wrote:

> On Fri, Mar 06, 2009 at 08:27:56PM -0800, Andrew Morton wrote:
> > On Sat, 7 Mar 2009 03:45:12 +0000 Matthew Garrett <mjg@redhat.com> wrote:
> > 
> > > The low-level cpufreq drivers have no idea whether a speed request 
> > > originated from userspace or the kernel, so we'd need to either special 
> > > case p4-clockmod in the core or add an argument that everything other 
> > > than p4-clockmod ignores. Or we could figure out why this computer 
> > > overheats and fix that bug.
> > 
> > Please stop deleting and then ignoring everything I say.
> > 
> > The ONLY way of fully repairing this regression is to restore the sysfs
> > files, and their 2.6.28 functionality.
> 
> Resulting in computers that run slower and consume more power. "My 
> script that does something stupid now gives an error" isn't a 
> regression. "My computer now overheats" is a bug that was being hidden 
> in the first place. Why don't we just fix that bug?

Do the thing which I suggested, and which you deleted without comment.

Or something else!  Just don't break existing userspace code, and
people's computers.  I'm sure you can manage this.
Comment 16 Anonymous Emailer 2009-03-06 21:19:22 UTC
Reply-To: mjg@redhat.com

On Fri, Mar 06, 2009 at 08:46:07PM -0800, Andrew Morton wrote:
> On Sat, 7 Mar 2009 04:38:57 +0000 Matthew Garrett <mjg@redhat.com> wrote:
> > Resulting in computers that run slower and consume more power. "My 
> > script that does something stupid now gives an error" isn't a 
> > regression. "My computer now overheats" is a bug that was being hidden 
> > in the first place. Why don't we just fix that bug?
> 
> Do the thing which I suggested, and which you deleted without comment.

As I said, we can't print a warning without either special casing 
p4-clockmod in the core or adding code to every driver that will only be 
relevant for p4-clockmod. It also means another 6 months of computers 
running slower and consuming more power.

> Or something else!  Just don't break existing userspace code, and
> people's computers.  I'm sure you can manage this.

I'd be thrilled to avoid fixing people's computers, but that means they 
need to report the bug about their machine overheating. Breaking code 
that is doing something actively harmful is a feature rather than a bug, 
so I'm less concerned about that.
Comment 17 Anonymous Emailer 2009-03-06 21:39:44 UTC
Reply-To: akpm@linux-foundation.org

On Sat, 7 Mar 2009 05:18:52 +0000 Matthew Garrett <mjg@redhat.com> wrote:

> On Fri, Mar 06, 2009 at 08:46:07PM -0800, Andrew Morton wrote:
> > On Sat, 7 Mar 2009 04:38:57 +0000 Matthew Garrett <mjg@redhat.com> wrote:
> > > Resulting in computers that run slower and consume more power. "My 
> > > script that does something stupid now gives an error" isn't a 
> > > regression. "My computer now overheats" is a bug that was being hidden 
> > > in the first place. Why don't we just fix that bug?
> > 
> > Do the thing which I suggested, and which you deleted without comment.
> 
> As I said, we can't print a warning without either special casing 
> p4-clockmod in the core

That patch already special-cased p4-clockmod in the core.

> or adding code to every driver that will only be 
> relevant for p4-clockmod. It also means another 6 months of computers 
> running slower and consuming more power.

Special-casing p4-clockmod will not affect other drivers.

> > Or something else!  Just don't break existing userspace code, and
> > people's computers.  I'm sure you can manage this.
> 
> I'd be thrilled to avoid fixing people's computers, but that means they 
> need to report the bug about their machine overheating. Breaking code 
> that is doing something actively harmful is a feature rather than a bug, 
> so I'm less concerned about that.

I don't know what that means.

Please find a way to fix these regressions.
Comment 18 uzytkownik2@gmail.com 2009-03-07 03:10:38 UTC
On Sat, 2009-03-07 at 04:38 +0000, Matthew Garrett wrote:
> On Fri, Mar 06, 2009 at 08:27:56PM -0800, Andrew Morton wrote:
> > On Sat, 7 Mar 2009 03:45:12 +0000 Matthew Garrett <mjg@redhat.com> wrote:
> > 
> > > The low-level cpufreq drivers have no idea whether a speed request 
> > > originated from userspace or the kernel, so we'd need to either special 
> > > case p4-clockmod in the core or add an argument that everything other 
> > > than p4-clockmod ignores. Or we could figure out why this computer 
> > > overheats and fix that bug.
> > 
> > Please stop deleting and then ignoring everything I say.
> > 
> > The ONLY way of fully repairing this regression is to restore the sysfs
> > files, and their 2.6.28 functionality.
> 
> Resulting in computers that run slower and consume more power. "My 
> script that does something stupid now gives an error" isn't a 
> regression. "My computer now overheats" is a bug that was being hidden 
> in the first place. Why don't we just fix that bug?
> 

AFAIU the system from Dave Jones post I've found today p4-clockmod is
suppose to work with ACPI evenets. 
I've found out that:
# cat /proc/acpi/thermal_zone/THM0/polling_frequency 
<polling disabled>
# cat /proc/acpi/thermal_zone/THM0/state 
state:                   ok
# cat /proc/acpi/thermal_zone/THM0/temperature 
temperature:             81 C
# cat /proc/acpi/thermal_zone/THM0/trip_points 
critical (S5):           99 C
passive:                 95 C: tc1=5 tc2=4 tsp=600 devices= CPU

The problem is that it seems that the system halts at about 95 C - in
moment when cooling should be applied. This might be an ACPI/ibm_acpi
bug.

Also in one comment[1] it have been mentioned that on Celeron M, which
lacks SpeedStep[2] and C-states[3], there are power savings as
p4-clockmod is enabled. So maybe p4-clockmod should be used fully only
for Celeron M (with appropriate renaming and description) - and only
Celeron M - as cpufreq backend and for rest it should warn/not expose
the sysfs interface? 

Regards

[1] http://www.codemonkey.org.uk/2009/01/18/forthcoming-p4clockmod/
[2] Confirmed on
http://en.wikipedia.org/w/index.php?title=Celeron&oldid=272955598#Mobile_Celeron_and_Celeron_M
[3] Powertop however reports different C-states on my computer. Maybe it
varies from core to core.
Comment 19 Anonymous Emailer 2009-03-07 07:00:40 UTC
Reply-To: mjg@redhat.com

On Sat, Mar 07, 2009 at 12:09:29PM +0100, Maciej Piechotka wrote:

> # cat /proc/acpi/thermal_zone/THM0/trip_points 
> critical (S5):           99 C
> passive:                 95 C: tc1=5 tc2=4 tsp=600 devices= CPU
> 
> The problem is that it seems that the system halts at about 95 C - in
> moment when cooling should be applied. This might be an ACPI/ibm_acpi
> bug.

Halts as in shuts down, or halts as in stops running? The R51e seems to 
have the 32/64-bit ACPI address issue - can you try booting with 
acpi=rsdt as a kernel argument and see whether it behaves any better?
Comment 20 Thomas Renninger 2009-03-07 08:33:40 UTC
On Saturday 07 March 2009 04:00:16 pm Matthew Garrett wrote:
> On Sat, Mar 07, 2009 at 12:09:29PM +0100, Maciej Piechotka wrote:
> > # cat /proc/acpi/thermal_zone/THM0/trip_points
> > critical (S5):           99 C
> > passive:                 95 C: tc1=5 tc2=4 tsp=600 devices= CPU
> >
> > The problem is that it seems that the system halts at about 95 C - in
> > moment when cooling should be applied. This might be an ACPI/ibm_acpi
> > bug.
>
> Halts as in shuts down, or halts as in stops running? The R51e seems to
> have the 32/64-bit ACPI address issue - can you try booting with
> acpi=rsdt as a kernel argument and see whether it behaves any better?

Yes it has.
I posted patches to fix this about three times, first time probably more than 
a year ago, last one should be these (on the acpi list, easy to google):

[PATCH 1/3] ACPICA: Add acpi_gbl_force_rsdt variable
[PATCH 2/3] Introduce acpi_root_table=rsdt boot param and dmi list to force 
rsdt
[PATCH 3/3] Remove R40e c-state blacklist

wrong the last time I sent them was on 19th of Oct. 2008:
[RESEND] [PATCH 0/3] Blacklist broken ThinkPads to use 32 bit FADT addresses
[RESEND] [PATCH 1/3] ACPICA: Add acpi_gbl_force_rsdt variable
[RESEND] [PATCH 2/3] Introduce acpi_root_table=rsdt boot param and dmi list to 
force rsdt
[RESEND] [PATCH 3/3] Remove R40e c-state blacklist

No we have:
  - SUSE and mainline kernels are out of sync with boot params,
    acpi_root_table=rsdt vs acpi=rsdt
  - R51e and R40e are broken in mainline for years, even we know why and they
    still are

While I agree with Len that we should try to find the root cause, such issues 
should get blacklisted first until the root cause has been found or we start 
debugging the same issues over and over again and loose an overview about 
which machines/BIOSes are broken.

       Thomas
Comment 21 uzytkownik2@gmail.com 2009-03-07 10:26:07 UTC
On Sat, 2009-03-07 at 17:33 +0100, Thomas Renninger wrote:
> On Saturday 07 March 2009 04:00:16 pm Matthew Garrett wrote:
> > On Sat, Mar 07, 2009 at 12:09:29PM +0100, Maciej Piechotka wrote:
> > > # cat /proc/acpi/thermal_zone/THM0/trip_points
> > > critical (S5):           99 C
> > > passive:                 95 C: tc1=5 tc2=4 tsp=600 devices= CPU
> > >
> > > The problem is that it seems that the system halts at about 95 C - in
> > > moment when cooling should be applied. This might be an ACPI/ibm_acpi
> > > bug.
> >
> > Halts as in shuts down, or halts as in stops running?

Halts as in shut down. As after halt command but without syncing disks
etc.

> > The R51e seems to
> > have the 32/64-bit ACPI address issue - can you try booting with
> > acpi=rsdt as a kernel argument and see whether it behaves any better?
> 
> Yes it has.
> I posted patches to fix this about three times, first time probably more than 
> a year ago, last one should be these (on the acpi list, easy to google):
> 
> [PATCH 1/3] ACPICA: Add acpi_gbl_force_rsdt variable
> [PATCH 2/3] Introduce acpi_root_table=rsdt boot param and dmi list to force 
> rsdt
> [PATCH 3/3] Remove R40e c-state blacklist
> 
> wrong the last time I sent them was on 19th of Oct. 2008:
> [RESEND] [PATCH 0/3] Blacklist broken ThinkPads to use 32 bit FADT addresses
> [RESEND] [PATCH 1/3] ACPICA: Add acpi_gbl_force_rsdt variable
> [RESEND] [PATCH 2/3] Introduce acpi_root_table=rsdt boot param and dmi list
> to 
> force rsdt
> [RESEND] [PATCH 3/3] Remove R40e c-state blacklist
> 
> No we have:
>   - SUSE and mainline kernels are out of sync with boot params,
>     acpi_root_table=rsdt vs acpi=rsdt
>   - R51e and R40e are broken in mainline for years, even we know why and they
>     still are
> 
> While I agree with Len that we should try to find the root cause, such issues 
> should get blacklisted first until the root cause has been found or we start 
> debugging the same issues over and over again and loose an overview about 
> which machines/BIOSes are broken.
> 
>        Thomas

Unfortunately those patches cannot be applied to current kernel:
Applying: ACPICA: Add acpi_gbl_force_rsdt variable
error: drivers/acpi/tables/tbutils.c: does not exist in index
error: drivers/acpi/utilities/utglobal.c: does not exist in index
error: include/acpi/acglobal.h: does not exist in index
Patch failed at 0001 ACPICA: Add acpi_gbl_force_rsdt variable

I'll try the boot option. 

Regards
Comment 22 Thomas Renninger 2009-03-07 11:11:12 UTC
Some comments...:

On Saturday 07 March 2009 06:39:40 am Andrew Morton wrote:
> On Sat, 7 Mar 2009 05:18:52 +0000 Matthew Garrett <mjg@redhat.com> wrote:
> > On Fri, Mar 06, 2009 at 08:46:07PM -0800, Andrew Morton wrote:
> > > On Sat, 7 Mar 2009 04:38:57 +0000 Matthew Garrett <mjg@redhat.com> 
wrote:
> > > > Resulting in computers that run slower and consume more power. "My
> > > > script that does something stupid now gives an error" isn't a
> > > > regression. "My computer now overheats" is a bug that was being
> > > > hidden in the first place. Why don't we just fix that bug?
> > >
> > > Do the thing which I suggested, and which you deleted without comment.
> >
> > As I said, we can't print a warning without either special casing
> > p4-clockmod in the core
>
> That patch already special-cased p4-clockmod in the core.
>
> > or adding code to every driver that will only be
> > relevant for p4-clockmod. It also means another 6 months of computers
> > running slower and consuming more power.
>
> Special-casing p4-clockmod will not affect other drivers.
>
> > > Or something else!  Just don't break existing userspace code, and
> > > people's computers.  I'm sure you can manage this.

Any modification on the p4-clockmod driver is not much worth it as it is 
broken by design. If you really touch it, please do it properly:
  - move it to drivers/platforms/x86/native_throttling.c
  - let it register at the generic thermal zone code as a cooling device
  - make sure it only loads if acpi throttling does not get activated

This is what you want. Going the "a cpufreq driver automatically is a cooling 
device, let's use it" and then cut out the cpufreq capabilities:
---------
    [CPUFREQ] Prevent p4-clockmod from auto-binding to the ondemand governor.

    The latency of p4-clockmod sucks so hard that scaling on a regular
    basis with ondemand is a really bad idea.

    Signed-off-by: Matthew Garrett <mjg59@srcf.ucam.org>
---------
is making things worse not better.

Until this happened, please mark the p4-clockmod driver experimental,
throw out all references to the cpufreq list and best state at load time that 
it should only be used on broken BIOSes.

      Thomas
Comment 23 uzytkownik2@gmail.com 2009-03-07 17:06:41 UTC
On Sat, 2009-03-07 at 15:00 +0000, Matthew Garrett wrote:
> On Sat, Mar 07, 2009 at 12:09:29PM +0100, Maciej Piechotka wrote:
> 
> > # cat /proc/acpi/thermal_zone/THM0/trip_points 
> > critical (S5):           99 C
> > passive:                 95 C: tc1=5 tc2=4 tsp=600 devices= CPU
> > 
> > The problem is that it seems that the system halts at about 95 C - in
> > moment when cooling should be applied. This might be an ACPI/ibm_acpi
> > bug.
> 
> Halts as in shuts down, or halts as in stops running? The R51e seems to 
> have the 32/64-bit ACPI address issue - can you try booting with 
> acpi=rsdt as a kernel argument and see whether it behaves any better?
> 

Somehow. It keeps system at 70-80 C but on 2.6.29 it freezes the
computer when I log into Gnome (it seems that gdm is not enought and it
occures after few minutes) - i.e. I cannot move pointer, change VT nor
even ping the system (I use in-kernel radeon driver and Radeon Xpress
200M card RC410).

Regards
Comment 24 Anonymous Emailer 2009-03-07 17:37:30 UTC
Reply-To: mjg@redhat.com

On Sun, Mar 08, 2009 at 02:06:33AM +0100, Maciej Piechotka wrote:

> Somehow. It keeps system at 70-80 C but on 2.6.29 it freezes the
> computer when I log into Gnome (it seems that gdm is not enought and it
> occures after few minutes) - i.e. I cannot move pointer, change VT nor
> even ping the system (I use in-kernel radeon driver and Radeon Xpress
> 200M card RC410).

Ok. That sounds like a separate bug. Can you try with this patch and no 
kernel argument?

commit 546be50e225261e8379731008cdfec336348f048
Author: Matthew Garrett <mjg@redhat.com>
Date:   Sun Mar 8 01:34:03 2009 +0000

    Use 32-bit FADT values on X86
    
    The ACPI specification says that we should use the 64-bit address offsets
    contained within the FADT if they exist. However, Windows uses the legacy
    address. Various vendors have left incorrect values in the 64-bit field
    which then causes problems later. Since the vast majority of machines have
    never been tested with an OS that uses the 64-bit value by default, we should
    behave like Windows and ignore the spec by only using the 64-bit address if
    it contains something that can't be represented in the legacy field. Since
    system io space is only 16 bits on x86, this should be entirely safe.

diff --git a/drivers/acpi/acpica/tbfadt.c b/drivers/acpi/acpica/tbfadt.c
index 3636e4f..ad0e858 100644
--- a/drivers/acpi/acpica/tbfadt.c
+++ b/drivers/acpi/acpica/tbfadt.c
@@ -361,9 +361,28 @@ static void acpi_tb_convert_fadt(void)
 		    ACPI_ADD_PTR(struct acpi_generic_address, &acpi_gbl_FADT,
 				 fadt_info_table[i].address64);
 
-		/* Expand only if the 64-bit X target is null */
+		/*
+		 * The ACPI specification says that we should use the
+		 * 64-bit address offsets if they exists. However,
+		 * Windows uses the legacy address. Various vendors
+		 * have left incorrect values in the 64-bit field,
+		 * which then causes problems later. Since the vast
+		 * majority of machines have never been tested with an
+		 * OS that uses the 64-bit value by default, we should
+		 * behave like Windows and ignore the spec by only
+		 * using the 64-bit address if it contains something
+		 * that can't be represented in the legacy
+		 * field. Since system io space is only 16 bits on
+		 * x86, this should be entirely safe. We also extend
+		 * the 32-bit value into the 64-bit one if no 64-bit
+		 * address is provided.
+		 */
 
-		if (!target64->address) {
+		if (!target64->address
+#ifdef CONFIG_X86
+		    || (target64->space_id == ACPI_ADR_SPACE_SYSTEM_IO)
+#endif
+			) {
 
 			/* The space_id is always I/O for the 32-bit legacy address fields */
Comment 25 uzytkownik2@gmail.com 2009-03-07 19:29:58 UTC
On Sun, 2009-03-08 at 01:37 +0000, Matthew Garrett wrote: 
> On Sun, Mar 08, 2009 at 02:06:33AM +0100, Maciej Piechotka wrote:
> 
> > Somehow. It keeps system at 70-80 C

And finally turn off the computer so I guess it only slowed the process
down.

> > but on 2.6.29 it freezes the
> > computer when I log into Gnome (it seems that gdm is not enought and it
> > occures after few minutes) - i.e. I cannot move pointer, change VT nor
> > even ping the system (I use in-kernel radeon driver and Radeon Xpress
> > 200M card RC410).
> 
> Ok. That sounds like a separate bug. Can you try with this patch and no 
> kernel argument?

The same problems.

Regards
Comment 26 Anonymous Emailer 2009-03-07 19:38:49 UTC
Reply-To: mjg@redhat.com

On Sun, Mar 08, 2009 at 04:29:44AM +0100, Maciej Piechotka wrote:
> On Sun, 2009-03-08 at 01:37 +0000, Matthew Garrett wrote: 
> > On Sun, Mar 08, 2009 at 02:06:33AM +0100, Maciej Piechotka wrote:
> > 
> > > Somehow. It keeps system at 70-80 C
> 
> And finally turn off the computer so I guess it only slowed the process
> down.

Interesting. The fact that it's shutting down hard is an indication that 
the embedded controller is performing the shutdown and not the OS. I've 
got a similar issue on a machine here - let me look into it.
Comment 27 Thomas Renninger 2009-03-07 23:01:12 UTC
On Sunday 08 March 2009 02:06:33 am Maciej Piechotka wrote:
> On Sat, 2009-03-07 at 15:00 +0000, Matthew Garrett wrote:
> > On Sat, Mar 07, 2009 at 12:09:29PM +0100, Maciej Piechotka wrote:
> > > # cat /proc/acpi/thermal_zone/THM0/trip_points
> > > critical (S5):           99 C
> > > passive:                 95 C: tc1=5 tc2=4 tsp=600 devices= CPU
> > >
> > > The problem is that it seems that the system halts at about 95 C - in
> > > moment when cooling should be applied. This might be an ACPI/ibm_acpi
> > > bug.
> >
> > Halts as in shuts down, or halts as in stops running? The R51e seems to
> > have the 32/64-bit ACPI address issue - can you try booting with
> > acpi=rsdt as a kernel argument and see whether it behaves any better?
>
> Somehow. It keeps system at 70-80 C but on 2.6.29 it freezes the
> computer when I log into Gnome (it seems that gdm is not enought and it
> occures after few minutes) - i.e. I cannot move pointer, change VT nor
> even ping the system (I use in-kernel radeon driver and Radeon Xpress
> 200M card RC410).

Hmm, I expect no PowerPlay things are implemented in the in-kernel radeon 
implementation yet?
fglrx and aticonfig --list-powerstates and aticonfig --set-powerstate X
could help a lot.
This could at least explain the high temperature rates.
Like that you could find out how much could be saved by graphics power 
savings.

> i.e. I cannot move pointer, change VT nor
> even ping the system (I use in-kernel radeon driver and Radeon Xpress
> 200M card RC410).
Sounds like the in-kernel radeon driver is not that stable yet?

Yang Zhao once implemented PowerPlay support in radeonhd userspace:
http://yangman.ca/git/xf86-video-radeonhd.git
diffing it with the official radeonhd driver and implementing the same
in the kernel's radeon driver might be necessary at some time?
I once played a bit with it and I could imagine as both should be atombios 
based, it shouldn't be that hard to port and it got already some testing on 
specific HW.
Matthew: Is it planned to add PowerPlay support to the radeon in-kernel driver 
at some time?
If yes, I can help a bit.

Hm, but all this has nothing to do with cpufreq which a Celeron is not capable 
of.

      Thomas
Comment 28 uzytkownik2@gmail.com 2009-03-08 03:36:21 UTC
On Sun, 2009-03-08 at 08:01 +0100, Thomas Renninger wrote:
> On Sunday 08 March 2009 02:06:33 am Maciej Piechotka wrote:
> > On Sat, 2009-03-07 at 15:00 +0000, Matthew Garrett wrote:
> > > On Sat, Mar 07, 2009 at 12:09:29PM +0100, Maciej Piechotka wrote:
> > > > # cat /proc/acpi/thermal_zone/THM0/trip_points
> > > > critical (S5):           99 C
> > > > passive:                 95 C: tc1=5 tc2=4 tsp=600 devices= CPU
> > > >
> > > > The problem is that it seems that the system halts at about 95 C - in
> > > > moment when cooling should be applied. This might be an ACPI/ibm_acpi
> > > > bug.
> > >
> > > Halts as in shuts down, or halts as in stops running? The R51e seems to
> > > have the 32/64-bit ACPI address issue - can you try booting with
> > > acpi=rsdt as a kernel argument and see whether it behaves any better?
> >
> > Somehow. It keeps system at 70-80 C but on 2.6.29 it freezes the
> > computer when I log into Gnome (it seems that gdm is not enought and it
> > occures after few minutes) - i.e. I cannot move pointer, change VT nor
> > even ping the system (I use in-kernel radeon driver and Radeon Xpress
> > 200M card RC410).
> 
> Hmm, I expect no PowerPlay things are implemented in the in-kernel radeon 
> implementation yet?
> fglrx and aticonfig --list-powerstates and aticonfig --set-powerstate X
> could help a lot.
> This could at least explain the high temperature rates.

If it might help. GPU temperature is lower then CPU. However it
prevents cooling as checked with rovclock. However AFAIR change 300 ->
50 is not sufficient.

> Like that you could find out how much could be saved by graphics power 
> savings.
> 

Ok. I'll check.

> 
> Hm, but all this has nothing to do with cpufreq which a Celeron is not
> capable 
> of.
> 

Should I move and rename the bug? Where should it go (ACPI -
Power-Other? Power Management - Other?)

Regards
Comment 29 Thomas Renninger 2009-03-09 05:43:20 UTC
> Should I move and rename the bug?
The 32 vs 64 bit is a duplicate of this one:
[Bug 8246] 32/64X address mismatch in "Gpe0Block" - IBM Thinkpad R51e
It is set to resolved because a boot param was added, which is IMO not sufficient. But it's hard to convince Len to add dmi blacklists...

You might then want to recheck about:
> i.e. I cannot move pointer, change VT nor even ping the system
or other bugs which might be a follow ups, but are more likely independent.
You want to open new bugs for unrelated things.

> Should I move and rename the bug?
Hmm, the bug is valid IMO, I'd keep it open. As long as the p4-clockmod is a cpufreq driver and not explicitly stated broken or tainting the kernel (and even then) it must provide sysfs cpufreq interface userspace programs rely on, like every other cpufreq driver does.
Comment 30 Rafael J. Wysocki 2009-03-10 02:03:55 UTC
Fixed by commit 129f8ae9b1b5be94517da76009ea956e89104ce8 .
Comment 31 Len Brown 2009-03-30 04:47:21 UTC
This bug was for the p4-clockmod API regression.
Per comment #30, that is reverted, and so this entry
should remain closed.

Thomas and Matthew are right, however, that the interesting
problem is why the r51e needs p4-clockmod to work around the
real bug -- that the the r51e is over-heating.
If we can fix that, perhaps
we can finally clear the way to delete p4-clockmod...

Maciej, can you file a new bug against r51e over-heating?
re: RSDT
note that 2.6.29 shipped with "acpi=rsdt" -- so you
can test if that helps w/o needing to patch your kernel.

You'll probably get asked if windows over-heats on the
same laptop, and it would be ideal if you could try that.

Also, there are some more 32 vs 64-bit fixes from ACPICA
that are staged for 2.6.30-rc1
that may address that issue w/o a bootparam.
We can discuss those in the new bug report.
Comment 32 uzytkownik2@gmail.com 2009-03-30 10:08:32 UTC
(In reply to comment #31)
> This bug was for the p4-clockmod API regression.
> Per comment #30, that is reverted, and so this entry
> should remain closed.
> 
> Thomas and Matthew are right, however, that the interesting
> problem is why the r51e needs p4-clockmod to work around the
> real bug -- that the the r51e is over-heating.
> If we can fix that, perhaps
> we can finally clear the way to delete p4-clockmod...
> 
> Maciej, can you file a new bug against r51e over-heating?
> re: RSDT
> note that 2.6.29 shipped with "acpi=rsdt" -- so you
> can test if that helps w/o needing to patch your kernel.
> 
> You'll probably get asked if windows over-heats on the
> same laptop, and it would be ideal if you could try that.
> 

It was a long time ago I used Windows on this computer. AFAIR it had not overheat but I'll check it.

> Also, there are some more 32 vs 64-bit fixes from ACPICA
> that are staged for 2.6.30-rc1
> that may address that issue w/o a bootparam.
> We can discuss those in the new bug report.

Agains what product should I fill it?
Comment 33 Henrique de Moraes Holschuh 2009-04-02 19:35:17 UTC
Len,

I don't think we should remove p4-clockmod at all for as long as we support boxes that can use it.

It clearly works well to avoid worse damage on boxes that are suffering some sort of thermal trouble (it really doesn't matter much why the box is overheating, that's orthogonal to the issue at hand).  That's a very good thing.   It is a proven safety net.

Now, maybe thermal throttling should be made more obnoxious when it activates, to make it clear to the user that the box is NOT operating under normal conditions.  Suitably informative rate-limited KERN_CRIT level messages, for example (limit to once every hour, maybe?).

Note You need to log in before you can comment on or make changes to this bug.