Bug 9227 - Wrong trip points + unable to turn on cooling device
Summary: Wrong trip points + unable to turn on cooling device
Status: REJECTED INVALID
Alias: None
Product: ACPI
Classification: Unclassified
Component: Power-Thermal (show other bugs)
Hardware: All Linux
: P1 normal
Assignee: Zhang Rui
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2007-10-25 13:27 UTC by Serge Maneuf
Modified: 2009-04-29 08:51 UTC (History)
0 users

See Also:
Kernel Version: 2.6.22-14-generic
Subsystem:
Regression: No
Bisected commit-id:


Attachments
result of acpidump, running kernel with acpi=off (96.05 KB, application/octet-stream)
2007-10-25 13:29 UTC, Serge Maneuf
Details
result of dmidecode , running kernell with acpi=off (11.06 KB, application/octet-stream)
2007-10-25 13:30 UTC, Serge Maneuf
Details
(before thermal) acpidump --addr 0x000FF810 --length 0x10 > temm-1 (16 bytes, application/octet-stream)
2008-06-19 07:41 UTC, Serge Maneuf
Details
(after thermal) acpidump > --addr 0x000FF810 --length 0x10 > temm-2 (96.05 KB, application/octet-stream)
2008-06-19 07:42 UTC, Serge Maneuf
Details
( thermal loaded ) acpidump --addr 0x000ff810 --length 0x10 > temm-2 (16 bytes, application/octet-stream)
2008-06-21 02:11 UTC, Serge Maneuf
Details

Description Serge Maneuf 2007-10-25 13:27:19 UTC
Most recent kernel where this bug did not occur:
Distribution:Ubuntu gutsy ( same bugg on ubuntu feisty and pardus )
Hardware Environment: Desktop HP pavillon a629.fr / AMD Athlon(tm) XP2600+ / Bios 3.08
Software Environment:
Problem Description:

Trip points are wrong : 
cat /proc/acpi/thermal_zone/*/*
0 - Active; 1 - Passive
<polling disabled>
state:                   passive 
temperature:             39 C
critical (S5):           100 C
passive:                 -248 C: tc1=4 tc2=3 tsp=60 devices=CPU0 
active[0]:               -266 C: devices= FAN

and an error is printed in the log files every 6 seconds : 
ACPI: Unable to turn cooling device [debcbf18] 'on'

With previous kernell I "fix" the issue by overwriting the trip points :
echo -n "100:0:90:60:0" > /proc/acpi/thermal_zone/THRM/trip_points
but it is no more possible with this kernell.

I can run kernell with acpi=off, but then the power shut down is not working.

The message disappear if I kill thermal: sudo rmmod thermal.

I will provide results of acpidump and dmidecode ( running with acpi=off ) ... when I'll found how to do it.  

Steps to reproduce:
Comment 1 Serge Maneuf 2007-10-25 13:29:29 UTC
Created attachment 13280 [details]
result of acpidump, running kernel with acpi=off 

Following my bugg report ...
Comment 2 Serge Maneuf 2007-10-25 13:30:40 UTC
Created attachment 13281 [details]
result of dmidecode , running kernell with acpi=off

following my bugg report
Comment 3 Zhang Rui 2007-10-26 00:42:52 UTC
Please try to boot with "thermal.act=xxx" and "thermal.psv=yyy" to override the active and passive trip point.
Comment 4 Serge Maneuf 2007-10-26 11:21:24 UTC
I tried to boot with Kernel parameters : 
ro quiet splash thermal.act=60 thermal.psv=50
and get :
Unknown boot option `thermal.act=60': ignoring
Unknown boot option `thermal.psv=50': ignoring 
Comment 5 Zhang Rui 2007-10-28 18:16:28 UTC
That feature is merged recently. Please try a newer kernel, eg. 2.6.23.
Comment 6 Fu Michael 2007-12-03 00:49:15 UTC
resolve this bug due to no response from bug reporter...
Comment 7 Serge Maneuf 2007-12-05 11:01:55 UTC
(In reply to comment #6)
> resolve this bug due to no response from bug reporter...
> Hi. Sorry to not have been able to test with kernel 2.6.23, I'm running
> ubuntu gutsy which is still using kernell 2.2.22 .
Comment 8 Len Brown 2008-06-13 21:33:05 UTC
please re-open if this is a problem in 2.6.23 or later.

also, i'm curious if you've run windows on this box
and if/why it doesn't run into the same bogus trip points.
Comment 9 Serge Maneuf 2008-06-14 07:37:27 UTC
I'm now using Ubuntu 8.04, Kernel 2.6.24-18-generic

I tried to boot with Kernel parameters : 
ro quiet splash thermal.act=60 thermal.psv=50
and get :
Unknown boot option `thermal.act=60': ignoring
Unknown boot option `thermal.psv=50': ignoring

For the timebeing,I added "modprobe -r thermal" in file /etc/rc.local to kill thermal after boot and avoid to get the message "unable to turn on cooling device" every 6 sec.

This Desktop never heat-up so it is OK.

I cannot say if it worked better under windows, I never check the trip points under XP, may be there was simply no active thermal management ?

Looks like new kernell still not works properly with the mother board of this HP computer ( ASUS A7V8X-LA ). 
Comment 10 Zhang Rui 2008-06-16 00:54:17 UTC
(In reply to comment #9)
> I'm now using Ubuntu 8.04, Kernel 2.6.24-18-generic
> 
> I tried to boot with Kernel parameters : 
> ro quiet splash thermal.act=60 thermal.psv=50
> and get :
> Unknown boot option `thermal.act=60': ignoring
> Unknown boot option `thermal.psv=50': ignoring
> 
weird, we didn't change this piece of code, could you make a double check?

Look at this AML code. the active[0] and passive trip point is just gotten by reading a specific system memory address.
            Method (_AC0, 0, NotSerialized)
            {
                If (Or (PLCY, PLCY, Local7))
                {
                    Return (TP2H)
                }
                Else
                {
                    Return (TP1H)
                }
            }
and
 OperationRegion (TEMM, SystemMemory, 0x000FF810, 0x10)
    Field (TEMM, WordAcc, NoLock, Preserve)
    {
        TP1H,   16,
        TP1L,   16,
        TP2H,   16,
        TP2L,   16,
        TRPC,   16,
        SENF,   16,
        TP3H,   16,
        TP3L,   16
    }

so could you please do this test:
1.boot with ACPI enabled,
2.BEFORE loading the thermal driver,
  do "acpidump --addr 0x000FF810 --length 0x10 > temm-1"
3.load thermal driver
4.do "acpidump > --addr 0x000FF810 --length 0x10 > temm-2"
and attach the test result here.
Comment 11 Serge Maneuf 2008-06-19 07:41:08 UTC
Created attachment 16547 [details]
(before thermal) acpidump --addr 0x000FF810 --length 0x10 > temm-1
Comment 12 Serge Maneuf 2008-06-19 07:42:11 UTC
Created attachment 16548 [details]
(after thermal) acpidump > --addr 0x000FF810 --length 0x10 > temm-2
Comment 13 Zhang Rui 2008-06-19 18:25:55 UTC
oops. sorry. it should be 
4. do "acpidump --addr 0x000ff810 --length 0x10 > temm-2"

please re-attach the test result after thermal driver is loaded,
Comment 14 Serge Maneuf 2008-06-21 02:09:12 UTC
No poblem. see below.
Comment 15 Serge Maneuf 2008-06-21 02:11:24 UTC
Created attachment 16569 [details]
( thermal loaded ) acpidump --addr 0x000ff810 --length 0x10 > temm-2
Comment 16 Zhang Rui 2008-07-07 22:46:32 UTC
Well, AML read the invalid data from OperationRegion (TEMM, SystemMemory, 0x000FF810, 0x10)
And this is not a Linux/ACPI bug to me.
Please check the BIOS to see if there are any thermal related options.
Please try to upgrade the BIOS to see if it's fixed.
Anyway, this is a BIOS problem and we will/can not fix it in Linux kernel.
Reject this bug as it's INVALID.
Please re-open it if you still have some questions.
Comment 17 Serge Maneuf 2009-04-26 08:49:07 UTC
Up to now, I "solved" the problem by removing the thermal module : modprobe -r thermal in /etc/rc.local.
Just updated my system to ubuntu 9.04 (Kernel 2.6.28-11) and the fix doesn't work anymore : No more "thermal" module and message "ACPI: Unable to turn cooling device" added any 6 seconds in the logs.....
I understand that the origin of the problem comes from my old bios and that it is not a Linux/ACPI bug, but I cannot change the bios and would appreciate any help to stop the warning message.
Comment 18 Zhang Rui 2009-04-27 01:21:12 UTC
please try boot option acpi.power_nocheck=1
Comment 19 Serge Maneuf 2009-04-27 08:58:55 UTC
I tried successively the following boot options without any success :
1) acpi.power_nocheck=1
2) thermal.nocrt=1
3) thermal.crt=-1
In any case the warning is still printed in the logs every 6 seconds.

Some more informations : 
Computer = Desktop HP Pavilion a629.fr
Motherboard = ASUS A7V8X-LA.
From kernell.log :
[    1.351468] fan PNP0C0B:00: registered as cooling_device0
[    1.351475] ACPI: Fan [FAN] (on)
[    1.351759] processor ACPI_CPU:00: registered as cooling_device1
[    1.356556] thermal LNXTHERM:01: registered as thermal_zone0
[    1.358249] ACPI: Unable to turn cooling device [df42cf18] 'on'
[    1.358255] ACPI: Thermal Zone [THRM] (56 C)
Comment 20 Serge Maneuf 2009-04-29 08:51:48 UTC
I finally be succesfull by adding the boot parameter thermal.act=80 to overwrite the wrong trip point value. No more warning in the log.

Note You need to log in before you can comment on or make changes to this bug.