Bug 11421

Summary: ACPI Exception (thermal-0377): AE_OK, No or invalid critical threshold [20080321] - HP dv5z
Product: ACPI Reporter: Howard Chu (hyc)
Component: BIOSAssignee: Zhang Rui (rui.zhang)
Status: CLOSED DUPLICATE    
Severity: normal CC: acpi-bugzilla, lenb, trenn
Priority: P1    
Hardware: All   
OS: Linux   
Kernel Version: 2.6.26.3 Subsystem:
Regression: --- Bisected commit-id:
Attachments: acpidump output
dmidecode output

Description Howard Chu 2008-08-24 21:09:10 UTC
Latest working kernel version:
Earliest failing kernel version: any
Distribution: Ubuntu
Hardware Environment:
Software Environment:
Problem Description:
/proc/acpi/thermal_zone is empty after booting. During boot this message is seen:
[    5.895095] ACPI: ACPI0007:00 is registered as cooling_device0
[    5.895107] ACPI: Processor [CPU0] (supports 8 throttling states)
[    5.895307] ACPI: ACPI0007:01 is registered as cooling_device1
[    5.899969] ACPI Exception (thermal-0377): AE_OK, No or invalid critical threshold [20080321]

The DSDT contains checks for Windows 6; in the _TZ scope there are two handlers defined thus:
            Method (_HOT, 0, Serialized)
            {
                If (LEqual (TPOS, 0x40))
                {
                    Return (Add (0x0AAC, Multiply (TPC, 0x0A)))
                }
            }

            Method (_CRT, 0, Serialized)
            {
                If (LLess (TPOS, 0x40))
                {
                    Return (Add (0x0AAC, Multiply (TPC, 0x0A)))
                }
            }

If Windows 6 is detected, TPOS = 0x41. For Linux it defaults to 0x80. In that case these two handlers exit without any return value, which is illegal. Commenting out the If comparisons and recompiling the DSDT allows thermal monitoring to work.

There are several other checks of the OS value in this DSDT, most of them appear to be related to buttons controlling the VGA display mode (e.g. CRT->LCD->TV). I've commented those checks out in my DSDT now too, but haven't seen any visible effect.

Steps to reproduce:
Just boot any ACPI-enabled Linux kernel and observe the error message, and note the empty /proc/acpi/thermal_zone directory.
Comment 1 Howard Chu 2008-08-24 21:12:03 UTC
Created attachment 17423 [details]
acpidump output
Comment 2 Howard Chu 2008-08-24 21:12:44 UTC
Created attachment 17424 [details]
dmidecode output
Comment 3 Zhang Rui 2008-08-24 23:59:55 UTC
I think this is the laptop that thomas mentioned some time ago.

ACPICA will return 0 in this case and it's marked as an invalid critical trip point. That's why linux thermal driver doesn't work on this laptop.
this's clearly a bios problem, but we have not get a conclusion about how to fix/workaround it in Linux.

well, cc len and thomas. :)
Comment 4 Thomas Renninger 2008-08-25 04:32:08 UTC
> /proc/acpi/thermal_zone is empty after booting.
IMO only the wrong trip point should be ignored. But it may be that Windows also invalidates the whole thermal zone, don't know.

I do not have a strong opinion here as long as the machine does not shut down...
You may want to resolve this as a won't fix, not much we can do here?
Comment 5 Howard Chu 2008-08-25 10:35:34 UTC
Resolving WONTFIX is fine with me; I just wanted to document the problem and workaround. Fixing the DSDT is pretty easy. Of course, it would be better if HP wasn't releasing such a boneheaded BIOS in the first place. I'll note that this laptop shipped with BIOS version F.07 and a couple days after I received it F.08 was available for download; the DSDT is identical in both versions.

Given this statement from the lesswatts site:
>>>
In the early days of Linux/ACPI, DSDT modifications were common to work around both BIOS bugs and Linux bugs. However, the stated goal of the Linux/ACPI project today is that Linux should run on un-modified firmware.
<<<
I wanted to make the point that BIOS bugs are still a problem, and unless you guys have any particular leverage on vendors to get them to release properly written firmware, you shouldn't be ruling out DSDT modifications as a valid approach to getting full functionality from a system.
Comment 6 Zhang Rui 2008-08-27 19:59:55 UTC
this is a duplicate of bug 10686.
_CRT returns a invalid value here, probably 0x40, as this is below 0C, the thermal zone will be disabled by commit a39a2d7c72b358c6253a2ec28e17b023b7f6f41c.
But IMO, commit a39a2d7c72b358c6253a2ec28e17b023b7f6f41c is still not the right solution. Anyway, let's discuss this problem in that thread. :)
Comment 7 Zhang Rui 2008-08-27 20:00:08 UTC

*** This bug has been marked as a duplicate of bug 10686 ***
Comment 8 Len Brown 2008-10-16 17:41:42 UTC
re: comment #1

> If Windows 6 is detected, TPOS = 0x41. For Linux it defaults to 0x80.

Linux has not returned true for OSI("Linux") since 2.6.22.
This bug is filed against 2.6.26, so here we are executing
the path the same way Vista does: 

                If (_OSI ("Windows 2006"))
                {
                    Store (0x40, OSTB)
                    Store (0x40, TPOS)
                }

                If (_OSI ("Windows 2006 SP1"))
                {
                    Store (0x41, OSTB)
                    Store (0x40, TPOS)
                }

ie. Linux will see (TPOS == 0x40)

So when we come to the thermal zone:

            Method (_HOT, 0, Serialized)
            {
                If (LEqual (TPOS, 0x40))
                {
                    Return (Add (0x0AAC, Multiply (TPC, 0x0A)))
                }
# BIOS bug:
# everything except Vista (TPOS==0x40)
# and Linux -- which pretends to be Vista
# will "implicit return" here.
# But Linux and Vista will have a valid return here.
            }

            Method (_CRT, 0, Serialized)
            {
                If (LLess (TPOS, 0x40))
                {
                    Return (Add (0x0AAC, Multiply (TPC, 0x0A)))
                }
# BIOS bug:
# releases older than Visa (TPOS < 0x40)
# will get a valid _CRT, but
# Vista and Linux pretending to be Vista will
# get an "implicit return", which depending
# on the version of Linux will be 0 or some random value.
            }

So Linux is doing the right thing by rejecting the _CRT.
The only question is if we should nuke the whole thermal zone
or not.  I'm inclined to think not, but we'll deal with that
in bug 10686.


re: comment #5
Yes, I understand that BIOS bugs are still a problem, and always will be.
The point of the quoted text (which I wrote)
is that our goal is for Linux to handle
"common industry practice" aka BIOS bugs by default, and not require
users to modify their DSDT to get their systems working.

If somebody still wants to modify their DSDT, more power to them.
However, the result is not a system that any distro can support.
Comment 9 Zhang Rui 2008-10-16 18:00:12 UTC
(In reply to comment #8)
> So Linux is doing the right thing by rejecting the _CRT.
> The only question is if we should nuke the whole thermal zone
> or not.  I'm inclined to think not, but we'll deal with that
> in bug 10686.
> 
Thomas will generate a patch in his "Firmware Bug Interface" patch set.