Bug 9513 - cat /proc/acpi/thermal_zone/THRM/temperature hangs with new BIOS
Summary: cat /proc/acpi/thermal_zone/THRM/temperature hangs with new BIOS
Status: REJECTED INVALID
Alias: None
Product: ACPI
Classification: Unclassified
Component: Power-Thermal (show other bugs)
Hardware: All Linux
: P1 normal
Assignee: Zhang Rui
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2007-12-06 14:09 UTC by Matthias Schnatbaum
Modified: 2008-06-03 18:00 UTC (History)
2 users (show)

See Also:
Kernel Version: 2.6.22.13-0.3
Subsystem:
Regression: ---
Bisected commit-id:


Attachments
acpidump, dmesg, cpuinfo, dmidecode outputs; etc. (93.41 KB, application/octet-stream)
2007-12-06 14:12 UTC, Matthias Schnatbaum
Details
acpidump output, Asus M2N-E Bios 1102 (working correctly) (145.21 KB, application/octet-stream)
2007-12-10 11:54 UTC, Matthias Schnatbaum
Details
acpidump output, Asus M2N-E Bios 1202 (hang) (145.50 KB, application/octet-stream)
2007-12-10 11:55 UTC, Matthias Schnatbaum
Details
.../THRM/* files of working BIOS 1102 (596 bytes, text/plain)
2008-01-05 15:09 UTC, Matthias Schnatbaum
Details
readable .../THRM/* files of BIOS 1202 (285 bytes, text/plain)
2008-01-05 15:10 UTC, Matthias Schnatbaum
Details
dmesg of working BIOS 1102 (22.19 KB, text/plain)
2008-01-05 15:11 UTC, Matthias Schnatbaum
Details
Asus M2N-E Bios 1102 (338.39 KB, application/zip)
2008-05-12 09:58 UTC, Matthias Schnatbaum
Details

Description Matthias Schnatbaum 2007-12-06 14:09:07 UTC
Most recent kernel where this bug did not occur: same, but with Asus M2N-E Bios 1102
Distribution: openSuse 10.3, x86_64
Hardware Environment: see appended files
Software Environment: Asus M2N-E Bios 1202
Problem Description: cat /proc/acpi/thermal_zone/THRM/temperature hangs (but rest of systems works normally)

Steps to reproduce: cat /proc/acpi/thermal_zone/THRM/temperature
Comment 1 Matthias Schnatbaum 2007-12-06 14:12:59 UTC
Created attachment 13897 [details]
acpidump, dmesg, cpuinfo, dmidecode outputs; etc.

Attachment is a zip file.
Comment 2 Matthias Schnatbaum 2007-12-08 07:25:47 UTC
Compared DSDT.dsl of Asus M2N-E Bios versions:

DSDT.dsl of Asus M2N-E Bios 1102 contains:

...
    Scope (\_PR)
    {
        Processor (\_PR.CPU0, 0x00, 0x00000000, 0x00) {}
        Processor (\_PR.CPU1, 0x01, 0x00000000, 0x00) {}
    }
...


DSDT.dsl of Asus M2N-E Bios 1102 contains:

...
    External (\_PR_.CPU0)

    Scope (\_PR)
    {
        Processor (\_PR.C000, 0x00, 0x00000000, 0x00) {}
        Processor (\_PR.C001, 0x01, 0x00000000, 0x00) {}
        Processor (\_PR.C002, 0x02, 0x00000000, 0x00) {}
        Processor (\_PR.C003, 0x03, 0x00000000, 0x00) {}
    }
...

I assume that a main change from Bios 1102 to 1202 is the addition of Quad processor support (Phenom).
Obviously they extended DSDT for that reason.
But I am still running the same AMD Athlon(tm) 64 X2 Dual Core Processor 3800+.
What does Linux ACPI do in this case? Does it compare DSDT definitions with the real number of processors?
N.B. Under Windows XP some temperature reading programs work normally with Asus M2N-E Bios 1202.
Comment 3 Zhang Rui 2007-12-09 18:10:29 UTC
So this problem only occurs after upgrading the BIOS, right?
Can you attach the full acpidump for 1102 and 1202 please?
Comment 4 Matthias Schnatbaum 2007-12-10 11:54:00 UTC
Created attachment 13948 [details]
acpidump output, Asus M2N-E Bios 1102 (working correctly)

Here are the requested acpidump outputs of Asus M2N-E Bios 1102 (working correctly) and 1202 (hang).
Comment 5 Matthias Schnatbaum 2007-12-10 11:55:25 UTC
Created attachment 13949 [details]
acpidump output, Asus M2N-E Bios 1202 (hang)
Comment 6 Matthias Schnatbaum 2007-12-10 11:58:14 UTC
Yes, right, problem occurs after BIOS upgrade to 1202 only, with same hardware and operating system.
Comment 7 Len Brown 2007-12-27 21:49:15 UTC
With the working BIOS, please paste the output from
more /proc/acpi/thermal_zone/THRM/* | cat

And for the failing BIOS, please do the same,
except exclude access to the problematic temperature file.

Can you attach the dmesg from the working 1102 BIOS?
I don't see anything special in the 1202 dmesg in attachment
in comment #1, but maybe the original dmesg will give some clues.

It looks like there is a bug in the AML of BIOS 1202.
They re-named CPU0 to be C000, in both the DSDT
and the SSDT, but didn't update the reference to
it from the thermal zone. (which is why iasl created
an External reference to the missing symbol)

I don't know if this causes the hang or not, but
it is certainly curious and it is in the same neck
of the woods...

$ cat DSDT.dsl.diff
5c5
<  * Disassembly of DSDT.dat, Fri Dec 28 00:18:35 2007
---
>  * Disassembly of DSDT.dat, Fri Dec 28 00:18:48 2007
10c10
<  *     Length           0x000078E1 (30945)
---
>  *     Length           0x00007908 (30984)
19a20,21
>     External (\_PR_.CPU0)
>
22,23c24,27
<         Processor (\_PR.CPU0, 0x00, 0x00000000, 0x00) {}
<         Processor (\_PR.CPU1, 0x01, 0x00000000, 0x00) {}
---
>         Processor (\_PR.C000, 0x00, 0x00000000, 0x00) {}
>         Processor (\_PR.C001, 0x01, 0x00000000, 0x00) {}
>         Processor (\_PR.C002, 0x02, 0x00000000, 0x00) {}
>         Processor (\_PR.C003, 0x03, 0x00000000, 0x00) {}
2949c2953
<                             0x80,               // Length
---
>                             0x02,               // Length

$ cat SSDT.dsl.diff
5c5
<  * Disassembly of SSDT.dat, Fri Dec 28 00:18:35 2007
---
>  * Disassembly of SSDT.dat, Fri Dec 28 00:18:48 2007
20,21c20,21
<     External (\_PR_.CPU1, DeviceObj)
<     External (\_PR_.CPU0, DeviceObj)
---
>     External (\_PR_.C001, DeviceObj)
>     External (\_PR_.C000, DeviceObj)
23c23
<     Scope (\_PR.CPU0)
---
>     Scope (\_PR.C000)
91c91
<     Scope (\_PR.CPU1)
---
>     Scope (\_PR.C001)

While in the DSDT, there is still an access to CPU0 in _PSL:
        ThermalZone (THRM)
        {
            Name (_AL0, Package (0x01)
            {
                FAN
            })
            Method (_INI, 0, NotSerialized)
            {
            }

            Method (_AC0, 0, NotSerialized)
            {
                If (Or (PLCY, PLCY, Local7))
                {
                    Return (KELA (TP2H))
                }
                Else
                {
                    Return (KELA (TP1H))
                }
            }

            Name (_PSL, Package (0x01)
            {
                \_PR.CPU0
            })

The other change was the size of a system IO resource shrunk
from 0x80 to 0x02:

                        IO (Decode16,
                            0x0800,             // Range Minimum
                            0x0800,             // Range Maximum
                            0x01,               // Alignment
                            0x02,               // Length

I have no idea what this resource is used for.
But it would be interesting if you could correct the DSDT
and boot with a DSDT override to see if either the CPU0
or the resource size changes make the hang go away.

How to Build a custom DSDT into the kernel
------------------------------------------
Get original DSDT:
# cp /proc/acpi/dsdt DSDT
or
# acpidump > acpidump.out
$ acpixtract DSDT acpidump > DSDT.dat

Disassemble it
$ iasl -d DSDT.dat
Make your changes:
$ vi DSDT.dsl
Build it:
$ iasl -tc DSDT.dsl
Put it where the kernel build can include it:
$ cp DSDT.hex $SRC/include/

Add this to the kernel .config:

CONFIG_STANDALONE=n
CONFIG_ACPI_CUSTOM_DSDT=y
CONFIG_ACPI_CUSTOM_DSDT_FILE="DSDT.hex"

Make the kernel and off you go!
You should see in dmesg:
Table [DSDT] replaced by host OS
Comment 8 Matthias Schnatbaum 2008-01-05 15:09:12 UTC
Created attachment 14302 [details]
.../THRM/* files of working BIOS 1102
Comment 9 Matthias Schnatbaum 2008-01-05 15:10:14 UTC
Created attachment 14303 [details]
readable .../THRM/* files of BIOS 1202
Comment 10 Matthias Schnatbaum 2008-01-05 15:11:15 UTC
Created attachment 14304 [details]
dmesg of working BIOS 1102
Comment 11 Matthias Schnatbaum 2008-01-05 15:15:57 UTC
Rebuilt the kernel with a custom DSDT as proposed.
Tried both renaming CPU0 to C000 and setting the resource size to 0x80, but the hang still occurs.
Comment 12 Zhang Rui 2008-04-29 01:07:58 UTC
There are not many changes between these two BIOSes,
and I don't know what may cause this issue.
is it fixed in a new BIOS release?
Comment 13 Matthias Schnatbaum 2008-04-29 02:42:46 UTC
Tested again with Asus M2N-E Bios 1401.
Described problem still persists.
I will stay with Bios 1102, because it works ok for my PC.
Comment 14 Zhang Rui 2008-05-11 20:18:45 UTC
I searched the Asus support website and couldn't find the 1102 BIOS.
The previous one before 1202 is 0802.
could you give me a url where you downloaded the 1102 BIOS please?

We should contact Asus as well because this problem is caused by a BIOS upgrade rather than downgrade. :(
At least we should get a release note of 1202 so that we can guess where the problem is. :(
Comment 15 Matthias Schnatbaum 2008-05-12 09:58:05 UTC
Created attachment 16117 [details]
Asus M2N-E Bios 1102

Downloaded this Bios from dlsvr02.asus.com/pub/ASUS/mb/socketAM2/M2N-E/1102.zip short time ago, but it seems not to be there anymore.
Comment 16 Matthias Schnatbaum 2008-05-12 10:05:29 UTC
Added Asus M2N-E Bios 1102 as attachment, because I am not sure it can be found on any download server anymore.
I can not remember Bios 1202 release notes, but I think it was the first one to support Phenom CPU's.
-
In the vip.asus.com forum about the M2N-E mainboard I found this comment about M2N-E Bios 1304:
"
Very disappointed, nowhere on the bios update does it list that you cannot revert after updating.  Thanks for the warning ASUS.  Now I have a BIOS that has a buggy ASL APCI because they changed the name the name in the bios from CPU(0,1) to C00(0,1,2,3) and in the process forgot to change the reference to the passive trip point device.  This causes powersaving routines to freak out since it looks for a device that no longer exists but is still referenced elsewhere.  This shoddy practice can be found in all the new Phenom enabled BIOS's on AM2 boards (M2N32 SLI, M2A VM HDMI, M2N-E, etc etc)
"
(http://vip.asus.com/forum/view.aspx?id=20080222165933328&board_id=1&model=M2N-E&page=1&SLanguage=en-us)
Comment 17 Zhang Rui 2008-06-03 18:00:46 UTC
well,
This is a BIOS problem, and I don't think we can fix this problem in kernel.
Close this bug and mark it as INVALID.

Matthias,
thanks for your effort on this.
And I suggest you to contact Asus, report this problem to see if we can get it fixed again in BIOS.

Note You need to log in before you can comment on or make changes to this bug.