Bug 8110 - insanely high temperature on bootup (e.g. 3517 C) - HP/Compaq nx8220, nc6000, nc8000 - 2.6.19 regression
Summary: insanely high temperature on bootup (e.g. 3517 C) - HP/Compaq nx8220, nc6000,...
Status: CLOSED CODE_FIX
Alias: None
Product: ACPI
Classification: Unclassified
Component: EC (show other bugs)
Hardware: i386 Linux
: P2 high
Assignee: Alexey Starikovskiy
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2007-03-02 04:34 UTC by Martin
Modified: 2008-10-17 21:40 UTC (History)
1 user (show)

See Also:
Kernel Version: 2.6.19
Subsystem:
Regression: ---
Bisected commit-id:


Attachments
don't read status until it's updated (5.18 KB, patch)
2007-03-07 11:40 UTC, Alexey Starikovskiy
Details | Diff

Description Martin 2007-03-02 04:34:06 UTC
Most recent kernel where this bug did *NOT* occur: 2.6.18.2
Distribution: Debian/SID
Hardware Environment: HP/Compaq nx8220 (and nc6000, nc8000)
Software Environment:
Problem Description:

Since I'm using kernel 2.6.20, on some fresh start-ups ACPI reports insanely
high temperatures of some thousand degrees C for thermal zone TZ3. I don't
exactly know what this sensor does - TZ1 is CPU temperature, TZ2 graphics chip,
TZ4 shows speed of CPU fan in % - I guess TZ3 shows some kind of case
temperature. Normal values for TZ3 are about 32C.

Another user on some forum has the same problem using a nc6000, yet another one
uses a nc8000 and kernel 2.6.19 (which I personally never installed/used). So I
guess the problem began with 2.6.19.

As I'm using powersaved, the system immediately shuts down on boot time, when
powersaved starts. Messages in syslog are sth. like:

ACPI: Critical trip point
Critical temperature reached (3264 C), shutting down.

The temperature shown is always different, mostly around 3000C. A new startup
after this shutdown (booting the same kernel) leads to the same situation, until I:
- boot with acpi=off - only resolves the issue for that boot up
- OR boot older kernel 2.6.18.2 - after that you can reboot into 2.6.20 w/o the
problem
- OR remove AC and battery and plug it in again

Steps to reproduce:
issue happens sporadically on "fresh" power-on - no way to reproduce it, yet...
Comment 1 Alexey Starikovskiy 2007-03-07 11:40:17 UTC
Created attachment 10642 [details]
don't read status until it's updated

Please try if this patch helps.
Comment 2 Alexey Starikovskiy 2007-03-07 11:41:46 UTC
patch is against 2.6.21-rc3.
Comment 3 Martin 2007-03-07 12:23:17 UTC
Hi Alexey, thanks for your reply! I'll try it ASAP... But I guess it'll take a
while to test whether the bug is gone. As I said: It only occurs from time to time.
Comment 4 Alexey Starikovskiy 2007-03-07 12:33:07 UTC
I was able to make Acer Ferrari 3200 to fall into such "broken EC" mode every time,
and this patch helps to recover from such state.
Will wait for your tests...
Comment 5 Len Brown 2007-03-07 15:20:22 UTC
patch in comment #1 applied to acpi-test
Comment 6 Martin 2007-03-09 01:06:32 UTC
I get this error compiling 2.6.21-rc3 with your patch:

  CC      drivers/acpi/ec.o
drivers/acpi/ec.c: In function 'acpi_fake_ecdt_callback':
drivers/acpi/ec.c:815: error: 'ec' undeclared (first use in this function)
drivers/acpi/ec.c:815: error: (Each undeclared identifier is reported only once
drivers/acpi/ec.c:815: error: for each function it appears in.)

What to do?
Comment 7 Alexey Starikovskiy 2007-03-09 01:29:27 UTC
change "ec" to "ec_ecdt" on this line.
Comment 8 Len Brown 2007-03-10 21:24:14 UTC
shipped in 2.6.21-rc3-git6
closed
Comment 9 Martin 2007-03-16 00:32:43 UTC
I wonder why the bug got closed although I haven't reported any success yet...

Anyway. I finally encountered the thousand-degrees-bug with 2.6.20 again. So my
strategy was to test, whether the patched 2.6.21-rc3 boots right after 2.6.20
auto-shutdowns. And it did just like 2.6.18 does! So I guess the problem is
really fixed and the bug can really be closed ;-)

The only thing I did not, was to use 2.6.21-rc3 for a longer period of time as
there are missing some other kernel patches I cannot do without. Another thing:
While booting 2.6.18 always "cures" the problem for the following bootups of
2.6.20 (just like removing the battery does), booting 2.6.21-rc3 only fixes the
issue for one next bootup of 2.6.20.

Thanks for the patch & your help! Now let's wait for the kernel to be released!
Martin
Comment 10 Len Brown 2007-03-16 15:56:49 UTC
Thanks for the confirmation, Martin.

Re: bug states
RESOLVED means there is a patch available to test.
CLOSED means that the patch that is thought to address
the symptom has shipped upstream.

In this case there was another machine with a similar
problem that suggested that this fix would work,
and we reviewed and discussed the patch in detail,
so I closed it when the fix shipped upstream.

If it turns out that I was premature in closing the report,
and that sometimes happen, anybody can simply re-open it
when it is confirmed the issue is still present.
Comment 11 ykzhao 2008-09-11 04:09:45 UTC
Hi, Martin
   Can you attach the output of acpidump?
   thanks.
Comment 12 ykzhao 2008-09-11 07:17:50 UTC
Hi, Martin
    Will you please confirm whether the system can work well on the latest kernel 2.6.27-rc5?
    From the patch in comment #1 it seems that the EC status register can't reflect the correct status of EC controller before the EC GPE interrupt is triggered. It seems weird according to the ACPI spec.
    
    Will you please confirm whether the windows can work on this system? If windows can work, please attach the output of acpidump.
    Thanks.

Note You need to log in before you can comment on or make changes to this bug.