Latest working kernel version: 220.127.116.11-92.fc8
Earliest failing kernel version: 18.104.22.168-2.fc8
Distribution: Fedora 8
Hardware Environment: Clevo M720R notebook with Intel T8100 processor
Software Environment: i686
The contents of /proc/acpi/thermal_zone/THRM/temperature reported a temperature of 3428C. This seemed to happen shortly after the hardware active cooling trip-point was reached and the fans activated around 53C in THRM). The machine continued to operate without any problems, despite the erroneous ACPI temperature report. coretemp reported:
Adapter: ISA adapter
Core 0: +42°C (high = +100°C)
Adapter: ISA adapter
Core 1: +42°C (high = +100°C)
The following was in dmesg:
ACPI Exception (thermal-0469): AE_ERROR, ACPI thermal trip point state changed
Please send acpidump to email@example.com
ACPI: EC: acpi_ec_wait timeout, status = 0x09, event = "b0=1"
ACPI: EC: read timeout, command = 130
For this machine's acpidump, see attachment 17030 [details] for bug 11170.
> The following was in dmesg:
> ACPI Exception (thermal-0469): AE_ERROR, ACPI thermal trip point state
> Please send acpidump to firstname.lastname@example.org
we can also see this in 2.6.24 kernel.
> ACPI: EC: acpi_ec_wait timeout, status = 0x09, event = "b0=1"
> ACPI: EC: read timeout, command = 130
this is new in 2.6.26.
Could you please verify that if the temperature is right until this message?
please attach the full dmesg output of 2.6.26.
Created attachment 17200 [details]
Please try to override the DSDT with the one I attached.
Set CONFIG_ACPI_DEBUG and recompile the kernel.
reboot with "acpi.debug_level=0x0f" and attach the dmesg out after the temperature becomes wrong.
Created attachment 17202 [details]
try the debug patch
Will you please try this debug patch ?
After the system is booted, please cat /proc/acpi/thermal_zone/THRM/temperature and attach the output of dmesg.
James, any updates?
Sorry I've not updated you on this. I've not seen it happen on any more recent kernels. Shall I close it unless it surfaces again?
okay. seems that the bug is already fixed in the latest kernel.
This seems like a rare thing, and it just came back to bite me. Seen in kernel-PAE-22.214.171.124-22.fc8: a bogus reading of 3428C, this time it shut down the machine.
Sep 16 12:34:38 rhapsody kernel: ACPI: Critical trip point
Sep 16 12:34:38 rhapsody kernel: Critical temperature reached (3428 C), shutting down.
This happened not long after I resumed from suspend-to-RAM. If I find the time, I'll try the debug patch above.
I also see
ACPI: EC: GPE storm detected, disabling EC GPE
which wasn't present in the 2.6.24 series, plus the usual trip-point message.
Thanks for your info.
From the description in comment #7 it seems that this issue is related with EC. In the AML code the temperature is obtained by reading the EC internal register.And on your laptop there exists the EC GPE storm.
Will you please try the attached four patches on the latest kernel(2.6.27-rc6) and see whether the problem still exists?
Created attachment 17820 [details]
patch 1/4: Don't issue the burst disable command if EC exits the burst mode
Created attachment 17821 [details]
Patch 2/4: Clear the query_pending bit only after processing EC notification event
Created attachment 17822 [details]
Patch 3/4: Switch to polling mode when there is no EC GPE interrupt for some EC transactions
If there is no EC GPE confirmation for some EC transactions, it will be switched to polling mode. And when EC internal register is accessed, it will work in polling mode. But the EC GPE is still enabled.
Created attachment 17823 [details]
patch 4/4: Add some delay in EC GPE handler to avoid EC GPE storm
Will you please try the attached patch set on the latest kernel(2.6.27-rc6) and see whether the problem still exists?
Please add the boot option of "acpi.debug_layer=0x04010000 acpi.debug_level=0x17" and attach the output of dmesg after test.
Created attachment 17829 [details]
please check if this patch works for you? it is supposed to be a better solution to storm problem, but your case may differ.
Do you have an opportunity to do the test as mentioned in comment #13?
yzhao, I managed to build the kernel with the patch applied, but couldn't get it to boot (it didn't find the root logical volume for some reason, stopped at switchroot with "Booting has failed"). I don't know what I've done wrong yet, I'll try the patch later on a Fedora development kernel and see if that works.
Maybe you should use the same .config file with the 126.96.36.199 Fedoral kernel.
Created attachment 18046 [details]
patch vs 2.6.27-rc7
This version of Alexey's fast transaction patch
has been checked into the acpi-test tree.
Please let us know if you have any troubles with it.
shipped in linux-2.6.28-rc1
Author: Alexey Starikovskiy <email@example.com>
Date: Thu Sep 25 21:00:31 2008 +0400
ACPI: EC: do transaction from interrupt context
Sorry I've not been able to provide further info over the past few weeks --- I'll try giving a 2.6.28-series kernel a go and check that this problem has been fixed.