Latest working kernel version: 2.6.24.7-92.fc8 Earliest failing kernel version: 2.6.26.2-2.fc8 Distribution: Fedora 8 Hardware Environment: Clevo M720R notebook with Intel T8100 processor Software Environment: i686 Problem Description: The contents of /proc/acpi/thermal_zone/THRM/temperature reported a temperature of 3428C. This seemed to happen shortly after the hardware active cooling trip-point was reached and the fans activated around 53C in THRM). The machine continued to operate without any problems, despite the erroneous ACPI temperature report. coretemp reported: $ sensors coretemp-isa-0000 Adapter: ISA adapter Core 0: +42°C (high = +100°C) coretemp-isa-0001 Adapter: ISA adapter Core 1: +42°C (high = +100°C) The following was in dmesg: ACPI Exception (thermal-0469): AE_ERROR, ACPI thermal trip point state changed Please send acpidump to linux-acpi@vger.kernel.org [20080321] ACPI: EC: acpi_ec_wait timeout, status = 0x09, event = "b0=1" ACPI: EC: read timeout, command = 130 For this machine's acpidump, see attachment 17030 [details] for bug 11170.
> The following was in dmesg: > ACPI Exception (thermal-0469): AE_ERROR, ACPI thermal trip point state > changed > Please send acpidump to linux-acpi@vger.kernel.org > [20080321] we can also see this in 2.6.24 kernel. > ACPI: EC: acpi_ec_wait timeout, status = 0x09, event = "b0=1" > ACPI: EC: read timeout, command = 130 this is new in 2.6.26. Could you please verify that if the temperature is right until this message? please attach the full dmesg output of 2.6.26.
Created attachment 17200 [details] customized DSDT Please try to override the DSDT with the one I attached. Set CONFIG_ACPI_DEBUG and recompile the kernel. reboot with "acpi.debug_level=0x0f" and attach the dmesg out after the temperature becomes wrong.
Created attachment 17202 [details] try the debug patch Will you please try this debug patch ? After the system is booted, please cat /proc/acpi/thermal_zone/THRM/temperature and attach the output of dmesg.
James, any updates?
Sorry I've not updated you on this. I've not seen it happen on any more recent kernels. Shall I close it unless it surfaces again?
okay. seems that the bug is already fixed in the latest kernel.
This seems like a rare thing, and it just came back to bite me. Seen in kernel-PAE-2.6.26.5-22.fc8: a bogus reading of 3428C, this time it shut down the machine. Sep 16 12:34:38 rhapsody kernel: ACPI: Critical trip point Sep 16 12:34:38 rhapsody kernel: Critical temperature reached (3428 C), shutting down. This happened not long after I resumed from suspend-to-RAM. If I find the time, I'll try the debug patch above. I also see ACPI: EC: GPE storm detected, disabling EC GPE which wasn't present in the 2.6.24 series, plus the usual trip-point message.
Hi, James Thanks for your info. From the description in comment #7 it seems that this issue is related with EC. In the AML code the temperature is obtained by reading the EC internal register.And on your laptop there exists the EC GPE storm. Will you please try the attached four patches on the latest kernel(2.6.27-rc6) and see whether the problem still exists?
Created attachment 17820 [details] patch 1/4: Don't issue the burst disable command if EC exits the burst mode
Created attachment 17821 [details] Patch 2/4: Clear the query_pending bit only after processing EC notification event
Created attachment 17822 [details] Patch 3/4: Switch to polling mode when there is no EC GPE interrupt for some EC transactions If there is no EC GPE confirmation for some EC transactions, it will be switched to polling mode. And when EC internal register is accessed, it will work in polling mode. But the EC GPE is still enabled.
Created attachment 17823 [details] patch 4/4: Add some delay in EC GPE handler to avoid EC GPE storm
Hi, James Will you please try the attached patch set on the latest kernel(2.6.27-rc6) and see whether the problem still exists? Please add the boot option of "acpi.debug_layer=0x04010000 acpi.debug_level=0x17" and attach the output of dmesg after test. Thanks.
Created attachment 17829 [details] fast transaction Hi, please check if this patch works for you? it is supposed to be a better solution to storm problem, but your case may differ.
Hi, James Do you have an opportunity to do the test as mentioned in comment #13? Thanks.
yzhao, I managed to build the kernel with the patch applied, but couldn't get it to boot (it didn't find the root logical volume for some reason, stopped at switchroot with "Booting has failed"). I don't know what I've done wrong yet, I'll try the patch later on a Fedora development kernel and see if that works.
Hi, James Maybe you should use the same .config file with the 2.6.26.2 Fedoral kernel. thanks.
Created attachment 18046 [details] patch vs 2.6.27-rc7 This version of Alexey's fast transaction patch has been checked into the acpi-test tree. Please let us know if you have any troubles with it. thanks, -Len
shipped in linux-2.6.28-rc1 closed commit 7c6db4e050601f359081fde418ca6dc4fc2d0011 Author: Alexey Starikovskiy <astarikovskiy@suse.de> Date: Thu Sep 25 21:00:31 2008 +0400 ACPI: EC: do transaction from interrupt context
Sorry I've not been able to provide further info over the past few weeks --- I'll try giving a 2.6.28-series kernel a go and check that this problem has been fixed.