Kernel Bug Tracker – Bug 9041
kernel acpi reads wrong temperature - critical shutdown
Last modified: 2007-12-14 03:14:49 UTC
Most recent kernel where this bug did not occur: unknown
Hardware Environment: https://bugzilla.novell.com/attachment.cgi?id=159317
Software Environment: opensuse 10.2
regulary my system shutsdown on a "wrong" temperature-alert from ACPI:
trippoint is reached by bogus value from ACPI
Sep 19 23:27:44 zion kernel: ACPI: Critical trip point
Sep 19 23:27:44 zion kernel: Critical temperature reached (91 C), shutting down.
Sep 19 23:27:44 zion kernel: ACPI: Unable to turn cooling device [dffecd88] 'on'
Sep 19 23:27:44 zion shutdown: shutting down for system halt
Sep 19 23:27:44 zion powersaved: WARNING (checkTemperatureStateChanges:218) Temperature state changed to c
Sep 19 23:27:46 zion init: Switching to runlevel: 0
Sep 19 23:27:50 zion kernel: Critical temperature reached (43 C), shutting down.
tnx 4 support
Steps to reproduce: waiting to happen
https://bugzilla.novell.com/show_bug.cgi?id=259992 has more history
trippoint is set to 60°C .. i never ever reached that value .. 6 seconds later the value shows up correct , but since powersaved ( or else ) have no margins set to retry reading acpi .. the system is doomed to shutdown ..
Developers mentioned that it is no good to recode routines to handle bogus values .. is this a bug in the kernel-chipset driver ?
i also reported to my bios-supplier , they ( of course ) refused support saying they have a magnificant bios without errors ( it only release 12 )
there seem to be some areas in driver-code that are incomplete
my system= AMD3800X2 on RS480-Mainboard (Shuttle ST20G5) .. the chipsetmodule is "it87" .. disabling thermalmonitoring by stopping lm_sensors kindo defeats the purpose .. so i think it must be fixed in driver
i think there are several issues with powersaving on this hardware .. also the WHITE SCREEN ( http://forums.suselinuxsupport.de/index.php?showtopic=36370 ) crash happens on this hardware, but since one mortal people have access to FGLRX-sourcecode debugging this seems not to get easy ..
i want to get rid of this issue - pls comment for more debug information , i will get it posted - just tell me how i can help :-)
Did you have chance to try with newer kernel, such as recent 2.6.24+? This kernel version is definitely too old.
lm_sensors and ACPI use the same hardware for getting temperature and are not able
to coexist on many systems, yours included.