Bug 53101 - Thinkpad T410 CPU overheat/emergency shutdown after suspend/resume cycle
Summary: Thinkpad T410 CPU overheat/emergency shutdown after suspend/resume cycle
Status: CLOSED INSUFFICIENT_DATA
Alias: None
Product: Power Management
Classification: Unclassified
Component: Hibernation/Suspend (show other bugs)
Hardware: All Linux
: P1 normal
Assignee: Rafael J. Wysocki
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2013-01-27 14:32 UTC by Florian Lohoff
Modified: 2013-09-23 05:35 UTC (History)
3 users (show)

See Also:
Kernel Version: 3.2.32-1~bpo60+1 and 3.7.3-1~experimental.1
Subsystem:
Regression: No
Bisected commit-id:


Attachments
Boot log - suspend/resume cycle and shutdown - next boot. (304.83 KB, text/plain)
2013-01-27 14:34 UTC, Florian Lohoff
Details
acpidump output (320.92 KB, text/plain)
2013-01-28 09:13 UTC, Florian Lohoff
Details
turbostat -v output - 8 parallel gzips (25.77 KB, application/octet-stream)
2013-01-29 09:19 UTC, Florian Lohoff
Details
grep from kernel log resumes/shutdowns/kernel versions (13.13 KB, application/octet-stream)
2013-01-29 09:40 UTC, Florian Lohoff
Details

Description Florian Lohoff 2013-01-27 14:32:54 UTC
About 1 in 10 Suspend/Resume cycle the T410 shuts down after a small time of usage with excessive CPU overheat. The Fan stays off or at very low RPM.

I first observed this with a 3.2 kernel (Debian) and tryed 3.7.3 from experimental which observes the same behaviour.


	Jan 25 10:12:38 p2 kernel: [64778.121330] Critical temperature reached (128 C), shutting down.
	Jan 25 10:12:38 p2 kernel: [64778.125309] Critical temperature reached (128 C), shutting down.

Debian Bug: http://bugs.debian.org/698917

This is the log from initial boot - 2 suspend/resume cycles - Emergency shutdown and the next boot for 3.2.35

** Model information
sys_vendor: LENOVO
product_name: 25379UG
product_version: ThinkPad T410
chassis_vendor: LENOVO
chassis_version: Not Available
bios_vendor: LENOVO
bios_version: 6IET74WW (1.34 )
board_vendor: LENOVO
board_name: 25379UG
board_version: Not Available

Flo
Comment 1 Florian Lohoff 2013-01-27 14:34:16 UTC
Created attachment 91961 [details]
Boot log - suspend/resume cycle and shutdown - next boot.
Comment 2 Jonathan Nieder 2013-01-28 09:09:41 UTC
Please attach full "acpidump" output.
Comment 3 Florian Lohoff 2013-01-28 09:13:39 UTC
Created attachment 92031 [details]
acpidump output
Comment 4 Len Brown 2013-01-29 04:17:46 UTC
this may be related to bug 45291
Comment 5 Len Brown 2013-01-29 04:19:50 UTC
Can you reproduce the failure without using suspend?

eg. run a few copies of a cycle soaker
cat /dev/zero > /dev/null &

and monitor the frequency and temperature with turbostat
(get latest from kernel source tree to see temperature)
Comment 6 Florian Lohoff 2013-01-29 09:02:26 UTC
No it does not trigger after a normal boot.

I am working on that machine for ~1 1/2 Years now without a problem - compiling stuff all day. It started after installing a Backports kernel.

After installing 3.7.3 i tried reproducing it fast by running above like gzip -c </dev/zero >/dev/null in a couple of screens to no success even after resume. 

I was running 

while true; do egrep . /sys/class/thermal/thermal_zone0/t*; echo; echo; sleep 10; done

In another session and could see the temperature go up to 80-85°C but not more. Then i killed all my experiments and decided 3.7.3 was okay. Next morning i resumed and after 50 Minutes the Notebook died with 128°C.

I had the feeling that polling/reading in /sys/class/thermal/thermal_zone0/ cured the problem.
Comment 7 Florian Lohoff 2013-01-29 09:19:10 UTC
Created attachment 92101 [details]
turbostat -v output - 8 parallel gzips

Here is the turbostat output running 8 parallel gzips in background. Goes to 96°C and throttles.

I'll play around with it a bit ...
Comment 8 Florian Lohoff 2013-01-29 09:40:50 UTC
Created attachment 92121 [details]
grep from kernel log resumes/shutdowns/kernel versions

Short overview over resume/boot/shutdowns .... Since end of December - Starts with 3.2.35 which had this problem already.

Produced with:

(zcat kern.log.4.gz kern.log.3.gz kern.log.2.gz ; cat kern.log.1 kern.log) | egrep -i "Linux version |temperature|PM: Preparing system for mem sleep"
Comment 9 Aaron Lu 2013-07-12 02:09:08 UTC
Is this problem still there on latest upstream kernel?

Note You need to log in before you can comment on or make changes to this bug.