Bug 12998
Summary: | S3 suspend: power shuts off completely every 20 or so suspends (T60) | ||
---|---|---|---|
Product: | ACPI | Reporter: | Sanjoy Mahajan (sanjoy) |
Component: | Power-Sleep-Wake | Assignee: | acpi_power-sleep-wake |
Status: | REJECTED INVALID | ||
Severity: | normal | CC: | rjw, rui.zhang, yakui.zhao |
Priority: | P1 | ||
Hardware: | All | ||
OS: | Linux | ||
Kernel Version: | 2.6.29 | Subsystem: | |
Regression: | No | Bisected commit-id: | |
Bug Depends on: | |||
Bug Blocks: | 7216 | ||
Attachments: |
acpidump output
dmesg from current bootup (from /var/log/dmesg) /var/log/messages showing a few S3 suspend/resumes and a reboot dmesg showing several successful suspend/resume cycles lspci -vxxx output dmesg_after after resuming (booted with acpi_sleep=s3_beep) final dmesg (all worked) |
Created attachment 20774 [details]
dmesg from current bootup (from /var/log/dmesg)
I put the machine into suspend using just echo mem > /sys/power/state (i.e. not with s2ram or acpid scripts) and wake it up by opening the lid or pushing the Fn key if the lid is already open. please attach the dmesg output after a successful suspend/resume cycle. does this problem exist in hibernation? > Every 20 or S3 suspends, the machine shuts completely off: every 20 what? > the crescent-moon LED turns on, how many LEDs are on in all? > I had noticed this problem with vanilla 2.6.27.4, and upgraded to 2.6.29 in > the > hope that it would go away. But if anything it has become more frequent. what do you mean more frequent? This bug is not always reproducible in every S3, is it? I suspect this may be a thermal problem, please attach the /var/log/messages file. and make sure there is no fan spinning when the laptop is suspended. Will you please attach the output of lspci -vxxx? Will you please do the following test and see whether the box can be resumed by power buttonr ? a. boot the system with the boot option of "acpi_sleep=s3_beep" b. kill the process using the /proc/acpi/event (use the command of "lsof /proc/acpi/event" to get the process) c. echo mem > /sys/power/state; dmesg >dmesg_after; d. after the system enters the suspended state, press the power button and see whether the box can be resumed. Of course please confirm whether the beep voice can be heard. e. If the box can't be resumed, please reboot the system and check whether there exists the file of dmesg_after . If exists, please attach it. Thanks. Created attachment 20777 [details]
/var/log/messages showing a few S3 suspend/resumes and a reboot
>> Every 20 or S3 suspends, the machine shuts completely off: > every 20 what? Sorry, I meant to type "every 20 or so S3 suspends...". >> the crescent-moon LED turns on, > how many LEDs are on in all? If I have the AC plugged in while suspending, all three LEDs (on the outside of the case) are on, including the crescent moon. As soon as I unplug the AC, only the crescent moon remains on. >> I had noticed this problem with vanilla 2.6.27.4, and upgraded to >> 2.6.29 in the hope that it would go away. But if anything it has >> become more frequent. > what do you mean more frequent? With 2.6.27.4, it happened maybe once per 50 suspends; with 2.6.29 it happens more frequently, maybe once per 20 suspends. > This bug is not always reproducible in every S3, is it? No, unfortunately. It happens about once every 20 cycles. It seems more likely to happen if I have it suspended for longer. I've never seen it happen with a 5- or 10-minute suspend; only with suspends lasting, say, an hour or longer. > I suspect this may be a thermal problem, please attach the > /var/log/messages file. I will attach a current /var/log/messages showing a few suspend/resume cycles as well as a reboot (I think I rebooted due to one of these resume failures). > and make sure there is no fan spinning when the laptop is suspended. The fan is almost never spinning when I suspend; but when it is, it always shuts off during the suspend (as far as I remember). But I'll listen carefully from now on to be sure. If the fan is spinning, I cannot turn it off by hand because the T60 fan is controlled only by the BIOS. There is nothing under /proc/acpi/fan/ for example (there are two thermal zones under /proc/acpi/thermal_zone/). Created attachment 20778 [details]
dmesg showing several successful suspend/resume cycles
> please attach the dmesg output after a successful suspend/resume > cycle. I've just attached it. > does this problem exist in hibernation? I haven't ever tested hibernation (I use only S3 suspend/resume). Created attachment 20779 [details]
lspci -vxxx output
> Will you please attach the output of lspci -vxxx?
Attached.
I'll next try the test you suggest.
Created attachment 20781 [details]
dmesg_after after resuming (booted with acpi_sleep=s3_beep)
> Will you please do the following test and see whether the box can be
> resumed by power buttonr ?
I did that test. acpid was using /proc/acpi/event, so I killed it. I
heard the beep during the suspend. It resumed using the power button,
making several beeps in the process. For completeness, I've attached
the dmesg_after file.
please 1. set CONFIG_PM_DEBUG and rebuild 2. echo core > /sys/power/pm_test 3. echo mem > /sys/power/state 4. run this test for 50 times and see if the problem is reproducible. > 4. run this test for 50 times and see if the problem is reproducible.
After recompiling and "echo core > /sys/power/pm_test", I did
for n in `seq 50`; do
echo ==== RUN $n ====;
echo mem > /sys/power/state
sleep 2
done
and all 50 suspend/resume cycles worked fine.
what about this test: "echo none > /sys/power/pm_test" for n in `seq 50`; do dmesg -c echo mem > /sys/power/state dmesg > dmesg-$n sleep 2 done when the S3 fails, please ttach the latest dmesg ouput. Created attachment 20938 [details]
final dmesg (all worked)
I ran that test, and (unfortunately) it suspended and resumed fine all 50 times. For completeness, here is the last dmesg.
Yesterday the bug repeated itself with vanilla 2.6.29 (without PM_DEBUG), but there was nothing useful in the logs.
If I keep running the PM_DEBUG kernel, is there a better chance of finding something in the logs if the problem recurs?
(In reply to comment #17) > Created an attachment (id=20938) [details] > final dmesg (all worked) > > I ran that test, and (unfortunately) it suspended and resumed fine all 50 > times. For completeness, here is the last dmesg. > > Yesterday the bug repeated itself with vanilla 2.6.29 (without PM_DEBUG), but > there was nothing useful in the logs. > so you get that dmesg output and there is nothing abnormal? will you please attach that dmesg output please? > If I keep running the PM_DEBUG kernel, is there a better chance of finding > something in the logs if the problem recurs? I don't think so, unless you can reproduce this bug in PM_DEBUG kernel. > so you get that dmesg output and there is nothing abnormal? will you > please attach that dmesg output please? (It wasn't a PM_DEBUG kernel.) Because the system crashed, I had to reboot, which reset the kernel ring buffer. And because it crashed before resuming, there was also nothing from the suspend (or the resume) in the syslog. > I don't think so, unless you can reproduce this bug in PM_DEBUG > kernel. I'll keep running with this PM_DEBUG kernel and see what happens if the bug reproduces itself in ordinary use (as it has in the past). okay, but I think you'd better use a script to save the dmesg before suspend every time when you want to do a S3. :) > okay, but I think you'd better use a script to save the dmesg before suspend
> every time when you want to do a S3.
Good idea. But if I use a script like
dmesg > /root/s3hang/before.dmesg
echo mem > /sys/power/state
how will it work? The before.dmesg file will contain all dmesgs, but
only before the S3 suspend started. The messages generated during the
suspend won't be saved anywhere. Or is there a way to save those too?
The obvious solutions like
dmesg > /root/s3hang/before.dmesg
echo mem > /sys/power/state
dmesg > /root/s3hang/after.dmesg
won't work since the second dmesg won't run until resume, and even then,
it won't save the file until a sync.
Or am I missing a clever method?
well, it's difficult to get the dmesg output when system hangs during suspend. As this bug is not reproducible in every S3, what we can do is to see if there is anything abnormal before the failed suspend, e.g. any device works in an incorrect state before suspend, etc. ping Sanjoy I've been running the 2.6.29 kernel w/ PMDEBUG since my last report. My current theory is that the laptop gets squeezed by books in my backpack, pushing the power button and shutting off the machine. I've therefore been careful over the last few weeks to stand the backpack upright in order to reduce the chance of that happening. Since starting that habit, I haven't been able to reproduce the problem. Which means it may be my fault to begin with, and not an ACPI or even a hardware problem (except that the T60 lid isn't as sturdy as I would like). Sorry for the likely noise. hah, thanks for finding out the ROOT CAUSE of the problem. :p close this bug. |
Created attachment 20773 [details] acpidump output Every 20 or S3 suspends, the machine shuts completely off: It goes to sleep fine, the crescent-moon LED turns on, I put it in my backpack to go home, and then a few hours later when I open the lid to wake it up, nothing happens. Whereupon I notice that the crescent-moon LED was not on anymore, and the machine needs to be rebooted. My first theory was that the battery ran out. But that was never the case. Most always the battery (checked after rebooting) is around 90% or 95%. I don't know how, but maybe it oopsed a while after going to sleep? There's nothing in the log files, of course, so it's been hard to debug. The hardware is a Thinkpad T60 with Intel graphics, wireless (untainted kernel). The machine runs Debian unstable but with the vanilla 2.6.29 kernel. I had noticed this problem with vanilla 2.6.27.4, and upgraded to 2.6.29 in the hope that it would go away. But if anything it has become more frequent.