Bug 11634

Summary: Sometime my laptop is dead on resume from ram
Product: Power Management Reporter: Romano Giannetti (romano.giannetti)
Component: Hibernation/SuspendAssignee: Rafael J. Wysocki (rjw)
Status: CLOSED CODE_FIX    
Severity: high CC: acpi-bugzilla
Priority: P1    
Hardware: All   
OS: Linux   
Kernel Version: 2.6.27-rc6 Subsystem:
Regression: Yes Bisected commit-id:
Bug Depends on:    
Bug Blocks: 7216, 11167    
Attachments: lspci -vnn && lsusb -v
dmesg
lsmod
acpidump
dmesg: boot and a following successful suspend/resume cycle

Description Romano Giannetti 2008-09-24 01:12:31 UTC
Latest working kernel version: 2.6.26.2

Earliest failing kernel version: 2.6.27-rc

Distribution: ubuntu 8.04

Hardware Environment: toshiba satellite U305-S5077 (lspci and boot dmesg will be attached right away)

Software Environment: doing native (by menu) suspend/resume

Problem Description: 

sometime resume from ram fails, giving a completely dead machine. Nothing works, not even SysRq-B. Nothing is left in the logs. It happens maybe every 7-9 correct resume events. It never happened in 2.6.26 (then it had problems with video not resuming, but this was another kind of problem). Problem is hardly bisectable...

The main difference could be the fact that now I have ath5k.ko driver instead of ndiswrapper for the wireless, but I have no proof whatsoever that the wireless could be the culprit. I can try to run with ndiswrapper instead a couple of day, but I'm not sure this is the correct strategy.

Any hints?
Comment 1 Romano Giannetti 2008-09-24 01:14:51 UTC
Created attachment 17996 [details]
lspci -vnn && lsusb -v
Comment 2 Romano Giannetti 2008-09-24 01:15:57 UTC
Created attachment 17997 [details]
dmesg
Comment 3 Romano Giannetti 2008-09-24 01:17:33 UTC
Created attachment 17998 [details]
lsmod
Comment 4 Rafael J. Wysocki 2008-09-24 11:01:49 UTC
Have you tried to unload the ath5k module before suspend and reload it after the resume?
Comment 5 Romano Giannetti 2008-09-25 01:16:59 UTC
I am trying just that now. Survived 5 cycles, but that means (still) nothing. 
Comment 6 ykzhao 2008-09-25 20:35:29 UTC
Will you please add the boot option of "acpi_sleep=s3_beep" and see whether the beep voice can be heard when the system is resumed?
   Of course please attach the output of acpidump.
   Thanks.
Comment 7 ykzhao 2008-09-25 20:37:12 UTC
If the beep voice can be heard, please always add the boot option of "acpi_sleep=s3_beep" and see whether the beep voice can be heard when failing in resume.
   Thansk.
Comment 8 Romano Giannetti 2008-09-27 11:35:12 UTC
Ok, I will add it when I do the next reboot (compiling rc7 now). 
Will attach acpidumd output now. 
Going around 10 cycles, now with ath5k loaded, and I can't trigger it anymore. 
Heisenbug?
Comment 9 Romano Giannetti 2008-09-27 11:36:24 UTC
Created attachment 18077 [details]
acpidump
Comment 10 Romano Giannetti 2008-09-27 11:37:42 UTC
By the way, I do not know if it is a *real* regression. It did never happen if I haven't ath5k loaded, and this is a new driver, so...
Your call, anyway.
Comment 11 Romano Giannetti 2008-09-27 13:46:18 UTC
Booted 2.6.27-rc7 with

[    0.000000] Kernel command line: root=UUID=995a9794-b3d4-4066-b9f8-36b84d4525ff ro nosplash log_buf_len=2M acpi_sleep=s3_beep

but there is  no beep on good resume (I do not know if this should be like that...).

Will update if I'll have another lock.
Comment 12 Zhang Rui 2008-09-27 23:13:55 UTC
(In reply to comment #10)
> By the way, I do not know if it is a *real* regression. It did never happen
> if
> I haven't ath5k loaded, and this is a new driver, so...
> 
if it's true, this is rather a ath5k driver bug than the generic suspend/resume bug.
will you please make sure that
problem can only be reproduced when ath5k driver is loaded?
Comment 13 Romano Giannetti 2008-09-28 04:17:22 UTC
It's quite difficult to be sure. I am running now with ath5k loaded. 
The problem is that is seems independent from the number of times I do suspend/resume (doing a lot on a row doesn't trigger it), it appears randomly after a time is passed after boot. 
Comment 14 Romano Giannetti 2008-10-03 05:33:25 UTC
Juat to keep this uptodate: I was unable to trigger it again. 

A guy (Adrian Knoth) suggested me in private mail that he sees a similar thing on his HP nx6325 if he unplugged the AC adapter when suspended, and resume on battery; I tried this too but no locks happened. 

So, it's almost a week that I am running the same -rc7 without any problem. I suppose that the acpi_sleep=s3_beep could not made a difference, true? 
Comment 15 Rafael J. Wysocki 2008-10-03 07:51:17 UTC
(In reply to comment #14)
> Juat to keep this uptodate: I was unable to trigger it again. 
> 
> So, it's almost a week that I am running the same -rc7 without any problem. I
> suppose that the acpi_sleep=s3_beep could not made a difference, true? 

True.

Do you think the bug can be closed, then?
Comment 16 Romano Giannetti 2008-10-03 07:54:58 UTC
Well, I suppose it can be closed as UNREPRODUCIBLE; if it happens again, I will reopen it or open another one. 
Comment 17 Romano Giannetti 2008-10-04 02:28:35 UTC
Spoke too soon, happened again this morning. Completely dead, no acpi beep, no sysrq keys. I *do* had ath5k loaded.
Comment 18 Rafael J. Wysocki 2008-10-04 03:04:36 UTC
Can you please apply the patch from http://marc.info/?l=linux-kernel&m=122307130419753&w=4 and see if the problem is reproducible with that?  Please also attach a dmesg with a successful suspend/resume cycle with this patch applied.
Comment 19 Romano Giannetti 2008-10-04 08:30:17 UTC
Ok, I will. In this moment I can´t do it because I do not have the laptop with me till tomorrow evening, but I will asap. Do the patch apply to linus´ current kernel?
Comment 20 Rafael J. Wysocki 2008-10-04 11:01:39 UTC
(In reply to comment #19)
> Do the patch apply to linus´ current kernel?

Yes, it should apply.
Comment 21 Romano Giannetti 2008-10-05 09:48:22 UTC
patch applied on v2.6.27-rc8-84-gfec6ed1. Will append the dmesg in a minute.
Comment 22 Romano Giannetti 2008-10-05 09:50:10 UTC
Created attachment 18174 [details]
dmesg: boot and a following successful suspend/resume cycle
Comment 23 Rafael J. Wysocki 2008-10-05 11:17:41 UTC
Well, it looks like your BIOS doesn't enable ACPI on resume and that may lead to problems described in your report.  You need the patch anyway, so please keep it applied.

Still, if it doesn't fix your resume problems, we'll have more debugging to do.
Comment 24 Romano Giannetti 2008-10-06 01:13:48 UTC
Till now, all ok. Did 4 cycles without problems.
Comment 25 Rafael J. Wysocki 2008-10-06 05:58:32 UTC
How many cycles did it take to reproduce the problem before?
Comment 26 Romano Giannetti 2008-10-06 07:12:39 UTC
Uff. It happened after one week of use, maybe 4-5 cycles per day. But it seems that happens randomly, sometime it triggered two times in a row, sometime it worked ok for days... I am trying to stress it a bit today, suspending and resuming a lot of times. Till now no problem. 
Comment 27 Rafael J. Wysocki 2008-10-09 08:50:35 UTC
OK, do I understand correctly that the patch does help?
Comment 28 Romano Giannetti 2008-10-09 22:39:59 UTC
Yes, I think it helps. Although I cannot be sure given the randomness of the bug, it's more than a week now of correct behaviour. Surely the patch does not harm anything, and it seems it helps.
I think you can close it, if it happens again with the patch, I will reopen it.
Comment 29 Romano Giannetti 2008-10-14 08:17:58 UTC
Still ok after 9 days. Yes, I think the patch definitely helps. Please push it for -stable, if you want you can add a 

Tested-by: Romano Giannetti <romano.giannetti@gmail.com>
Comment 30 Rafael J. Wysocki 2008-10-14 12:05:02 UTC
The patch is on its way to the mainline and it will go into -stable after it's been merged.
Comment 31 Len Brown 2008-10-16 11:18:03 UTC
thanks for testing, Romano.
marking as RESOLVED, since fix is in the queue
Comment 33 Len Brown 2008-10-24 23:16:37 UTC
shipped in linux-2.6.28-rc1
closed

commit a68823ee5285e65b51ceb96f8b13a5b4f99a6888
Author: Matthew Garrett <mjg59@srcf.ucam.org>
Date:   Wed Aug 6 19:12:04 2008 +0100

    ACPI: Clear WAK_STS on resume