Bug 5555
Summary: | suspend/resume unstable between 2.6.11 and 2.6.12/13/14 | ||
---|---|---|---|
Product: | ACPI | Reporter: | steve (stevenm) |
Component: | Power-Sleep-Wake | Assignee: | Shaohua (shaohua.li) |
Status: | REJECTED INSUFFICIENT_DATA | ||
Severity: | high | CC: | acpi-bugzilla, kernel, sziwan |
Priority: | P2 | ||
Hardware: | i386 | ||
OS: | Linux | ||
Kernel Version: | 2.6.11 upgraded to 2.6.12 2.6.13 2.6.14 2.6.15 | Subsystem: | |
Regression: | Yes | Bisected commit-id: | |
Bug Depends on: | |||
Bug Blocks: | 7216 | ||
Attachments: |
debug
lspci output, 2.6.11, after bootup lspci output, 2.6.11, after first suspend/resume cycle lspci output, 2.6.15, after bootup lspci output, 2.6.15, after first suspend/resume cycle lspci output, 2.6.15, right before a bad suspend 'cycle' |
Description
steve
2005-11-05 09:09:49 UTC
Since I haven't the laptop you have, I can't reproduce it here. Could you please narrow the problem? Between 2.6.11 to 2.6.12, there are 6 -rc releases (2.6.12-rc1 to 2.6.12-rc6).You can get the patch from (http://www.kernel.org/pub/linux/kernel/v2.6/testing/). Could you please figure out which one is the first release breaking your system? Thanks! All right. Given the intermittent nature of this bug, I'll test all of those. It usually happens once every few days, so it will probably take a while. I just downloaded RC1 and will try building it. Hopefully we'll find out soon enough... I think I'm hitting this too, the same symptoms and completely erratic behaviour. I had 2.6.12-rc6 going through literally hundreds of suspends (stress testing for this specific issue, with or without X, USB, C3, etc.) and then failing. I tested every -bk snapshot between 2.6.12-rc4 (which, at that time, seemed stable) to 2.6.12 final, only to come to a conclusion that no kernel version (including those older than 2.6.11) has ever been 100% stable on my machine. Some versions are better (last hundreds of suspends), some worse (15% resumes fail), sometimes a new build of the same version behaves differently than the previous one. I finally gave up and moved to 2.6.15-rc, which still seems to exhibit this problem. I'm getting the impression that the bug is obscurely related to the compiler version, code alignment, or something similar. For the record, I'm using gcc 3.3.6, but I remember that 3.4 didn't really help much. I also need acpi_sleep=s3_bios for my backlight to work. David: any idea on how to debug this? Serial console? Is this likely a hardware problem? Created attachment 6935 [details]
debug
So both your issues are s3 stress test failed. Attached patch will emulate a S3
process. Let's try if it can pass your stress test.
You might also check if the lspci -vv output is significently different before
suspend and after resume in a real S3 circle.
Hello again. Switched to kernel 2.6.15, this is still happening. First resume freeze happened 2 days after making the switch. Again, 2.6.11 works just fine, months without errors. I will try to get a serial console going to try to see if it produces any output, but I do not even know what to look for. Is anyone still working on this? Any ideas why this happens? Created attachment 7514 [details]
lspci output, 2.6.11, after bootup
Created attachment 7515 [details]
lspci output, 2.6.11, after first suspend/resume cycle
Created attachment 7516 [details]
lspci output, 2.6.15, after bootup
Created attachment 7517 [details]
lspci output, 2.6.15, after first suspend/resume cycle
I know this is hard, but can you give me the lspci -vv output just before the failed suspend/resume cycle? We got several failure reports (two IIRC) caused by 2.6.11 - 2.6.12 changes, but I looked at the changesets. There aren't significent suspend/resume changes. Will do. This will take time, and hopefully I will never have to post it! I will add a command to echo the output of lspci -vv to a file and sync right before suspending. Next time it fails, I will post the output. It's been about a week and sure enough, when I opened my laptop this morning, I was greeted by a locked up system. Open it up, the HD spins up, and nothing else happens. LCD backlight does not turn on, no HD activity during resume (usually the light blinks a few times during resume). I am posting an output of lspci -vv right before the bad suspend cycle. I should probably mention that my video card (Radeon Mobility M9000) does not get POSTed by the BIOS during resume. This is instead done by the Radeon driver in Xorg. I highly doubt this is responsible for the lockups, as this all works fine with kernel 2.6.11. Still... anyways, hope the output helps. Created attachment 7571 [details]
lspci output, 2.6.15, right before a bad suspend 'cycle'
Downstream bug: http://bugs.gentoo.org/126051 (for my reference only, no useful info to add) Switched from letting X wake up the graphics card to using vbetool. Same kind of behavior still. I open the lid, power light goes solid, but the hard drive doesn't even click a few times (I guess this happens as tasks begin to resume). I upgraded from 2.6.15 to 2.6.16 by thinking "well maybe they fixed it." In 2.6.15, I've been using a serial console to look at the output of various things before and after suspend. Of course, if the suspend freeze happened before the console was resumed, then that would be a problem. Well, problem is, in 2.6.16 the serial console is not properly restored after suspend until the machine is fully resumed and a userspace command is issued. This is Bug 6259. How would I capture/store any debug messages that occur during a failed suspend/resume cycle if getting the console working is contingent on a successful resume in 2.6.16? Is this still an issue on the latest kernel, currently 2.6.21? please reopen when responding to comment #16 |