Kernel Bug Tracker – Bug 10609
systems hangs after resume from hibernation
Last modified: 2011-07-30 05:00:44 UTC
Latest working kernel version:
Earliest failing kernel version: at least 2.6.20
Hardware Environment: HP nx6310, HP nx7400 and probably other
The problem is described here:
I can reproduce it every time. I can do some more tests. Just tell me what to do.
Steps to reproduce:
Hibernate. Resume from hibernation. Use the system for a few minutes. It hangs.
Is this a 32-bit or 64-bit kernel?
Have you tried 2.6.25 or 2.6.26-rc1?
Does suspend to RAM work on this box?
It's 32-bit kernel.
Suspend to RAM work perfectly.
I'll try these kernels later.
I attached some logs in the original ubuntu bug report.
This looks like some sort of memory corruption to me.
Do you use highmem?
Which method of hibernation do you use (the in-kernel one, TuxOnIce or uswsusp)?
Also, can you please attach the log here uncompressed (as a plain text attachment)?
Created attachment 16050 [details]
uname -a output (default ubuntu 8.04 kernel)
Created attachment 16051 [details]
lspci -vvnn output
Created attachment 16052 [details]
dmesg output before hibernation
Created attachment 16053 [details]
dmesg output after resume
To check if the problem is related to highmem, please test the kernel with CONFIG_NOHIGHMEM set.
So far I tried 18.104.22.168 vanilla with configuration the same as default ubuntu 2.6.24 kernel (HIGHMEM4G set). I didn't encounter this bug.
The hibernation method seems to be in-kernel one. I attach dmesg output after resume.
What should I try next? I think that recompiling the ubuntu kernel with NOHIGHMEM would be a good option to test.
Created attachment 16059 [details]
dmesg output after resume, kernel 22.214.171.124 vanilla
The bug is still there in recompiled ubuntu kernel with NOHIGHMEM.
I tested 126.96.36.199 vanilla. It's the same version that ubuntu uses.
- with HIGHMEM4G - bug still there
- with NOHIGHMEM - bug still there
Apparently, there's something in 2.6.25 that fixes the problem for you, but I have no idea what it is. I suspect that it might be related to the ADSL device you have.
Can you please check if the problem is present in 2.6.26-rc1?
I doubt it is related to my ADSL device. I tested ubuntu kernel with the modem unplugged and got another system hang.
I compiled 2.6.25 with debugging on by mistake. Can it make any difference?
2.6.26-rc1 works fine
Created attachment 16075 [details]
dmesg output after resume, kernel 2.6.26-rc1 vanilla
Referring to Comment #15:
The debugging can make a difference in theory.
Referring to Comment #16:
Thanks for testing, it's good to know the current mainline is fine.
Unfortunately, I have no idea what's wrong with the previous kernels. Multiple hibernation-related patches went in before 2.6.25 and one of them might fix the problem. Also, it might have been fixed by an architecture patch or a driver fix.
Saying "fine" I meant that I don't experience hang ups with newer kernels. However there were some "minor" issues. The computer didn't always turn off after hibernation. Resumed system sometimes didn't recognize dual core processor and hung during the shutdown.
Another problem was that configuration from /boot/config-2.6.24-16-generic wasn't really identical to the ubuntu kernel configuration. Debugging was on, some drivers were disabled.
I'll check 2.6.25 or 2.6.26 kernel when it gets packaged for next ubuntu release.
I can reproduce the hang with 2.6.26 on NX7400. Using uswsusp (0.8), the system always hangs after resuming from suspend to disk, within at most 30 minutes. Tuxonice (with 2.6.25.x kernels) generally works, although I have noticed occasional hangs similar to the one with uswsusp, yet I'm not convinced that it's the same bug and not something specific to tuxonice. I haven't tried the in-kernel suspend yet. Suspend to ram works perfectly.
Albeit I'm using a kernel with gentoo patches, I don't think those are interfering with suspend, but if it helps debugging the problem I can use vanilla 2.6.26 too.
I installed 2.6.27 kernel witch is to be included in Ubuntu 8.10. Unfortunately the same problem still exists. So the mainline is not fine. Maybe there was something missing in vanilla kernels that made them work.
Today I tried uswsusp and still the same issue - system hangs a few seconds after resume.
OK, can you please test 2.6.28-rc4 and from kernel.org, _not_ from Ubuntu? Also, if that fails as well, please check what
# echo shutdown > /sys/power/disk
# echo disk > /sys/power/state
Today I checked tuxonice and the same problem occurred. So it seems to be unrelated to hibernation method. I suspect the new SLUB allocator may be guilty. I'll recompile the kernel with SLAB on and try tuxonice and maybe uswsusp.
Did that happen with the Ubuntu kernel or with a kernel.org kernel?
Ubuntu 2.6.24 kernel patched with tuxonice:
http://kernel.ubuntu.com/git (ubuntu-hardy+tuxonice branch)
I've checked the kernel.org 2.6.28-rc4 and it still has the weird behaviour(*)/hangs after resume from suspend to disk using uswsusp. STR works perfectly.
Tuxonice worked fine for me most of the time, although it did had similar problems occasionally (once in about every 20 resume).
(*) By weird behaviour I mean various processes getting stuck in D state, or having unexpected sluggishness, freezing for seconds while the system is not under any kind of load. For example, xterm not responding to input for seconds, while top running in another xterm keeps on updating and shows no activity. In one case I was running mplayer with strace when it stuck on a semaphore operation.
SLAB doesn't fix it. For me tuxonice doesn't improve the situation at all. System hangs about 1-3 minutes after resume. More specifically I mean:
Systemw works normally for a moment then it slows down considerably - mouse cursor refreshes once a second. It doesn't last long, usually I'm not able to click shutdown button. It hangs completely soon. SysRq + REISUB works.
I've tried SLAB allocator with 2.6.28-rc4, and did not experience any problem with the in-kernel swsusp or with uswsusp either during 10+ suspend-resume cycle but I'll keep testing during the weekend and report back later.
how about boot option nohz=off?
Neither nohz=off nor kernel 2.6.29-rc3 fixes it for me.
Will you please try the latest kernel(2.6.31-rc3) and see whether the issue still exists?
If the issue still hangs, will you please capture the picture of screen when it hangs?
Will you please add the following boot option and see whether the behaviour can be changed?
Will you please stop the ethernet and see whether the issue still exists?
As there is no response from the bug reporter for more than two months, this bug will be rejected.