Bug 10609 - systems hangs after resume from hibernation
systems hangs after resume from hibernation
Status: CLOSED INSUFFICIENT_DATA
Product: Power Management
Classification: Unclassified
Component: Hibernation/Suspend
All Linux
: P1 normal
Assigned To: ykzhao
:
Depends on:
Blocks: 7216
  Show dependency treegraph
 
Reported: 2008-05-06 10:27 UTC by Michał Świtakowski
Modified: 2011-07-30 05:00 UTC (History)
4 users (show)

See Also:
Kernel Version: 2.6.24
Tree: Mainline
Regression: No


Attachments
uname -a output (default ubuntu 8.04 kernel) (90 bytes, text/plain)
2008-05-06 13:50 UTC, Michał Świtakowski
Details
lspci -vvnn output (12.99 KB, text/plain)
2008-05-06 13:52 UTC, Michał Świtakowski
Details
dmesg output before hibernation (32.75 KB, text/plain)
2008-05-06 13:52 UTC, Michał Świtakowski
Details
dmesg output after resume (45.38 KB, text/plain)
2008-05-06 13:53 UTC, Michał Świtakowski
Details
dmesg output after resume, kernel 2.6.25.1 vanilla (45.81 KB, text/plain)
2008-05-07 06:13 UTC, Michał Świtakowski
Details
dmesg output after resume, kernel 2.6.26-rc1 vanilla (42.49 KB, text/plain)
2008-05-08 14:39 UTC, Michał Świtakowski
Details

Description Michał Świtakowski 2008-05-06 10:27:44 UTC
Latest working kernel version: 
Earliest failing kernel version: at least 2.6.20
Distribution: Ubuntu 
Hardware Environment: HP nx6310, HP nx7400 and probably other
Software Environment:
Problem Description:

The problem is described here: 
https://bugs.launchpad.net/ubuntu/+source/linux-source-2.6.20/+bug/110581

I can reproduce it every time. I can do some more tests. Just tell me what to do.


Steps to reproduce:
Hibernate. Resume from hibernation. Use the system for a few minutes. It hangs.
Comment 1 Rafael J. Wysocki 2008-05-06 12:51:14 UTC
Is this a 32-bit or 64-bit kernel?

Have you tried 2.6.25 or 2.6.26-rc1?

Does suspend to RAM work on this box?
Comment 2 Michał Świtakowski 2008-05-06 13:14:14 UTC
It's 32-bit kernel.
Suspend to RAM work perfectly.
I'll try these kernels later.
I attached some logs in the original ubuntu bug report.
Comment 3 Rafael J. Wysocki 2008-05-06 13:33:19 UTC
This looks like some sort of memory corruption to me.

Do you use highmem?

Which method of hibernation do you use (the in-kernel one, TuxOnIce or uswsusp)?
Comment 4 Rafael J. Wysocki 2008-05-06 13:34:54 UTC
Also, can you please attach the log here uncompressed (as a plain text attachment)?
Comment 5 Michał Świtakowski 2008-05-06 13:50:56 UTC
Created attachment 16050 [details]
uname -a output (default ubuntu 8.04 kernel)
Comment 6 Michał Świtakowski 2008-05-06 13:52:04 UTC
Created attachment 16051 [details]
lspci -vvnn output
Comment 7 Michał Świtakowski 2008-05-06 13:52:38 UTC
Created attachment 16052 [details]
dmesg output before hibernation
Comment 8 Michał Świtakowski 2008-05-06 13:53:24 UTC
Created attachment 16053 [details]
dmesg output after resume
Comment 9 Rafael J. Wysocki 2008-05-06 14:15:09 UTC
To check if the problem is related to highmem, please test the kernel with CONFIG_NOHIGHMEM set.
Comment 10 Michał Świtakowski 2008-05-07 06:12:27 UTC
So far I tried 2.6.25.1 vanilla with configuration the same as default ubuntu 2.6.24 kernel (HIGHMEM4G set). I didn't encounter this bug.
The hibernation method seems to be in-kernel one. I attach dmesg output after resume.
What should I try next? I think that recompiling the ubuntu kernel with NOHIGHMEM would be a good option to test.
Comment 11 Michał Świtakowski 2008-05-07 06:13:20 UTC
Created attachment 16059 [details]
dmesg output after resume, kernel 2.6.25.1 vanilla
Comment 12 Michał Świtakowski 2008-05-07 07:39:03 UTC
The bug is still there in recompiled ubuntu kernel with NOHIGHMEM.
Comment 13 Michał Świtakowski 2008-05-07 10:21:37 UTC
I tested 2.6.24.3 vanilla. It's the same version that ubuntu uses.
- with HIGHMEM4G - bug still there
- with NOHIGHMEM - bug still there
Comment 14 Rafael J. Wysocki 2008-05-07 11:20:04 UTC
Apparently, there's something in 2.6.25 that fixes the problem for you, but I have no idea what it is.  I suspect that it might be related to the ADSL device you have.

Can you please check if the problem is present in 2.6.26-rc1?
Comment 15 Michał Świtakowski 2008-05-07 15:38:00 UTC
I doubt it is related to my ADSL device. I tested ubuntu kernel with the modem unplugged and got another system hang.
I compiled 2.6.25 with debugging on by mistake. Can it make any difference?
Comment 16 Michał Świtakowski 2008-05-08 14:37:36 UTC
2.6.26-rc1 works fine
Comment 17 Michał Świtakowski 2008-05-08 14:39:00 UTC
Created attachment 16075 [details]
dmesg output after resume, kernel 2.6.26-rc1 vanilla
Comment 18 Rafael J. Wysocki 2008-05-08 15:01:38 UTC
Referring to Comment #15:
The debugging can make a difference in theory.

Referring to Comment #16:
Thanks for testing, it's good to know the current mainline is fine.

Unfortunately, I have no idea what's wrong with the previous kernels.  Multiple hibernation-related patches went in before 2.6.25 and one of them might fix the problem.  Also, it might have been fixed by an architecture patch or a driver fix.
Comment 19 Michał Świtakowski 2008-05-09 10:40:14 UTC
Saying "fine" I meant that I don't experience hang ups with newer kernels. However there were some "minor" issues. The computer didn't always turn off after hibernation. Resumed system sometimes didn't recognize dual core processor and hung during the shutdown.
Another problem was that configuration from /boot/config-2.6.24-16-generic wasn't really identical to the ubuntu kernel configuration. Debugging was on, some drivers were disabled. 
I'll check 2.6.25 or 2.6.26 kernel when it gets packaged for next ubuntu release.
Comment 20 Kovacs, Tamas 2008-07-22 07:15:40 UTC
I can reproduce the hang with 2.6.26 on NX7400. Using uswsusp (0.8), the system always hangs after resuming from suspend to disk, within at most 30 minutes. Tuxonice (with 2.6.25.x kernels) generally works, although I have noticed occasional hangs similar to the one with uswsusp, yet I'm not convinced that it's the same bug and not something specific to tuxonice. I haven't tried the in-kernel suspend yet. Suspend to ram works perfectly.

Albeit I'm using a kernel with gentoo patches, I don't think those are interfering with suspend, but if it helps debugging the problem I can use vanilla 2.6.26 too.
Comment 21 Michał Świtakowski 2008-09-12 02:52:09 UTC
I installed 2.6.27 kernel witch is to be included in Ubuntu 8.10. Unfortunately the same problem still exists. So the mainline is not fine. Maybe there was something missing in vanilla kernels that made them work.
Comment 22 Michał Świtakowski 2008-11-10 08:41:04 UTC
Today I tried uswsusp and still the same issue - system hangs a few seconds after resume. 
Comment 23 Rafael J. Wysocki 2008-11-10 09:55:01 UTC
OK, can you please test 2.6.28-rc4 and from kernel.org, _not_ from Ubuntu?  Also, if that fails as well, please check what

# echo shutdown > /sys/power/disk
# echo disk > /sys/power/state

does.
Comment 24 Michał Świtakowski 2008-11-12 15:25:09 UTC
Today I checked tuxonice and the same problem occurred. So it seems to be unrelated to hibernation method. I suspect the new SLUB allocator may be guilty. I'll recompile the kernel with SLAB on and try tuxonice and maybe uswsusp.
Comment 25 Rafael J. Wysocki 2008-11-12 15:31:17 UTC
Did that happen with the Ubuntu kernel or with a kernel.org kernel?
Comment 26 Michał Świtakowski 2008-11-13 03:01:59 UTC
Ubuntu 2.6.24 kernel patched with tuxonice: 
http://kernel.ubuntu.com/git (ubuntu-hardy+tuxonice branch)
Comment 27 Kovacs, Tamas 2008-11-13 03:04:32 UTC
I've checked the kernel.org 2.6.28-rc4 and it still has the weird behaviour(*)/hangs after resume from suspend to disk using uswsusp. STR works perfectly.

Tuxonice worked fine for me most of the time, although it did had similar problems occasionally (once in about every 20 resume).

(*) By weird behaviour I mean various processes getting stuck in D state, or having unexpected sluggishness, freezing for seconds while the system is not under any kind of load. For example, xterm not responding to input for seconds, while top running in another xterm keeps on updating and shows no activity. In one case I was running mplayer with strace when it stuck on a semaphore operation.
Comment 28 Michał Świtakowski 2008-11-13 09:39:20 UTC
SLAB doesn't fix it. For me tuxonice doesn't improve the situation at all. System hangs about 1-3 minutes after resume. More specifically I mean:
Systemw works normally for a moment then it slows down considerably - mouse cursor refreshes once a second. It doesn't last long, usually I'm not able to click shutdown button. It hangs completely soon. SysRq + REISUB works.
Comment 29 Kovacs, Tamas 2008-11-14 02:20:15 UTC
I've tried SLAB allocator with 2.6.28-rc4, and did not experience any problem  with the in-kernel swsusp or with uswsusp either during 10+ suspend-resume cycle but I'll keep testing during the weekend and report back later.
Comment 30 Shaohua 2008-12-23 22:47:47 UTC
how about boot option nohz=off?
Comment 31 Michał Świtakowski 2009-02-06 07:29:45 UTC
Neither nohz=off nor kernel 2.6.29-rc3 fixes it for me.
Comment 32 ykzhao 2009-07-20 14:09:48 UTC
Will you please try the latest kernel(2.6.31-rc3) and see whether the issue still exists?
   
If the issue still hangs, will you please capture the picture of screen when it hangs?

Will you please add the following boot option and see whether the behaviour can be changed?
    a. idle=poll
    c. nolapic_timer

thanks.
Comment 33 ykzhao 2009-07-21 09:24:03 UTC
Hi
   Will you please stop the ethernet and see whether the issue still exists?
   
Thanks.
Comment 34 ykzhao 2009-09-29 07:12:44 UTC
As there is no response from the bug reporter for more than two months, this bug will be rejected.
Thanks.

Note You need to log in before you can comment on or make changes to this bug.