Bug 13058

Summary: First hibernation attempt fails
Product: Power Management Reporter: Alan Jenkins (alan-jenkins)
Component: Hibernation/SuspendAssignee: power-management_other
Status: CLOSED CODE_FIX    
Severity: normal CC: rjw
Priority: P1    
Hardware: All   
OS: Linux   
Kernel Version: 2.6.30-rc1 Subsystem:
Regression: Yes Bisected commit-id:
Bug Depends on:    
Bug Blocks: 7216, 13070    
Attachments: v2.6.30-rc1-136-g62b8e68 dmesg - after hibernation failure
Kernel config for above dmesg

Description Alan Jenkins 2009-04-10 10:58:04 UTC
Last working version: 2.6.29

On my laptop (EEE 4G), the first attempt at hibernation fails. 
Subsequent attempts succeed.

If I use s2disk as normal, I see "snapshotting system", then before it
gets to write out the image, it aborts and switches back to X.   I can't
see any error message, even if I check the log with "dmesg".

If I use "echo disk > /sys/power/state", echo reports the error "Cannot
allocate memory".

References: http://marc.info/?l=linux-kernel&m=123928022321917&w=2
Comment 1 Alan Jenkins 2009-04-10 11:06:29 UTC
Created attachment 20919 [details]
v2.6.30-rc1-136-g62b8e68 dmesg - after hibernation failure

Dmesg with PM_DEBUG.  Attempted hibernation using my usual s2disk (triggered by gnome-power-manager).
Comment 2 Alan Jenkins 2009-04-10 11:12:11 UTC
Created attachment 20920 [details]
Kernel config for above dmesg
Comment 3 Alan Jenkins 2009-04-10 11:22:26 UTC
I reproduce(d) this by booting with my wireless device removed (thanks to eeepc-laptop's pecualiar rfkill), and logging in to a sparse KDE session.

This session includes kmix, gnome-power-manager, and knetworkmanager running as system notification icons, but no windows.  I hibernate by pressing the the "sleep" button, which causes gnome-power-manager to (indirectly) run s2disk.

If I start up my normal KDE session, which adds konsole, akregator, and konqueror set to "about:", and enable wireless, then I cannot reproduce it.

[The original v2.6.30-rc1 caused ath5k to fail to load.  When I tried to reproduce on the latest git, it seemed necessary to disable the wireless.  Also, simply running Konsole seemed to avoid the bug].

I found some of these details half way through a bisection run, which is par for the course :-).  I'll try again now.
Comment 4 Alan Jenkins 2009-04-10 13:58:55 UTC
1faa16d22877f4839bd433547d770c676d1d964c is first bad commit
commit 1faa16d22877f4839bd433547d770c676d1d964c
Author: Jens Axboe <jens.axboe@oracle.com>
Date:   Mon Apr 6 14:48:01 2009 +0200

    block: change the request allocation/congestion logic to be sync/async based

    This makes sure that we never wait on async IO for sync requests, instead
    of doing the split on writes vs reads.

    Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
    Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

:040000 040000 36195bfe0020ec7b5b563287c2c3fc69d9414d86 644c5a877cac09d760644e579cc77c0b92aeb393 M      block
:040000 040000 08a56ecc325fe1e43399cfb2548be4cfbeb5077c 6d21b6115b5bcc74f63d5d0a8397d6c5cd5cd905 M      include
:040000 040000 8de3ffe4d98143b9590d41bf23bdedfcb9d61ad5 9299a338610d09b41a9ffc67902a2c7613707e52 M      mm
Comment 5 Rafael J. Wysocki 2009-04-11 22:01:59 UTC
First-Bad-Commit : 1faa16d22877f4839bd433547d770c676d1d964c
Comment 6 Rafael J. Wysocki 2009-04-17 21:01:01 UTC
References : http://lkml.org/lkml/2009/4/17/53