Bug 18162

Summary: userspace software suspend hangs at "s2disk: Snapshotting system"
Product: Power Management Reporter: Martin Steigerwald (Martin)
Component: Hibernation/SuspendAssignee: power-management_other
Status: CLOSED PATCH_ALREADY_AVAILABLE    
Severity: normal CC: lenb, rjw
Priority: P1    
Hardware: All   
OS: Linux   
Kernel Version: 2.6.36-rc3 and 2.6.35.4 Subsystem:
Regression: Yes Bisected commit-id:
Bug Depends on:    
Bug Blocks: 7216    
Attachments: hardware of my ThinkPad T42

Description Martin Steigerwald 2010-09-09 12:04:35 UTC
Created attachment 29442 [details]
hardware of my ThinkPad T42

Last kernel that worked:
- 2.6.33.7-tp42-toi-3.1.1.1-05125-ga817cb9 for sure. Absolutely stable.
- I am not sure whether I tested 2.6.34 with userspace software suspend as with that one those freezes I reported in bug #16376 "random - possibly Radeon DRM KMS related - freezes" appeared.

This is with Nigel'c TuxOnIce tree 2.6.36-rc3 + TOI 3.2-rc1. Rafael, if you have a founded suspicion that this might be related to some TOI change, I can test with Linus tree as well.

Please bear with me when I take my time tough, cause after compiling a kernel for a zillion times for bisecting bug #16376, while having holidays and with that radeon drm/kms kernel that works with TuxOnIce - 2.6.33 did only work stably with userspace software supend - I just want to enjoy that kernel for a while, before rebooting and compiling a whole lot again.

If you have some other debug tips I can try, please let me know. AFAIR there is some test stuff that could be activated via sysfs.
Comment 1 Rafael J. Wysocki 2010-09-09 20:46:17 UTC
It would be good to know in what time frame exactly it regressed.  There's
quite a distance from 2.6.33 to 2.6.36-rc3.

Still, can you try (as root)

# echo devices > /sys/power/pm_test && echo disk > /sys/power/state

with the current Linus' tree and see if that freezes too (if not, it will
return to the command prompt after 5 - 10 seconds)?
Comment 2 Martin Steigerwald 2010-09-14 12:40:38 UTC
I tested with with 2.6.35.4-tp42-vmembase-0, which is a plain vanilla 2.6.35.4 from Greg's stable tree. I forgot to include the vmembase-0 patch to "fix" bug #16376, so its really 100% 2.6.35.4.

It hangs as well. No matter whether I use hibernate - which I normally use, cause I find pm-utils utterly overengineered and error prone, for example when it doesn't let me hibernate when I just upgraded a kernel that I am actually not even used - and hibernate:

martin@shambhala:~> apt-show-versions | egrep "(hibernate|pm-utils)"
hibernate/squeeze uptodate 1.99-1.1
pm-utils/squeeze uptodate 1.3.0-2

I also tried the test thing, but /sys/power/pm_test doesn't exist. Thus I just tried:

echo disk > /sys/power/state

It hangs as well. Before it printed:

PM: Syncing filesystems ... done.
Freezing user space processes ... (elapsed 0.01 seconds) done.
Freezing remaining freezable tasks ... (elapsed 0.01 seconds) done.
PM: Preallocating image memory...

I waited over a minute with nothing apparent happening.

But: It worked once from KDM with hibernate script. So it doesn't seem to work anymore once the KDE Desktop has started and initialized OpenGL composoting. I once tried with full KDE 4.5.1 nepomuk/virtuoso desktop search scan activity. The other time I paused scanning for new files. On both attempts it hung.

So plain result is: With plain 2.6.35.4 userspace software suspend and even the kernel based variant just doesn't work. FWIW TuxOnIce currently doesn't work nicely with 2.6.36-rc3 and 2.6.35.4 as well, thus I just downgraded once again to the last known fully working kernel 2.6.33.7 with userspace software suspend.

And now I don't get how to remove that "NEEDINFO" flag.
Comment 3 Martin Steigerwald 2010-09-14 13:19:49 UTC
TuxOnIce made at least *some* cycles with a started KDE desktop, while userspace software suspend hung on every attempt. TuxOnice shows problems on some resumes instead, so I think thats an different TuxOnIce related issue, thus I chose tuxonice-devel for reporting them.
Comment 4 Rafael J. Wysocki 2010-09-14 18:12:27 UTC
This appears to be the issue fixed by this commit:

http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=6715045ddc7472a22be5e49d4047d2d89b391f45

Can you please try to apply it on top of 2.6.35.y and see if it helps?
Comment 5 Martin Steigerwald 2010-09-15 09:31:23 UTC
2.6.35.4-tp42-vmembase-0-pm-avoid-oom-dirty works here. Tried two times, one tim with some applications open - a situation where it sometimes doesn't suspend cause it could not free enough low mem pages, just the same issue I have with TuxOnIce after switching to Radeon KMS. Thanks.

Hopefully Nigel can take some time soon to look into fixing up TuxOnIce. Its just faster and more responsive after a cycle.

Userspace software suspend basically seems to swap out lots of stuff prior to creating the image - it takes about a minute from "returned to userspace" before I can basically use the desktop again:

martin@shambhala:~/Computer/Shambhala/Kernel/2.6.35> free -m
             total       used       free     shared    buffers     cached
Mem:          2024        791       1232          0         56        212
-/+ buffers/cache:        523       1500
Swap:         3906        361       3545

(This is after two cycles with a fresh system which usually doesn't yet swap. KDE + applications can go into swap with 2 GiB, but not with just Kontact and a Konqueror window.)

Anyway, it works and if it continues to do so, I finally have an usable 2.6.35 on my ThinkPad T42. So thanks again.