Bug 60802

Summary: regression bisected: s2disk fails to resume image: Processes could not be frozen, cannot continue resuming
Product: Process Management Reporter: Andrew Savchenko (bircoph)
Component: OtherAssignee: process_other
Status: NEW ---    
Severity: normal CC: funtoos
Priority: P1    
Hardware: i386   
OS: Linux   
Kernel Version: 3.11-rc7 Subsystem:
Regression: Yes Bisected commit-id:
Attachments: linuxrc-workaround.patch
config.xz

Description Andrew Savchenko 2013-08-27 03:42:10 UTC
Hello,

I found a quite old regression from 3.7-rc1 which is still here with 3.11-rc7 or 3.10.9. I started LKML thread about this bug two weeks ago, but still there is no reply: http://marc.info/?l=linux-kernel&m=137633669228353&w=2

User-space resume (from suspend-1.0 aka uswsusp) no longer works. Kernel-space suspend and resume work fine (e.g. echo disk > /sys/power/state), problem is with user-space support. (I need user-space version because it supports image encryption.)

After resume (essentially linuxrc) application loads image it fails to apply it:

========================================================
Processes could not be frozen, cannot continue resuming.
Error 11: Resource temporarily unavailable

You can now boot the system and lose the saved state
or reboot and try again.

[Notice that if you decide to reboot, you MUST NOT mount
any filesystems before a successful resume.
Resuming after some filesystems have been mounted
will badly damage these filesystems.]

Do you want to continue booting (Y/n)?
========================================================

Error code wasn't originally showed, I added it to suspend tool to aid debugging. Essentially freeze ioctl on /dev/snapshot fails with this error.

I bisected a commit which introduces this bug:

========================================================
commit ba4df2808a86f8b103c4db0b8807649383e9bd13 
Author: Al Viro <viro@zeniv.linux.org.uk> 
Date:   Tue Oct 2 15:29:10 2012 -0400 

    don't bother with kernel_thread/kernel_execve for launching
linuxrc 
    exec_usermodehelper_fns() will do just fine... 
    
    Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> 
========================================================

In fact this commit induced/triggered at least two bugs: the first one I'm facing now and the second one was fixed in commit f0de17c0babe7f29381892def6b37e9181a53410: make sure that /linuxrc has std{in,out,err}.

As a temporarily workaround for this issue I reverted all changes for init/do_mounts_initrd.c up to the latest working commit cb450766bcafc7bd7d40e9a5a0050745e8c68b3e considering the kernel API changes (kernel_execve -> sys_execve). See linuxrc-workaround.patch. I understand this isn't a proper solution, I just want to show what code works for me.

I also found an interesting LKML discussion about s2disk and freezer issue: http://www.spinics.net/lists/linux-nfs/msg38160.html Maybe it is related to this bug, but patch proposed there doesn't in my case.

Kernel config which fails with ba4df2808a86f8b103c4db0b8807649383e9bd13 and works with f0de17c0babe7f29381892def6b37e9181a53410 is also attached.

As this issue maybe hardware related, the system is 32-bit EEE PC 1000H with Atom N270, 2GB RAM, 750 GB SATA drive.

Additional (but probably useless) information on this bug may be found here: https://forums.gentoo.org/viewtopic-p-7371120.html

If you need some additional info, testing or debugging, please let me know!
Comment 1 Andrew Savchenko 2013-08-27 03:43:22 UTC
Created attachment 107331 [details]
linuxrc-workaround.patch

This workaround just reverts all offending changes, but it works fine here.
Comment 2 Andrew Savchenko 2013-08-27 03:44:26 UTC
Created attachment 107332 [details]
config.xz

Kernel config which works with f0de17c0babe7f29381892def6b37e9181a53410 and fails with ba4df2808a86f8b103c4db0b8807649383e9bd13.
Comment 3 devsk 2013-12-12 21:47:40 UTC
I am running into the exact same issue. Any chance of this getting some attention? Its a serious regression to be not able to use s2disk.

Particularly, the filer has done quite a bit of analysis already.
Comment 4 Andrew Savchenko 2013-12-12 22:05:06 UTC
You may find a proper patch here:
http://lkml.org/lkml/2013/9/18/455

It was acked by Rafael J. Wysocki:
http://lkml.org/lkml/2013/10/17/558
but as far as I can tell it is still not in the tree.

Works fine here.