Bug 60802 - regression bisected: s2disk fails to resume image: Processes could not be frozen, cannot continue resuming
Summary: regression bisected: s2disk fails to resume image: Processes could not be fro...
Status: NEW
Alias: None
Product: Process Management
Classification: Unclassified
Component: Other (show other bugs)
Hardware: i386 Linux
: P1 normal
Assignee: process_other
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2013-08-27 03:42 UTC by Andrew Savchenko
Modified: 2013-12-12 22:05 UTC (History)
1 user (show)

See Also:
Kernel Version: 3.11-rc7
Subsystem:
Regression: Yes
Bisected commit-id:


Attachments
linuxrc-workaround.patch (2.96 KB, patch)
2013-08-27 03:43 UTC, Andrew Savchenko
Details | Diff
config.xz (9.59 KB, application/x-xz)
2013-08-27 03:44 UTC, Andrew Savchenko
Details

Description Andrew Savchenko 2013-08-27 03:42:10 UTC
Hello,

I found a quite old regression from 3.7-rc1 which is still here with 3.11-rc7 or 3.10.9. I started LKML thread about this bug two weeks ago, but still there is no reply: http://marc.info/?l=linux-kernel&m=137633669228353&w=2

User-space resume (from suspend-1.0 aka uswsusp) no longer works. Kernel-space suspend and resume work fine (e.g. echo disk > /sys/power/state), problem is with user-space support. (I need user-space version because it supports image encryption.)

After resume (essentially linuxrc) application loads image it fails to apply it:

========================================================
Processes could not be frozen, cannot continue resuming.
Error 11: Resource temporarily unavailable

You can now boot the system and lose the saved state
or reboot and try again.

[Notice that if you decide to reboot, you MUST NOT mount
any filesystems before a successful resume.
Resuming after some filesystems have been mounted
will badly damage these filesystems.]

Do you want to continue booting (Y/n)?
========================================================

Error code wasn't originally showed, I added it to suspend tool to aid debugging. Essentially freeze ioctl on /dev/snapshot fails with this error.

I bisected a commit which introduces this bug:

========================================================
commit ba4df2808a86f8b103c4db0b8807649383e9bd13 
Author: Al Viro <viro@zeniv.linux.org.uk> 
Date:   Tue Oct 2 15:29:10 2012 -0400 

    don't bother with kernel_thread/kernel_execve for launching
linuxrc 
    exec_usermodehelper_fns() will do just fine... 
    
    Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> 
========================================================

In fact this commit induced/triggered at least two bugs: the first one I'm facing now and the second one was fixed in commit f0de17c0babe7f29381892def6b37e9181a53410: make sure that /linuxrc has std{in,out,err}.

As a temporarily workaround for this issue I reverted all changes for init/do_mounts_initrd.c up to the latest working commit cb450766bcafc7bd7d40e9a5a0050745e8c68b3e considering the kernel API changes (kernel_execve -> sys_execve). See linuxrc-workaround.patch. I understand this isn't a proper solution, I just want to show what code works for me.

I also found an interesting LKML discussion about s2disk and freezer issue: http://www.spinics.net/lists/linux-nfs/msg38160.html Maybe it is related to this bug, but patch proposed there doesn't in my case.

Kernel config which fails with ba4df2808a86f8b103c4db0b8807649383e9bd13 and works with f0de17c0babe7f29381892def6b37e9181a53410 is also attached.

As this issue maybe hardware related, the system is 32-bit EEE PC 1000H with Atom N270, 2GB RAM, 750 GB SATA drive.

Additional (but probably useless) information on this bug may be found here: https://forums.gentoo.org/viewtopic-p-7371120.html

If you need some additional info, testing or debugging, please let me know!
Comment 1 Andrew Savchenko 2013-08-27 03:43:22 UTC
Created attachment 107331 [details]
linuxrc-workaround.patch

This workaround just reverts all offending changes, but it works fine here.
Comment 2 Andrew Savchenko 2013-08-27 03:44:26 UTC
Created attachment 107332 [details]
config.xz

Kernel config which works with f0de17c0babe7f29381892def6b37e9181a53410 and fails with ba4df2808a86f8b103c4db0b8807649383e9bd13.
Comment 3 devsk 2013-12-12 21:47:40 UTC
I am running into the exact same issue. Any chance of this getting some attention? Its a serious regression to be not able to use s2disk.

Particularly, the filer has done quite a bit of analysis already.
Comment 4 Andrew Savchenko 2013-12-12 22:05:06 UTC
You may find a proper patch here:
http://lkml.org/lkml/2013/9/18/455

It was acked by Rafael J. Wysocki:
http://lkml.org/lkml/2013/10/17/558
but as far as I can tell it is still not in the tree.

Works fine here.

Note You need to log in before you can comment on or make changes to this bug.