Bug 22732

Summary: 2.6.37-rc1: hibernation breaks swap
Product: Power Management Reporter: Zhang Rui (rui.zhang)
Component: Hibernation/SuspendAssignee: Rafael J. Wysocki (rjw)
Status: CLOSED CODE_FIX    
Severity: normal CC: acpi-bugzilla, florian, maciej.rutecki, pascal.chapperon, rjw, stefan.seyfried
Priority: P1    
Hardware: All   
OS: Linux   
Kernel Version: 2.6.37-rc1 Subsystem:
Regression: Yes Bisected commit-id:
Bug Depends on:    
Bug Blocks: 7216, 21782    
Attachments: init from initrd

Description Zhang Rui 2010-11-11 08:38:14 UTC
this should be a 2.6.36/2.6.37-rc1 regression.
In 2.6.36-rc8, everything works well.
But in 2.6.37-rc1, system fails to resume from disk. And it can not find the swap partition any more. swapon fails to probe the swap partition.
Comment 1 Zhang Rui 2010-11-11 08:43:23 UTC
the problem seems to be fix on 32bit kernel.
Here is what I did:
1. boot into bash
2. mkswap /dev/sdaX
3. swapon -a shows the swap partition is probed successfully
4. reboot
In 32bit kernel, hibernate starts to work.
In x86_64 kernel, kernel can find swap partition, until it tried to hibernate, i.e. system suspends okay, but reboots instead of resume from disk, and kernel fails to probe the swap partition after reboot.
Comment 2 Rafael J. Wysocki 2010-11-16 22:14:11 UTC
I cannot reproduce this issue.

It looks like your resume kernel is different from the hibernated kernel.
Please verify that this isn't the case.

When exactly does the reboot happen when you're trying to resume?

Also, do you use the built-in hibernation or s2disk?
Comment 3 Zhang Rui 2010-11-17 07:50:23 UTC
(In reply to comment #2)
> I cannot reproduce this issue.
> 
> It looks like your resume kernel is different from the hibernated kernel.
> Please verify that this isn't the case.
> 
No, hibernate and resume in the same kernel.

> When exactly does the reboot happen when you're trying to resume?
> 
it hibernate normally. But this step seems break the swap partition.
And when I press power button to resume, kernel fails to find swap partition, thus it boots up normally instead of resume.

> Also, do you use the built-in hibernation or s2disk?

I run "echo disk > /sys/power/state" to hibernate.

I made a double check and the hibernation works well in 2.6.36-rc8 and breaks in 2.6.36.
Comment 4 Rafael J. Wysocki 2010-11-17 20:36:09 UTC
Hmm.  There were no changes in kernel/power/ between .36-rc8 and .36.  Moreover,
there's only one x86 change related to KVM from that time.

Any chance to bisect?
Comment 5 Pascal Chapperon 2010-12-11 12:22:57 UTC
Hi,

same issue here. Hibernation works with 2.6.36-rc8 kernel ant does not work with further kernels (x86_64). I'm currently trying to bisect (bad= v2.6.37-rc1 good= v2.6.36-rc8) : long task...

A noticeable think; with working kernels, i have the following message when trying to hibernate :
 PM: Creating hibernation image:

and not with non-working kernels...
Comment 6 Pascal Chapperon 2010-12-11 18:38:34 UTC
bisect result =>
[root@strange linux-2.6]# git bisect bad
Bisecting: 0 revisions left to test after this (roughly 0 steps)
[3624eb04c24861ab296842414f9752a393e68372] PM / Hibernate: Modify signature used to mark swap
Comment 7 Rafael J. Wysocki 2010-12-11 19:25:12 UTC
What kind of user space do you have?
Comment 8 Pascal Chapperon 2010-12-11 19:31:29 UTC
Mostly fedora 14 (with customised env., like kernel)
Comment 9 Rafael J. Wysocki 2010-12-11 19:33:49 UTC
Is your resume kernel (ie. the one used for reading the image) the same
as your hibernated kernel?  Please double check.
Comment 10 Rafael J. Wysocki 2010-12-11 19:37:33 UTC
Also please let me know what your kernel command line is.
Comment 11 Pascal Chapperon 2010-12-11 19:42:13 UTC
I double check at every step of git bisect process; no doubt (git bisect: compile kernel: reboot: try hibernate: git bisect: compile: reboot: try hibernate, etc...)
Comment 12 Pascal Chapperon 2010-12-11 19:43:54 UTC
[    0.000000] Command line: ro root=UUID=3354f340-fc60-4b71-aad8-3eefb1c5dfa1 rd_NO_LUKS rd_NO_LVM rd_NO_MD rd_NO_DM LANG=fr_FR.UTF-8 SYSFONT=latarcyrheb-sun16 KEYBOARDTYPE=pc KEYTABLE=fr-latin9 rhgb quiet selinux=0 init=/sbin/bootchartd
[
Comment 13 Rafael J. Wysocki 2010-12-11 19:50:39 UTC
OK, so Fedora seems to do some tricks in initramfs to feed the kernel with
the image location and doesn't find it due to a different signature.

I'll post a patch that will change the signature back to the old one for
2.6.37, but I'd like to make sure I know what the problem is.

Which one is your resume partition?
Comment 14 Pascal Chapperon 2010-12-11 20:29:38 UTC
Created attachment 39862 [details]
init from initrd

my init from initrd
Comment 15 Pascal Chapperon 2010-12-11 20:33:20 UTC
>> Which one is your resume partition?
/dev/sda5 a plain sw
Comment 16 Pascal Chapperon 2010-12-11 20:36:45 UTC
sorry ! no way to edit previous comments ... 

So swap partition is a plain partition :
[root@strange img]# swapon -s
Filename				Type		Size	Used	Priority
/dev/sda5                               partition	3481596	0	-1
Comment 17 Rafael J. Wysocki 2010-12-11 20:38:49 UTC
Does resume from hibernation work if you add resume=/dev/sda5 to the kernel
command line?
Comment 18 Pascal Chapperon 2010-12-11 21:06:06 UTC
You are right : with "resume=/dev/sda5" in the command line, my genuine 2.6.37.rc5 don't mess hibernate; works as expected...
Comment 19 Rafael J. Wysocki 2010-12-11 21:15:32 UTC
OK, thanks for verifying.

The patch to restore the old signature is at:
https://patchwork.kernel.org/patch/400092/
Comment 20 Rafael J. Wysocki 2010-12-11 21:16:36 UTC
Patch : https://patchwork.kernel.org/patch/400092/
Handled-By : Rafael J. Wysocki <rjw@sisk.pl>
Comment 21 Pascal Chapperon 2010-12-11 21:45:15 UTC
Thanks to you for your quickness ;-)
Comment 22 Stefan Seyfried 2010-12-14 08:27:00 UTC
Unfortunately I did not see the original patch or I would have NAK'ed it anyway ;)

The suspend signature is IMHO one of the kernel interfaces that need to go through feature-removal.txt to get changed, as hopefully every distribution has some kind of "remove unresumed image in case of normal boot"-tool which needs to know the signature.

Stuff like versioning the suspend image needs probably an extra field in the suspend header.