Bug 12239

Summary: Swap breaks after 2x hibernation failure due to lack of swap
Product: Power Management Reporter: Alan Jenkins (alan-jenkins)
Component: Hibernation/SuspendAssignee: power-management_other
Status: CLOSED CODE_FIX    
Severity: normal CC: odie, pavel, rjw
Priority: P1    
Hardware: All   
OS: Linux   
Kernel Version: 2.6.27-rc9 Subsystem:
Regression: --- Bisected commit-id:
Bug Depends on:    
Bug Blocks: 7216    
Attachments: Proposed fix

Description Alan Jenkins 2008-12-16 13:14:49 UTC
Distribution: Ubuntu 8.10
Hardware Environment: x86-64
Problem Description:

If there is not enough swap to hibernate sucessfuly, and you try to hibernate twice in a row, the status of swap partitions is corrupted.

Steps to reproduce:

1) Conspire to ensure insufficient swap space for hibernation
2) Attempt to hibernate, and fail
3) Attempt to hibernate *again*, and fail
4) try swapoff

Outcome:

# swapoff /dev/sda6
swapoff: /dev/sda6: Invalid argument
# cat /proc/swaps
Filename                                Type            Size    Used   
Priority
/dev/sda6                               partition       1494004 1350564 -1

More details about the resulting behaviour:

Incidentally "swapoff -a" fails silently, which I think is a bug in
swapoff (i.e. it should report the error).

I don't see any backtraces in dmesg; the only errors it shows is the
failure to hibernate

If I free up some memory and try to hibernate again, I get a different
error - "Cannot find swap device, try swapon -a.".  Now swapon -a
doesn't work, but if I target the specific device...

# swapon /dev/sda6
# cat /proc/swaps
Filename                                Type            Size    Used   
Priority
/dev/sda6                               partition       1494004 331060  -1
/dev/sda6                               partition       1494004 0       -2

Wheee!  Interestingly I can now hibernate and resume.  But it's not
safe; my desktop session dies horribly.

Thanks
Alan

[ 1541.499324] kdeinit[10614]: segfault at 0 ip 00007fbd8515139a sp
00007fff8e04cf80 error 4 in libc-2.7.so[7fbd850dc000+158000]
[ 1541.520988] kdeinit[7322]: segfault at 0 ip 00007fbd8515139a sp
00007fff8e04cf50 error 4 in libc-2.7.so[7fbd850dc000+158000]
[ 1541.545512] kio_file[7343]: segfault at ffffffff00000000 ip
ffffffff00000000 sp 00007fff8e04c8a8 error 14
[ 1541.680883] Xorg[6928]: segfault at 692 ip 00007fb391c5bb92 sp
00007fff99e6a6c0 error 4 in ld-2.7.so[7fb391c4e000+1d000]

(ignore this bit, I think it happens every time X starts)
[ 1541.686628] Xorg:6928 freeing invalid memtype d0020000-d002a000
[ 1541.686748] Xorg:6928 freeing invalid memtype d002a000-d0032000
[ 1541.686848] Xorg:6928 freeing invalid memtype d0032000-d0042000
[ 1541.686948] Xorg:6928 freeing invalid memtype d0042000-d0043000
[ 1541.687221] Xorg:6928 freeing invalid memtype d0100000-d0740000
[ 1541.687421] Xorg:6928 freeing invalid memtype d077f000-d1a00000
[ 1541.687549] Xorg:6928 freeing invalid memtype d1a00000-d2040000
[ 1541.687649] Xorg:6928 freeing invalid memtype d2040000-d2680000
[ 1541.687750] Xorg:6928 freeing invalid memtype d2680000-d4680000
[ 1542.116794] Xorg:6928 freeing invalid memtype d0000000-d0020000

[ 1542.343673] pidgin[7505] general protection ip:7fcfdab92c92
sp:7fffe4e46ca0 error:0 in libX11.so.6.2.0[7fcfdab52000+ff000]
[ 1542.366374] update-notifier[7372] general protection ip:7f543a2acb92
sp:7fff424bc4a0 error:0 in ld-2.7.so[7f543a29f000+1d000]
[ 1545.569879] skge eth2: disabling interface
[ 1545.575729] skge eth2: enabling interface
[ 1545.579087] ADDRCONF(NETDEV_UP): eth2: link is not ready
[ 1546.966466] gedit[7423]: segfault at ffff88003740d448 ip
00007fb76a5d9053 sp 00007fff7ecf11d0 error 5 in
libqt-mt.so.3.3.8[7fb76a1f8000+9df000]
[ 1547.262323] skge eth2: Link is up at 100 Mbps, full duplex, flow
control both
[ 1547.262656] ADDRCONF(NETDEV_CHANGE): eth2: link becomes ready
[ 1547.831519] evince[7916] general protection ip:7f92c525b794
sp:7fffcd46a680 error:0 in ld-2.7.so[7f92c524d000+1d000]
[ 1547.964072] python[7526]: segfault at 200515 ip 00007f98515237e3 sp
00007fff59737530 error 4 in ld-2.7.so[7f985151a000+1d000]
[ 1549.013419] console-kit-dae[6506]: segfault at 28 ip 00007f310cf8b7c5
sp 00007fff1519ef40 error 4 in ld-2.7.so[7f310cf82000+1d000]
[ 1557.926527] eth2: no IPv6 routers present
[ 1568.747175] dcopserver[7325]: segfault at 4617998 ip 00007fbd85111409
sp 00007fff8e04c7e0 error 4 in libc-2.7.so[7fbd850dc000+158000]
[ 1569.959891] eth2: no IPv6 routers present
Comment 1 Pavel Machek 2008-12-17 02:23:34 UTC
I tried it on 2.6.28-rc8-git, and did not get anything wrong. Can you retest with recent mainline?
Comment 2 Alan Jenkins 2008-12-20 06:16:46 UTC
Reproduced on 2.6.28-rc9 without difficulty.

I had to mount *three* failed hibernation attempts though.  I rebooted and had another go, and it seems 3 is the magic number.

I should say that I'm using uwswsusp.  I installed it from a badly-supported package in Universe; the version is 0.6~cvs20070618-1ubuntu2.  

I have a feeling I had to create some symlinks so pm-utils could find the s2ram binary. (I'm suspending from X using the option on the Kubuntu "logout" dialog).

I have 2GB of ram and 1.5GB of swap.  To provoke hibernate to fail due to lack of swap, I do two things

1. Start & use 10 of my favourite memory hogging GUI apps.
2. Create a 1Gb file on tmpfs.
Comment 3 Alan Jenkins 2009-01-19 10:15:44 UTC
I investigated a bit with gdb.

Look at swapon() and swapoff() in mm/swapfile.c.  They first open the file, then check for a duplicate in the list of swap files.  The comparison is made using the f_mapping member of struct file.

1) Before hibernation

 # cat /proc/swaps
 Filename                                Type            Size    Used    Priority
 /dev/sda7                               partition       1494004 1104332 -1

 (gdb) p nr_swapfiles
 $1 = 1
 (gdb) p swap_list
 $2 = {head = 0, next = 0}
 (gdb) p swap_info
 $3 = {flags = 3, prio = -1, next = -1, swap_file = 0xffff88007ad28b40,
    bdev = 0xffff88007ec26d80, extent_list = {next = 0xffff88007b9b7240,
     prev = 0xffff88007b9b7240}, curr_swap_extent = 0xffff88007b9b7240,
    swap_map = 0xffffc200104f6000, lowest_bit = 1, highest_bit = 373501,
    lowest_alloc = 0, highest_alloc = 0, cluster_next = 1, cluster_nr = 0,
    pages = 373501, max = 373502, inuse_pages = 0, old_block_size = 4096}

 (gdb) p swap_info[0].swap_file->f_mapping
 $4 = (struct address_space *) 0xffff88007ec26f78

2) After 3 failed hibernation cycles, swap_info[0].swap_file->f_mapping is unchanged

 (gdb) p nr_swapfiles
 $1 = 1
 (gdb) p swap_list
 $2 = {head = 0, next = 0}
 (gdb) p swap_info[0]
 $3 = {flags = 3, prio = -1, next = -1, swap_file = 0xffff88007ad28b40,
  bdev = 0xffff88007ec26d80, extent_list = {next = 0xffff88007b9b7240,
    prev = 0xffff88007b9b7240}, curr_swap_extent = 0xffff88007b9b7240,
  swap_map = 0xffffc200104f6000, lowest_bit = 4353, highest_bit = 373501,
  lowest_alloc = 0, highest_alloc = 0, cluster_next = 287568,
  cluster_nr = 177, pages = 373501, max = 373502, inuse_pages = 275248,
  old_block_size = 4096}
 (gdb) p swap_info[0].swap_file->f_mapping
 $4 = (struct address_space *) 0xffff88007ec26f78

3) But "swapon /dev/sda7" succeeds because when the swap partition is opened, a *different* f_mapping is returned:

# swapon /dev/sda7
# cat /proc/swaps
Filename                                Type            Size    Used    Priority
/dev/sda7                               partition       1494004 1097944 -1
/dev/sda7                               partition       1494004 0       -2

(gdb) p nr_swapfiles
$1 = 2
(gdb) p swap_list
$2 = {head = 0, next = 0}
(gdb) p swap_info[0]
$3 = {flags = 3, prio = -1, next = 1, swap_file = 0xffff88007ad28b40,
  bdev = 0xffff88007ec26d80, extent_list = {next = 0xffff88007b9b7240,
    prev = 0xffff88007b9b7240}, curr_swap_extent = 0xffff88007b9b7240,
  swap_map = 0xffffc200104f6000, lowest_bit = 4353, highest_bit = 373501,
  lowest_alloc = 0, highest_alloc = 0, cluster_next = 287568,
  cluster_nr = 177, pages = 373501, max = 373502, inuse_pages = 274766,
  old_block_size = 4096}
(gdb) p swap_info[1]
$4 = {flags = 3, prio = -2, next = -1, swap_file = 0xffff88005b01f300,
  bdev = 0xffff88007ee73a80, extent_list = {next = 0xffff88006712f300,
    prev = 0xffff88006712f300}, curr_swap_extent = 0xffff88006712f300,
  swap_map = 0xffffc20010402000, lowest_bit = 1, highest_bit = 373501,
  lowest_alloc = 0, highest_alloc = 0, cluster_next = 1, cluster_nr = 0,
  pages = 373501, max = 373502, inuse_pages = 0, old_block_size = 512}
(gdb) p swap_info[0].swap_file->f_mapping
$5 = (struct address_space *) 0xffff88007ec26f78
(gdb) p swap_info[1].swap_file->f_mapping
$6 = (struct address_space *) 0xffff88007ee73c78


Hypothesis
 - Maybe the swap code is making a wrong assumption, and f_mapping is allowed to differ between different opens of the same file.  But this seems unlikely to me.

 - Maybe the failed hibernation cycles end up closing swap_file.  All the data in swap_file is now stale (it is freed memory).  When the file is re-opened, it has a new f_mapping allocated at a different address.
Comment 4 Rafael J. Wysocki 2009-01-19 11:08:40 UTC
On Monday 19 January 2009, bugme-daemon@bugzilla.kernel.org wrote:
> http://bugzilla.kernel.org/show_bug.cgi?id=12239
> 
> 
> 
> 
> 
> ------- Comment #3 from alan-jenkins@tuffmail.co.uk  2009-01-19 10:15 -------
> I investigated a bit with gdb.
> 
> Look at swapon() and swapoff() in mm/swapfile.c.  They first open the file,
> then check for a duplicate in the list of swap files.  The comparison is made
> using the f_mapping member of struct file.
>
>  - Maybe the failed hibernation cycles end up closing swap_file.  All the
>  data
> in swap_file is now stale (it is freed memory).  When the file is re-opened,
> it
> has a new f_mapping allocated at a different address.

Hibernation modifies the swap signature and if the resume fails, the signature
is never restored and swapon fails.  This is a known issue and it has always
worked like this.
Comment 5 Alan Jenkins 2009-01-20 09:57:56 UTC
(In reply to comment #4)
> On Monday 19 January 2009, bugme-daemon@bugzilla.kernel.org wrote:
> > http://bugzilla.kernel.org/show_bug.cgi?id=12239
> > 
> > 
> > 
> > 
> > 
> > ------- Comment #3 from alan-jenkins@tuffmail.co.uk  2009-01-19 10:15
> -------
> > I investigated a bit with gdb.
> > 
> > Look at swapon() and swapoff() in mm/swapfile.c.  They first open the file,
> > then check for a duplicate in the list of swap files.  The comparison is
> made
> > using the f_mapping member of struct file.
> >
> >  - Maybe the failed hibernation cycles end up closing swap_file.  All the
> data
> > in swap_file is now stale (it is freed memory).  When the file is
> re-opened, it
> > has a new f_mapping allocated at a different address.
> 
> Hibernation modifies the swap signature and if the resume fails, the
> signature
> is never restored and swapon fails.  This is a known issue and it has always
> worked like this.

Sorry, I'm not resuming.  I shouldn't have said "cycles"; that's not what I meant.  I'm only hibernating - the hibernation fails because I don't have enough swap.  After exactly three failed hibernation attempts, something goes horribly wrong.

It leaves swapon() / swapoff() unable to recognise the current swap partition. So I can't "swapoff /dev/sda7", but I can "swapon /dev/sda7", and I get a /proc/swaps with two entries for /dev/sda7 (see above).

Similarly, I can't hibernate again, even if I free up memory.  I get the "try swappon -a" message.  Because when the kernel scans the swap list, it doesn't recognise the entry for "/dev/sda7" as being the same as the resume device any more.  Because resume_bdev doesn't match swap_list[0].bdev any more.
Comment 6 Rafael J. Wysocki 2009-01-20 12:06:01 UTC
Okay, sorry.

There probably is something wrong in the error path of swsusp, although I can find it at the moment.

At least, it should check if you have enough swap and fail cleanly if not.  Theoretically, this is, but for some obscure reason it doesn't work on your setup.  I haven't had the time to investigate it in detail recently.
Comment 7 Alan Jenkins 2009-01-23 09:21:33 UTC
Ok, I managed to hunt this down myself.

-----------------------------------------------
From: Alan Jenkins <alan-jenkins@tuffmail.co.uk>

The current image writing code drops a reference to the current swap device.  This doesn't show up if the hibernation succeeds - because it doesn't affect the image which gets resumed.  But it means multiple failed hibernations will free the swap device while it is still use.

swsusp_write() finds the block_device for the swap file using swap_type_of().  It then calls blkdev_get() / blkdev_put() in symmetric pairs.

Unfortunately, blkdev_get() assumes ownership of the inode of the block_device passed to it.  So blkdev_put() calls iput() on the inode.  This is by design and other callers expect this behaviour.  The fix is for swap_type_of() to take a reference on the inode using bdget().

diff --git a/mm/swapfile.c b/mm/swapfile.c
index f48b831..7740478 100644
--- a/mm/swapfile.c
+++ b/mm/swapfile.c
@@ -635,7 +635,7 @@ int swap_type_of(dev_t device, sector_t offset, struct block_device **bdev_p)
 
 		if (!bdev) {
 			if (bdev_p)
-				*bdev_p = sis->bdev;
+				*bdev_p = bdget(sis->bdev->bd_dev);
 
 			spin_unlock(&swap_lock);
 			return i;
@@ -647,7 +647,7 @@ int swap_type_of(dev_t device, sector_t offset, struct block_device **bdev_p)
 					struct swap_extent, list);
 			if (se->start_block == offset) {
 				if (bdev_p)
-					*bdev_p = sis->bdev;
+					*bdev_p = bdget(sis->bdev->bd_dev);
 
 				spin_unlock(&swap_lock);
 				bdput(bdev);
Comment 8 Alan Jenkins 2009-01-23 09:24:54 UTC
Created attachment 19955 [details]
Proposed fix
Comment 9 Rafael J. Wysocki 2009-01-23 13:56:46 UTC
On Friday 23 January 2009, bugme-daemon@bugzilla.kernel.org wrote:
> http://bugzilla.kernel.org/show_bug.cgi?id=12239
> 
> 
> 
> 
> 
> ------- Comment #7 from alan-jenkins@tuffmail.co.uk  2009-01-23 09:21 -------
> Ok, I managed to hunt this down myself.

Great, please send the patch to me via e-mail (discard this message if you
already have).

Thanks!
Comment 10 Zhang Rui 2009-03-18 19:50:12 UTC
fixed by a1bb7d61233ba5fb5cd865f907a9ddcc8f8c02bd
Comment 11 Simon Holm Thøgersen 2009-03-19 04:05:01 UTC
*** Bug 12241 has been marked as a duplicate of this bug. ***