Bug 12239
Summary: | Swap breaks after 2x hibernation failure due to lack of swap | ||
---|---|---|---|
Product: | Power Management | Reporter: | Alan Jenkins (alan-jenkins) |
Component: | Hibernation/Suspend | Assignee: | power-management_other |
Status: | CLOSED CODE_FIX | ||
Severity: | normal | CC: | odie, pavel, rjw |
Priority: | P1 | ||
Hardware: | All | ||
OS: | Linux | ||
Kernel Version: | 2.6.27-rc9 | Subsystem: | |
Regression: | --- | Bisected commit-id: | |
Bug Depends on: | |||
Bug Blocks: | 7216 | ||
Attachments: | Proposed fix |
Description
Alan Jenkins
2008-12-16 13:14:49 UTC
I tried it on 2.6.28-rc8-git, and did not get anything wrong. Can you retest with recent mainline? Reproduced on 2.6.28-rc9 without difficulty. I had to mount *three* failed hibernation attempts though. I rebooted and had another go, and it seems 3 is the magic number. I should say that I'm using uwswsusp. I installed it from a badly-supported package in Universe; the version is 0.6~cvs20070618-1ubuntu2. I have a feeling I had to create some symlinks so pm-utils could find the s2ram binary. (I'm suspending from X using the option on the Kubuntu "logout" dialog). I have 2GB of ram and 1.5GB of swap. To provoke hibernate to fail due to lack of swap, I do two things 1. Start & use 10 of my favourite memory hogging GUI apps. 2. Create a 1Gb file on tmpfs. I investigated a bit with gdb. Look at swapon() and swapoff() in mm/swapfile.c. They first open the file, then check for a duplicate in the list of swap files. The comparison is made using the f_mapping member of struct file. 1) Before hibernation # cat /proc/swaps Filename Type Size Used Priority /dev/sda7 partition 1494004 1104332 -1 (gdb) p nr_swapfiles $1 = 1 (gdb) p swap_list $2 = {head = 0, next = 0} (gdb) p swap_info $3 = {flags = 3, prio = -1, next = -1, swap_file = 0xffff88007ad28b40, bdev = 0xffff88007ec26d80, extent_list = {next = 0xffff88007b9b7240, prev = 0xffff88007b9b7240}, curr_swap_extent = 0xffff88007b9b7240, swap_map = 0xffffc200104f6000, lowest_bit = 1, highest_bit = 373501, lowest_alloc = 0, highest_alloc = 0, cluster_next = 1, cluster_nr = 0, pages = 373501, max = 373502, inuse_pages = 0, old_block_size = 4096} (gdb) p swap_info[0].swap_file->f_mapping $4 = (struct address_space *) 0xffff88007ec26f78 2) After 3 failed hibernation cycles, swap_info[0].swap_file->f_mapping is unchanged (gdb) p nr_swapfiles $1 = 1 (gdb) p swap_list $2 = {head = 0, next = 0} (gdb) p swap_info[0] $3 = {flags = 3, prio = -1, next = -1, swap_file = 0xffff88007ad28b40, bdev = 0xffff88007ec26d80, extent_list = {next = 0xffff88007b9b7240, prev = 0xffff88007b9b7240}, curr_swap_extent = 0xffff88007b9b7240, swap_map = 0xffffc200104f6000, lowest_bit = 4353, highest_bit = 373501, lowest_alloc = 0, highest_alloc = 0, cluster_next = 287568, cluster_nr = 177, pages = 373501, max = 373502, inuse_pages = 275248, old_block_size = 4096} (gdb) p swap_info[0].swap_file->f_mapping $4 = (struct address_space *) 0xffff88007ec26f78 3) But "swapon /dev/sda7" succeeds because when the swap partition is opened, a *different* f_mapping is returned: # swapon /dev/sda7 # cat /proc/swaps Filename Type Size Used Priority /dev/sda7 partition 1494004 1097944 -1 /dev/sda7 partition 1494004 0 -2 (gdb) p nr_swapfiles $1 = 2 (gdb) p swap_list $2 = {head = 0, next = 0} (gdb) p swap_info[0] $3 = {flags = 3, prio = -1, next = 1, swap_file = 0xffff88007ad28b40, bdev = 0xffff88007ec26d80, extent_list = {next = 0xffff88007b9b7240, prev = 0xffff88007b9b7240}, curr_swap_extent = 0xffff88007b9b7240, swap_map = 0xffffc200104f6000, lowest_bit = 4353, highest_bit = 373501, lowest_alloc = 0, highest_alloc = 0, cluster_next = 287568, cluster_nr = 177, pages = 373501, max = 373502, inuse_pages = 274766, old_block_size = 4096} (gdb) p swap_info[1] $4 = {flags = 3, prio = -2, next = -1, swap_file = 0xffff88005b01f300, bdev = 0xffff88007ee73a80, extent_list = {next = 0xffff88006712f300, prev = 0xffff88006712f300}, curr_swap_extent = 0xffff88006712f300, swap_map = 0xffffc20010402000, lowest_bit = 1, highest_bit = 373501, lowest_alloc = 0, highest_alloc = 0, cluster_next = 1, cluster_nr = 0, pages = 373501, max = 373502, inuse_pages = 0, old_block_size = 512} (gdb) p swap_info[0].swap_file->f_mapping $5 = (struct address_space *) 0xffff88007ec26f78 (gdb) p swap_info[1].swap_file->f_mapping $6 = (struct address_space *) 0xffff88007ee73c78 Hypothesis - Maybe the swap code is making a wrong assumption, and f_mapping is allowed to differ between different opens of the same file. But this seems unlikely to me. - Maybe the failed hibernation cycles end up closing swap_file. All the data in swap_file is now stale (it is freed memory). When the file is re-opened, it has a new f_mapping allocated at a different address. On Monday 19 January 2009, bugme-daemon@bugzilla.kernel.org wrote: > http://bugzilla.kernel.org/show_bug.cgi?id=12239 > > > > > > ------- Comment #3 from alan-jenkins@tuffmail.co.uk 2009-01-19 10:15 ------- > I investigated a bit with gdb. > > Look at swapon() and swapoff() in mm/swapfile.c. They first open the file, > then check for a duplicate in the list of swap files. The comparison is made > using the f_mapping member of struct file. > > - Maybe the failed hibernation cycles end up closing swap_file. All the > data > in swap_file is now stale (it is freed memory). When the file is re-opened, > it > has a new f_mapping allocated at a different address. Hibernation modifies the swap signature and if the resume fails, the signature is never restored and swapon fails. This is a known issue and it has always worked like this. (In reply to comment #4) > On Monday 19 January 2009, bugme-daemon@bugzilla.kernel.org wrote: > > http://bugzilla.kernel.org/show_bug.cgi?id=12239 > > > > > > > > > > > > ------- Comment #3 from alan-jenkins@tuffmail.co.uk 2009-01-19 10:15 > ------- > > I investigated a bit with gdb. > > > > Look at swapon() and swapoff() in mm/swapfile.c. They first open the file, > > then check for a duplicate in the list of swap files. The comparison is > made > > using the f_mapping member of struct file. > > > > - Maybe the failed hibernation cycles end up closing swap_file. All the > data > > in swap_file is now stale (it is freed memory). When the file is > re-opened, it > > has a new f_mapping allocated at a different address. > > Hibernation modifies the swap signature and if the resume fails, the > signature > is never restored and swapon fails. This is a known issue and it has always > worked like this. Sorry, I'm not resuming. I shouldn't have said "cycles"; that's not what I meant. I'm only hibernating - the hibernation fails because I don't have enough swap. After exactly three failed hibernation attempts, something goes horribly wrong. It leaves swapon() / swapoff() unable to recognise the current swap partition. So I can't "swapoff /dev/sda7", but I can "swapon /dev/sda7", and I get a /proc/swaps with two entries for /dev/sda7 (see above). Similarly, I can't hibernate again, even if I free up memory. I get the "try swappon -a" message. Because when the kernel scans the swap list, it doesn't recognise the entry for "/dev/sda7" as being the same as the resume device any more. Because resume_bdev doesn't match swap_list[0].bdev any more. Okay, sorry. There probably is something wrong in the error path of swsusp, although I can find it at the moment. At least, it should check if you have enough swap and fail cleanly if not. Theoretically, this is, but for some obscure reason it doesn't work on your setup. I haven't had the time to investigate it in detail recently. Ok, I managed to hunt this down myself. ----------------------------------------------- From: Alan Jenkins <alan-jenkins@tuffmail.co.uk> The current image writing code drops a reference to the current swap device. This doesn't show up if the hibernation succeeds - because it doesn't affect the image which gets resumed. But it means multiple failed hibernations will free the swap device while it is still use. swsusp_write() finds the block_device for the swap file using swap_type_of(). It then calls blkdev_get() / blkdev_put() in symmetric pairs. Unfortunately, blkdev_get() assumes ownership of the inode of the block_device passed to it. So blkdev_put() calls iput() on the inode. This is by design and other callers expect this behaviour. The fix is for swap_type_of() to take a reference on the inode using bdget(). diff --git a/mm/swapfile.c b/mm/swapfile.c index f48b831..7740478 100644 --- a/mm/swapfile.c +++ b/mm/swapfile.c @@ -635,7 +635,7 @@ int swap_type_of(dev_t device, sector_t offset, struct block_device **bdev_p) if (!bdev) { if (bdev_p) - *bdev_p = sis->bdev; + *bdev_p = bdget(sis->bdev->bd_dev); spin_unlock(&swap_lock); return i; @@ -647,7 +647,7 @@ int swap_type_of(dev_t device, sector_t offset, struct block_device **bdev_p) struct swap_extent, list); if (se->start_block == offset) { if (bdev_p) - *bdev_p = sis->bdev; + *bdev_p = bdget(sis->bdev->bd_dev); spin_unlock(&swap_lock); bdput(bdev); Created attachment 19955 [details]
Proposed fix
On Friday 23 January 2009, bugme-daemon@bugzilla.kernel.org wrote: > http://bugzilla.kernel.org/show_bug.cgi?id=12239 > > > > > > ------- Comment #7 from alan-jenkins@tuffmail.co.uk 2009-01-23 09:21 ------- > Ok, I managed to hunt this down myself. Great, please send the patch to me via e-mail (discard this message if you already have). Thanks! fixed by a1bb7d61233ba5fb5cd865f907a9ddcc8f8c02bd *** Bug 12241 has been marked as a duplicate of this bug. *** |