Bug 12241

Summary: Crash during suspend to disk
Product: Power Management Reporter: Simon Holm Thøgersen (odie)
Component: Hibernation/SuspendAssignee: power-management_other
Status: CLOSED DUPLICATE    
Severity: normal CC: lenb, pavel, rjw, rui.zhang
Priority: P1    
Hardware: All   
OS: Linux   
Kernel Version: 2.6.28-rc6-00007-ged31348 Subsystem:
Regression: No Bisected commit-id:
Bug Depends on:    
Bug Blocks: 7216    
Attachments: oops part 1
oops part 2
oops part 3

Description Simon Holm Thøgersen 2008-12-16 14:39:38 UTC
Latest working kernel version: Long term bug
Earliest failing kernel version: At least 2.6.27
Distribution: Gentoo
Hardware/Software: 768 MB RAM, 1 GB swap partition, reiserfs filesystems

Problem Description:
The problem occured in my fourth attempt in quick succession to suspend
to disk. The first three did not succeed due to insufficient free swap. I then closed a few running programs to free some extra memory and did a fourth suspend to disk attempt that provoked the BUG with trace below.

This is at least the third time I've seen this, but as I reported earlier at [1] I did not capture the OOPS properly. I'm pretty sure it is a bug from at least 2.6.27, but it could have been there for years as far as I'm concerned since suspend to disk works well under most cirsumstances and I haven't been using it that much earlier.

I'm running into this bug when my computer has had an uptime of weeks, so it is not exactly easy to reproduce. I have tried upon a fresh bootup to provoke the crash without luck, but I haven't tried really hard and I'd say it is realiably reproducable if I put the right effort into it.

The key to reproduce it seems to be allocating sufficient memory for suspend to disk to fail due to insufficient swap space, do multiple attempts in succession and then free a bit of memory and attempt again. I'm not sure whether the multiple attempts in succession are really required, but I believe that I've been doing that each time the BUG has showned its ugly face.

The OOPS looks like this, the full details can be seen in the pictures I'm attaching. I'd gladly type in the text from the pictures if anyone finds it useful and requests it.

kernel BUG at fs/inode.c:1153!ks ...
invalid opcode 0000 [#1] PREEMPT

EIP is at iput
Pid: 2515, comm: bash Tainted: G         W (2.6.28-rc6-00007-ged31348 #137)
Call Trace:
  __blkdev_put
  swsusp_write
  hibernate
  state_store
  state_store
  state_store
  kobj_attr_store
  sysfs_write_file
  sysfs_write_file
  vfs_write
  sys_write
  sysenter_do_call
  down_write
Code: ......
EIP: [<...>] iput
---[ end trace ........................... ]---

About the following:
Tainted: G         W

I checked my logs and could not find any warning backtraces in them so I'm not sure why W is there.

[1] http://lkml.org/lkml/2008/11/20/146
Comment 1 Simon Holm Thøgersen 2008-12-16 14:43:09 UTC
Created attachment 19334 [details]
oops part 1
Comment 2 Simon Holm Thøgersen 2008-12-16 14:44:13 UTC
Created attachment 19335 [details]
oops part 2
Comment 3 Simon Holm Thøgersen 2008-12-16 14:44:47 UTC
Created attachment 19336 [details]
oops part 3
Comment 4 Pavel Machek 2008-12-17 02:25:14 UTC
root@amd:~# echo disk > /sys/power/state
-bash: echo: write error: No space left on device
root@amd:~# echo disk > /sys/power/state
-bash: echo: write error: No space left on device
root@amd:~# swapoff /dev/sda1
root@amd:~# mkswap /dev/sda1
Setting up swapspace version 1, size = 1011703 kB
no label, UUID=7805f4f4-b42f-4f54-bd18-5fbe81fb5284
root@amd:~# swapon /dev/sda1
root@amd:~# echo disk > /sys/power/state


My naive attempt at reproducing this failed :-(.
Comment 5 Simon Holm Thøgersen 2008-12-19 02:45:05 UTC
(In reply to comment #4)
> My naive attempt at reproducing this failed :-(.> 

Well, first of all I've got quite a bit of the swap partition in use when doing the suspend to disk attempts, also when failing. Also, I'm normally able to suspend to disk and I've sucessfully suspended to disk sometimes even though I've got "No space left on device" just before.

Is there any debug instrumentation I could add? I'm able to reproduce, though not very easily.

Would it be a good idea to report this on linux-fsdevel, the BUG I'm hitting is in the VFS code after all?
Comment 6 Zhang Rui 2009-03-18 19:53:40 UTC
please verify if this is a duplicate of bug #12239
Comment 7 Simon Holm Thøgersen 2009-03-19 04:05:01 UTC
(In reply to comment #6)
> please verify if this is a duplicate of bug #12239

It is, or at least my issue is gone using a1e4ee22863d41a6fbb24310d7951836cb6dafe7. Thanks a lot for pointing that bug out to me Zhang.


*** This bug has been marked as a duplicate of bug 12239 ***