References: http://lkml.org/lkml/2009/10/4/64 I'm seeing a variety of BUGs on my EeePC 701 after hibernation. Sometimes they cause a hang during resume; sometimes they happen just after resume. It doesn't happen all the time either - I've just hibernated three times in a row with no problems. It's most perplexing. One resume hang showed a series of SCSI backtraces and errors. Unfortunately I wasn't able to capture it at the time. They were most probably related to the root device, an SSD controlled by ata_piix. I have full dmesgs for two different sets of backtraces following resume: "bad swap file entry" backtraces, and BUGs in fget_light(). The fget_light() bug was on a slightly older kernel (still a child of 32-rc1). I will attach them forthwith.
Created attachment 23258 [details] Current kconfig
Created attachment 23259 [details] Dmesg showing "bad swap file entry" on current kernel
Created attachment 23260 [details] Another dmesg showing "bad swap file entry" on current kernel
Created attachment 23261 [details] Dmesg showing BUG in fget_light() on slightly earlier kernel
And then suspend-to-ram failed with a long freezer/scheduler debug spew, followed by a set of hung task warnings. I used suspend-to-ram a couple of times in the past few days before finding this problem.
Created attachment 23262 [details] /var/log/messages showing freezer/scheduler debug spew after suspend to ram Ah, I should also have said that the suspend actually failed, unlike the hibernation problems
Created attachment 23263 [details] Dmesg showing hung tasks after suspend to ram This is the same event as the /var/log/messages attachment. Dmesg shows the "hung task" message which /var/log/messages misses. (/var/log/messages is otherwise more complete, since the kernel log buffer has overflowed).
And here's ps showing the hung tasks: $ ps ax | grep D PID TTY STAT TIME COMMAND 3521 ? D 0:00 /usr/lib/ConsoleKit/run-session.d/udev-acl.ck session_active_changed 3691 ? D 0:00 setfacl -m u:1000:rw /dev/audio -m u:1000:rw /dev/dsp -m u:1000:rw /dev/mixer -m u:1000:rw /dev/snd/controlC0 -m u:1000:rw /dev/snd/hwC0D0 -m u:1000:rw /dev/snd/pcmC0D0c -m u:1000:rw /dev/snd/pcmC0D0p -m u:1000:rw /dev/snd/timer -m u:1000:rw /dev/video0 3874 ? D 0:00 /usr/lib/ConsoleKit/run-session.d/udev-acl.ck session_active_changed 4050 ? D 0:00 setfacl -m u:1000:rw /dev/audio -m u:1000:rw /dev/dsp -m u:1000:rw /dev/mixer -m u:1000:rw /dev/snd/controlC0 -m u:1000:rw /dev/snd/hwC0D0 -m u:1000:rw /dev/snd/pcmC0D0c -m u:1000:rw /dev/snd/pcmC0D0p -m u:1000:rw /dev/snd/timer -m u:1000:rw /dev/video0
Is this reproducible without KMS?
The issue doesn't appear to be reproducible without KMS.
> 2. Did it work with 2.6.31 (and with KMS)? I thought so, but my recollection is hazy. I didn't test it for very long if I did. I've tried it now and 2.6.31 behaves pretty similarly. Sorry for ringing the regression bell. (Corrected in bugzilla). Firstly the _hibernation_ process hung (after a couple of suspend-to-ram cycles). No text on the console (despite using s2disk). It echoed keypresses and responds to SysRq keys. No messages from lockdep or the hung task detector (after waiting 5 minutes). SysRq-P said we're in the idle loop; SysRq-T said both events/0 and hald-addon-input were runnable. Then suspend-to-ram hung (following a hibernation cycle). This time it showed the contents of vt1, but didn't appear to respond to anything short of SysRq+B.
Can you bisect it? Sounds like we may be corrupting memory...
I don't have a "last known good version" with KMS enabled. It seems to be working fine now (2.6.32-rc8). I use hibernation frequently and I've left KMS enabled, so I'll notice if it breaks again :).
Ok, thanks for the update.