Created attachment 72224 [details] syslog messages sometimes it even crashes when i log out from my gnome session... but resume from hibernation or suspend crashes more often (30%)... i use Fedora Core 16 on an Asrock H61M-ITX with an Intel Core i7-2600K with no 2nd graphix card... maybe it is not hibernate/suspend related? i cant say if it is a new bug, because the box is new... i play SecondLife a lot with the /usr/lib64/xorg/modules/drivers/intel_drv.so and onboard audio driver (HDMI and analog-stereo)... Linux version 3.2.2-1.fc16.x86_64 (mockbuild@x86-13.phx2.fedoraproject.org) (gcc version 4.6.2 20111027 (Red Hat 4.6.2-1) (GCC) ) #1 SMP Thu Jan 26 03:21:58 UTC 2012
Created attachment 72225 [details] contents of /proc/cpuinfo
Is this really related to suspend or hibernate? Can you reproduce it (by logging out, or whatever) on a system where you have never suspended or hibernated since booting?
last night it refused to hibernate (it stuck at a black screen with white cursor)... i will just turn off monitors in the next few days instead of using suspend/hibernate... playing SecondLife (once) after a fresh reboot and then logging off+on doesnt trigger the bug... but i m quite sure, that the uptime before a hang/panic was more than a day (that means i hibernated/suspended it at least once), IIRC... rtcwake or not rtcwake makes no difference...
Created attachment 72245 [details] output of lspci -v
Created attachment 72246 [details] output of lsusb -v
Created attachment 72247 [details] output of lspci -v
no crash after 28hrs uptime and normal use (just no suspend)... :-) -arne
the intel-gfx@lists.freedesktop.org mailing list told me this: http://lists.freedesktop.org/archives/intel-gfx/2012-January/014825.html seems to b a known prob... -arne
last night i hibernated again (after 65hrs uptime) and today after thawn it had again this: list_add corruption. next->prev should be prev (ffff88023017cbf8), but was (null). (next=ffff88023017cbf8). and it failed to reboot... now i try to do it with tuxonice and shutdown method... -arne
/sys/power/disk: "shutdown" isnt better than "platform"... tuxonice: i dont know how to activate it... it seems like i need a custom kernel... now i will use "halt -p" and restore the applications every morning... as a workaround... :-) -arne
today i got a new symptom: i hibernated the box and when i came back i tried to thaw it... everything went quite good (some time just a cursor and then the background image on the right monitor)... but then: the left monitor had still black background with very fast scrolling messages (i couldnt read them)... is there a workaround? i mean: KMS is quite old... or isnt it KMS related? could it increase overall stabilily when i buy a dedicated gfx card? why r graphix cards so complicated? :-) -arne
3.2.5-3.fc16.x86_64 has this bug, too: i just did a "find /sys | grep fan" after 2 otherwise successful hibernate/thaw cycles, and my GNOME crashed (b&w text mode with panic messages) and i had to reboot... -arne
it did it again... this time it seems like it tried to execute an invalid instruction... :-) kernel BUG at fs/dcache.c:154! invalid opcode: 0000 [#1] SMP CPU 1 Modules linked in: usblp tcp_lp ftdi_sio binfmt_misc bnep bluetooth rfkill ppdev parport_pc lp parport fuse nfs fscache auth_rpcgss nfs_acl lockd nf_conntrack_tftp ipt_LOG ip6t_REJECT nf_conntrack_ipv6 nf_conntrack_ipv4 nf_defrag_ipv4 nf_defrag_ipv6 xt_state nf_conntrack ip6table_filter ip6_tables coretemp w83627ehf hwmon_vid snd_hda_codec_hdmi snd_hda_codec_realtek snd_hda_intel snd_hda_codec snd_hwdep snd_seq snd_seq_device snd_pcm snd_timer snd iTCO_wdt cdc_acm r8169 mii iTCO_vendor_support i2c_i801 soundcore snd_page_alloc microcode sunrpc uinput i915 drm_kms_helper drm i2c_algo_bit i2c_core video [last unloaded: scsi_wait_scan] Pid: 70, comm: kswapd0 Not tainted 3.2.6-3.fc16.x86_64 #1 To Be Filled By O.E.M. To Be Filled By O.E.M./H61M-ITX RIP: 0010:[<ffffffff8118dd04>] [<ffffffff8118dd04>] d_free+0x64/0x70 RSP: 0018:ffff88022e19db20 EFLAGS: 00010286 RAX: 0000000000000000 RBX: ffff8802303d79c0 RCX: 00000000ffffffff RDX: 0000000000000002 RSI: ffff880214dab150 RDI: ffff8802303d79c0 RBP: ffff88022e19db30 R08: ffff8802303d7a70 R09: ffffc90000002000 R10: 000000000001ccf0 R11: 0000000000000002 R12: ffff880214dab0d0 R13: ffff8802303d7cc0 R14: ffff8802303d7a70 R15: ffff8802303d7cc0 FS: 0000000000000000(0000) GS:ffff88023fa40000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b CR2: 00000000e8c0a040 CR3: 0000000211dad000 CR4: 00000000000406e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 Process kswapd0 (pid: 70, threadinfo ffff88022e19c000, task ffff88022e1ddc80) Stack: ffff8802303d7cc0 ffff8802303d79c0 ffff88022e19db60 ffffffff8118ee17 ffff8802303d79c0 ffff88022e19dbf0 ffff880214dab0d0 ffff8802303d7a1c ffff88022e19dbc0 ffffffff8118eff7 0000000000000001 ffff8802303d79c0 Call Trace: [<ffffffff8118ee17>] d_kill+0xa7/0x100 [<ffffffff8118eff7>] shrink_dentry_list+0x187/0x1e0 [<ffffffff8118fc11>] prune_dcache_sb+0x121/0x140 [<ffffffff8117c060>] prune_super+0x130/0x1a0 [<ffffffff8112bab4>] shrink_slab+0x154/0x310 [<ffffffff8112f22a>] balance_pgdat+0x4fa/0x6c0 [<ffffffff8112f568>] kswapd+0x178/0x3d0 [<ffffffff815df2c4>] ? __schedule+0x3d4/0x8c0 [<ffffffff81090440>] ? remove_wait_queue+0x50/0x50 [<ffffffff8112f3f0>] ? balance_pgdat+0x6c0/0x6c0 [<ffffffff8108fb9c>] kthread+0x8c/0xa0 [<ffffffff815ebaf4>] kernel_thread_helper+0x4/0x10 [<ffffffff8108fb10>] ? kthread_worker_fn+0x190/0x190 [<ffffffff815ebaf0>] ? gs_change+0x13/0x13 Code: bb 90 00 00 00 74 18 48 c7 c6 e0 da 18 81 e8 b4 6a f5 ff 48 83 c4 08 5b 5d c3 0f 1f 44 00 00 e8 e3 fd ff ff 48 83 c4 08 5b 5d c3 <0f> 0b 66 2e 0f 1f 84 00 00 00 00 00 55 48 89 e5 41 57 41 56 41 RIP [<ffffffff8118dd04>] d_free+0x64/0x70 RSP <ffff88022e19db20> HDMI hot plug event: Codec=3 Pin=7 Presence_Detect=0 ELD_Valid=0 HDMI status: Codec=3 Pin=7 Presence_Detect=0 ELD_Valid=0 HDMI hot plug event: Codec=3 Pin=7 Presence_Detect=1 ELD_Valid=0 HDMI status: Codec=3 Pin=7 Presence_Detect=1 ELD_Valid=0 -arne
today (after the first thaw since last reboot) it says this: # find /sys | wc -l find: WARNING: file `/sys/kernel/debug/dri/64/i915_blt_ringbuffer_data' appears to have mode 0000 23480 # ls -l /sys/kernel/debug/dri/64/i915_blt_ringbuffer_data ?--------- 1 root root 0 Feb 21 20:14 /sys/kernel/debug/dri/64/i915_blt_ringbuffer_data looks like it needs a reboot... why does it do that? -arne
with 3.2.7-1.fc16.x86_64 it still doesnt thaw properly (in spite of "3.2.6-4 Freeze all filesystems during system suspend/hibernate."): ------------[ cut here ]------------ kernel BUG at fs/inode.c:429! invalid opcode: 0000 [#1] SMP CPU 1 Modules linked in: tcp_lp usblp binfmt_misc usb_storage ppdev parport_pc lp parport fuse bnep bluetooth rf kill nfs fscache auth_rpcgss nfs_acl lockd ipt_LOG nf_conntrack_ipv4 nf_conntrack_tftp ip6t_REJECT nf_conntrack_ipv6 nf_defrag_ipv6 nf_defrag_ipv4 xt_state nf_conntrack ip6table_filter ip6_tables coretemp w83627ehf hwmon_vid snd_hda_codec_hdmi snd_hda_codec_realtek snd_hda_intel snd_hda_codec snd_hwdep snd_seq snd_seq_device snd_pcm snd_timer snd soundcore r8169 snd_page_alloc iTCO_wdt mii cdc_acm i2c_i801 microcode iTCO_vendor_support sunrpc uinput i915 drm_kms_helper drm i2c_algo_bit i2c_core video [last unloaded: scsi_wait_scan] Pid: 26157, comm: crond Tainted: G I 3.2.7-1.fc16.x86_64 #1 To Be Filled By O.E.M. To Be Filled By O.E.M./H61M-ITX RIP: 0010:[<ffffffff811921dc>] [<ffffffff811921dc>] end_writeback+0x9c/0xa0 RSP: 0018:ffff880231539e08 EFLAGS: 00010207 RAX: ffff880230180c00 RBX: ffff880230180a30 RCX: dead000000200200 RDX: 000000000000002f RSI: ffff880230180ab0 RDI: ffff880230180b88 RBP: ffff880231539e18 R08: ffff88020cadc5f0 R09: 0000000000000001 R10: ffff88009e0aeb10 R11: 0000000000000001 R12: ffff880230180b28 R13: ffffffff816790a0 R14: ffffffff816790a0 R15: ffff880230180a30 FS: 00007f49d17e07c0(0000) GS:ffff88023fa40000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007fff9405c030 CR3: 0000000209629000 CR4: 00000000000406e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 Process crond (pid: 26157, threadinfo ffff880231538000, task ffff8802314f2e40) Stack: 0000000000000000 ffff880230180a30 ffff880231539e48 ffffffff81192412 ffff880231539e58 ffff880230180a30 ffff880230180ab0 ffff880232498800 ffff880231539e78 ffffffff81192593 ffff88020cadc540 ffff880230180a30 Call Trace: [<ffffffff81192412>] evict+0x122/0x1a0 [<ffffffff81192593>] iput+0x103/0x200 [<ffffffff8118f0e0>] d_kill+0xf0/0x100 [<ffffffff8118f772>] dput+0xe2/0x1b0 [<ffffffff8117aad6>] fput+0x176/0x220 [<ffffffff81177216>] filp_close+0x66/0x90 [<ffffffff811772d8>] sys_close+0x98/0xf0 [<ffffffff815e9d82>] system_call_fastpath+0x16/0x1b Code: 02 00 00 00 48 c7 c2 a0 10 19 81 be 07 00 00 00 e8 fa e5 44 00 48 c7 83 98 00 00 00 60 00 00 00 48 83 c4 08 5b 5d c3 0f 0b 0f 0b <0f> 0b 0f 0b 55 48 89 e5 53 48 83 ec 08 66 66 66 66 90 48 89 fb RIP [<ffffffff811921dc>] end_writeback+0x9c/0xa0 RSP <ffff880231539e08> HDMI hot plug event: Codec=3 Pin=7 Presence_Detect=0 ELD_Valid=0 HDMI status: Codec=3 Pin=7 Presence_Detect=0 ELD_Valid=0 HDMI hot plug event: Codec=3 Pin=7 Presence_Detect=1 ELD_Valid=0 HDMI status: Codec=3 Pin=7 Presence_Detect=1 ELD_Valid=0 ---[ end trace a7919e7f17c0a727 ]--- -arne
since i disable the write cache of my hard disc (WDC WD10EARS-00Y5B1) 10 seconds before hibernation (with "shutdown"), i was able to thaw 4 times without any intermediate reboot/oops/panic... :-) -arne
neither (1) disabling the cache of my hard disc nor (2) emptying the swap area is a workaround for this bug... now i try tuxonice from atrpms...
seems to work now... with 3.3.2-1.fc17.x86_64...
works now...