Bug 54311 - computer hangs during resume from S3 in i915 drm
computer hangs during resume from S3 in i915 drm
Status: RESOLVED CODE_FIX
Product: Drivers
Classification: Unclassified
Component: Video(DRI - Intel)
All Linux
: P1 normal
Assigned To: intel-gfx-bugs@lists.freedesktop.org
:
Depends on:
Blocks: 56331
  Show dependency treegraph
 
Reported: 2013-02-23 10:56 UTC by Thomas Meyer
Modified: 2013-04-09 06:23 UTC (History)
4 users (show)

See Also:
Kernel Version: 3.8.0
Tree: Mainline
Regression: Yes


Attachments

Description Thomas Meyer 2013-02-23 10:56:46 UTC
I put my machine to supsend to ram in the evening. the next day I resume from ram and the computer hangs/is frozen.

The strange things is that the 3.8.0 kernel seems to resume fine for short sleep periods (for a few seconds).

I'll switch back to 3.7.8 which doesn't expose this behaviour.
Comment 1 Aaron Lu 2013-02-25 05:23:28 UTC
(In reply to comment #0)
> I put my machine to supsend to ram in the evening. the next day I resume from
> ram and the computer hangs/is frozen.

Did you see any output when it hangs?
You can try booting into console mode and adding nomodeset no_console_suspend to the kernel command line to see what happened.

> 
> The strange things is that the 3.8.0 kernel seems to resume fine for short
> sleep periods (for a few seconds).
> 
> I'll switch back to 3.7.8 which doesn't expose this behaviour.

Use git-bisect to find out offending commit would be best :-)
Thanks.
Comment 2 Thomas Meyer 2013-03-03 12:28:42 UTC
Hi,

some more infos:

The crash occurs non determinstic and seems not to be depenend on the time of the S2R as the bug's title suggests.

The kernel is waking up correctly, I can ssh into the machine. The probelm seems to be the graphic subsystem. The X server seems to hang. This is the i915 driver.


The problem still occurs in 3.8.1

I tried to gdb into the Xorg process, but this got stuck somehow!
Comment 3 Thomas Meyer 2013-03-03 12:34:58 UTC
$ su -c 'echo w > /proc/sysrq-trigger'

[...]
[81535.305064] SysRq : Show Blocked State
[81535.305072]   task                        PC stack   pid father
[81535.305096] Xorg            D 7fffffffffffffff     0   552      1 0x00400084
[81535.305102]  ffff88012f6bfad8 0000000000000086 ffff88012a085f40 ffff88012f6bffd8
[81535.305106]  ffff88012f6bffd8 ffff88012f6bffd8 ffff88010f9d8000 ffff88012a085f40
[81535.305109]  0000000100000000 0000000000000304 003fffffffc00c2a 0000000000000000
[81535.305113] Call Trace:
[81535.305123]  [<ffffffff814f9163>] schedule+0x23/0x60
[81535.305127]  [<ffffffff814f7e35>] schedule_timeout+0x105/0x140
[81535.305131]  [<ffffffff814f8ffa>] wait_for_common+0xaa/0x140
[81535.305137]  [<ffffffff810553e0>] ? try_to_wake_up+0x80/0x80
[81535.305141]  [<ffffffff814f9138>] wait_for_completion+0x18/0x20
[81535.305146]  [<ffffffff810478ea>] flush_workqueue+0x10a/0x390
[81535.305152]  [<ffffffff81317223>] intel_crtc_page_flip+0x133/0x350
[81535.305157]  [<ffffffff812e49a5>] drm_mode_page_flip_ioctl+0x235/0x2a0
[81535.305161]  [<ffffffff812df161>] ? drm_mode_object_find+0x61/0x90
[81535.305165]  [<ffffffff812defcc>] ? drm_crtc_convert_to_umode+0xcc/0x150
[81535.305170]  [<ffffffff812d3bd3>] drm_ioctl+0x4c3/0x570
[81535.305174]  [<ffffffff812e4770>] ? drm_mode_gamma_get_ioctl+0x120/0x120
[81535.305180]  [<ffffffff810f7fca>] do_vfs_ioctl+0x8a/0x560
[81535.305186]  [<ffffffff811b9635>] ? inode_has_perm.isra.40.constprop.70+0x25/0x30
[81535.305190]  [<ffffffff811babaf>] ? file_has_perm+0x8f/0xa0
[81535.305190]  [<ffffffff810f8531>] sys_ioctl+0x91/0xb0
[81535.305190]  [<ffffffff814ffa50>] system_call_fastpath+0x16/0x1b
[81535.305190] kworker/u:52    D ffffffff8150b740     0  4107      2 0x00000080
[81535.305190]  ffff88012b1d7d28 0000000000000046 ffff8800af349900 ffff88012b1d7fd8
[81535.305190]  ffff88012b1d7fd8 ffff88012b1d7fd8 ffffffff816d4460 ffff8800af349900
[81535.305190]  ffff88012b1d7d68 ffffffff814f8bfd ffff8800af349900 ffff88013b21bb30
[81535.305190] Call Trace:
[81535.305190]  [<ffffffff814f8bfd>] ? __schedule+0x22d/0x500
[81535.305190]  [<ffffffff814f9163>] schedule+0x23/0x60
[81535.305190]  [<ffffffff814f92c9>] schedule_preempt_disabled+0x9/0x10
[81535.305190]  [<ffffffff814f843d>] __mutex_lock_slowpath+0x5d/0x90
[81535.305190]  [<ffffffff814f817d>] mutex_lock+0x1d/0x30
[81535.305190]  [<ffffffff812f3348>] i915_hotplug_work_func+0x28/0xa0
[81535.305190]  [<ffffffff812f3320>] ? i915_error_work_func+0x100/0x100
[81535.305190]  [<ffffffff810471cd>] process_one_work+0x11d/0x420
[81535.305190]  [<ffffffff81048355>] worker_thread+0x135/0x3d0
[81535.305190]  [<ffffffff81048220>] ? manage_workers+0x240/0x240
[81535.305190]  [<ffffffff8104c94a>] kthread+0xba/0xc0
[81535.305190]  [<ffffffff8104c890>] ? kthread_create_on_node+0x110/0x110
[81535.305190]  [<ffffffff814ff9aa>] ret_from_fork+0x7a/0xb0
[81535.305190]  [<ffffffff8104c890>] ? kthread_create_on_node+0x110/0x110
[81535.305190] Xorg            D ffff88009cc2f200     0  4757   4754 0x00400084
[81535.305190]  ffff8800a579dd68 0000000000000082 ffff88009cc2f200 ffff8800a579dfd8
[81535.305190]  ffff8800a579dfd8 ffff8800a579dfd8 ffff88013b08cc80 ffff88009cc2f200
[81535.305190]  ffff8800a579ddd0 ffff88013b21b800 00000000fffffff2 ffff88013b21bb30
[81535.305190] Call Trace:
[81535.305190]  [<ffffffff814f9163>] schedule+0x23/0x60
[81535.305190]  [<ffffffff814f92c9>] schedule_preempt_disabled+0x9/0x10
[81535.305190]  [<ffffffff814f843d>] __mutex_lock_slowpath+0x5d/0x90
[81535.305190]  [<ffffffff814f817d>] mutex_lock+0x1d/0x30
[81535.305190]  [<ffffffff812e39da>] drm_fb_release+0x2a/0x80
[81535.305190]  [<ffffffff812d4738>] drm_release+0x568/0x600
[81535.305190]  [<ffffffff810e9027>] __fput+0xe7/0x220
[81535.305190]  [<ffffffff810e91f9>] ____fput+0x9/0x10
[81535.305190]  [<ffffffff81049c57>] task_work_run+0x77/0xc0
[81535.305190]  [<ffffffff81002846>] do_notify_resume+0x56/0x80
[81535.305190]  [<ffffffff814ffcd4>] int_signal+0x12/0x17
[81535.305064] SysRq : Show Blocked State
[81535.305072]   task                        PC stack   pid father
[81535.305096] Xorg            D 7fffffffffffffff     0   552      1 0x00400084
[81535.305102]  ffff88012f6bfad8 0000000000000086 ffff88012a085f40 ffff88012f6bffd8
[81535.305106]  ffff88012f6bffd8 ffff88012f6bffd8 ffff88010f9d8000 ffff88012a085f40
[81535.305109]  0000000100000000 0000000000000304 003fffffffc00c2a 0000000000000000
[81535.305113] Call Trace:
[81535.305123]  [<ffffffff814f9163>] schedule+0x23/0x60
[81535.305127]  [<ffffffff814f7e35>] schedule_timeout+0x105/0x140
[81535.305131]  [<ffffffff814f8ffa>] wait_for_common+0xaa/0x140
[81535.305137]  [<ffffffff810553e0>] ? try_to_wake_up+0x80/0x80
[81535.305141]  [<ffffffff814f9138>] wait_for_completion+0x18/0x20
[81535.305146]  [<ffffffff810478ea>] flush_workqueue+0x10a/0x390
[81535.305152]  [<ffffffff81317223>] intel_crtc_page_flip+0x133/0x350
[81535.305157]  [<ffffffff812e49a5>] drm_mode_page_flip_ioctl+0x235/0x2a0
[81535.305161]  [<ffffffff812df161>] ? drm_mode_object_find+0x61/0x90
[81535.305165]  [<ffffffff812defcc>] ? drm_crtc_convert_to_umode+0xcc/0x150
[81535.305170]  [<ffffffff812d3bd3>] drm_ioctl+0x4c3/0x570
[81535.305174]  [<ffffffff812e4770>] ? drm_mode_gamma_get_ioctl+0x120/0x120
[81535.305180]  [<ffffffff810f7fca>] do_vfs_ioctl+0x8a/0x560
[81535.305186]  [<ffffffff811b9635>] ? inode_has_perm.isra.40.constprop.70+0x25/0x30
[81535.305190]  [<ffffffff811babaf>] ? file_has_perm+0x8f/0xa0
[81535.305190]  [<ffffffff810f8531>] sys_ioctl+0x91/0xb0
[81535.305190]  [<ffffffff814ffa50>] system_call_fastpath+0x16/0x1b
[81535.305190] kworker/u:52    D ffffffff8150b740     0  4107      2 0x00000080
[81535.305190]  ffff88012b1d7d28 0000000000000046 ffff8800af349900 ffff88012b1d7fd8
[81535.305190]  ffff88012b1d7fd8 ffff88012b1d7fd8 ffffffff816d4460 ffff8800af349900
[81535.305190]  ffff88012b1d7d68 ffffffff814f8bfd ffff8800af349900 ffff88013b21bb30
[81535.305190] Call Trace:
[81535.305190]  [<ffffffff814f8bfd>] ? __schedule+0x22d/0x500
[81535.305190]  [<ffffffff814f9163>] schedule+0x23/0x60
[81535.305190]  [<ffffffff814f92c9>] schedule_preempt_disabled+0x9/0x10
[81535.305190]  [<ffffffff814f843d>] __mutex_lock_slowpath+0x5d/0x90
[81535.305190]  [<ffffffff814f817d>] mutex_lock+0x1d/0x30
[81535.305190]  [<ffffffff812f3348>] i915_hotplug_work_func+0x28/0xa0
[81535.305190]  [<ffffffff812f3320>] ? i915_error_work_func+0x100/0x100
[81535.305190]  [<ffffffff810471cd>] process_one_work+0x11d/0x420
[81535.305190]  [<ffffffff81048355>] worker_thread+0x135/0x3d0
[81535.305190]  [<ffffffff81048220>] ? manage_workers+0x240/0x240
[81535.305190]  [<ffffffff8104c94a>] kthread+0xba/0xc0
[81535.305190]  [<ffffffff8104c890>] ? kthread_create_on_node+0x110/0x110
[81535.305190]  [<ffffffff814ff9aa>] ret_from_fork+0x7a/0xb0
[81535.305190]  [<ffffffff8104c890>] ? kthread_create_on_node+0x110/0x110
[81535.305190] Xorg            D ffff88009cc2f200     0  4757   4754 0x00400084
[81535.305190]  ffff8800a579dd68 0000000000000082 ffff88009cc2f200 ffff8800a579dfd8
[81535.305190]  ffff8800a579dfd8 ffff8800a579dfd8 ffff88013b08cc80 ffff88009cc2f200
[81535.305190]  ffff8800a579ddd0 ffff88013b21b800 00000000fffffff2 ffff88013b21bb30
[81535.305190] Call Trace:
[81535.305190]  [<ffffffff814f9163>] schedule+0x23/0x60
[81535.305190]  [<ffffffff814f92c9>] schedule_preempt_disabled+0x9/0x10
[81535.305190]  [<ffffffff814f843d>] __mutex_lock_slowpath+0x5d/0x90
[81535.305190]  [<ffffffff814f817d>] mutex_lock+0x1d/0x30
[81535.305190]  [<ffffffff812e39da>] drm_fb_release+0x2a/0x80
[81535.305190]  [<ffffffff812d4738>] drm_release+0x568/0x600
[81535.305190]  [<ffffffff810e9027>] __fput+0xe7/0x220
[81535.305190]  [<ffffffff810e91f9>] ____fput+0x9/0x10
[81535.305190]  [<ffffffff81049c57>] task_work_run+0x77/0xc0
[81535.305190]  [<ffffffff81002846>] do_notify_resume+0x56/0x80
[81535.305190]  [<ffffffff814ffcd4>] int_signal+0x12/0x17
Comment 4 Aaron Lu 2013-03-04 01:11:45 UTC
Hi Thomas,

Attach the full dmesg after you ssh to the system would be helpful, thanks.
Comment 5 Aaron Lu 2013-03-04 01:13:00 UTC
And I'll move this bug to drivers/i915.
Comment 6 Jani Nikula 2013-03-04 10:19:40 UTC
(In reply to comment #1)
> Use git-bisect to find out offending commit would be best :-)
> Thanks.

Seconded.

(In reply to comment #4)
> Attach the full dmesg after you ssh to the system would be helpful, thanks.

Please do this with drm.debug=0xe module parameter.
Comment 7 Daniel Vetter 2013-03-04 10:30:42 UTC
Hm, smells like a deadlock on the mode_config lock. Can you please re-hang your machine with lockdep enabled too, that should spit out all current lock holders?

Also please retest with latest drm-intel-nightly from http://cgit.freedesktop.org/~danvet/drm-intel we've fixed a bunch of bugs in that area recently (or 3.9 kernels since all patches are currently merged upstream).
Comment 8 Thomas Meyer 2013-03-06 21:20:34 UTC
Hi, mhh. Strange: With enabling DEBUG_LOCKDEP the hang does not occur anymore... I'll update you if I can catch the hang again with this debug option enabled.
Comment 9 Thomas Meyer 2013-03-28 11:28:22 UTC
I'm using 3.9.0-rc4+ right now. the 3.9-rcX kernels seems to be okay. I did'nt encounter the hang the last weeks with the current development kernel. will stay on 3.9 for now. feel free to close this bug.
Comment 10 Daniel Vetter 2013-03-28 11:33:03 UTC
Ok, sounds like some piece of ducttape in 3.9 helps. Thanks for reporting this issue and please reopen when it pops up again.

Note You need to log in before you can comment on or make changes to this bug.