Bug 60552

Summary: Lock up using 3.10.0 + dpm patches
Product: Drivers Reporter: Bernd Steinhauser (linux)
Component: Video(DRI - non Intel)Assignee: drivers_video-dri
Status: CLOSED CODE_FIX    
Severity: normal CC: alan, markus
Priority: P1    
Hardware: x86-64   
OS: Linux   
Kernel Version: 3.10.0 Subsystem:
Regression: No Bisected commit-id:
Attachments: log messages

Description Bernd Steinhauser 2013-07-13 19:58:30 UTC
I build a kernel based on 3.10.0 for which I merged the dpm patches from Alex Deuchers up to:
commit 9b5de59629d2e58eab41e2f0e5cc60b3c395f1c3
Author: Alex Deucher <alexander.deucher@amd.com>
Date:   Tue Jul 2 18:52:10 2013 -0400

    drm/radeon/dpm: implement force performance level for TN
    
    Allows you to force the selected performance level via sysfs.
    
    Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

Not sure about the follow up commits, could try with the lates rev, but they seem to apply mostly to ids later than r600.
Hardware: 01:05.0 VGA compatible controller [0300]: Advanced Micro Devices, Inc. [AMD/ATI] RS880 [Radeon HD 4290] [1002:9714]
Kernel starts and dpm seems to work (although I can't fully check since the debugfs support for r600 is still missing). At some (almost random) point of time, the display freezes (while the rest of the computer works just fine).
Until now it always happened during drag&drop (or in that case only "drag". ;-)) but I did not find a reliable way to reproduce it.

In my log find these messages:
2013-07-12T16:53:55.681371+02:00 orionis kernel: [121491.842729]
2013-07-12T16:53:55.681406+02:00 orionis kernel: [121491.842743] ================================================
2013-07-12T16:53:55.681409+02:00 orionis kernel: [121491.842749] [ BUG: lock held when returning to user space! ]
2013-07-12T16:53:55.681412+02:00 orionis kernel: [121491.842759] 3.10.0-00731-ga08e3f1 #1 Not tainted
2013-07-12T16:53:55.681414+02:00 orionis kernel: [121491.842765] ------------------------------------------------
2013-07-12T16:53:55.681417+02:00 orionis kernel: [121491.842771] X/1472 is leaving the kernel with locks still held!
2013-07-12T16:53:55.681458+02:00 orionis kernel: [121491.842779] 2 locks held by X/1472:
2013-07-12T16:53:55.681462+02:00 orionis kernel: [121491.842783]  #0:  (reservation_ww_class_acquire){......}, at: [<ffffffff81325d1d>] radeon_bo_list_validate+0x1c/0xc1
2013-07-12T16:53:55.681465+02:00 orionis kernel: [121491.842810]  #1:  (reservation_ww_class_mutex){......}, at: [<ffffffff8130d526>] ttm_eu_reserve_buffers+0x113/0x401

When I try to switch to the console via SysRq+V, I get a blank screen with a (not blinking) dash in the upper left, but still not a working console. I found this in my log 3 times:
2013-07-12T16:54:28.673494+02:00 orionis kernel: [121524.876529] SysRq : Restore framebuffer console
2013-07-12T16:54:28.673533+02:00 orionis kernel: [121524.876690] ------------[ cut here ]------------
2013-07-12T16:54:28.673549+02:00 orionis kernel: [121524.876719] WARNING: at drivers/gpu/drm/drm_crtc.c:87 drm_warn_on_modeset_not_all_locked+0x43/0x7a()
2013-07-12T16:54:28.673616+02:00 orionis kernel: [121524.876733] Modules linked in: xts snd_hda_codec_via snd_hda_intel snd_hda_codec k10temp snd_pcm r8169 mii snd_page_alloc snd_timer snd i2c_piix4 soundcore ipt_MASQUERADE iptable_nat nf_nat_ipv4 nf_nat w83627ehf
2013-07-12T16:54:28.673620+02:00 orionis kernel: [121524.876796] CPU: 3 PID: 22105 Comm: kworker/3:2 Not tainted 3.10.0-00731-ga08e3f1 #1
2013-07-12T16:54:28.673623+02:00 orionis kernel: [121524.876804] Hardware name: To Be Filled By O.E.M. To Be Filled By O.E.M./890GX Extreme3, BIOS L1.19 03/30/2010
2013-07-12T16:54:28.673632+02:00 orionis kernel: [121524.876816] Workqueue: events drm_fb_helper_restore_work_fn
2013-07-12T16:54:28.673645+02:00 orionis kernel: [121524.876838]  ffffffff81807e8b ffff8801e6b01c38 ffffffff815580cd ffff8801e6b01c78
2013-07-12T16:54:28.674039+02:00 orionis kernel: [121524.876863]  ffffffff8103781f ffff8801e6b01c88 0000000000000000 ffff8802164ed000
2013-07-12T16:54:28.674044+02:00 orionis kernel: [121524.876882]  ffff8802164ec000 ffff8802164ec9e8 ffff880215b90c00 ffff8801e6b01c88
2013-07-12T16:54:28.674047+02:00 orionis kernel: [121524.876894] Call Trace:
2013-07-12T16:54:28.674049+02:00 orionis kernel: [121524.876910]  [<ffffffff815580cd>] dump_stack+0x19/0x1b
2013-07-12T16:54:28.674052+02:00 orionis kernel: [121524.876922]  [<ffffffff8103781f>] warn_slowpath_common+0x62/0x7b
2013-07-12T16:54:28.674054+02:00 orionis kernel: [121524.876932]  [<ffffffff8103784d>] warn_slowpath_null+0x15/0x17
2013-07-12T16:54:28.674057+02:00 orionis kernel: [121524.876943]  [<ffffffff812fc0d2>] drm_warn_on_modeset_not_all_locked+0x43/0x7a
2013-07-12T16:54:28.674059+02:00 orionis kernel: [121524.876954]  [<ffffffff812eb1b8>] drm_fb_helper_restore_fbdev_mode+0x1d/0xab
2013-07-12T16:54:28.674061+02:00 orionis kernel: [121524.876964]  [<ffffffff812eb285>] drm_fb_helper_force_kernel_mode+0x3f/0x73
2013-07-12T16:54:28.674064+02:00 orionis kernel: [121524.876975]  [<ffffffff812ec491>] drm_fb_helper_restore_work_fn+0x9/0x24
2013-07-12T16:54:28.674066+02:00 orionis kernel: [121524.876987]  [<ffffffff8104dd1f>] process_one_work+0x227/0x3a7
2013-07-12T16:54:28.674069+02:00 orionis kernel: [121524.876997]  [<ffffffff8104dcba>] ? process_one_work+0x1c2/0x3a7
2013-07-12T16:54:28.674071+02:00 orionis kernel: [121524.877010]  [<ffffffff8104e61f>] worker_thread+0x1d1/0x2cc
2013-07-12T16:54:28.674073+02:00 orionis kernel: [121524.877022]  [<ffffffff8104e44e>] ? manage_workers.isra.28+0x1c6/0x1c6
2013-07-12T16:54:28.674076+02:00 orionis kernel: [121524.877032]  [<ffffffff8105334f>] kthread+0xd0/0xd8
2013-07-12T16:54:28.674078+02:00 orionis kernel: [121524.877045]  [<ffffffff8105327f>] ? __init_kthread_worker+0x55/0x55
2013-07-12T16:54:28.674080+02:00 orionis kernel: [121524.877057]  [<ffffffff8155f7ac>] ret_from_fork+0x7c/0xb0
2013-07-12T16:54:28.674083+02:00 orionis kernel: [121524.877073]  [<ffffffff8105327f>] ? __init_kthread_worker+0x55/0x55
2013-07-12T16:54:28.674085+02:00 orionis kernel: [121524.877081] ---[ end trace 2d4e7f44058d01cf ]---
2013-07-12T16:54:28.674087+02:00 orionis kernel: [121524.877086] ------------[ cut here ]------------

When I use SAK (SysRq+k), the x process is killed and hence the system gets into a usable state again.
I tried 1-2 days with plain 3.10.0 and didn't see this issue, so it might be a regression of the dpm patches, but I'm not 100% sure about that.
Comment 1 Bernd Steinhauser 2013-07-13 20:01:35 UTC
Created attachment 106886 [details]
log messages

Log messages as attachment for better readability.
Comment 2 Markus Trippelsdorf 2013-07-13 20:10:29 UTC
See:
http://thread.gmane.org/gmane.comp.video.dri.devel/87584
Comment 3 Bernd Steinhauser 2013-07-13 21:07:34 UTC
Ah, thanks, looks like that's the same issue. I'll try the patch.
Guess this one can be closed.