[ 278.448193] ====================================================== [ 278.448194] [ INFO: possible circular locking dependency detected ] [ 278.448197] 3.16.0-0.rc3.git3.1.fc21.x86_64 #1 Not tainted [ 278.448198] ------------------------------------------------------- [ 278.448200] Xorg.bin/1249 is trying to acquire lock: [ 278.448201] (&(&priv->lock)->rlock#2){-.-...}, at: [<ffffffffa0108618>] nouveau_therm_update+0x48/0x350 [nouveau] [ 278.448251] but task is already holding lock: [ 278.448253] (&(&priv->sensor.alarm_program_lock)->rlock){-.-...}, at: [<ffffffffa010a284>] alarm_timer_callback+0x54/0xe0 [nouveau] [ 278.448273] which lock already depends on the new lock. [ 278.448275] the existing dependency chain (in reverse order) is: [ 278.448276] -> #1 (&(&priv->sensor.alarm_program_lock)->rlock){-.-...}: [ 278.448279] [<ffffffff81102104>] lock_acquire+0xa4/0x1d0 [ 278.448283] [<ffffffff818113f7>] _raw_spin_lock_irqsave+0x57/0xa0 [ 278.448287] [<ffffffffa010a284>] alarm_timer_callback+0x54/0xe0 [nouveau] [ 278.448303] [<ffffffffa010c3b8>] nv04_timer_alarm_trigger+0x138/0x190 [nouveau] [ 278.448319] [<ffffffffa010c470>] nv04_timer_alarm+0x60/0xd0 [nouveau] [ 278.448334] [<ffffffffa01088d7>] nouveau_therm_update+0x307/0x350 [nouveau] [ 278.448349] [<ffffffffa010893a>] nouveau_therm_alarm+0x1a/0x20 [nouveau] [ 278.448365] [<ffffffffa010c3b8>] nv04_timer_alarm_trigger+0x138/0x190 [nouveau] [ 278.448380] [<ffffffffa010c54b>] nv04_timer_intr+0x6b/0x90 [nouveau] [ 278.448395] [<ffffffffa0105bf1>] nouveau_mc_intr+0x141/0x1c0 [nouveau] [ 278.448410] [<ffffffff81116127>] handle_irq_event_percpu+0x77/0x340 [ 278.448413] [<ffffffff8111642d>] handle_irq_event+0x3d/0x60 [ 278.448415] [<ffffffff811193e6>] handle_edge_irq+0x66/0x130 [ 278.448418] [<ffffffff8101c3e4>] handle_irq+0x84/0x150 [ 278.448421] [<ffffffff8181482d>] do_IRQ+0x4d/0xe0 [ 278.448423] [<ffffffff81812472>] ret_from_intr+0x0/0x1a [ 278.448426] [<ffffffff81415aba>] debug_dma_assert_idle+0xea/0x220 [ 278.448429] [<ffffffff811f3d75>] do_wp_page+0xe5/0x970 [ 278.448432] [<ffffffff811f6c9c>] handle_mm_fault+0x8ec/0xfd0 [ 278.448434] [<ffffffff81064379>] __do_page_fault+0x239/0x620 [ 278.448437] [<ffffffff81064782>] do_page_fault+0x22/0x30 [ 278.448439] [<ffffffff818139f8>] page_fault+0x28/0x30 [ 278.448441] -> #0 (&(&priv->lock)->rlock#2){-.-...}: [ 278.448444] [<ffffffff8110163b>] __lock_acquire+0x1abb/0x1ca0 [ 278.448446] [<ffffffff81102104>] lock_acquire+0xa4/0x1d0 [ 278.448448] [<ffffffff818113f7>] _raw_spin_lock_irqsave+0x57/0xa0 [ 278.448450] [<ffffffffa0108618>] nouveau_therm_update+0x48/0x350 [nouveau] [ 278.448465] [<ffffffffa010893a>] nouveau_therm_alarm+0x1a/0x20 [nouveau] [ 278.448480] [<ffffffffa010c3b8>] nv04_timer_alarm_trigger+0x138/0x190 [nouveau] [ 278.448496] [<ffffffffa010c470>] nv04_timer_alarm+0x60/0xd0 [nouveau] [ 278.448511] [<ffffffffa010a30d>] alarm_timer_callback+0xdd/0xe0 [nouveau] [ 278.448526] [<ffffffffa010c3b8>] nv04_timer_alarm_trigger+0x138/0x190 [nouveau] [ 278.448542] [<ffffffffa010c54b>] nv04_timer_intr+0x6b/0x90 [nouveau] [ 278.448557] [<ffffffffa0105bf1>] nouveau_mc_intr+0x141/0x1c0 [nouveau] [ 278.448572] [<ffffffff81116127>] handle_irq_event_percpu+0x77/0x340 [ 278.448574] [<ffffffff8111642d>] handle_irq_event+0x3d/0x60 [ 278.448576] [<ffffffff811193e6>] handle_edge_irq+0x66/0x130 [ 278.448578] [<ffffffff8101c3e4>] handle_irq+0x84/0x150 [ 278.448581] [<ffffffff8181482d>] do_IRQ+0x4d/0xe0 [ 278.448583] [<ffffffff81812472>] ret_from_intr+0x0/0x1a [ 278.448585] [<ffffffff81137c72>] __module_text_address+0x12/0x70 [ 278.448588] [<ffffffff8113c196>] is_module_text_address+0x16/0x30 [ 278.448590] [<ffffffff810c566a>] __kernel_text_address+0x3a/0x90 [ 278.448592] [<ffffffff8101da72>] print_context_stack+0x62/0x100 [ 278.448594] [<ffffffff8101c620>] dump_trace+0x170/0x350 [ 278.448596] [<ffffffff8102b47b>] save_stack_trace+0x2b/0x50 [ 278.448599] [<ffffffff81413529>] dma_entry_alloc+0x59/0x90 [ 278.448601] [<ffffffff81413b8f>] debug_dma_alloc_coherent+0x2f/0x90 [ 278.448603] [<ffffffffa008f755>] ttm_dma_populate+0x545/0xaa0 [ttm] [ 278.448613] [<ffffffffa015727c>] nouveau_ttm_tt_populate+0x14c/0x170 [nouveau] [ 278.448639] [<ffffffffa0084d80>] ttm_tt_bind+0x40/0x80 [ttm] [ 278.448644] [<ffffffffa008748f>] ttm_bo_handle_move_mem+0x5bf/0x650 [ttm] [ 278.448649] [<ffffffffa00883ef>] ttm_bo_validate+0x2df/0x300 [ttm] [ 278.448654] [<ffffffffa0088663>] ttm_bo_init+0x253/0x3b0 [ttm] [ 278.448658] [<ffffffffa0157c82>] nouveau_bo_new+0x202/0x310 [nouveau] [ 278.448677] [<ffffffffa015a42b>] nouveau_gem_new+0x6b/0x160 [nouveau] [ 278.448698] [<ffffffffa015a5d6>] nouveau_gem_ioctl_new+0xb6/0x220 [nouveau] [ 278.448718] [<ffffffffa003dcdf>] drm_ioctl+0x1df/0x6a0 [drm] [ 278.448733] [<ffffffffa0151a45>] nouveau_drm_ioctl+0x65/0xa0 [nouveau] [ 278.448753] [<ffffffff812628d0>] do_vfs_ioctl+0x2f0/0x520 [ 278.448756] [<ffffffff81262b81>] SyS_ioctl+0x81/0xa0 [ 278.448758] [<ffffffff818118e9>] system_call_fastpath+0x16/0x1b [ 278.448760] other info that might help us debug this: [ 278.448762] Possible unsafe locking scenario: [ 278.448764] CPU0 CPU1 [ 278.448765] ---- ---- [ 278.448766] lock(&(&priv->sensor.alarm_program_lock)->rlock); [ 278.448767] lock(&(&priv->lock)->rlock#2); [ 278.448770] lock(&(&priv->sensor.alarm_program_lock)->rlock); [ 278.448771] lock(&(&priv->lock)->rlock#2); [ 278.448773] *** DEADLOCK *** [ 278.448775] 2 locks held by Xorg.bin/1249: [ 278.448776] #0: (reservation_ww_class_mutex){+.+.+.}, at: [<ffffffffa00886e1>] ttm_bo_init+0x2d1/0x3b0 [ttm] [ 278.448783] #1: (&(&priv->sensor.alarm_program_lock)->rlock){-.-...}, at: [<ffffffffa010a284>] alarm_timer_callback+0x54/0xe0 [nouveau] [ 278.448800] stack backtrace: [ 278.448803] CPU: 0 PID: 1249 Comm: Xorg.bin Not tainted 3.16.0-0.rc3.git3.1.fc21.x86_64 #1 [ 278.448804] Hardware name: System manufacturer System Product Name/M4A78LT-M, BIOS 0802 08/24/2010 [ 278.448806] 0000000000000000 000000007a5a7c22 ffff88011aa03b00 ffffffff81807cec [ 278.448809] ffffffff82bc2ef0 ffff88011aa03b40 ffffffff8180508c ffff88011aa03ba0 [ 278.448812] ffff8801185b9a40 ffff8801185b99d0 0000000000000002 ffff8801185ba5a8 [ 278.448815] Call Trace: [ 278.448816] <IRQ> [<ffffffff81807cec>] dump_stack+0x4d/0x66 [ 278.448823] [<ffffffff8180508c>] print_circular_bug+0x201/0x20f [ 278.448825] [<ffffffff8110163b>] __lock_acquire+0x1abb/0x1ca0 [ 278.448828] [<ffffffff810242de>] ? native_sched_clock+0x2e/0xb0 [ 278.448831] [<ffffffff81102104>] lock_acquire+0xa4/0x1d0 [ 278.448847] [<ffffffffa0108618>] ? nouveau_therm_update+0x48/0x350 [nouveau] [ 278.448850] [<ffffffff818113f7>] _raw_spin_lock_irqsave+0x57/0xa0 [ 278.448866] [<ffffffffa0108618>] ? nouveau_therm_update+0x48/0x350 [nouveau] [ 278.448882] [<ffffffffa0108618>] nouveau_therm_update+0x48/0x350 [nouveau] [ 278.448898] [<ffffffffa010893a>] nouveau_therm_alarm+0x1a/0x20 [nouveau] [ 278.448915] [<ffffffffa010c3b8>] nv04_timer_alarm_trigger+0x138/0x190 [nouveau] [ 278.448931] [<ffffffffa010c470>] nv04_timer_alarm+0x60/0xd0 [nouveau] [ 278.448948] [<ffffffffa010a30d>] alarm_timer_callback+0xdd/0xe0 [nouveau] [ 278.448964] [<ffffffffa010c3b8>] nv04_timer_alarm_trigger+0x138/0x190 [nouveau] [ 278.448981] [<ffffffffa010c54b>] nv04_timer_intr+0x6b/0x90 [nouveau] [ 278.448998] [<ffffffffa0105bf1>] nouveau_mc_intr+0x141/0x1c0 [nouveau] [ 278.449000] [<ffffffff81116127>] handle_irq_event_percpu+0x77/0x340 [ 278.449003] [<ffffffff8111642d>] handle_irq_event+0x3d/0x60 [ 278.449005] [<ffffffff811193e6>] handle_edge_irq+0x66/0x130 [ 278.449007] [<ffffffff8101c3e4>] handle_irq+0x84/0x150 [ 278.449010] [<ffffffff810e2145>] ? irqtime_account_irq+0xc5/0xd0 [ 278.449012] [<ffffffff8181482d>] do_IRQ+0x4d/0xe0 [ 278.449015] [<ffffffff81812472>] common_interrupt+0x72/0x72 [ 278.449016] <EOI> [<ffffffffa0084000>] ? 0xffffffffa0083fff [ 278.449023] [<ffffffff81137b39>] ? __module_address+0x29/0x150 [ 278.449026] [<ffffffff81137c03>] ? __module_address+0xf3/0x150 [ 278.449029] [<ffffffff81137c72>] __module_text_address+0x12/0x70 [ 278.449031] [<ffffffff8113c196>] is_module_text_address+0x16/0x30 [ 278.449034] [<ffffffff810c566a>] __kernel_text_address+0x3a/0x90 [ 278.449036] [<ffffffff8101da72>] print_context_stack+0x62/0x100 [ 278.449038] [<ffffffff8101c620>] dump_trace+0x170/0x350 [ 278.449041] [<ffffffff8102b47b>] save_stack_trace+0x2b/0x50 [ 278.449043] [<ffffffff81413529>] dma_entry_alloc+0x59/0x90 [ 278.449045] [<ffffffff81413b8f>] debug_dma_alloc_coherent+0x2f/0x90 [ 278.449051] [<ffffffffa008f755>] ttm_dma_populate+0x545/0xaa0 [ttm] [ 278.449072] [<ffffffffa015727c>] nouveau_ttm_tt_populate+0x14c/0x170 [nouveau] [ 278.449078] [<ffffffffa0084d80>] ttm_tt_bind+0x40/0x80 [ttm] [ 278.449084] [<ffffffffa008748f>] ttm_bo_handle_move_mem+0x5bf/0x650 [ttm] [ 278.449089] [<ffffffffa0087c59>] ? ttm_bo_mem_space+0x179/0x370 [ttm] [ 278.449092] [<ffffffff810fc24f>] ? lock_release_holdtime.part.28+0xf/0x200 [ 278.449098] [<ffffffffa00883ef>] ttm_bo_validate+0x2df/0x300 [ttm] [ 278.449100] [<ffffffff810ff72d>] ? trace_hardirqs_on_caller+0x15d/0x200 [ 278.449106] [<ffffffffa0088663>] ttm_bo_init+0x253/0x3b0 [ttm] [ 278.449126] [<ffffffffa0157c82>] nouveau_bo_new+0x202/0x310 [nouveau] [ 278.449147] [<ffffffffa0156660>] ? nv10_bo_put_tile_region+0x50/0x50 [nouveau] [ 278.449168] [<ffffffffa015a42b>] nouveau_gem_new+0x6b/0x160 [nouveau] [ 278.449189] [<ffffffffa015a5d6>] nouveau_gem_ioctl_new+0xb6/0x220 [nouveau] [ 278.449197] [<ffffffffa003dcdf>] drm_ioctl+0x1df/0x6a0 [drm] [ 278.449201] [<ffffffff810ff72d>] ? trace_hardirqs_on_caller+0x15d/0x200 [ 278.449203] [<ffffffff810ff7dd>] ? trace_hardirqs_on+0xd/0x10 [ 278.449223] [<ffffffffa0151a45>] nouveau_drm_ioctl+0x65/0xa0 [nouveau] [ 278.449226] [<ffffffff812628d0>] do_vfs_ioctl+0x2f0/0x520 [ 278.449228] [<ffffffff81262b81>] SyS_ioctl+0x81/0xa0 [ 278.449231] [<ffffffff8115fb9c>] ? __audit_syscall_entry+0x9c/0xf0 [ 278.449234] [<ffffffff818118e9>] system_call_fastpath+0x16/0x1b
Thanks for reporting. I'll have a look at it during the week! Please do not report Nouveau bugs to the kernel's bugtracker. We have our own which we monitor much more attentively.
and witch bugtracker is it?
I was referring to freedesktop's bug tracker. Reporting bugs to Nouveau is explained here: http://nouveau.freedesktop.org/wiki/Bugs/ But that's ok, no need to report it once again ;)
static void ttm_bo_cleanup_memtype_use(struct ttm_buffer_object *bo) { if (bo->bdev->driver->move_notify) bo->bdev->driver->move_notify(bo, NULL); if (bo->ttm) { ttm_tt_unbind(bo->ttm); ttm_tt_destroy(bo->ttm); bo->ttm = NULL; } ttm_bo_mem_put(bo, &bo->mem); ww_mutex_unlock (&bo->resv->lock); } The last line ? must it also change like this: https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/drivers/gpu/drm/ttm/ttm_bo.c?id=c75230833ce4fbbfaa257c07b55f97912fb1dc02
Created attachment 142831 [details] drm/nouveau/therm: fix a potential deadlock in the therm monitoring code Sorry for the wait. Can you try to reproduce the issue with this patch?
look like okey.
(In reply to Martin Peres from comment #5) > Sorry for the wait. Can you try to reproduce the issue with this patch? Your patch works. Thanks. I cannot reproduce it with this patch.
(In reply to Stefan Ringel from comment #7) > (In reply to Martin Peres from comment #5) > > > Sorry for the wait. Can you try to reproduce the issue with this patch? > > Your patch works. Thanks. I cannot reproduce it with this patch. Great! I've asked for inclusion. I'll close this bug