Bug 79591 - possible circular locking dependency detected
Summary: possible circular locking dependency detected
Status: NEW
Alias: None
Product: Drivers
Classification: Unclassified
Component: Video(DRI - non Intel) (show other bugs)
Hardware: All Linux
: P1 high
Assignee: drivers_video-dri
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2014-07-07 08:11 UTC by Stefan Ringel
Modified: 2014-07-14 09:56 UTC (History)
1 user (show)

See Also:
Kernel Version: 3.16.0-0.rc3.git3
Subsystem:
Regression: No
Bisected commit-id:


Attachments
drm/nouveau/therm: fix a potential deadlock in the therm monitoring code (1.14 KB, patch)
2014-07-12 21:50 UTC, Martin Peres
Details | Diff

Description Stefan Ringel 2014-07-07 08:11:24 UTC
[  278.448193] ======================================================
[  278.448194] [ INFO: possible circular locking dependency detected ]
[  278.448197] 3.16.0-0.rc3.git3.1.fc21.x86_64 #1 Not tainted
[  278.448198] -------------------------------------------------------
[  278.448200] Xorg.bin/1249 is trying to acquire lock:
[  278.448201]  (&(&priv->lock)->rlock#2){-.-...}, at: [<ffffffffa0108618>] nouveau_therm_update+0x48/0x350 [nouveau]
[  278.448251] 
but task is already holding lock:
[  278.448253]  (&(&priv->sensor.alarm_program_lock)->rlock){-.-...}, at: [<ffffffffa010a284>] alarm_timer_callback+0x54/0xe0 [nouveau]
[  278.448273] 
which lock already depends on the new lock.

[  278.448275] 
the existing dependency chain (in reverse order) is:
[  278.448276] 
-> #1 (&(&priv->sensor.alarm_program_lock)->rlock){-.-...}:
[  278.448279]        [<ffffffff81102104>] lock_acquire+0xa4/0x1d0
[  278.448283]        [<ffffffff818113f7>] _raw_spin_lock_irqsave+0x57/0xa0
[  278.448287]        [<ffffffffa010a284>] alarm_timer_callback+0x54/0xe0 [nouveau]
[  278.448303]        [<ffffffffa010c3b8>] nv04_timer_alarm_trigger+0x138/0x190 [nouveau]
[  278.448319]        [<ffffffffa010c470>] nv04_timer_alarm+0x60/0xd0 [nouveau]
[  278.448334]        [<ffffffffa01088d7>] nouveau_therm_update+0x307/0x350 [nouveau]
[  278.448349]        [<ffffffffa010893a>] nouveau_therm_alarm+0x1a/0x20 [nouveau]
[  278.448365]        [<ffffffffa010c3b8>] nv04_timer_alarm_trigger+0x138/0x190 [nouveau]
[  278.448380]        [<ffffffffa010c54b>] nv04_timer_intr+0x6b/0x90 [nouveau]
[  278.448395]        [<ffffffffa0105bf1>] nouveau_mc_intr+0x141/0x1c0 [nouveau]
[  278.448410]        [<ffffffff81116127>] handle_irq_event_percpu+0x77/0x340
[  278.448413]        [<ffffffff8111642d>] handle_irq_event+0x3d/0x60
[  278.448415]        [<ffffffff811193e6>] handle_edge_irq+0x66/0x130
[  278.448418]        [<ffffffff8101c3e4>] handle_irq+0x84/0x150
[  278.448421]        [<ffffffff8181482d>] do_IRQ+0x4d/0xe0
[  278.448423]        [<ffffffff81812472>] ret_from_intr+0x0/0x1a
[  278.448426]        [<ffffffff81415aba>] debug_dma_assert_idle+0xea/0x220
[  278.448429]        [<ffffffff811f3d75>] do_wp_page+0xe5/0x970
[  278.448432]        [<ffffffff811f6c9c>] handle_mm_fault+0x8ec/0xfd0
[  278.448434]        [<ffffffff81064379>] __do_page_fault+0x239/0x620
[  278.448437]        [<ffffffff81064782>] do_page_fault+0x22/0x30
[  278.448439]        [<ffffffff818139f8>] page_fault+0x28/0x30
[  278.448441] 
-> #0 (&(&priv->lock)->rlock#2){-.-...}:
[  278.448444]        [<ffffffff8110163b>] __lock_acquire+0x1abb/0x1ca0
[  278.448446]        [<ffffffff81102104>] lock_acquire+0xa4/0x1d0
[  278.448448]        [<ffffffff818113f7>] _raw_spin_lock_irqsave+0x57/0xa0
[  278.448450]        [<ffffffffa0108618>] nouveau_therm_update+0x48/0x350 [nouveau]
[  278.448465]        [<ffffffffa010893a>] nouveau_therm_alarm+0x1a/0x20 [nouveau]
[  278.448480]        [<ffffffffa010c3b8>] nv04_timer_alarm_trigger+0x138/0x190 [nouveau]
[  278.448496]        [<ffffffffa010c470>] nv04_timer_alarm+0x60/0xd0 [nouveau]
[  278.448511]        [<ffffffffa010a30d>] alarm_timer_callback+0xdd/0xe0 [nouveau]
[  278.448526]        [<ffffffffa010c3b8>] nv04_timer_alarm_trigger+0x138/0x190 [nouveau]
[  278.448542]        [<ffffffffa010c54b>] nv04_timer_intr+0x6b/0x90 [nouveau]
[  278.448557]        [<ffffffffa0105bf1>] nouveau_mc_intr+0x141/0x1c0 [nouveau]
[  278.448572]        [<ffffffff81116127>] handle_irq_event_percpu+0x77/0x340
[  278.448574]        [<ffffffff8111642d>] handle_irq_event+0x3d/0x60
[  278.448576]        [<ffffffff811193e6>] handle_edge_irq+0x66/0x130
[  278.448578]        [<ffffffff8101c3e4>] handle_irq+0x84/0x150
[  278.448581]        [<ffffffff8181482d>] do_IRQ+0x4d/0xe0
[  278.448583]        [<ffffffff81812472>] ret_from_intr+0x0/0x1a
[  278.448585]        [<ffffffff81137c72>] __module_text_address+0x12/0x70
[  278.448588]        [<ffffffff8113c196>] is_module_text_address+0x16/0x30
[  278.448590]        [<ffffffff810c566a>] __kernel_text_address+0x3a/0x90
[  278.448592]        [<ffffffff8101da72>] print_context_stack+0x62/0x100
[  278.448594]        [<ffffffff8101c620>] dump_trace+0x170/0x350
[  278.448596]        [<ffffffff8102b47b>] save_stack_trace+0x2b/0x50
[  278.448599]        [<ffffffff81413529>] dma_entry_alloc+0x59/0x90
[  278.448601]        [<ffffffff81413b8f>] debug_dma_alloc_coherent+0x2f/0x90
[  278.448603]        [<ffffffffa008f755>] ttm_dma_populate+0x545/0xaa0 [ttm]
[  278.448613]        [<ffffffffa015727c>] nouveau_ttm_tt_populate+0x14c/0x170 [nouveau]
[  278.448639]        [<ffffffffa0084d80>] ttm_tt_bind+0x40/0x80 [ttm]
[  278.448644]        [<ffffffffa008748f>] ttm_bo_handle_move_mem+0x5bf/0x650 [ttm]
[  278.448649]        [<ffffffffa00883ef>] ttm_bo_validate+0x2df/0x300 [ttm]
[  278.448654]        [<ffffffffa0088663>] ttm_bo_init+0x253/0x3b0 [ttm]
[  278.448658]        [<ffffffffa0157c82>] nouveau_bo_new+0x202/0x310 [nouveau]
[  278.448677]        [<ffffffffa015a42b>] nouveau_gem_new+0x6b/0x160 [nouveau]
[  278.448698]        [<ffffffffa015a5d6>] nouveau_gem_ioctl_new+0xb6/0x220 [nouveau]
[  278.448718]        [<ffffffffa003dcdf>] drm_ioctl+0x1df/0x6a0 [drm]
[  278.448733]        [<ffffffffa0151a45>] nouveau_drm_ioctl+0x65/0xa0 [nouveau]
[  278.448753]        [<ffffffff812628d0>] do_vfs_ioctl+0x2f0/0x520
[  278.448756]        [<ffffffff81262b81>] SyS_ioctl+0x81/0xa0
[  278.448758]        [<ffffffff818118e9>] system_call_fastpath+0x16/0x1b
[  278.448760] 
other info that might help us debug this:

[  278.448762]  Possible unsafe locking scenario:

[  278.448764]        CPU0                    CPU1
[  278.448765]        ----                    ----
[  278.448766]   lock(&(&priv->sensor.alarm_program_lock)->rlock);
[  278.448767]                                lock(&(&priv->lock)->rlock#2);
[  278.448770]                                lock(&(&priv->sensor.alarm_program_lock)->rlock);
[  278.448771]   lock(&(&priv->lock)->rlock#2);
[  278.448773] 
 *** DEADLOCK ***

[  278.448775] 2 locks held by Xorg.bin/1249:
[  278.448776]  #0:  (reservation_ww_class_mutex){+.+.+.}, at: [<ffffffffa00886e1>] ttm_bo_init+0x2d1/0x3b0 [ttm]
[  278.448783]  #1:  (&(&priv->sensor.alarm_program_lock)->rlock){-.-...}, at: [<ffffffffa010a284>] alarm_timer_callback+0x54/0xe0 [nouveau]
[  278.448800] 
stack backtrace:
[  278.448803] CPU: 0 PID: 1249 Comm: Xorg.bin Not tainted 3.16.0-0.rc3.git3.1.fc21.x86_64 #1
[  278.448804] Hardware name: System manufacturer System Product Name/M4A78LT-M, BIOS 0802    08/24/2010
[  278.448806]  0000000000000000 000000007a5a7c22 ffff88011aa03b00 ffffffff81807cec
[  278.448809]  ffffffff82bc2ef0 ffff88011aa03b40 ffffffff8180508c ffff88011aa03ba0
[  278.448812]  ffff8801185b9a40 ffff8801185b99d0 0000000000000002 ffff8801185ba5a8
[  278.448815] Call Trace:
[  278.448816]  <IRQ>  [<ffffffff81807cec>] dump_stack+0x4d/0x66
[  278.448823]  [<ffffffff8180508c>] print_circular_bug+0x201/0x20f
[  278.448825]  [<ffffffff8110163b>] __lock_acquire+0x1abb/0x1ca0
[  278.448828]  [<ffffffff810242de>] ? native_sched_clock+0x2e/0xb0
[  278.448831]  [<ffffffff81102104>] lock_acquire+0xa4/0x1d0
[  278.448847]  [<ffffffffa0108618>] ? nouveau_therm_update+0x48/0x350 [nouveau]
[  278.448850]  [<ffffffff818113f7>] _raw_spin_lock_irqsave+0x57/0xa0
[  278.448866]  [<ffffffffa0108618>] ? nouveau_therm_update+0x48/0x350 [nouveau]
[  278.448882]  [<ffffffffa0108618>] nouveau_therm_update+0x48/0x350 [nouveau]
[  278.448898]  [<ffffffffa010893a>] nouveau_therm_alarm+0x1a/0x20 [nouveau]
[  278.448915]  [<ffffffffa010c3b8>] nv04_timer_alarm_trigger+0x138/0x190 [nouveau]
[  278.448931]  [<ffffffffa010c470>] nv04_timer_alarm+0x60/0xd0 [nouveau]
[  278.448948]  [<ffffffffa010a30d>] alarm_timer_callback+0xdd/0xe0 [nouveau]
[  278.448964]  [<ffffffffa010c3b8>] nv04_timer_alarm_trigger+0x138/0x190 [nouveau]
[  278.448981]  [<ffffffffa010c54b>] nv04_timer_intr+0x6b/0x90 [nouveau]
[  278.448998]  [<ffffffffa0105bf1>] nouveau_mc_intr+0x141/0x1c0 [nouveau]
[  278.449000]  [<ffffffff81116127>] handle_irq_event_percpu+0x77/0x340
[  278.449003]  [<ffffffff8111642d>] handle_irq_event+0x3d/0x60
[  278.449005]  [<ffffffff811193e6>] handle_edge_irq+0x66/0x130
[  278.449007]  [<ffffffff8101c3e4>] handle_irq+0x84/0x150
[  278.449010]  [<ffffffff810e2145>] ? irqtime_account_irq+0xc5/0xd0
[  278.449012]  [<ffffffff8181482d>] do_IRQ+0x4d/0xe0
[  278.449015]  [<ffffffff81812472>] common_interrupt+0x72/0x72
[  278.449016]  <EOI>  [<ffffffffa0084000>] ? 0xffffffffa0083fff
[  278.449023]  [<ffffffff81137b39>] ? __module_address+0x29/0x150
[  278.449026]  [<ffffffff81137c03>] ? __module_address+0xf3/0x150
[  278.449029]  [<ffffffff81137c72>] __module_text_address+0x12/0x70
[  278.449031]  [<ffffffff8113c196>] is_module_text_address+0x16/0x30
[  278.449034]  [<ffffffff810c566a>] __kernel_text_address+0x3a/0x90
[  278.449036]  [<ffffffff8101da72>] print_context_stack+0x62/0x100
[  278.449038]  [<ffffffff8101c620>] dump_trace+0x170/0x350
[  278.449041]  [<ffffffff8102b47b>] save_stack_trace+0x2b/0x50
[  278.449043]  [<ffffffff81413529>] dma_entry_alloc+0x59/0x90
[  278.449045]  [<ffffffff81413b8f>] debug_dma_alloc_coherent+0x2f/0x90
[  278.449051]  [<ffffffffa008f755>] ttm_dma_populate+0x545/0xaa0 [ttm]
[  278.449072]  [<ffffffffa015727c>] nouveau_ttm_tt_populate+0x14c/0x170 [nouveau]
[  278.449078]  [<ffffffffa0084d80>] ttm_tt_bind+0x40/0x80 [ttm]
[  278.449084]  [<ffffffffa008748f>] ttm_bo_handle_move_mem+0x5bf/0x650 [ttm]
[  278.449089]  [<ffffffffa0087c59>] ? ttm_bo_mem_space+0x179/0x370 [ttm]
[  278.449092]  [<ffffffff810fc24f>] ? lock_release_holdtime.part.28+0xf/0x200
[  278.449098]  [<ffffffffa00883ef>] ttm_bo_validate+0x2df/0x300 [ttm]
[  278.449100]  [<ffffffff810ff72d>] ? trace_hardirqs_on_caller+0x15d/0x200
[  278.449106]  [<ffffffffa0088663>] ttm_bo_init+0x253/0x3b0 [ttm]
[  278.449126]  [<ffffffffa0157c82>] nouveau_bo_new+0x202/0x310 [nouveau]
[  278.449147]  [<ffffffffa0156660>] ? nv10_bo_put_tile_region+0x50/0x50 [nouveau]
[  278.449168]  [<ffffffffa015a42b>] nouveau_gem_new+0x6b/0x160 [nouveau]
[  278.449189]  [<ffffffffa015a5d6>] nouveau_gem_ioctl_new+0xb6/0x220 [nouveau]
[  278.449197]  [<ffffffffa003dcdf>] drm_ioctl+0x1df/0x6a0 [drm]
[  278.449201]  [<ffffffff810ff72d>] ? trace_hardirqs_on_caller+0x15d/0x200
[  278.449203]  [<ffffffff810ff7dd>] ? trace_hardirqs_on+0xd/0x10
[  278.449223]  [<ffffffffa0151a45>] nouveau_drm_ioctl+0x65/0xa0 [nouveau]
[  278.449226]  [<ffffffff812628d0>] do_vfs_ioctl+0x2f0/0x520
[  278.449228]  [<ffffffff81262b81>] SyS_ioctl+0x81/0xa0
[  278.449231]  [<ffffffff8115fb9c>] ? __audit_syscall_entry+0x9c/0xf0
[  278.449234]  [<ffffffff818118e9>] system_call_fastpath+0x16/0x1b
Comment 1 Martin Peres 2014-07-07 18:24:25 UTC
Thanks for reporting. I'll have a look at it during the week!

Please do not report Nouveau bugs to the kernel's bugtracker. We have our own which we monitor much more attentively.
Comment 2 Stefan Ringel 2014-07-07 18:28:55 UTC
and witch bugtracker is it?
Comment 3 Martin Peres 2014-07-07 19:05:33 UTC
I was referring to freedesktop's bug tracker. Reporting bugs to Nouveau is explained here: http://nouveau.freedesktop.org/wiki/Bugs/

But that's ok, no need to report it once again ;)
Comment 4 Stefan Ringel 2014-07-07 20:16:16 UTC
static void ttm_bo_cleanup_memtype_use(struct ttm_buffer_object *bo)
{
	if (bo->bdev->driver->move_notify)
		bo->bdev->driver->move_notify(bo, NULL);

	if (bo->ttm) {
		ttm_tt_unbind(bo->ttm);
		ttm_tt_destroy(bo->ttm);
		bo->ttm = NULL;
	}
	ttm_bo_mem_put(bo, &bo->mem);

	ww_mutex_unlock (&bo->resv->lock);
}

The last line ? must it also change like this:
https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/drivers/gpu/drm/ttm/ttm_bo.c?id=c75230833ce4fbbfaa257c07b55f97912fb1dc02
Comment 5 Martin Peres 2014-07-12 21:50:56 UTC
Created attachment 142831 [details]
drm/nouveau/therm: fix a potential deadlock in the therm monitoring code

Sorry for the wait. Can you try to reproduce the issue with this patch?
Comment 6 Stefan Ringel 2014-07-13 12:08:08 UTC
look like okey.
Comment 7 Stefan Ringel 2014-07-14 06:47:32 UTC
(In reply to Martin Peres from comment #5)

> Sorry for the wait. Can you try to reproduce the issue with this patch?

Your patch works. Thanks. I cannot reproduce it with this patch.
Comment 8 Martin Peres 2014-07-14 09:56:55 UTC
(In reply to Stefan Ringel from comment #7)
> (In reply to Martin Peres from comment #5)
> 
> > Sorry for the wait. Can you try to reproduce the issue with this patch?
> 
> Your patch works. Thanks. I cannot reproduce it with this patch.

Great! I've asked for inclusion. I'll close this bug

Note You need to log in before you can comment on or make changes to this bug.