Bug 13030 - i915: possible circular locking dependency detected in i915_gem_execbuffer
Summary: i915: possible circular locking dependency detected in i915_gem_execbuffer
Status: RESOLVED CODE_FIX
Alias: None
Product: Drivers
Classification: Unclassified
Component: Video(DRI - Intel) (show other bugs)
Hardware: All Linux
: P1 normal
Assignee: drivers_video-dri-intel@kernel-bugs.osdl.org
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2009-04-06 23:18 UTC by Sami Liedes
Modified: 2009-09-17 03:46 UTC (History)
3 users (show)

See Also:
Kernel Version: 2.6.29.1
Subsystem:
Regression: No
Bisected commit-id:


Attachments
Dmesg output, except the very first seconds that I didn't capture (248.03 KB, text/x-log)
2009-04-07 16:53 UTC, Sami Liedes
Details

Description Sami Liedes 2009-04-06 23:18:20 UTC
Hi,

When running glxgears on a recent X server from Debian experimental on 2.6.29.1 I get the following warning:

------------------------------------------------------------
[  361.948133]                                                                                                                                
[  361.948137] =======================================================                                                                        
[  361.948142] [ INFO: possible circular locking dependency detected ]                                                                        
[  361.948146] 2.6.29.1-tuxonice #1                                                                                                           
[  361.948148] -------------------------------------------------------                                                                        
[  361.948150] glxgears/4533 is trying to acquire lock:                                                                                       
[  361.948153]  (&mm->mmap_sem){----}, at: [<ffffffff802e3eba>] might_fault+0x6a/0xc0                                                         
[  361.948164]                                                                                                                                
[  361.948165] but task is already holding lock:                                                                                              
[  361.948167]  (&dev->struct_mutex){--..}, at: [<ffffffffa02f1dce>] i915_gem_execbuffer+0xfe/0xe80 [i915]                                    
[  361.948185]                                                                                                                                
[  361.948185] which lock already depends on the new lock.                                                                                    
[  361.948186]                                                                                                                                
[  361.948188]                                                                                                                                
[  361.948189] the existing dependency chain (in reverse order) is:                                                                           
[  361.948192]                                                                                                                                
[  361.948192] -> #2 (&dev->struct_mutex){--..}:                                                                                              
[  361.948198]        [<ffffffff8027be99>] __lock_acquire+0x1499/0x1de0                                                                       
[  361.948205]        [<ffffffff8027c871>] lock_acquire+0x91/0xc0                                                                             
[  361.948209]        [<ffffffff8065461c>] mutex_lock_nested+0xec/0x380                                                                       
[  361.948215]        [<ffffffffa02c76f0>] drm_vm_open+0x40/0x60 [drm]                                                                        
[  361.948235]        [<ffffffff8024a6d5>] dup_mm+0x2c5/0x410                                                                                 
[  361.948240]        [<ffffffff8024b637>] copy_process+0xdc7/0x14a0                                                                          
[  361.948244]        [<ffffffff8024bda0>] do_fork+0x90/0x4a0                                                                                 
[  361.948248]        [<ffffffff8020a888>] sys_clone+0x28/0x30                                                                                
[  361.948253]        [<ffffffff8020cc73>] stub_clone+0x13/0x20                                                                               
[  361.948257]        [<ffffffffffffffff>] 0xffffffffffffffff                                                                                 
[  361.948277]                                                                                                                                
[  361.948277] -> #1 (&mm->mmap_sem/1){--..}:                                                                                                 
[  361.948284]        [<ffffffff8027be99>] __lock_acquire+0x1499/0x1de0                                                                       
[  361.948289]        [<ffffffff8027c871>] lock_acquire+0x91/0xc0                                                                             
[  361.948293]        [<ffffffff8026a522>] down_write_nested+0x52/0x90                                                                        
[  361.948299]        [<ffffffff8024a4f7>] dup_mm+0xe7/0x410                                                                                  
[  361.948303]        [<ffffffff8024b637>] copy_process+0xdc7/0x14a0                                                                          
[  361.948307]        [<ffffffff8024bda0>] do_fork+0x90/0x4a0                                                                                 
[  361.948311]        [<ffffffff8020a888>] sys_clone+0x28/0x30                                                                                
[  361.948315]        [<ffffffff8020cc73>] stub_clone+0x13/0x20                                                                               
[  361.948319]        [<ffffffffffffffff>] 0xffffffffffffffff                                                                                 
[  361.948327]                                                                                                                                
[  361.948328] -> #0 (&mm->mmap_sem){----}:                                                                                                   
[  361.948333]        [<ffffffff8027c065>] __lock_acquire+0x1665/0x1de0                                                                       
[  361.948338]        [<ffffffff8027c871>] lock_acquire+0x91/0xc0
[  361.948342]        [<ffffffff802e3ee7>] might_fault+0x97/0xc0
[  361.948347]        [<ffffffffa02eafdf>] i915_emit_box+0x3f/0x300 [i915]
[  361.948357]        [<ffffffffa02f2903>] i915_gem_execbuffer+0xc33/0xe80 [i915]
[  361.948368]        [<ffffffffa02c0bc2>] drm_ioctl+0x122/0x350 [drm]
[  361.948380]        [<ffffffff80314425>] vfs_ioctl+0x85/0xb0
[  361.948385]        [<ffffffff803144e2>] do_vfs_ioctl+0x92/0x5b0
[  361.948389]        [<ffffffff80314a4f>] sys_ioctl+0x4f/0x80
[  361.948393]        [<ffffffff8020c88b>] system_call_fastpath+0x16/0x1b
[  361.948397]        [<ffffffffffffffff>] 0xffffffffffffffff
[  361.948402]
[  361.948403] other info that might help us debug this:
[  361.948404]
[  361.948406] 1 lock held by glxgears/4533:
[  361.948409]  #0:  (&dev->struct_mutex){--..}, at: [<ffffffffa02f1dce>] i915_gem_execbuffer+0xfe/0xe80 [i915]
[  361.948423]
[  361.948423] stack backtrace:
[  361.948427] Pid: 4533, comm: glxgears Not tainted 2.6.29.1-tuxonice #1
[  361.948430] Call Trace:
[  361.948435]  [<ffffffff8027a560>] print_circular_bug_tail+0xe0/0xf0
[  361.948440]  [<ffffffff8027c065>] __lock_acquire+0x1665/0x1de0
[  361.948444]  [<ffffffff8027c871>] lock_acquire+0x91/0xc0
[  361.948448]  [<ffffffff802e3eba>] ? might_fault+0x6a/0xc0
[  361.948453]  [<ffffffff802e3ee7>] might_fault+0x97/0xc0
[  361.948456]  [<ffffffff802e3eba>] ? might_fault+0x6a/0xc0
[  361.948466]  [<ffffffffa02eafdf>] i915_emit_box+0x3f/0x300 [i915]
[  361.948476]  [<ffffffffa02f2903>] i915_gem_execbuffer+0xc33/0xe80 [i915]
[  361.948482]  [<ffffffff802fe775>] ? check_object+0x265/0x270
[  361.948494]  [<ffffffffa02c0bc2>] drm_ioctl+0x122/0x350 [drm]
[  361.948504]  [<ffffffffa02f1cd0>] ? i915_gem_execbuffer+0x0/0xe80 [i915]
[  361.948508]  [<ffffffff80314425>] vfs_ioctl+0x85/0xb0
[  361.948512]  [<ffffffff803144e2>] do_vfs_ioctl+0x92/0x5b0
[  361.948516]  [<ffffffff8020c8bc>] ? sysret_check+0x27/0x62
[  361.948520]  [<ffffffff8020c8bc>] ? sysret_check+0x27/0x62
[  361.948524]  [<ffffffff80314a4f>] sys_ioctl+0x4f/0x80
[  361.948528]  [<ffffffff8020c88b>] system_call_fastpath+0x16/0x1b
------------------------------------------------------------
Comment 1 Sérgio M Basto 2009-04-07 16:04:04 UTC
same here , on boot
Running glxgears
I saw in dmesg 
[drm:drm_wait_vblank] *ERROR* failed to acquire vblank counter, -22

=======================================================
[ INFO: possible circular locking dependency detected ]
2.6.29.1-42.rc1.sb3.fc10.i686.PAE #1                   
-------------------------------------------------------
X/2749 is trying to acquire lock:                      
 (&mm->mmap_sem){----}, at: [<c049c6f2>] might_fault+0x48/0x85

but task is already holding lock:
 (&dev->struct_mutex){--..}, at: [<f8353b97>] i915_gem_execbuffer+0x105/0xad7 [i915]

which lock already depends on the new lock.


the existing dependency chain (in reverse order) is:

-> #1 (&dev->struct_mutex){--..}:
       [<c045a398>] __lock_acquire+0x9a8/0xb1b
       [<c045a566>] lock_acquire+0x5b/0x81    
       [<c073a7c0>] __mutex_lock_common+0xda/0x32e
       [<c073aabb>] mutex_lock_nested+0x33/0x3b   
       [<f81bc7b0>] drm_gem_mmap+0x34/0xf8 [drm]  
       [<c04a395f>] mmap_region+0x255/0x3f9       
       [<c04a3d4a>] do_mmap_pgoff+0x247/0x297     
       [<c040c739>] sys_mmap2+0x5f/0x80           
       [<c040966b>] sysenter_do_call+0x12/0x3f    
       [<ffffffff>] 0xffffffff                    

-> #0 (&mm->mmap_sem){----}:
       [<c045a26d>] __lock_acquire+0x87d/0xb1b
       [<c045a566>] lock_acquire+0x5b/0x81    
       [<c049c70f>] might_fault+0x65/0x85     
       [<c057dbcc>] copy_from_user+0x2f/0x117 
       [<f8353d55>] i915_gem_execbuffer+0x2c3/0xad7 [i915]
       [<f81bb8c2>] drm_ioctl+0x1c4/0x241 [drm]           
       [<c04c19d7>] vfs_ioctl+0x55/0x6e                   
       [<c04c1f28>] do_vfs_ioctl+0x46f/0x4a8              
       [<c04c1fa6>] sys_ioctl+0x45/0x5f                   
       [<c040966b>] sysenter_do_call+0x12/0x3f            
       [<ffffffff>] 0xffffffff                            

other info that might help us debug this:

1 lock held by X/2749:
 #0:  (&dev->struct_mutex){--..}, at: [<f8353b97>] i915_gem_execbuffer+0x105/0xad7 [i915]

stack backtrace:
Pid: 2749, comm: X Not tainted 2.6.29.1-42.rc1.sb3.fc10.i686.PAE #1
Call Trace:                                                        
 [<c0739533>] ? printk+0x14/0x19                                   
 [<c04597d9>] print_circular_bug_tail+0x5d/0x68                    
 [<c045a26d>] __lock_acquire+0x87d/0xb1b                           
 [<c045a566>] lock_acquire+0x5b/0x81                               
 [<c049c6f2>] ? might_fault+0x48/0x85                              
 [<c049c70f>] might_fault+0x65/0x85                                
 [<c049c6f2>] ? might_fault+0x48/0x85                              
 [<c057dbcc>] copy_from_user+0x2f/0x117                            
 [<f8353d55>] i915_gem_execbuffer+0x2c3/0xad7 [i915]               
 [<c04b0e39>] ? __slab_alloc+0x3d0/0x445                           
 [<c049c72d>] ? might_fault+0x83/0x85                              
 [<c057dbcc>] ? copy_from_user+0x2f/0x117                          
 [<f81bb8c2>] drm_ioctl+0x1c4/0x241 [drm]                          
 [<f8353a92>] ? i915_gem_execbuffer+0x0/0xad7 [i915]               
 [<c04c19d7>] vfs_ioctl+0x55/0x6e                                  
 [<c04c1f28>] do_vfs_ioctl+0x46f/0x4a8                             
 [<c0580d08>] ? _raw_spin_unlock+0x74/0x78                         
 [<c04b6ca2>] ? fsnotify_modify+0x54/0x5f                          
 [<c04b6e93>] ? do_sync_write+0x0/0xee                             
 [<c04b77af>] ? vfs_write+0xa9/0xe4                                
 [<c04c1fa6>] sys_ioctl+0x45/0x5f                                  
 [<c040966b>] sysenter_do_call+0x12/0x3f
Comment 2 Sami Liedes 2009-04-07 16:53:22 UTC
Created attachment 20861 [details]
Dmesg output, except the very first seconds that I didn't capture

Here's my dmesg. The first seconds are missing, I guess, because of the vast amount of kobject: debug messages before the disk logging started.
Comment 3 Eric Anholt 2009-04-07 23:53:29 UTC
This has been fixed in master, but hasn't been sent to stable due to known regressions that will be fixed soon.
Comment 4 Sérgio M Basto 2009-04-08 01:15:07 UTC
(In reply to comment #3)
> This has been fixed in master, but hasn't been sent to stable due to known
> regressions that will be fixed soon.

the master of what ? (drm-next, libdrm, mesa , drv-intel or xserver)
Comment 5 Eric Anholt 2009-04-08 02:34:41 UTC
A lock order fix could only occur in the kernel, and master refers to linus.
Comment 6 Gordon Jin 2009-09-17 03:46:06 UTC
This should have been fixed in 2.6.30 according to Eric's comment.

Note You need to log in before you can comment on or make changes to this bug.