Latest working kernel version: unknown Earliest failing kernel version: Distribution: fedora Hardware Environment: Software Environment: Problem Description: INFO trace appears in the log - but no other problems visible Steps to reproduce: losetup /dev/loop0 file losetup -o 32256 /dev/loop1 /dev/loop0 losetup -d /dev/loop1 losetup -d /dev/loop0 -------------------- EXT3 FS on loop1, internal journal EXT3-fs: mounted filesystem with ordered data mode. ======================================================= [ INFO: possible circular locking dependency detected ] 2.6.25 #57 ------------------------------------------------------- losetup/9902 is trying to acquire lock: (&bdev->bd_mutex){--..}, at: [<ffffffff810e8db8>] __blkdev_put+0x38/0x1e0 but task is already holding lock: (&lo->lo_ctl_mutex){--..}, at: [<ffffffffa03e499e>] lo_ioctl+0x4e/0xb40 [loop] which lock already depends on the new lock. the existing dependency chain (in reverse order) is: -> #1 (&lo->lo_ctl_mutex){--..}: [<ffffffff81066249>] __lock_acquire+0x1149/0x11d0 [<ffffffff81066366>] lock_acquire+0x96/0xe0 [<ffffffff812ebfd0>] mutex_lock_nested+0xc0/0x330 [<ffffffffa03e3384>] lo_open+0x34/0x50 [loop] [<ffffffff810e9329>] do_open+0xa9/0x350 [<ffffffff810e960d>] blkdev_open+0x3d/0x90 [<ffffffff810b7f94>] __dentry_open+0x114/0x390 [<ffffffff810b82f4>] nameidata_to_filp+0x44/0x60 [<ffffffff810c76ab>] do_filp_open+0x1fb/0xa40 [<ffffffff810b7d96>] do_sys_open+0x76/0x100 [<ffffffff810b7e4b>] sys_open+0x1b/0x20 [<ffffffff8100c5db>] system_call_after_swapgs+0x7b/0x80 [<ffffffffffffffff>] 0xffffffffffffffff -> #0 (&bdev->bd_mutex){--..}: [<ffffffff810660e5>] __lock_acquire+0xfe5/0x11d0 [<ffffffff81066366>] lock_acquire+0x96/0xe0 [<ffffffff812ebfd0>] mutex_lock_nested+0xc0/0x330 [<ffffffff810e8db8>] __blkdev_put+0x38/0x1e0 [<ffffffff810e8f6b>] blkdev_put+0xb/0x10 [<ffffffff810e8f98>] blkdev_close+0x28/0x40 [<ffffffff810bb3c5>] __fput+0xc5/0x1f0 [<ffffffff810bb50d>] fput+0x1d/0x30 [<ffffffffa03e491d>] loop_clr_fd+0x1ad/0x1e0 [loop] [<ffffffffa03e4acf>] lo_ioctl+0x17f/0xb40 [loop] [<ffffffff8116b36d>] blkdev_driver_ioctl+0x8d/0xa0 [<ffffffff8116b5fd>] blkdev_ioctl+0x27d/0x7e0 [<ffffffff810e86db>] block_ioctl+0x1b/0x20 [<ffffffff810c8ea1>] vfs_ioctl+0x31/0xa0 [<ffffffff810c9193>] do_vfs_ioctl+0x283/0x2f0 [<ffffffff810c9299>] sys_ioctl+0x99/0xa0 [<ffffffff8100c5db>] system_call_after_swapgs+0x7b/0x80 [<ffffffffffffffff>] 0xffffffffffffffff other info that might help us debug this: 1 lock held by losetup/9902: #0: (&lo->lo_ctl_mutex){--..}, at: [<ffffffffa03e499e>] lo_ioctl+0x4e/0xb40 [loop] stack backtrace: Pid: 9902, comm: losetup Not tainted 2.6.25 #57 Call Trace: [<ffffffff81064e44>] print_circular_bug_tail+0x84/0x90 [<ffffffff810660e5>] __lock_acquire+0xfe5/0x11d0 [<ffffffff81064e90>] ? check_usage+0x40/0x2b0 [<ffffffff810e8db8>] ? __blkdev_put+0x38/0x1e0 [<ffffffff81066366>] lock_acquire+0x96/0xe0 [<ffffffff810e8db8>] ? __blkdev_put+0x38/0x1e0 [<ffffffff812ebfd0>] mutex_lock_nested+0xc0/0x330 [<ffffffff810e8db8>] ? __blkdev_put+0x38/0x1e0 [<ffffffff810e8db8>] __blkdev_put+0x38/0x1e0 [<ffffffff810e8f6b>] blkdev_put+0xb/0x10 [<ffffffff810e8f98>] blkdev_close+0x28/0x40 [<ffffffff810bb3c5>] __fput+0xc5/0x1f0 [<ffffffff810bb50d>] fput+0x1d/0x30 [<ffffffffa03e491d>] :loop:loop_clr_fd+0x1ad/0x1e0 [<ffffffffa03e4acf>] :loop:lo_ioctl+0x17f/0xb40 [<ffffffff81013458>] ? native_sched_clock+0x78/0x80 [<ffffffff81065464>] ? __lock_acquire+0x364/0x11d0 [<ffffffff81013458>] ? native_sched_clock+0x78/0x80 [<ffffffff81013458>] ? native_sched_clock+0x78/0x80 [<ffffffff81065464>] ? __lock_acquire+0x364/0x11d0 [<ffffffff81013458>] ? native_sched_clock+0x78/0x80 [<ffffffff81065464>] ? __lock_acquire+0x364/0x11d0 [<ffffffff81062afa>] ? get_lock_stats+0x2a/0x70 [<ffffffff81062b4e>] ? put_lock_stats+0xe/0x30 [<ffffffff8105a3e3>] ? down+0x33/0x50 [<ffffffff812ee295>] ? _spin_unlock_irqrestore+0x65/0x90 [<ffffffff810649e1>] ? trace_hardirqs_on+0x131/0x190 [<ffffffff812ee275>] ? _spin_unlock_irqrestore+0x45/0x90 [<ffffffff8105a3e3>] ? down+0x33/0x50 [<ffffffff8116b36d>] blkdev_driver_ioctl+0x8d/0xa0 [<ffffffff8116b5fd>] blkdev_ioctl+0x27d/0x7e0 [<ffffffff81013458>] ? native_sched_clock+0x78/0x80 [<ffffffff81065464>] ? __lock_acquire+0x364/0x11d0 [<ffffffff810c4a21>] ? putname+0x31/0x50 [<ffffffff810b39f5>] ? check_object+0x265/0x270 [<ffffffff810b3230>] ? init_object+0x50/0x90 [<ffffffff810e86db>] block_ioctl+0x1b/0x20 [<ffffffff810c8ea1>] vfs_ioctl+0x31/0xa0 [<ffffffff810c9193>] do_vfs_ioctl+0x283/0x2f0 [<ffffffff812edbd9>] ? trace_hardirqs_on_thunk+0x35/0x3a [<ffffffff810c9299>] sys_ioctl+0x99/0xa0 [<ffffffff8100c5db>] system_call_after_swapgs+0x7b/0x80
This only happened when the backing file is a block device file. But IMO it's safe. There's two "bd_mutex -> lo_ctl_mutex" conditions 1. open The lo_refcnt could be changed to a atomic_t, it's not necessary using lo_ctl_mutex. 2. release If LO_FLAGS_AUTOCLEAR is set the lo_release will call loop_clr_fd, how can we fix it?
Created attachment 15904 [details] don't hold mutex while fput in loop_clr_fd Don't hold the lo_ctl_mutex while fput to silence lockdep. Jens, what do you think about this?
Ping, we hit this bug even in nowadays kernel. Care to send the patch to the list?
> ------- Comment #3 from jirislaby@gmail.com 2009-02-27 00:13 ------- > Ping, we hit this bug even in nowadays kernel. > > Care to send the patch to the list? > > Please see: http://lkml.org/lkml/2008/4/27/322 Al Viro said there's side effect, but I have no better fixes for this, maybe I need learn more about block/fs part.
See http://lkml.org/lkml/2009/3/12/86 Fix is in http://git.kernel.org/?p=linux/kernel/git/axboe/linux-2.6-block.git;a=commit;h=7345a058962e92a284eeb27e73cae29686766421