Kernel Bug Tracker – Bug 29172
releasing loop on top of other loop leads to deadlock
Last modified: 2011-03-06 12:33:41 UTC
Following steps lead to deadlock in kernel:
dd if=/dev/zero of=img bs=512 count=1000
losetup -f img
mount -t ext2 -o loop /dev/loop0 mnt
The umount gets stuck in the kernel (verified with strace).
[ 609.086012] umount D 0000000000000000 0 3832 3831 0x00000001
[ 609.086012] ffff8800ae4c9d90 0000000000000086 ffff8800b882f830 0000000000012640
[ 609.086012] ffff8800ae4c9fd8 0000000000012640 ffff8800ae4c9fd8 ffff8800ae4c9fd8
[ 609.086012] ffff8800ae58e5b0 0000000000012640 0000000000012640 ffff8800ae4c8000
[ 609.086012] Call Trace:
[ 609.086012] [<ffffffff8152197b>] __mutex_lock_slowpath+0x11b/0x1d0
[ 609.086012] [<ffffffff8152151a>] mutex_lock+0x1a/0x40
[ 609.086012] [<ffffffff8118583d>] __blkdev_put+0x3d/0x190
[ 609.086012] [<ffffffff811547fe>] __fput+0xae/0x240
[ 609.086012] [<ffffffffa0579b8b>] loop_clr_fd+0x1fb/0x260 [loop]
[ 609.086012] [<ffffffffa0579c6a>] lo_release+0x7a/0x80 [loop]
[ 609.086012] [<ffffffff811858d0>] __blkdev_put+0xd0/0x190
[ 609.086012] [<ffffffff81155573>] deactivate_locked_super+0x43/0x70
[ 609.086012] [<ffffffff811705a9>] sys_umount+0x59/0xd0
[ 609.086012] [<ffffffff8100318b>] tracesys+0xd9/0xde
[ 609.086012] [<00007f222827f8c7>] 0x7f222827f8c7
This is a regression. It broke somewhere between 2.6.36 and 2.6.37. I tried bisecting[*] the problem, which resulted in commit 5704e44d283 being identified as the first bad commit. This looks weird to me: I would rather suspect commit 2a48fc0ab24241755dc9, which introduced the private loop_mutex as part of the BKL removal process.
From the stacktrace it seems to depend on the LO_FLAGS_AUTOCLEAR flag set.
I removed the locking/unlocking of loop_mutex from lo_release() and the problem disappeared. I don't know if this is the proper solution as I don't understand why/if anything in the loop_clr_fd() needs to be protected by the loop_mutex.
[*] For the first time, so it is likely that I did something wrong.
Still happens with 2.6.38-rc6. Sent patch that fixes it for me:
merged for .38-rc8 (or final):
Author: Petr Uzel <firstname.lastname@example.org>
Date: Thu Mar 3 11:48:50 2011 -0500
block: kill loop_mutex