Bug 202985 - loop device deadlocks
Summary: loop device deadlocks
Status: NEW
Alias: None
Product: IO/Storage
Classification: Unclassified
Component: Other (show other bugs)
Hardware: All Linux
: P1 normal
Assignee: Jan Kara
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2019-03-21 10:23 UTC by Jan Kara
Modified: 2019-03-21 10:25 UTC (History)
0 users

See Also:
Kernel Version: 4.4-stable
Subsystem:
Regression: No
Bisected commit-id:


Attachments
Kernel messages from hung kernel (262.14 KB, text/plain)
2019-03-21 10:23 UTC, Jan Kara
Details

Description Jan Kara 2019-03-21 10:23:32 UTC
Created attachment 281943 [details]
Kernel messages from hung kernel

Systemd testsuite deadlocks when running with recent 4.4-stable kernels (note the original report is actually for our distribution (SLES15) 4.12-based kernel but the commit present also in 4.4-stable tree has been identified as the culprit.
Comment 1 Jan Kara 2019-03-21 10:24:21 UTC
Looking at the backtraces I can see:

systemd-udevd
__mutex_lock.isra.5+0x178/0x4a0
lo_release+0x44/0xa0 [loop]
__blkdev_put+0x19d/0x1f0
blkdev_close+0x21/0x30
__fput+0xd2/0x210

-> so it holds bdev->bd_mutex, loop_index_mutex, waits for loop_ctl_mutex.

losetup
__mutex_lock.isra.5+0x178/0x4a0
blkdev_reread_part+0x16/0x30
loop_reread_partitions+0x23/0x50 [loop]
loop_set_status+0x4c8/0x530 [loop]
loop_set_status64+0x40/0x70 [loop]
lo_ioctl+0xfb/0x6f0 [loop]
blkdev_ioctl+0x847/0x940
block_ioctl+0x39/0x40
do_vfs_ioctl+0x90/0x5f0

-> so it holds loop_ctl_mutex and waits for bdev->bd_mutex.

So a classical ABBA deadlock. And actually a one that is there for a long time and that got fixed upstream by a rather intrusive locking rework for loop device.

Is this reproducible or a one-time occurrence?

I'm asking because I think that commit 8f611d6dde in our tree (block/loop: Use global lock for ioctl() operation (bsc#1124974)) could make the long-present deadlock easier to hit because loop_ctl_mutex got converted from a per-device one to a global one.
Comment 2 Jan Kara 2019-03-21 10:25:19 UTC
Answer wrt reproducibility:

@Jan Kara .. is reproduced every run of systemd_testuite in openQA with kernel update candidate.

Note You need to log in before you can comment on or make changes to this bug.