Created attachment 273975 [details] dmesg output We recently upgraded from 3.16(debian) to 4.9(debian) (and I tested it on 4.14(debian) and vanilla 4.15) and noticed, that our temporary raid1 setup created via device-mapper's raid target (see script below) hangs on writes. One can see an in-flight request never finishing in /proc/diskstats and any dmsetup calls trying to remove/suspend the device will hang, too. # grep dm /proc/diskstats 254 0 dm-0 87 0 4152 0 1 0 8 0 1 13728 13728 Using a raid1 without metadata devices was working in 3.16 and 3.2. Since the kernel doesn't show any error when creating the device-mapper target and seems to successfully sync the devices, I suspect the now seen behaviour is a bug. I've attached the dmesg output (including echo t > /proc/sysrq-trigger). You can reproduce the problem with this little script: #!/bin/sh dd if=/dev/zero of=./disk1 bs=10M count=10 dd if=/dev/zero of=./disk2 bs=10M count=10 LO1=$(losetup --show -f ./disk1) LO2=$(losetup --show -f ./disk2) SIZE=$(blockdev --getsz ${LO1}) echo "0 ${SIZE} linear ${LO1} 0" | dmsetup create dm-raid1-bug dmsetup suspend dm-raid1-bug echo "0 ${SIZE} raid raid1 2 0 sync 2 - ${LO1} - ${LO2}" | dmsetup load dm-raid1-bug dmsetup resume dm-raid1-bug dmsetup message dm-raid1-bug 0 resync echo "Waiting for sync to finish." while ! dmsetup status dm-raid1-bug | grep -q idle; do sleep 1 done echo "Writing to first sector." dd if=/dev/zero of=/dev/mapper/dm-raid1-bug bs=512 count=1
Without any metadata devices and hence no raid superblock(s) md.c:md_write_start() still schedules waiting for them to be written causing deadlock. Solution is to not wait in this case. Quick hack allowing your setup to succeed: bool md_write_start(struct mddev *mddev, struct bio *bi) { int did_change = 0; + struct md_rdev *rdev; + if (bio_data_dir(bi) != WRITE) return true; @@ -8081,6 +8086,11 @@ bool md_write_start(struct mddev *mddev, struct bio *bi) rcu_read_unlock(); if (did_change) sysfs_notify_dirent_safe(mddev->sysfs_state); + rdev_for_each(rdev, mddev) + if (rdev->sb_page) + goto wait; + return true; +wait: wait_event(mddev->sb_wait, !test_bit(MD_SB_CHANGE_PENDING, &mddev->sb_flags) || mddev->suspended);
FWIW: you should be able to hit this deadlock with any raid1/4/5/6/10 mapping
Revised patch sent off to dm-devel, linux-kernel, linux-raid.