Most recent kernel where this bug did not occur: ??? not tested with earlier than 2.6.19 Distribution: Gentoo Hardware Environment: * USB disk 120GB Western Digital, model 00UE-00KVT0 (according to udev), serial DEF10CD7F64C * SATA disk 200GB Seagate 7200.7, model ST3200822AS * Motherboard Asus A8N5X, nForce4 chipset Software Environment: * Offending file system: reiserfs v3.6, mounted with noatime,barrier=flush * dm-crypt using aes-256 with cbc-essiv:sha256; using assembly-optimized AES on x86_64 (CONFIG_CRYPTO_AES_X86_64) * LVM utilities version: 2.02.17 (2006-12-14) * LVM library version: 1.02.12 (2006-10-13) * LVM driver version: 4.11.0 * cryptsetup-luks 1.0.5 (user space interface to dm-crypt) ---- Problem Description: ---- I'm reporting this here because my bug reports half a year ago on the LKML didn't turn up any solutions. I'd like to stress here that this is a *FULLY REPRODUCIBLE* bug. After lvmove'ing a dm-crypted LVM volume from a SATA disk to a USB disk, reiserfs starts spewing I/O errors with "bio too big device dm-1 (256 > 240)" in dmesg; fsck reports no corruptions, and problems only occur at certain lengths in different files, such as 112k, 240k, 368k, and only with files that existed before, and not newly written files. When copying the partition back to its original disk, everything works again. The same issue applies to ext3 to a lesser extent: the error messages still appear in dmesg, but instead of breaking, ext3 is simply _insanely_ slow; much slower than the USB disk normally is. So this sounds like a workaround in ext3, rather than a bug in reiserfs. None of these problems occur either when: (a) dm-crypt is missing from the picture, or (b) the file system was initialized on the USB disk in the first place. The original bug reports on LKML contain much more detail: * http://article.gmane.org/gmane.linux.kernel/502020 * http://article.gmane.org/gmane.linux.file-systems/13619 * http://article.gmane.org/gmane.linux.kernel/508104 ---- Steps to reproduce: ---- $VOLGROUP - name of the LVM volume group containing both USB and SATA PVs $SATA_PV_DISK - physical volume path of the non-USB disk, e.g. /dev/sda5 $USB_PV_DISK - physical volume path of the USB disk, e.g. /dev/sdc1 mkdir /mnt/punchbag lvcreate -n punchbag --extents=60 primary $SATA_PV_DISK cryptsetup luksFormat /dev/mapper/$VOLGROUP-punchbag cryptsetup luksOpen /dev/mapper/$VOLGROUP-punchbag crypt-punchbag mkfs.reiserfs /dev/mapper/crypt-punchbag mount /dev/mapper/crypt-punchbag /mnt/punchbag -o rw,noatime,barrier=flush # write some stuff onto /mnt/punchbag dd if=/dev/zero of=/mnt/punchbag/junk bs=1M count=10 # make sure that nothing is written onto the disk hereafter mount /mnt/punchbag -o remount,ro pvmove -i2 -npunchbag $SATA_PV_DISK $USB_PV_DISK sync # drop caches: otherwise the newly-written file will already be cached echo 3 > /proc/sys/vm/drop_caches # witness the breakage sha1sum /mnt/punchbag/*
Created attachment 13598 [details] dm-crypt-bug.sh This is the script I wrote to trigger the bug. By the way, the first 'lvcreate' line in the bug report should have contained $VOLGROUP instead of "primary"
bugme-daemon@bugzilla.kernel.org wrote: > This is the script I wrote to trigger the bug. Thanks for report and script, I reproduced this bug finally. I will try to find what's happening there. Milan
Well the problem is not in dm-crypt, but is more generic - stacking block devices and block queue restrictions. Here are our simple stacked devices: Physical volume (USB vs SATA) \_LV-volume (primary-punchbag) \_CRYPT volume (crypt-punchbag) pvmove operation changes underlying Physical volume, unfortunately it has different hw parameters (max_hw_sectors). Mapping table for LV volume is correctly reloaded and block queue parameters are properly set, but this won't happen for CRYPT volume on top of it. So crypt volume still sends bigger bio that underlying device allows. Of course this will not happen if we use USB (device with smallest max_sectors) in the first place. We can simply move dm-crypt out of the picture here, instead of cryptsetup, try to use simple linear mapping over LV-volume and you get the same error: dmsetup create crypt-punchbag --table "0 `blockdev --getsize /dev/mapper/$VOLGROUP-punchbag` linear /dev/mapper/$VOLGROUP-punchbag 0" Also if crypt device mapping table is reloaded (to force new restriction to apply), it will work correctly: echo 3 > /proc/sys/vm/drop_caches dmsetup suspend crypt-punchbag dmsetup reload crypt-punchbag --table "`dmsetup table --showkeys crypt-punchbag`" dmsetup resume crypt-punchbag sha1sum /mnt/punchbag/junk Problem is now fully understood but solution is not so simple. This problem can happen with arbitrary block devices stacked mapping. For long-term - one possible solution is that block devices should be responsible for splitting requests (currently upper device should not send too big bio). Some patches already exist for this. But maybe some *workaround* at least for stacked device mapper block devices to work correctly in this situation can be used. Milan -- mbroz@redhat.com
(In reply to comment #3) > Problem is now fully understood but solution is not so simple. > This problem can happen with arbitrary block devices stacked mapping. > > For long-term - one possible solution is that block devices should be > responsible for splitting requests (currently upper device should not send > too > big bio). Some patches already exist for this. Disclaimer: I probably don't know what I'm talking about. Has this been thoroughly discussed on the LKML, can you point to the discussions? I'm not convinced that this should be the final solution in the long run, because in this case, a 256-block request would be split into 240 and 16 -- the second request is almost wasted. Though I realize that it could be viable for solving spurious race conditions when max_hw_sectors changes after a request had been submitted but not yet serviced.
*** Bug 13252 has been marked as a duplicate of this bug. ***
Has any progress been made on this? I see this error when resyncing an LVM-on-RAID setup to a USB disk. Step 1: Plug in a USB disk Step 2: Give it a compatible partition table Step 3: Add device to LVM-on-RAID setup (mdadm --manage /dev/md1 --add /dev/sdc6) I see some bio errors. This is on ubuntu 10.04.3 using kernel 2.6.32-33 Cheers, Michael
I get the impression that this will never be fixed. LVM was intended for mostly server uses. Non-server requirements (such as proper USB disk support, boot time reduction, etc) aren't a priority to LVM developers. Sad, but that's the way things are. Desktop distros, including Fedora, have already dropped LVM from their default install.
This problem affects also md raid1 devices. I do not see how to setup desktop machine without any RAID1/5/6 so that it works for the user unattended for some time with the unreliable drives we have today.
Yes, it does affect raid1. In my case, my laptop only has 2 esata interfaces, so when I want to sync my raid backup disk to alternate with offsite storage, I have to make use of the USB output.
There's hope yet... https://www.redhat.com/archives/dm-devel/2012-May/msg00200.html
looks like the fix is called "immutable bio vecs" https://lkml.org/lkml/2012/10/15/398
Alasdair, you marked this bug as resolved. Could you point to the fix? I just ran into it with 3.14.15 after adding a USB disk to a RAID1 array + LVM, but 3.14 included Kent Overstreet's immutable biovecs patch. Is there something else that's needed?
I'm also getting this problem. currently I'm using kernel 4.1.10 on NixOS. I have a a a single SSD in my notebook that's setup as raid1. On top of that I have dm-crypt running. Rationale: Attach external usb 3.0 drive, expand raid to 2 devices, make a hot sync, and detach again. Here's some dmesg output if that's any useful: https://paste.simplylinux.ch/view/d0d6899f