Bug 200439

Summary: LVM snapshoting broke in 4.16 (Failed to lock logical volume)
Product: IO/Storage Reporter: WGH (wgh)
Component: LVM2/DMAssignee: Alasdair G Kergon (agk)
Status: RESOLVED INVALID    
Severity: normal    
Priority: P1    
Hardware: All   
OS: Linux   
Kernel Version: 4.16.18-gentoo Subsystem:
Regression: Yes Bisected commit-id:

Description WGH 2018-07-07 15:53:09 UTC
When I updated from 4.14 to 4.16, my LVM snapshotting script broke for no apparent reason. When I boot the system with older 4.14, the script works again.

I don't really have much expertise with LVM, so it's quite possible that I'm doing something completely wrong, though.

# sudo vgdisplay 
  --- Volume group ---
  VG Name               vg0
  System ID             
  Format                lvm2
  Metadata Areas        1
  Metadata Sequence No  61
  VG Access             read/write
  VG Status             resizable
  MAX LV                0
  Cur LV                2
  Open LV               2
  Max PV                0
  Cur PV                1
  Act PV                1
  VG Size               476.81 GiB
  PE Size               4.00 MiB
  Total PE              122063
  Alloc PE / Size       120055 / 468.96 GiB
  Free  PE / Size       2008 / 7.84 GiB
  VG UUID               sDdIeh-9cec-WdaN-yRfZ-C31m-xgfw-Ta4sOe

# lvdisplay 
  --- Logical volume ---
  LV Path                /dev/vg0/lvol_rootfs
  LV Name                lvol_rootfs
  VG Name                vg0
  LV UUID                2MEjBM-SN2d-OQzM-NpzD-loPK-zFMf-hfeWSY
  LV Write Access        read/write
  LV Creation host, time ubuntu, 2017-08-04 20:39:45 +0300
  LV Status              available
  # open                 1
  LV Size                452.96 GiB
  Current LE             115959
  Segments               1
  Allocation             inherit
  Read ahead sectors     auto
  - currently set to     256
  Block device           253:1
   
  --- Logical volume ---
  LV Path                /dev/vg0/lvol_swap
  LV Name                lvol_swap
  VG Name                vg0
  LV UUID                h1cR1Z-WMQj-BfQY-yKyC-fRHN-QIs7-JNyEPe
  LV Write Access        read/write
  LV Creation host, time lenovo-gentoo, 2017-08-16 14:11:56 +0300
  LV Status              available
  # open                 2
  LV Size                16.00 GiB
  Current LE             4096
  Segments               1
  Allocation             inherit
  Read ahead sectors     auto
  - currently set to     256
  Block device           253:0

My script has the following line, and it fails like this:
+ lvcreate --size 5G --snapshot --name snap0 --permission r /dev/mapper/vg0-lvol_rootfs
  device-mapper: create ioctl on vg0-snap0-cowLVM-sDdIeh9cecWdaNyRfZC31mxgfwTa4sOeHMJXVOykGVRtfP6Aii7IHvwS066AOLOM-cow failed: Device or resource busy
  Failed to lock logical volume vg0/lvol_rootfs.
  Aborting. Manual intervention required.

At the same time, some errors appear in dmesg as well:
[   26.145279] generic_make_request: Trying to write to read-only block-device dm-3 (partno 0)
[   26.145288] device-mapper: persistent snapshot: write_header failed
[   26.145847] device-mapper: table: 253:4: snapshot: Failed to read snapshot metadata
[   26.145851] device-mapper: ioctl: error adding target to table
Comment 1 WGH 2018-07-25 18:30:10 UTC
Still happens to me as of 4.17.9.
Comment 2 WGH 2018-08-02 12:20:23 UTC
So I bisected the bug on vanilla kernel.

First bad commit is

commit 721c7fc701c71f693307d274d2b346a1ecd4a534 (HEAD, refs/bisect/bad)
Author: Ilya Dryomov <idryomov@gmail.com>
Date:   Thu Jan 11 14:09:11 2018 +0100

    block: fail op_is_write() requests to read-only partitions
    
    Regular block device writes go through blkdev_write_iter(), which does
    bdev_read_only(), while zeroout/discard/etc requests are never checked,
    both userspace- and kernel-triggered.  Add a generic catch-all check to
    generic_make_request_checks() to actually enforce ioctl(BLKROSET) and
    set_disk_ro(), which is used by quite a few drivers for things like
    snapshots, read-only backing files/images, etc.
    
    Reviewed-by: Sagi Grimberg <sagi@grimberg.me>
    Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
    Signed-off-by: Jens Axboe <axboe@kernel.dk>
Comment 3 WGH 2018-08-02 15:39:04 UTC
This was fixed on lvm2 (userspace) side in commit https://sourceware.org/git/?p=lvm2.git;a=commit;h=a6fdb9d9d70f51c49ad11a87ab4243344e6701a3.