btrfs-progs v4.5.2 kernel 4.5.3-300.fc24.x86_64 mkfs.btrfs -draid1 -mraid1 /dev/VG/b1 /dev/VG/b2 ## where b1/b2 are LVM thinp LVs Summary: When a node containing a csum extent item is corrupt, Btrfs allows reading the file that the csum data applies to, and the read in data is bad (corrupt somehow). The example is with raid1 where it's unlikely to have two copies of the same node corrupt, but is concerning behavior for single copy metadata on SSD for example. Expectation: Either the file data passed to user space should be what's in the data extents, or I should get some failure to read the file: enoent, or may eperm? Reproduce steps: [root@f23s ~]# mount /dev/VG/b1 /mnt/0 [root@f23s ~]# sha256sum /mnt/0/openSUSE-Tumbleweed-NET-x86_64-Current.iso 5427573e30b49df3cede6608e1d5483aa7ddd60318f0c84d9d8965d74787416a /mnt/0/openSUSE-Tumbleweed-NET-x86_64-Current.iso ## that matches the published sha256 hash for this file [root@f23s ~]# umount /mnt/0 [root@f23s ~]# btrfs-debug-tree /dev/VG/b1 ##snipped, complete btrfs-debug-tree attached leaf 3317858304 items 1 free space 30 generation 74 owner 7 fs uuid f25ca737-51e5-459a-b461-6a92f1909803 chunk uuid defe7881-e3d2-4ddd-b3b3-128bbfc4d351 item 0 key (EXTENT_CSUM EXTENT_CSUM 2354839552) itemoff 55 itemsize 16228 extent csum item ## grab the logical address for the first node containing extent csum [root@f23s ~]# btrfs-map-logical -l 3317858304 /dev/VG/b1 mirror 1 logical 3317858304 physical 3297935360 device /dev/mapper/VG-b2 mirror 2 logical 3317858304 physical 3317858304 device /dev/mapper/VG-b1 ## identically corrupt both copies of the node checksum, first byte [root@f23s ~]# printf '\xa1' | dd conv=notrunc of=/dev/VG/b2 bs=1 seek=3297935360 1+0 records in 1+0 records out 1 byte (1 B) copied, 0.0285455 s, 0.0 kB/s [root@f23s ~]# printf '\xa1' | dd conv=notrunc of=/dev/VG/b1 bs=1 seek=3317858304 1+0 records in 1+0 records out 1 byte (1 B) copied, 0.0348683 s, 0.0 kB/s ## confirm the corruption [root@f23s ~]# btrfs check /dev/VG/b1 Checking filesystem on /dev/VG/b1 UUID: f25ca737-51e5-459a-b461-6a92f1909803 checking extents checksum verify failed on 3317858304 found EE51EFE0 wanted EE51EFA1 checksum verify failed on 3317858304 found EE51EFE0 wanted EE51EFA1 checksum verify failed on 3317858304 found EE51EFE0 wanted EE51EFA1 checksum verify failed on 3317858304 found EE51EFE0 wanted EE51EFA1 Csum didn't match owner ref check failed [3317858304 16384] checking free space cache checking fs roots checksum verify failed on 3317858304 found EE51EFE0 wanted EE51EFA1 checksum verify failed on 3317858304 found EE51EFE0 wanted EE51EFA1 checksum verify failed on 3317858304 found EE51EFE0 wanted EE51EFA1 checksum verify failed on 3317858304 found EE51EFE0 wanted EE51EFA1 Csum didn't match found 110346242 bytes used err is 1 total csum bytes: 91292 total tree bytes: 229376 total fs tree bytes: 32768 total extent tree bytes: 16384 btree space waste bytes: 128945 file data blocks allocated: 110624768 referenced 110624768 [root@f23s ~]# mount /dev/VG/b1 /mnt/0 [root@f23s ~]# sha256sum /mnt/0/openSUSE-Tumbleweed-NET-x86_64-Current.iso 455198311931e67f1fde04848f0d1d91978c4a7cc88e07c799d3fd67a218f5a6 /mnt/0/openSUSE-Tumbleweed-NET-x86_64-Current.iso Why is the data in this file returning at all? Why does it return incorrectly? Since the node is corrupt, the data block checksums in it are not reliable so I think it's better to stop the data transfer with an error to make the user aware of the problem.
Created attachment 216361 [details] btrfsdebugtree_bug118321.txt