Hi, This is related to earlier bugs that I have filed, where corrupting a certain specific data structure in SSD mode BTRFS renders the filesystem unusable. Please see related bugs list for your reference. I tried corrupting root-tree node of btrfs in SSD mode to check if mount operation would work on a corrupted root tree node. Unlike superblock data structure or chunk-tree data structure corruptions, which are recoverable using btrfs recover utility, I could not find any tool to recover root tree node, which is essential to mount the file system. I tried running btrfsck, but it couldn't correct the corruption. For btrfs scrub, the prerequisite is that the file system be mounted before starting the scrub. but mount operation does not succeed after root tree node corruption so I am unable to run btrfs scrub. Steps to replicate this issue: 1. create 2 disks, /dev/sdb and /dev/sdc of 1.5 GB each. 2. attach them to QEMU. make both disks SSD by doing echo 0 > /sys/dev/block/<dev_id>/queue/rotational where dev_id is the dev id for /dev/sdb or /dev/sdc as obtained using lsblk -l command. for my case, it is 8:16 for sdb and 8:32 for sdc. 3. create btrfs file system on /dev/sdc, mount it, run a workload on it (attached for your reference), unmount it. sudo mkfs.btrfs -f /dev/sdc sudo mount /dev/sdc /mnt ./cmd2000 umount -l /mnt copy /dev/sdc to /dev/sdb using pv command (faster than dd). sudo sh -c 'pv < /dev/sdc > /dev/sdb' we are essentially keeping a backup disk (sdc), because I noticed that everytime I mount the filesystem, the roottree node location gets changed due to btrfs cow property. keeping backup in /dev/sdc helps get the block number of roottree deterministic. 4. we now note the sequence of blocks that are read from btrfs file system during mount operation. for this, use a device mapper (dm-track.c, Makefile attached) that will print both the block number and the node type that gets accessed. create folder dm-track at your linux source code folder linux/fs/dm-track mv dm-track.c Makefile linux/fs/dm-track cd linux/fs/dm-track make ( creates and links dm-track.ko to linux kernel ) sudo dmsetup create dmtarget --table "0 3145728 track /dev/sdb 0" --readahead none #at this point some garbage gets printed in /var/log/syslog, please ignore #however, next instruction (mount) is important, as this is where we get block numbers and block types: sudo mount -t btrfs -o nospace_cache,nobarrier,flushoncommit,noinode_cache /dev/mapper/dmtarget /mnt Note I am removing caches because it leads to too many extrenaous reads and makes readability of logs difficult. 5. If everything works so far, this should print a sequence of log statements, similar to the following output: Feb 16 20:44:47 vm-guest kernel: [17417.344398] track_end_io():block = 17 Feb 16 20:44:47 vm-guest kernel: [17417.344401] XXX generation = 28229632, owner = 1064960:default blkNo=17 nritems= 0 level 0 Feb 16 20:44:47 vm-guest kernel: [17417.345023] track_end_io():block = 17 Feb 16 20:44:47 vm-guest kernel: [17417.345025] XXX generation = 28229632, owner = 1064960:default blkNo=17 nritems= 0 level 0 Feb 16 20:44:47 vm-guest kernel: [17417.346457] track_end_io():block = 17 Feb 16 20:44:47 vm-guest kernel: [17417.346460] XXX generation = 28229632, owner = 1064960:default blkNo=17 nritems= 0 level 0 Feb 16 20:44:47 vm-guest kernel: [17417.346546] BTRFS info (device dm-0): disabling disk space caching Feb 16 20:44:47 vm-guest kernel: [17417.346549] BTRFS info (device dm-0): turning off barriers Feb 16 20:44:47 vm-guest kernel: [17417.346553] BTRFS info (device dm-0): turning on flush-on-commit Feb 16 20:44:47 vm-guest kernel: [17417.346557] BTRFS info (device dm-0): has skinny extents Feb 16 20:44:47 vm-guest kernel: [17417.352241] track_end_io():block = 264 Feb 16 20:44:47 vm-guest kernel: [17417.352244] XXX generation = 8, owner = 3:chunkTree blkNo=264 nritems= 7 level 0 Feb 16 20:44:47 vm-guest kernel: [17417.352815] track_end_io():block = 6896 Feb 16 20:44:47 vm-guest kernel: [17417.352818] XXX generation = 10, owner = 1:rootTree blkNo=6896 nritems= 19 level 0 Feb 16 20:44:47 vm-guest kernel: [17417.353399] track_end_io():block = 6900 Feb 16 20:44:47 vm-guest kernel: [17417.353402] XXX generation = 10, owner = 2:extentTree blkNo=6900 nritems= 7 level 1 Feb 16 20:44:47 vm-guest kernel: [17417.353979] track_end_io():block = 5456 Feb 16 20:44:47 vm-guest kernel: [17417.353982] XXX generation = 8, owner = 4:devTree blkNo=5456 nritems= 7 level 0 Feb 16 20:44:47 vm-guest kernel: [17417.354499] track_end_io():block = 6916 Feb 16 20:44:47 vm-guest kernel: [17417.354509] XXX generation = 10, owner = 7:csumTree blkNo=6916 nritems= 14 level 1 Feb 16 20:44:47 vm-guest kernel: [17417.354991] track_end_io():block = 5392 Feb 16 20:44:47 vm-guest kernel: [17417.354992] XXX generation = 7, owner = 9:uuidTree blkNo=5392 nritems= 0 level 0 Feb 16 20:44:47 vm-guest kernel: [17417.355389] track_end_io():block = 6908 Feb 16 20:44:47 vm-guest kernel: [17417.355391] XXX generation = 10, owner = 2:extentTree blkNo=6908 nritems= 208 level 0 Feb 16 20:44:47 vm-guest kernel: [17417.355837] track_end_io():block = 6928 Feb 16 20:44:47 vm-guest kernel: [17417.355838] XXX generation = 10, owner = 2:extentTree blkNo=6928 nritems= 209 level 0 Feb 16 20:44:47 vm-guest kernel: [17417.356116] track_end_io():block = 6924 Feb 16 20:44:47 vm-guest kernel: [17417.356118] XXX generation = 10, owner = 2:extentTree blkNo=6924 nritems= 124 level 0 Feb 16 20:44:47 vm-guest kernel: [17417.356407] track_end_io():block = 6912 Feb 16 20:44:47 vm-guest kernel: [17417.356409] XXX generation = 10, owner = 2:extentTree blkNo=6912 nritems= 125 level 0 Feb 16 20:44:47 vm-guest kernel: [17417.356763] track_end_io():block = 6904 Feb 16 20:44:47 vm-guest kernel: [17417.356765] XXX generation = 10, owner = 2:extentTree blkNo=6904 nritems= 259 level 0 Feb 16 20:44:47 vm-guest kernel: [17417.357420] track_end_io():block = 1320 Feb 16 20:44:47 vm-guest kernel: [17417.357423] XXX generation = 4, owner = 18446744073709551607:default blkNo=1320 nritems= 2 level 0 Feb 16 20:44:47 vm-guest kernel: [17417.357807] track_end_io():block = 6240 Feb 16 20:44:47 vm-guest kernel: [17417.357809] XXX generation = 9, owner = 5:fsTree blkNo=6240 nritems= 253 level 1 Feb 16 20:44:47 vm-guest kernel: [17417.358169] track_end_io():block = 6776 Feb 16 20:44:47 vm-guest kernel: [17417.358171] XXX generation = 9, owner = 5:fsTree blkNo=6776 nritems= 127 level 0 Feb 16 20:44:47 vm-guest kernel: [17417.358537] track_end_io():block = 6264 Feb 16 20:44:47 vm-guest kernel: [17417.358539] XXX generation = 9, owner = 5:fsTree blkNo=6264 nritems= 169 level 0 6. note the line printing the root tree log: owner = 1:rootTree blkNo=6896 nritems= 19 level 0 from this, we can determine what the offset of root tree would be in our backup file system. we compute the offset as follows: 6896 * 4096 = 28246016 7. unmount the filesystem. delete dmtarget. make a copy from backup file system (which would have the same root-tree location as the first run). umount -l /mnt sudo dmsetup remove dmtarget sudo sh -c 'pv < /dev/sdc > /dev/sdb' 8. corrupt the rootTree block using corruptoffset.c program at the offset where root-tree exists (attached corruptoffset.c) sudo ./corruptoffset /dev/sdb 28246016 4096 9. try mounting the file system /dev/sdb: sudo mount -t btrfs -o nospace_cache,nobarrier,flushoncommit,noinode_cache /dev/sdb /mnt mount: wrong fs type, bad option, bad superblock on /dev/sdb, missing codepage or helper program, or other error In some cases useful info is found in syslog - try dmesg | tail or so. 10. btrfsck output sudo btrfsck /dev/sdb checksum verify failed on 28246016 found D842F9FA wanted 58585858 checksum verify failed on 28246016 found D842F9FA wanted 58585858 bytenr mismatch, want=28246016, have=6365935209750747224 Couldn't setup extent tree ERROR: cannot open file system 11. btrfs restore <no option to recover roottree node> $ sudo btrfs restore -l /dev/sdb checksum verify failed on 28246016 found D842F9FA wanted 58585858 checksum verify failed on 28246016 found D842F9FA wanted 58585858 bytenr mismatch, want=28246016, have=6365935209750747224 Couldn't setup extent tree tree key (EXTENT_TREE ROOT_ITEM 0) 28246016 level 1 tree key (DEV_TREE ROOT_ITEM 0) 22331392 level 0 tree key (FS_TREE ROOT_ITEM 0) 25542656 level 1 tree key (CSUM_TREE ROOT_ITEM 0) 28311552 level 1 tree key (UUID_TREE ROOT_ITEM 0) 22069248 level 0 tree key (DATA_RELOC_TREE ROOT_ITEM 0) 5390336 level 0 12. btrfs scrub requires file system to be mounted HDD mode: ======== In HDD mode, the corruption can be corrected from a backup copy of root-tree block, because metadata replication is enabled. block number for root-tree in HDD mode - 10888 sudo ./corruptoffset /dev/sdb 44597248 4096 #block 10888 * 4096 = 44597248. sudo mount /dev/sdb /mnt/ Feb 16 21:16:50 vm-guest kernel: [19340.189229] BTRFS info (device sdb): disk space caching is enabled Feb 16 21:16:50 vm-guest kernel: [19340.189232] BTRFS info (device sdb): has skinny extents Feb 16 21:16:50 vm-guest kernel: [19340.193918] BTRFS error (device sdb): bad tree block start 6365935209750747224 36208640 Feb 16 21:16:50 vm-guest kernel: [19340.194794] BTRFS info (device sdb): read error corrected: ino 0 off 36208640 (dev /dev/sdb sector 87104) Feb 16 21:16:50 vm-guest kernel: [19340.195258] BTRFS info (device sdb): read error corrected: ino 0 off 36212736 (dev /dev/sdb sector 87112) Feb 16 21:16:50 vm-guest kernel: [19340.195717] BTRFS info (device sdb): read error corrected: ino 0 off 36216832 (dev /dev/sdb sector 87120) Feb 16 21:16:50 vm-guest kernel: [19340.196189] BTRFS info (device sdb): read error corrected: ino 0 off 36220928 (dev /dev/sdb sector 87128) (36208640 / 512) * 4096 = 10888 block number which was corrected during read.
Created attachment 274209 [details] workload
Created attachment 274211 [details] dm-track.c
Created attachment 274213 [details] corruptoffset.c
Created attachment 274215 [details] Makefile for dm-track please create dm-track folder at location your_linux_source_code/fs/dm-track place dm-track.c and Makefile in this folder.