I hit a corruption when I use "XFS_IOC_GOINGDOWN" ioctl to simulate powerfailure: Phase 1 - find and verify superblock... Phase 2 - using internal log - zero log... - scan filesystem freespace and inode maps... - found root inode chunk Phase 3 - for each AG... - scan (but don't clear) agi unlinked lists... - process known inodes and perform inode discovery... - agno = 0 - agno = 1 - agno = 2 - agno = 3 mismatch between format (2) and size (0) in symlink ino 50358026 bad data fork in symlink 50358026 would have cleared inode 50358026 - process newly discovered inodes... Phase 4 - check for duplicate blocks... - setting up duplicate extent list... - check for inodes claiming duplicate blocks... - agno = 0 - agno = 1 - agno = 3 - agno = 2 mismatch between format (2) and size (0) in symlink ino 50358026 bad data fork in symlink 50358026 would have cleared inode 50358026 No modify flag set, skipping phase 5 Phase 6 - check inode connectivity... - traversing filesystem ... - traversal finished ... - moving disconnected inodes to lost+found ... Phase 7 - verify link counts... No modify flag set, skipping filesystem flush and exiting. Version-Release number of selected component (if applicable): xfsprogs 4.16.1 kernel 4.17 How reproducible: about 1% Steps to Reproduce: 1. prepare a 30G local device 2. mkfs.xfs -f $dev 3. mount $dev $mnt 4. fs_mark -d $mnt -n 10000 -N 10 -D 1000 -s 409600 -r 8 -t 10 -k & 5. -d $SCRATCH_MNT/ -n 1000 -p 100 -c -l 0 & 6. wait 1~15min randomly 7. xfs_io -xc "shutdown" $mnt 8. kill fs_mark and fssstress processes 9. umount $dev 10. mount $dev $mnt 11. umount $dev 12. xfs_repair -n $dev 13. loop run above 12 steps until reproduce this bug. Additional info: XFS info: meta-data=/dev/mapper/xxxx-xfscratch isize=512 agcount=4, agsize=1966080 blks = sectsz=512 attr=2, projid32bit=1 = crc=1 finobt=1, sparse=1, rmapbt=0 = reflink=0 data = bsize=4096 blocks=7864320, imaxpct=25 = sunit=0 swidth=0 blks naming =version 2 bsize=4096 ascii-ci=0, ftype=1 log =internal log bsize=4096 blocks=3840, version=2 = sectsz=512 sunit=0 blks, lazy-co
dmesg log: [586904.373660] XFS (dm-3): Starting recovery (logdev: internal) [586914.907824] XFS (dm-3): Metadata corruption detected at xfs_dinode_verify.part.5+0x27c/0x450 [xfs], inode 0x300670a dinode [586914.920278] XFS (dm-3): Unmount and run xfs_repair [586914.925734] XFS (dm-3): First 128 bytes of corrupted metadata buffer: [586914.933034] 00000000bfd2c617: 49 4e a1 ff 03 02 00 00 00 00 00 00 00 00 00 00 IN.............. [586914.942844] 000000008bc40e32: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ [586914.952644] 0000000047723d40: 5b 20 a7 8a 1e be a2 95 5b 20 a7 8a 1e be a2 95 [ ......[ ...... [586914.962452] 00000000a7866534: 5b 20 a7 cc 19 76 5e 4a 00 00 00 00 00 00 00 00 [ ...v^J........ [586914.972261] 00000000d84c3ac0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ [586914.982058] 000000009e465a94: 00 00 00 02 00 00 00 00 00 00 00 00 07 43 52 07 .............CR. [586914.991866] 000000006db3fac1: ff ff ff ff d2 03 50 9f 00 00 00 00 00 00 00 09 ......P......... [586915.001675] 00000000c42b6120: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ [586915.011498] XFS (dm-3): Ending recovery (logdev: internal)
dchinner's looking at this one
[PATCH] xfs: symlinks can be zero length during log recovery
I still can reproduce this bug on linux v4.20-rc1. So we haven't merged this patch (in comment 3), right? Phase 1 - find and verify superblock... - reporting progress in intervals of 15 minutes Phase 2 - using internal log - zero log... - scan filesystem freespace and inode maps... - 14:10:29: scanning filesystem freespace - 16 of 16 allocation groups done - found root inode chunk Phase 3 - for each AG... - scan (but don't clear) agi unlinked lists... - 14:10:29: scanning agi unlinked lists - 16 of 16 allocation groups done - process known inodes and perform inode discovery... - agno = 15 - agno = 0 - agno = 1 mismatch between format (2) and size (0) in symlink ino 125888517 bad data fork in symlink 125888517 would have cleared inode 125888517 - agno = 2 - agno = 3 - agno = 4 - agno = 5 - agno = 6 - agno = 7 - agno = 8 - agno = 9 - agno = 10 - agno = 11 - agno = 12 - agno = 13 - agno = 14 - 14:10:32: process known inodes and inode discovery - 50048 of 50048 inodes done - process newly discovered inodes... - 14:10:32: process newly discovered inodes - 16 of 16 allocation groups done Phase 4 - check for duplicate blocks... - setting up duplicate extent list... - 14:10:32: setting up duplicate extent list - 16 of 16 allocation groups done - check for inodes claiming duplicate blocks... - agno = 0 - agno = 13 - agno = 2 - agno = 3 - agno = 5 - agno = 4 - agno = 6 - agno = 7 - agno = 9 - agno = 11 - agno = 10 - agno = 8 - agno = 14 - agno = 12 - agno = 15 - agno = 1 mismatch between format (2) and size (0) in symlink ino 125888517 bad data fork in symlink 125888517 would have cleared inode 125888517 - 14:10:32: check for inodes claiming duplicate blocks - 50048 of 50048 inodes done No modify flag set, skipping phase 5 Phase 6 - check inode connectivity... - traversing filesystem ... - traversal finished ... - moving disconnected inodes to lost+found ... Phase 7 - verify link counts... - 14:10:33: verify and correct link counts - 16 of 16 allocation groups done
On Mon, Nov 12, 2018 at 01:31:26PM +0000, bugzilla-daemon@bugzilla.kernel.org wrote: > https://bugzilla.kernel.org/show_bug.cgi?id=200063 > > --- Comment #4 from Zorro Lang (zlang@redhat.com) --- > I still can reproduce this bug on linux v4.20-rc1. So we haven't merged this > patch (in comment 3), right? Patch was superceded, and I never got around to posting the new version. It's still sitting in my stack, I'll try to get it out in the next coupel of days. Cheers, Dave.