Bug 208325
Summary: | f2fs inconsistent node block | ||
---|---|---|---|
Product: | File System | Reporter: | zKqri0 |
Component: | f2fs | Assignee: | Default virtual assignee for f2fs (filesystem_f2fs) |
Status: | NEEDINFO --- | ||
Severity: | normal | CC: | chao |
Priority: | P1 | ||
Hardware: | All | ||
OS: | Linux | ||
Kernel Version: | 5.7.2-arch1-1 | Subsystem: | |
Regression: | No | Bisected commit-id: |
Description
zKqri0
2020-06-26 10:22:46 UTC
Hi, thanks for the report. What's you mkfs/mount option? I've no idea whether this is a f2fs bug or not, as you said device can be trusted, so almostly it should be a software bug. One case I can image could be that apps bypassing filesystem to write data via LBA directly, then data can be corrupted. If possible, could you please help to add below three patches to recompile the kernel https://lore.kernel.org/linux-f2fs-devel/20200628122940.29665-1-yuchao0@huawei.com/T/#t [f2fs-dev] [PATCH 1/3] f2fs: fix wrong return value of f2fs_bmap_compress() [f2fs-dev] [PATCH 2/3] f2fs: support to trace f2fs_bmap() [f2fs-dev] [PATCH 3/3] f2fs: support to trace f2fs_fiemap() Then, use below commands to see whether there is apps are lookuping LBA: echo 1 > /sys/kernel/debug/tracing/events/f2fs/f2fs_bmap/enable echo 1 > /sys/kernel/debug/tracing/events/f2fs/f2fs_fiemap/enable cat /sys/kernel/debug/tracing/trace_pipe |grep f2fs (In reply to Chao Yu from comment #1) > Hi, thanks for the report. > > What's you mkfs/mount option? > > I've no idea whether this is a f2fs bug or not, as you said device can be > trusted, so almostly it should be a software bug. > > One case I can image could be that apps bypassing filesystem to write data > via LBA directly, then data can be corrupted. > > If possible, could you please help to add below three patches to recompile > the kernel > > https://lore.kernel.org/linux-f2fs-devel/20200628122940.29665-1- > yuchao0@huawei.com/T/#t > > [f2fs-dev] [PATCH 1/3] f2fs: fix wrong return value of f2fs_bmap_compress() > [f2fs-dev] [PATCH 2/3] f2fs: support to trace f2fs_bmap() > [f2fs-dev] [PATCH 3/3] f2fs: support to trace f2fs_fiemap() > > Then, use below commands to see whether there is apps are lookuping LBA: > > echo 1 > /sys/kernel/debug/tracing/events/f2fs/f2fs_bmap/enable > echo 1 > /sys/kernel/debug/tracing/events/f2fs/f2fs_fiemap/enable > cat /sys/kernel/debug/tracing/trace_pipe |grep f2fs Mount options are "/dev/sda2 on / type f2fs (rw,relatime,lazytime,background_gc=on,discard,no_heap,user_xattr,inline_xattr,acl,inline_data,inline_dentry,flush_merge,extent_cache,mode=adaptive,active_logs=6,alloc_mode=default,fsync_mode=posix)". I patched my laptop's kernel with those patches but I don't see anything in "trace_pipe" while I'm getting invalid argument errors. Also I noticed that the "nid" and "node_footer" are the same always in the error so its only one node block that's messed up. Maybe a raw copy of that node block will help find what caused it ? (In reply to zKqri0 from comment #2) > Mount options are "/dev/sda2 on / type f2fs > (rw,relatime,lazytime,background_gc=on,discard,no_heap,user_xattr, > inline_xattr,acl,inline_data,inline_dentry,flush_merge,extent_cache, > mode=adaptive,active_logs=6,alloc_mode=default,fsync_mode=posix)". It looks it's default mount options. Did you use any special mkfs options? like -O [feature_name]? > > I patched my laptop's kernel with those patches but I don't see anything in > "trace_pipe" while I'm getting invalid argument errors. Also I noticed that > the "nid" and "node_footer" are the same always in the error so its only one > node block that's messed up. > > Maybe a raw copy of that node block will help find what caused it ? Yes, please, I can parse it with dentry_block, inode or dnode structure to see what it looks like, and what kind of fields are corrupted. (In reply to Chao Yu from comment #3) > (In reply to zKqri0 from comment #2) > > Mount options are "/dev/sda2 on / type f2fs > > (rw,relatime,lazytime,background_gc=on,discard,no_heap,user_xattr, > > inline_xattr,acl,inline_data,inline_dentry,flush_merge,extent_cache, > > mode=adaptive,active_logs=6,alloc_mode=default,fsync_mode=posix)". > > It looks it's default mount options. > > Did you use any special mkfs options? like -O [feature_name]? > > > > > I patched my laptop's kernel with those patches but I don't see anything in > > "trace_pipe" while I'm getting invalid argument errors. Also I noticed that > > the "nid" and "node_footer" are the same always in the error so its only > one > > node block that's messed up. > > > > Maybe a raw copy of that node block will help find what caused it ? > > Yes, please, I can parse it with dentry_block, inode or dnode structure to > see what it looks like, and what kind of fields are corrupted. I used default mkfs options. Here is output of using dump.f2fs on that inode >sudo dump.f2fs -i 1761978 /dev/sda2 >Info: [/dev/sda2] Disk Model: Samsung SSD 850 >Info: Segments per section = 1 >Info: Sections per zone = 1 >Info: sector size = 512 >Info: total sectors = 102539264 (50068 MB) >Info: MKFS version > "Linux version 4.20.0-arch1-1-ARCH (builduser@heftig-29859) (gcc version > 8.2.1 20181127 (GCC)) #1 SMP PREEMPT Mon Dec 24 03:00:40 UTC 2018" >Info: FSCK version > from "Linux version 5.7.2-arch1-1 (linux@archlinux) (gcc version 10.1.0 > (GCC), GNU ld (GNU Binutils) 2.34.0) #1 SMP PREEMPT Wed, 10 Jun 2020 > 20:36:24 +0000" > to "Linux version 5.7.2-arch1-1 (linux@archlinux) (gcc version 10.1.0 > (GCC), GNU ld (GNU Binutils) 2.34.0) #1 SMP PREEMPT Wed, 10 Jun 2020 > 20:36:24 +0000" >Info: superblock features = 0 : >Info: superblock encrypt level = 0, salt = 00000000000000000000000000000000 >Info: total FS sectors = 102539264 (50068 MB) >Info: CKPT version = 64d05005 >[print_node_info: 271] Node ID [0x203a7962:540703074] is direct node or >>indirect node. >[0] [0x5452202c : 1414668332] >[1] [0x45445f4d : 1162108749] >[2] [0x4444414c : 1145323852] >[3] [0x202c2952 : 539765074] >[4] [0x6e657665 : 1852143205] >[5] [0x6f687420 : 1869116448] >[6] [0x20686775 : 543713141] >[7] [0x69207469 : 1763734633] >[8] [0x6f6e2073 : 1869488243] >[9] [0x46202e74 : 1176514164] >[10] [0x69207869 : 1763735657] >Invalid (i)node block >Info: checkpoint state = 51 : crc fsck unmount >Done: 0.063481 secs (In reply to zKqri0 from comment #4) > >[print_node_info: 271] Node ID [0x203a7962:540703074] is direct node or > >>indirect node. > >[0] [0x5452202c : 1414668332] > >[1] [0x45445f4d : 1162108749] > >[2] [0x4444414c : 1145323852] > >[3] [0x202c2952 : 539765074] > >[4] [0x6e657665 : 1852143205] > >[5] [0x6f687420 : 1869116448] > >[6] [0x20686775 : 543713141] > >[7] [0x69207469 : 1763734633] > >[8] [0x6f6e2073 : 1869488243] > >[9] [0x46202e74 : 1176514164] > >[10] [0x69207869 : 1763735657] > >Invalid (i)node block I don't see any valid information from this, could you please upload the raw block if possible? (In reply to Chao Yu from comment #5) > (In reply to zKqri0 from comment #4) > > >[print_node_info: 271] Node ID [0x203a7962:540703074] is direct node or > > >>indirect node. > > >[0] [0x5452202c : 1414668332] > > >[1] [0x45445f4d : 1162108749] > > >[2] [0x4444414c : 1145323852] > > >[3] [0x202c2952 : 539765074] > > >[4] [0x6e657665 : 1852143205] > > >[5] [0x6f687420 : 1869116448] > > >[6] [0x20686775 : 543713141] > > >[7] [0x69207469 : 1763734633] > > >[8] [0x6f6e2073 : 1869488243] > > >[9] [0x46202e74 : 1176514164] > > >[10] [0x69207869 : 1763735657] > > >Invalid (i)node block > > I don't see any valid information from this, could you please upload the raw > block if possible? yeah there probably isnt because it seems like blk_addr is pointing to an invalid address. i took a dump of the node with "dd if=/dev/sda2 of=./node.bin bs=4096 skip=2900324352 count=4096 iflag=skip_bytes,count_bytes" with 2900324352 being "blk_addr >> 12" and it was part of a random git commit message and not a node block. anything else that would be useful to dump ? |