Bug 210745
Summary: | kernel crash during umounting a partition with f2fs filesystem | ||
---|---|---|---|
Product: | File System | Reporter: | Zhiguo.Niu (Zhiguo.Niu) |
Component: | f2fs | Assignee: | Default virtual assignee for f2fs (filesystem_f2fs) |
Status: | NEEDINFO --- | ||
Severity: | high | CC: | chao |
Priority: | P1 | ||
Hardware: | All | ||
OS: | Linux | ||
Kernel Version: | 4.14.193 | Subsystem: | |
Regression: | No | Bisected commit-id: |
Description
Zhiguo.Niu
2020-12-17 06:43:10 UTC
Hi, I checked the code of 4.14.193, I don't have any clue about why this can happen, and I don't remember that there is such corruption condition occured on nid list, because all its update is under nat_tree_lock, let me know if I missed something. Do you apply private patch on 4.14.193? (In reply to Chao Yu from comment #1) > Hi, > > I checked the code of 4.14.193, I don't have any clue about why this can > happen, > and I don't remember that there is such corruption condition occured on nid > list, because all its update is under nat_tree_lock, let me know if I missed > something. > > Do you apply private patch on 4.14.193? hi Chao, Thanks for your reply, I have checked my codebase, there is no any other private patches in current version. I find that local variables natvec & setvec in f2fs_destroy_node_manager may be inited as 0xaa and 0xaaaaaaaaaaaaaaaa, just like : void f2fs_destroy_node_manager(struct f2fs_sb_info *sbi) { struct f2fs_nm_info *nm_i = NM_I(sbi); struct free_nid *i, *next_i; struct nat_entry *natvec[NATVEC_SIZE]; struct nat_entry_set *setvec[SETVEC_SIZE]; dis: crash_arm64> dis f2fs_destroy_node_manager 0xffffff800842e2a8 <f2fs_destroy_node_manager>: stp x29, x30, [sp,#-96]! 0xffffff800842e2ac <f2fs_destroy_node_manager+4>: stp x28, x27, [sp,#16] 0xffffff800842e2b0 <f2fs_destroy_node_manager+8>: stp x26, x25, [sp,#32] 0xffffff800842e2b4 <f2fs_destroy_node_manager+12>: stp x24, x23, [sp,#48] 0xffffff800842e2b8 <f2fs_destroy_node_manager+16>: stp x22, x21, [sp,#64] 0xffffff800842e2bc <f2fs_destroy_node_manager+20>: stp x20, x19, [sp,#80] 0xffffff800842e2c0 <f2fs_destroy_node_manager+24>: mov x29, sp 0xffffff800842e2c4 <f2fs_destroy_node_manager+28>: sub sp, sp, #0x320 0xffffff800842e2c8 <f2fs_destroy_node_manager+32>: adrp x8, 0xffffff800947e000 <xt_connlimit_locks+768> 0xffffff800842e2cc <f2fs_destroy_node_manager+36>: ldr x8, [x8,#264] 0xffffff800842e2d0 <f2fs_destroy_node_manager+40>: mov x27, x0 0xffffff800842e2d4 <f2fs_destroy_node_manager+44>: str x8, [x29,#-16] 0xffffff800842e2d8 <f2fs_destroy_node_manager+48>: nop 0xffffff800842e2dc <f2fs_destroy_node_manager+52>: ldr x20, [x27,#112] 0xffffff800842e2e0 <f2fs_destroy_node_manager+56>: add x0, sp, #0x110 0xffffff800842e2e4 <f2fs_destroy_node_manager+60>: mov w1, #0xaa // #170 0xffffff800842e2e8 <f2fs_destroy_node_manager+64>: mov w2, #0x200 // #512 0xffffff800842e2ec <f2fs_destroy_node_manager+68>: bl 0xffffff8008be6b80 <__memset> 0xffffff800842e2f0 <f2fs_destroy_node_manager+72>: mov x8, #0xaaaaaaaaaaaaaaaa // #-6148914691236517206 0xffffff800842e2f4 <f2fs_destroy_node_manager+76>: stp x8, x8, [sp,#256] 0xffffff800842e2f8 <f2fs_destroy_node_manager+80>: stp x8, x8, [sp,#240] 0xffffff800842e2fc <f2fs_destroy_node_manager+84>: stp x8, x8, [sp,#224] 0xffffff800842e300 <f2fs_destroy_node_manager+88>: stp x8, x8, [sp,#208] 0xffffff800842e304 <f2fs_destroy_node_manager+92>: stp x8, x8, [sp,#192] 0xffffff800842e308 <f2fs_destroy_node_manager+96>: stp x8, x8, [sp,#176] 0xffffff800842e30c <f2fs_destroy_node_manager+100>: stp x8, x8, [sp,#160] 0xffffff800842e310 <f2fs_destroy_node_manager+104>: stp x8, x8, [sp,#144] 0xffffff800842e314 <f2fs_destroy_node_manager+108>: stp x8, x8, [sp,#128] 0xffffff800842e318 <f2fs_destroy_node_manager+112>: stp x8, x8, [sp,#112] 0xffffff800842e31c <f2fs_destroy_node_manager+116>: stp x8, x8, [sp,#96] 0xffffff800842e320 <f2fs_destroy_node_manager+120>: stp x8, x8, [sp,#80] 0xffffff800842e324 <f2fs_destroy_node_manager+124>: stp x8, x8, [sp,#64] 0xffffff800842e328 <f2fs_destroy_node_manager+128>: stp x8, x8, [sp,#48] 0xffffff800842e32c <f2fs_destroy_node_manager+132>: stp x8, x8, [sp,#32] 0xffffff800842e330 <f2fs_destroy_node_manager+136>: stp x8, x8, [sp,#16] I am not sure this is the root cause about this issue, because these invalid entry can be found in nat_root radix tree of f2fs_nm_info thanks! thanks! nm_i->nat_list_lock was introduced in 4.19, are you sure your codebase is 4.14.193? (In reply to Zhiguo.Niu from comment #2) > hi Chao, > > Thanks for your reply, I have checked my codebase, there is no any other > private patches in current version. > > I find that local variables natvec & setvec in f2fs_destroy_node_manager may > be inited as 0xaa and 0xaaaaaaaaaaaaaaaa, just like : > > void f2fs_destroy_node_manager(struct f2fs_sb_info *sbi) > { > struct f2fs_nm_info *nm_i = NM_I(sbi); > struct free_nid *i, *next_i; > struct nat_entry *natvec[NATVEC_SIZE]; > struct nat_entry_set *setvec[SETVEC_SIZE]; > I don't think so, natvec array will be assigned in __gang_lookup_nat_cache(), and natvec[0..found - 1] will be valid, in "destroy nat cache" loop, we will not access natvec array out-of-range. Can you please check whether @found is valid or not (@found should be less or equal than NATVEC_SIZE)? BTW, one possible case could be stack overflow, but during umount(), would that really happen? |