Bug 210745

Summary: kernel crash during umounting a partition with f2fs filesystem
Product: File System Reporter: Zhiguo.Niu (Zhiguo.Niu)
Component: f2fsAssignee: Default virtual assignee for f2fs (filesystem_f2fs)
Status: NEEDINFO ---    
Severity: high CC: chao
Priority: P1    
Hardware: All   
OS: Linux   
Kernel Version: 4.14.193 Subsystem:
Regression: No Bisected commit-id:

Description Zhiguo.Niu 2020-12-17 06:43:10 UTC
Hi,
When we do the reboot stress test in a device, we may encounter the following kernel crash occasionally.


[   42.035226] c6 Unable to handle kernel NULL pointer dereference at virtual address 0000000a
[   43.437464] c6  __list_del_entry_valid+0xc/0xd8
[   43.441962] c6  f2fs_destroy_node_manager+0x218/0x398
[   43.446984] c6  f2fs_put_super+0x19c/0x2b8
[   43.451052] c6  generic_shutdown_super+0x70/0xf8
[   43.455635] c6  kill_block_super+0x2c/0x5c
[   43.459702] c6  kill_f2fs_super+0xac/0xd8
[   43.463684] c6  deactivate_locked_super+0x5c/0x124
[   43.468442] c6  deactivate_super+0x5c/0x68
[   43.472512] c6  cleanup_mnt+0x9c/0x118
[   43.476231] c6  __cleanup_mnt+0x1c/0x28
[   43.480043] c6  task_work_run+0x88/0xa8
[   43.483850] c6  do_notify_resume+0x39c/0x1c88
[   43.488174] c6  work_pending+0x8/0x14

the code of crash point is:
f2fs/node.c

void f2fs_destroy_node_manager(struct f2fs_sb_info *sbi)

	while ((found = __gang_lookup_nat_cache(nm_i,
					nid, NATVEC_SIZE, natvec))) {
		unsigned idx;

		nid = nat_get_nid(natvec[found - 1]) + 1;
		for (idx = 0; idx < found; idx++) {
			spin_lock(&nm_i->nat_list_lock);
>                       list_del(&natvec[idx]->list);
			spin_unlock(&nm_i->nat_list_lock);

			__del_from_nat_cache(nm_i, natvec[idx]);
		}
	}

because of the current nat entry in natvec[idx] is a invalid pointer or its member list has null next member.

We have encountered this issue for several times in both Andoird Q & R version

I analyze these issue as following:

1. the current nat can be found in stack, like as "a"
ffffff800806b8d0:  ffffffc0af33cbc0 ffffffc0af4869a0 
> ffffff800806b8e0:  ffffffc0f49baa00 000000000000000a 
ffffff800806b8f0:  ffffffc0af33c040 ffffffc0c69f0e20 
ffffff800806b900:  ffffffc0c695abc0 ffffffc01e2a4460 

2.these invalid entry can be found in nat_root radix tree of f2fs_nm_info

3. I have reviewed the codes about nat_tree_lock, and has not any clues

please let me know if you need any other information
thanks a lot.
Comment 1 Chao Yu 2020-12-18 10:27:08 UTC
Hi,

I checked the code of 4.14.193, I don't have any clue about why this can happen,
and I don't remember that there is such corruption condition occured on nid list, because all its update is under nat_tree_lock, let me know if I missed something.

Do you apply private patch on 4.14.193?
Comment 2 Zhiguo.Niu 2020-12-21 08:09:23 UTC
(In reply to Chao Yu from comment #1)
> Hi,
> 
> I checked the code of 4.14.193, I don't have any clue about why this can
> happen,
> and I don't remember that there is such corruption condition occured on nid
> list, because all its update is under nat_tree_lock, let me know if I missed
> something.
> 
> Do you apply private patch on 4.14.193?


hi Chao, 

Thanks for your reply, I have checked my codebase, there is no any other private patches in current version.

I find that local variables natvec & setvec in f2fs_destroy_node_manager may be inited as 0xaa and 0xaaaaaaaaaaaaaaaa, just like :

void f2fs_destroy_node_manager(struct f2fs_sb_info *sbi)
{
	struct f2fs_nm_info *nm_i = NM_I(sbi);
	struct free_nid *i, *next_i;
	struct nat_entry *natvec[NATVEC_SIZE];
	struct nat_entry_set *setvec[SETVEC_SIZE];

dis:
crash_arm64> dis f2fs_destroy_node_manager
0xffffff800842e2a8 <f2fs_destroy_node_manager>: stp     x29, x30, [sp,#-96]!
0xffffff800842e2ac <f2fs_destroy_node_manager+4>:       stp     x28, x27, [sp,#16]
0xffffff800842e2b0 <f2fs_destroy_node_manager+8>:       stp     x26, x25, [sp,#32]
0xffffff800842e2b4 <f2fs_destroy_node_manager+12>:      stp     x24, x23, [sp,#48]
0xffffff800842e2b8 <f2fs_destroy_node_manager+16>:      stp     x22, x21, [sp,#64]
0xffffff800842e2bc <f2fs_destroy_node_manager+20>:      stp     x20, x19, [sp,#80]
0xffffff800842e2c0 <f2fs_destroy_node_manager+24>:      mov     x29, sp
0xffffff800842e2c4 <f2fs_destroy_node_manager+28>:      sub     sp, sp, #0x320
0xffffff800842e2c8 <f2fs_destroy_node_manager+32>:      adrp    x8, 0xffffff800947e000 <xt_connlimit_locks+768>
0xffffff800842e2cc <f2fs_destroy_node_manager+36>:      ldr     x8, [x8,#264]
0xffffff800842e2d0 <f2fs_destroy_node_manager+40>:      mov     x27, x0
0xffffff800842e2d4 <f2fs_destroy_node_manager+44>:      str     x8, [x29,#-16]
0xffffff800842e2d8 <f2fs_destroy_node_manager+48>:      nop
0xffffff800842e2dc <f2fs_destroy_node_manager+52>:      ldr     x20, [x27,#112]
0xffffff800842e2e0 <f2fs_destroy_node_manager+56>:      add     x0, sp, #0x110
0xffffff800842e2e4 <f2fs_destroy_node_manager+60>:      mov     w1, #0xaa                       // #170
0xffffff800842e2e8 <f2fs_destroy_node_manager+64>:      mov     w2, #0x200                      // #512
0xffffff800842e2ec <f2fs_destroy_node_manager+68>:      bl      0xffffff8008be6b80 <__memset>
0xffffff800842e2f0 <f2fs_destroy_node_manager+72>:      mov     x8, #0xaaaaaaaaaaaaaaaa         // #-6148914691236517206
0xffffff800842e2f4 <f2fs_destroy_node_manager+76>:      stp     x8, x8, [sp,#256]
0xffffff800842e2f8 <f2fs_destroy_node_manager+80>:      stp     x8, x8, [sp,#240]
0xffffff800842e2fc <f2fs_destroy_node_manager+84>:      stp     x8, x8, [sp,#224]
0xffffff800842e300 <f2fs_destroy_node_manager+88>:      stp     x8, x8, [sp,#208]
0xffffff800842e304 <f2fs_destroy_node_manager+92>:      stp     x8, x8, [sp,#192]
0xffffff800842e308 <f2fs_destroy_node_manager+96>:      stp     x8, x8, [sp,#176]
0xffffff800842e30c <f2fs_destroy_node_manager+100>:     stp     x8, x8, [sp,#160]
0xffffff800842e310 <f2fs_destroy_node_manager+104>:     stp     x8, x8, [sp,#144]
0xffffff800842e314 <f2fs_destroy_node_manager+108>:     stp     x8, x8, [sp,#128]
0xffffff800842e318 <f2fs_destroy_node_manager+112>:     stp     x8, x8, [sp,#112]
0xffffff800842e31c <f2fs_destroy_node_manager+116>:     stp     x8, x8, [sp,#96]
0xffffff800842e320 <f2fs_destroy_node_manager+120>:     stp     x8, x8, [sp,#80]
0xffffff800842e324 <f2fs_destroy_node_manager+124>:     stp     x8, x8, [sp,#64]
0xffffff800842e328 <f2fs_destroy_node_manager+128>:     stp     x8, x8, [sp,#48]
0xffffff800842e32c <f2fs_destroy_node_manager+132>:     stp     x8, x8, [sp,#32]
0xffffff800842e330 <f2fs_destroy_node_manager+136>:     stp     x8, x8, [sp,#16]

I am not sure this is the root cause about this issue, because these invalid entry can be found in nat_root radix tree of f2fs_nm_info

thanks!

thanks!
Comment 3 Chao Yu 2020-12-21 08:29:58 UTC
nm_i->nat_list_lock was introduced in 4.19, are you sure your codebase is 4.14.193?
Comment 4 Chao Yu 2020-12-21 08:44:13 UTC
(In reply to Zhiguo.Niu from comment #2)
> hi Chao, 
> 
> Thanks for your reply, I have checked my codebase, there is no any other
> private patches in current version.
> 
> I find that local variables natvec & setvec in f2fs_destroy_node_manager may
> be inited as 0xaa and 0xaaaaaaaaaaaaaaaa, just like :
> 
> void f2fs_destroy_node_manager(struct f2fs_sb_info *sbi)
> {
>       struct f2fs_nm_info *nm_i = NM_I(sbi);
>       struct free_nid *i, *next_i;
>       struct nat_entry *natvec[NATVEC_SIZE];
>       struct nat_entry_set *setvec[SETVEC_SIZE];
> 

I don't think so, natvec array will be assigned in __gang_lookup_nat_cache(),
and natvec[0..found - 1] will be valid, in "destroy nat cache" loop, we will
not access natvec array out-of-range.

Can you please check whether @found is valid or not (@found should be less or
equal than NATVEC_SIZE)?

BTW, one possible case could be stack overflow, but during umount(), would
that really happen?