Bug 218164 - ubifs: racing in grabbing LEB when making reservation, cannot reserve XX bytes in jhead, err = -28
Summary: ubifs: racing in grabbing LEB when making reservation, cannot reserve XX byte...
Status: NEW
Alias: None
Product: File System
Classification: Unclassified
Component: Other (show other bugs)
Hardware: All Linux
: P3 normal
Assignee: fs_other
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2023-11-20 01:40 UTC by Zhihao Cheng
Modified: 2024-01-22 04:14 UTC (History)
0 users

See Also:
Kernel Version:
Subsystem:
Regression: No
Bisected commit-id:


Attachments
diff (2.96 KB, patch)
2023-11-20 01:44 UTC, Zhihao Cheng
Details | Diff
test.sh (1.06 KB, application/x-shellscript)
2023-11-20 01:45 UTC, Zhihao Cheng
Details
diff_fix (3.28 KB, patch)
2023-11-20 01:45 UTC, Zhihao Cheng
Details | Diff
diff_fix_v2 (4.04 KB, patch)
2024-01-22 04:14 UTC, Zhihao Cheng
Details | Diff

Description Zhihao Cheng 2023-11-20 01:40:22 UTC
Recently we catched ENOSPC returned by make_reservation() while doing fsstress on UBIFS:

UBIFS error (ubi0:0 pid 3665202): ubifs_init_security [ubifs]: cannot initialize security for inode 13638780, error -28
UBIFS error (ubi0:0 pid 3665191): ubifs_init_security [ubifs]: cannot initialize security for inode 13638781, error -28
rtos_fs_stress. (3667318): drop_caches: 2
UBIFS error (ubi0:0 pid 3640152): make_reservation [ubifs]: cannot reserve 112 bytes in jhead 2, error -28
CPU: 2 PID: 3640152 Comm: kworker/u16:2 Tainted: G    B   W  O      5.10.0 #1
Hardware name: Hisilicon PhosphorHi1230 EMU (DT)
Workqueue: writeback wb_workfn (flush-ubifs_0_0)
Call trace:
 dump_backtrace+0x0/0x2d0
 show_stack+0x18/0x28
 dump_stack+0x114/0x198
 make_reservation+0x564/0x610 [ubifs]
 ubifs_jnl_write_data+0x328/0x48c [ubifs]
 do_writepage+0x2a8/0x3e4 [ubifs]
 ubifs_writepage+0x16c/0x374 [ubifs]
 __writepage+0x58/0x124
 write_cache_pages+0x528/0x650
 generic_writepages+0xb4/0x114
 do_writepages+0xcc/0x11c
 __writeback_single_inode+0x58/0x240
 writeback_sb_inodes+0x2d0/0x564
 wb_writeback+0x20c/0x2b4
 wb_workfn+0x404/0x510
 process_one_work+0x304/0x4ac
 worker_thread+0x31c/0x4e4
 kthread+0x23c/0x290
 ret_from_fork+0x10/0x1c
(pid 3640152) Budgeting info: data budget sum 17576, total budget sum 17768
	budg_data_growth 4144, budg_dd_growth 13432, budg_idx_growth 192
	min_idx_lebs 13, old_idx_sz 988640, uncommitted_idx 0
	page_budget 4144, inode_budget 160, dent_budget 312
	nospace 0, nospace_rp 0
	dark_wm 8192, dead_wm 4096, max_idx_node_sz 192
	freeable_cnt 0, calc_idx_sz 988640, idx_gc_cnt 0
	dirty_pg_cnt 4, dirty_zn_cnt 0, clean_zn_cnt 4811
	gc_lnum 21, ihead_lnum 14
	jhead 0 (GC)	 LEB 16
	jhead 1 (base)	 LEB 34
	jhead 2 (data)	 LEB 23
	bud LEB 16
	bud LEB 23
	bud LEB 34
	old bud LEB 33
	old bud LEB 31
	old bud LEB 15
	commit state 4
Budgeting predictions:
	available: 33832, outstanding 17576, free 15356
(pid 3640152) start dumping LEB properties
(pid 3640152) Lprops statistics: empty_lebs 3, idx_lebs  11
	taken_empty_lebs 1, total_free 1253376, total_dirty 2445736
	total_used 3438712, total_dark 65536, total_dead 17248
LEB 10      free 0        dirty 88496    used 165456   free + dirty 88496    flags 0x22 (dirty index)
LEB 11      free 0        dirty 32       used 253920   free + dirty 32       dark 0    dead 32   nodes fit 0   flags 0x0  (not categorized)
LEB 12      free 0        dirty 253880   used 72       free + dirty 253880   flags 0x22 (dirty index)
LEB 13      free 0        dirty 4064     used 249888   free + dirty 4064     dark 0    dead 4064 nodes fit 0   flags 0x0  (not categorized)
LEB 14      free 0        dirty 193944   used 60008    free + dirty 193944   flags 0x22 (dirty index)
LEB 15      free 0        dirty 248000   used 5952     free + dirty 248000   dark 8192 dead 0    nodes fit 58  flags 0x10 (taken)
LEB 16      free 110592   dirty 896      used 142464   free + dirty 111488   dark 8192 dead 0    nodes fit 26  flags 0x10 (taken, jhead 0 (GC))
LEB 17      free 0        dirty 16       used 253936   free + dirty 16       dark 0    dead 16   nodes fit 0   flags 0x0  (not categorized)
LEB 18      free 0        dirty 3784     used 250168   free + dirty 3784     dark 0    dead 3784 nodes fit 0   flags 0x0  (not categorized)
LEB 19      free 0        dirty 3656     used 250296   free + dirty 3656     dark 0    dead 3656 nodes fit 0   flags 0x0  (not categorized)
LEB 20      free 0        dirty 84232    used 169720   free + dirty 84232    flags 0x22 (dirty index)
LEB 21      free 253952   dirty 0        used 0        free + dirty 253952   dark 8192 dead 0    nodes fit 59  flags 0x10 (taken, GC LEB)
LEB 22      free 0        dirty 1064     used 252888   free + dirty 1064     dark 0    dead 1064 nodes fit 0   flags 0x0  (not categorized)
LEB 23      free 0        dirty 248104   used 5848     free + dirty 248104   dark 8192 dead 0    nodes fit 58  flags 0x10 (taken, jhead 2 (data))
LEB 24      free 0        dirty 800      used 253152   free + dirty 800      dark 0    dead 800  nodes fit 0   flags 0x0  (not categorized)
LEB 25      free 0        dirty 48       used 253904   free + dirty 48       dark 0    dead 48   nodes fit 0   flags 0x0  (not categorized)
LEB 26      free 0        dirty 576      used 253376   free + dirty 576      dark 0    dead 576  nodes fit 0   flags 0x0  (not categorized)
LEB 27      free 0        dirty 183016   used 70936    free + dirty 183016   flags 0x22 (dirty index)
LEB 28      free 0        dirty 240320   used 13632    free + dirty 240320   flags 0x22 (dirty index)
LEB 29      free 253952   dirty 0        used 0        free + dirty 253952   dark 8192 dead 0    nodes fit 59  flags 0x4  (empty)
LEB 30      free 0        dirty 186368   used 67584    free + dirty 186368   flags 0x22 (dirty index)
LEB 31      free 0        dirty 2696     used 251256   free + dirty 2696     dark 0    dead 2696 nodes fit 0   flags 0x10 (taken)
LEB 32      free 0        dirty 448      used 253504   free + dirty 448      dark 0    dead 448  nodes fit 0   flags 0x0  (not categorized)
LEB 33      free 0        dirty 253952   used 0        free + dirty 253952   dark 8192 dead 0    nodes fit 59  flags 0x10 (taken)
LEB 34      free 217088   dirty 36544    used 320      free + dirty 253632   dark 8192 dead 0    nodes fit 59  flags 0x10 (taken, jhead 1 (base))
LEB 35      free 0        dirty 32       used 253920   free + dirty 32       dark 0    dead 32   nodes fit 0   flags 0x0  (not categorized)
LEB 36      free 0        dirty 105912   used 148040   free + dirty 105912   flags 0x22 (dirty index)
LEB 37      free 253952   dirty 0        used 0        free + dirty 253952   dark 8192 dead 0    nodes fit 59  flags 0x4  (empty)
LEB 38      free 0        dirty 102888   used 151064   free + dirty 102888   flags 0x22 (dirty index)
LEB 39      free 163840   dirty 2736     used 87376    free + dirty 166576   flags 0x30 (index, taken)
LEB 40      free 0        dirty 32       used 253920   free + dirty 32       dark 0    dead 32   nodes fit 0   flags 0x0  (not categorized)
LEB 41      free 0        dirty 199200   used 54752    free + dirty 199200   flags 0x22 (dirty index)
(pid 3640152) finish dumping LEB properties
UBIFS error (ubi0:0 pid 3640152): do_writepage [ubifs]: cannot write page 0 of inode 13636292, error -28
UBIFS warning (ubi0:0 pid 3640152): ubifs_ro_mode [ubifs]: switched to read-only mode, error -28
CPU: 2 PID: 3640152 Comm: kworker/u16:2 Tainted: G    B   W  O      5.10.0 #1
Hardware name: Hisilicon PhosphorHi1230 EMU (DT)
Workqueue: writeback wb_workfn (flush-ubifs_0_0)
Call trace:
 dump_backtrace+0x0/0x2d0
 show_stack+0x18/0x28
 dump_stack+0x114/0x198
 ubifs_ro_mode+0x114/0x154 [ubifs]
 do_writepage+0x324/0x3e4 [ubifs]
 ubifs_writepage+0x16c/0x374 [ubifs]
 __writepage+0x58/0x124
 write_cache_pages+0x528/0x650
 generic_writepages+0xb4/0x114
 do_writepages+0xcc/0x11c
 __writeback_single_inode+0x58/0x240
 writeback_sb_inodes+0x2d0/0x564
 wb_writeback+0x20c/0x2b4
 wb_workfn+0x404/0x510
 process_one_work+0x304/0x4ac
 worker_thread+0x31c/0x4e4
 kthread+0x23c/0x290
 ret_from_fork+0x10/0x1c
UBIFS error (ubi0:0 pid 3668050): do_commit [ubifs]: commit failed, error -30
UBIFS_ERR do_commit:267 commit failed, error line 175
UBIFS error (ubi0:0 pid 3668050): do_writepage [ubifs]: cannot write page 360 of inode 13640850, error -30
UBIFS error (ubi0:0 pid 3668050): make_reservation [ubifs]: cannot reserve 87 bytes in jhead 2, error -30
UBIFS error (ubi0:0 pid 3668050): do_writepage [ubifs]: cannot write page 361 of inode 13640850, error -30
Comment 1 Zhihao Cheng 2023-11-20 01:41:55 UTC
After analyzing, The dirty LEBs could be grabbed by other writeback threads, which fails finding dirty LEBs of GC in current thread, so make_reservation() should try many times to invoke GC&&committing.

Here is a reproducer:
1. Apply diff and compile kernel(CONFIG_MTD_NAND_NANDSIM=m  CONFIG_PREEMPTION=y, CONFIG_PREEMPT_DYNAMIC=y, CONFIG_PREEMPT_VOLUNTARY=y, CONFIG_PREEMPT_COUNT=y    CONFIG_DEBUG_KMEMLEAK=n,CONFIG_KASAN=n)
2. Config qemu(x86, -smp 4, -m 4096)
3. Start qemu && ./test.sh
4. Several hours later(3h~)

[ 6771.139766] UBIFS error (ubi0:0 pid 1113): make_reservation [ubifs]: wait nospc_retries 1
[ 6771.139766] 
[ 6780.522204] UBIFS error (ubi0:0 pid 1112): make_reservation [ubifs]: stuck
[ 6780.522204] 
[ 6780.522221] UBIFS error (ubi0:0 pid 1111): make_reservation [ubifs]: stuck
[ 6780.522221] 
[ 6780.606098] UBIFS error (ubi0:0 pid 1113): make_reservation [ubifs]: wait done, err -28
[ 6780.606098] 
[ 6780.822298] UBIFS error (ubi0:0 pid 1113): make_reservation [ubifs]: wait nospc_retries 2
[ 6780.822298] 
[ 6780.827718] UBIFS error (ubi0:0 pid 1115): ubifs_commit_required [ubifs]: delay commit 1
[ 6780.827718] 
[ 6782.534601] UBIFS error (ubi0:0 pid 1115): make_reservation [ubifs]: stuck
[ 6782.534601] 
[ 6782.535557] UBIFS error (ubi0:0 pid 1110): ubifs_commit_required [ubifs]: delay commit 1
[ 6782.535557] 
[ 6782.736785] UBIFS error (ubi0:0 pid 1113): make_reservation [ubifs]: wait done, err -11
[ 6782.736785] 
[ 6782.950270] UBIFS error (ubi0:0 pid 1113): make_reservation [ubifs]: wait nospc_retries 2
[ 6782.950270] 
[ 6788.901728] UBIFS error (ubi0:0 pid 1110): make_reservation [ubifs]: stuck
[ 6788.901728] 
[ 6788.901741] UBIFS error (ubi0:0 pid 1111): make_reservation [ubifs]: stuck
[ 6788.901741] 
[ 6788.901746] UBIFS error (ubi0:0 pid 1115): make_reservation [ubifs]: stuck
[ 6788.901746] 
[ 6788.901744] UBIFS error (ubi0:0 pid 1112): make_reservation [ubifs]: stuck
[ 6788.901744] 
[ 6788.982114] UBIFS error (ubi0:0 pid 1113): make_reservation [ubifs]: wait done, err -28
[ 6788.982114] 
[ 6789.198047] UBIFS error (ubi0:0 pid 1113): make_reservation [ubifs]: cannot reserve 216 bytes in jhead 1, error -28
[ 6789.201011] CPU: 3 PID: 1113 Comm: test.sh Not tainted 6.7.0-rc1-00026-g01552fb03014-dirty #328
[ 6789.203588] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.1-2.fc37 04/01/2014
[ 6789.205955] Call Trace:
[ 6789.206685]  <TASK>
[ 6789.207294]  dump_stack_lvl+0x86/0xc0
[ 6789.208338]  dump_stack+0x18/0x30
[ 6789.209257]  make_reservation+0x5ec/0xb90 [ubifs]
[ 6789.210620]  ubifs_jnl_truncate+0x184/0xcb0 [ubifs]
[ 6789.211724]  ? ubifs_xattr_get+0x143/0x310 [ubifs]
[ 6789.212769]  ? down_read+0x1a/0x180
[ 6789.213492]  ? unmap_mapping_pages+0x85/0x180
[ 6789.214410]  ubifs_setattr+0x4fa/0x770 [ubifs]
[ 6789.215374]  notify_change+0x478/0x790
[ 6789.216162]  ? do_truncate+0xa5/0x120
[ 6789.216948]  do_truncate+0xa5/0x120
[ 6789.217683]  ? fscrypt_symlink_getattr+0xe0/0xe0
[ 6789.218659]  path_openat+0x2d0/0x1a60
[ 6789.219428]  do_filp_open+0xd2/0x180
[ 6789.220175]  ? kmem_cache_alloc+0x2ef/0x550
[ 6789.221017]  ? alloc_fd+0x137/0x2a0
[ 6789.221727]  do_sys_openat2+0xc9/0x130
[ 6789.222294]  do_sys_open+0x5c/0xa0
[ 6789.222812]  __x64_sys_open+0x29/0x40
[ 6789.223376]  do_syscall_64+0x6f/0x140
[ 6789.223916]  entry_SYSCALL_64_after_hwframe+0x63/0x6b
[ 6789.224669] RIP: 0033:0x7f427a8ffe80
[ 6789.225225] Code: 00 f7 d8 64 89 01 48 83 c8 ff c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 83 3d 69 48 2d 00 00 75 10 b8 02 00 004
[ 6789.227907] RSP: 002b:00007ffd0a044b98 EFLAGS: 00000246 ORIG_RAX: 0000000000000002
[ 6789.229017] RAX: ffffffffffffffda RBX: 000055cb20f53c40 RCX: 00007f427a8ffe80
[ 6789.230064] RDX: 00000000000001b6 RSI: 0000000000000241 RDI: 000055cb20f54030
[ 6789.231126] RBP: 00007ffd0a044c30 R08: 0000000000000020 R09: fefeff2f312e6f6c
[ 6789.232193] R10: 0000000000000005 R11: 0000000000000246 R12: 0000000000000000
[ 6789.232958] R13: 0000000000000003 R14: 0000000000000001 R15: 000055cb20f54030
[ 6789.233717]  </TASK>
[ 6789.234005] (pid 1113) Budgeting info: data budget sum 13184, total budget sum 13760
[ 6789.235105] 	budg_data_growth 12432, budg_dd_growth 752, budg_idx_growth 576
[ 6789.235865] 	min_idx_lebs 20, old_idx_sz 87472, uncommitted_idx 4608
[ 6789.236569] 	page_budget 4144, inode_budget 160, dent_budget 312
[ 6789.237316] 	nospace 0, nospace_rp 0
[ 6789.237712] 	dark_wm 4608, dead_wm 512, max_idx_node_sz 192
[ 6789.238337] 	freeable_cnt 0, calc_idx_sz 85464, idx_gc_cnt 0
[ 6789.238955] 	dirty_pg_cnt 3, dirty_zn_cnt 17, clean_zn_cnt 761
[ 6789.239622] 	gc_lnum 44, ihead_lnum 47
[ 6789.240053] 	jhead 0 (GC)	 LEB 36
[ 6789.240431] 	jhead 1 (base)	 LEB 45
[ 6789.240820] 	jhead 2 (data)	 LEB 10
[ 6789.241230] 	bud LEB 10
[ 6789.241510] 	bud LEB 36
[ 6789.241786] 	bud LEB 45
[ 6789.242088] 	bud LEB 46
[ 6789.242370] 	commit state 0
[ 6789.242654] Budgeting predictions:
[ 6789.242949] 	available: 23088, outstanding 13184, free 10942
[ 6789.243453] (pid 1113) start dumping LEB properties
[ 6789.243870] (pid 1113) Lprops statistics: empty_lebs 2, idx_lebs  20
[ 6789.244431] 	taken_empty_lebs 1, total_free 58368, total_dirty 312328
[ 6789.245004] 	total_used 219680, total_dark 46080, total_dead 2992
[ 6789.245511] LEB 10      free 1024     dirty 13888    used 448      free + dirty 14912    dark 4608 dead 0    nodes fit 3   flags 0x10 )
[ 6789.246728] LEB 11      free 0        dirty 48       used 15312    free + dirty 48       dark 0    dead 48   nodes fit 0   flags 0x0  )
[ 6789.247864] LEB 12      free 0        dirty 12896    used 2464     free + dirty 12896    flags 0x22 (dirty index)
[ 6789.248759] LEB 13      free 0        dirty 160      used 15200    free + dirty 160      dark 0    dead 160  nodes fit 0   flags 0x0  )
[ 6789.249908] LEB 14      free 0        dirty 8752     used 6608     free + dirty 8752     flags 0x22 (dirty index)
[ 6789.250780] LEB 15      free 0        dirty 7296     used 8064     free + dirty 7296     flags 0x22 (dirty index)
[ 6789.251661] LEB 16      free 0        dirty 14880    used 480      free + dirty 14880    dark 4608 dead 0    nodes fit 3   flags 0x1  )
[ 6789.252752] LEB 17      free 0        dirty 32       used 15328    free + dirty 32       dark 0    dead 32   nodes fit 0   flags 0x0  )
[ 6789.253920] LEB 18      free 0        dirty 112      used 15248    free + dirty 112      dark 0    dead 112  nodes fit 0   flags 0x0  )
[ 6789.255057] LEB 19      free 0        dirty 496      used 14864    free + dirty 496      dark 0    dead 496  nodes fit 0   flags 0x0  )
[ 6789.256190] LEB 20      free 0        dirty 40       used 15320    free + dirty 40       dark 0    dead 40   nodes fit 0   flags 0x0  )
[ 6789.257322] LEB 21      free 6144     dirty 7088     used 2128     free + dirty 13232    flags 0x22 (dirty index)
[ 6789.258210] LEB 22      free 0        dirty 13008    used 2352     free + dirty 13008    flags 0x22 (dirty index)
[ 6789.259091] LEB 23      free 15360    dirty 0        used 0        free + dirty 15360    dark 4608 dead 0    nodes fit 3   flags 0x4  )
[ 6789.260176] LEB 24      free 0        dirty 15304    used 56       free + dirty 15304    dark 4608 dead 0    nodes fit 3   flags 0x1  )
[ 6789.261247] LEB 25      free 0        dirty 7520     used 7840     free + dirty 7520     flags 0x22 (dirty index)
[ 6789.262125] LEB 26      free 0        dirty 240      used 15120    free + dirty 240      dark 0    dead 240  nodes fit 0   flags 0x0  )
[ 6789.263254] LEB 27      free 0        dirty 7968     used 7392     free + dirty 7968     flags 0x22 (dirty index)
[ 6789.264126] LEB 28      free 0        dirty 15040    used 320      free + dirty 15040    dark 4608 dead 0    nodes fit 3   flags 0x1  )
[ 6789.265196] LEB 29      free 0        dirty 48       used 15312    free + dirty 48       dark 0    dead 48   nodes fit 0   flags 0x0  )
[ 6789.266341] LEB 30      free 0        dirty 8080     used 7280     free + dirty 8080     flags 0x22 (dirty index)
[ 6789.267219] LEB 31      free 0        dirty 416      used 14944    free + dirty 416      dark 0    dead 416  nodes fit 0   flags 0x0  )
[ 6789.268355] LEB 32      free 0        dirty 11776    used 3584     free + dirty 11776    flags 0x22 (dirty index)
[ 6789.269215] LEB 33      free 0        dirty 32       used 15328    free + dirty 32       dark 0    dead 32   nodes fit 0   flags 0x0  )
[ 6789.270366] LEB 34      free 0        dirty 400      used 14960    free + dirty 400      dark 0    dead 400  nodes fit 0   flags 0x0  )
[ 6789.271498] LEB 35      free 0        dirty 9240     used 6120     free + dirty 9240     flags 0x22 (dirty index)
[ 6789.272363] LEB 36      free 0        dirty 10744    used 4616     free + dirty 10744    dark 4608 dead 0    nodes fit 2   flags 0x10 )
[ 6789.273506] LEB 37      free 0        dirty 104      used 15256    free + dirty 104      dark 0    dead 104  nodes fit 0   flags 0x0  )
[ 6789.274636] LEB 38      free 0        dirty 9200     used 6160     free + dirty 9200     flags 0x22 (dirty index)
[ 6789.275493] LEB 39      free 0        dirty 368      used 14992    free + dirty 368      dark 0    dead 368  nodes fit 0   flags 0x0  )
[ 6789.276624] LEB 40      free 0        dirty 13120    used 2240     free + dirty 13120    flags 0x22 (dirty index)
[ 6789.277478] LEB 41      free 0        dirty 5504     used 9856     free + dirty 5504     flags 0x22 (dirty index)
[ 6789.278324] LEB 42      free 0        dirty 10656    used 4704     free + dirty 10656    flags 0x22 (dirty index)
[ 6789.279184] LEB 43      free 0        dirty 13680    used 1680     free + dirty 13680    flags 0x22 (dirty index)
[ 6789.280039] LEB 44      free 15360    dirty 0        used 0        free + dirty 15360    dark 4608 dead 0    nodes fit 3   flags 0x10 )
[ 6789.281161] LEB 45      free 3584     dirty 10176    used 1600     free + dirty 13760    dark 4608 dead 0    nodes fit 3   flags 0x10 )
[ 6789.282319] LEB 46      free 0        dirty 15360    used 0        free + dirty 15360    dark 4608 dead 0    nodes fit 3   flags 0x10 )
[ 6789.283515] LEB 47      free 7168     dirty 7968     used 224      free + dirty 15136    flags 0x30 (index, taken)
[ 6789.284366] LEB 48      free 0        dirty 13904    used 1456     free + dirty 13904    flags 0x22 (dirty index)
[ 6789.285224] LEB 49      free 0        dirty 13976    used 1384     free + dirty 13976    flags 0x22 (dirty index)
[ 6789.286070] LEB 50      free 0        dirty 13672    used 1688     free + dirty 13672    flags 0x22 (dirty index)
[ 6789.286902] LEB 51      free 9728     dirty 3392     used 2240     free + dirty 13120    flags 0x22 (dirty index)
[ 6789.287753] LEB 52      free 0        dirty 496      used 14864    free + dirty 496      dark 0    dead 496  nodes fit 0   flags 0x0  )
[ 6789.288850] LEB 53      free 0        dirty 15248    used 112      free + dirty 15248    dark 4608 dead 0    nodes fit 3   flags 0x1  )
[ 6789.289897] (pid 1113) finish dumping LEB properties
[ 6789.290389] ------------[ cut here ]------------
[ 6789.290764] kernel BUG at fs/ubifs/journal.c:423!
[ 6789.291186] invalid opcode: 0000 [#1] PREEMPT SMP
[ 6789.291159] UBIFS error (ubi0:0 pid 1110): ubifs_commit_required [ubifs]: delay commit 1
[ 6789.291159] 
[ 6789.291574] CPU: 3 PID: 1113 Comm: test.sh Not tainted 6.7.0-rc1-00026-g01552fb03014-dirty #328
[ 6789.294725] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.1-2.fc37 04/01/2014
[ 6789.295436] RIP: 0010:make_reservation+0x63b/0xb90 [ubifs]
[ 6789.295910] Code: 04 00 48 89 df 48 83 05 c2 bb 06 00 01 e8 7d 01 04 00 4c 89 ff 48 83 05 ba bb 06 00 01 e8 dd 7b f4 e0 48 83 05 b5 bb0
[ 6789.297410] RSP: 0018:ffffc90000627a18 EFLAGS: 00010202
[ 6789.297857] RAX: 0000000000000000 RBX: ffff888102318000 RCX: ffffffff830fd4d0
[ 6789.298443] RDX: 0000000000000000 RSI: 0000000000000206 RDI: ffff888106174e28
[ 6789.299009] RBP: 0000000000000002 R08: ffff888101ac0010 R09: ffff88813bd2d900
[ 6789.299591] R10: 0000000000000000 R11: ffffffffffb6d221 R12: ffff888107d774f8
[ 6789.300153] R13: 00000000ffffffe4 R14: 00000000000000d8 R15: ffff8881023180c8
[ 6789.300742] FS:  00007f427b7ae700(0000) GS:ffff88813bd80000(0000) knlGS:0000000000000000
[ 6789.301403] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 6789.301872] CR2: 00007f71146db000 CR3: 000000010e07f000 CR4: 00000000000006f0
[ 6789.302435] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 6789.303009] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[ 6789.303604] Call Trace:
[ 6789.303806]  <TASK>
[ 6789.303988]  ? show_regs+0x88/0xa0
[ 6789.304286]  ? __die_body+0x26/0x90
[ 6789.304568]  ? die+0x5c/0xa0
[ 6789.304822]  ? do_trap+0x10e/0x140
[ 6789.305106]  ? do_error_trap+0x85/0xd0
[ 6789.305412]  ? make_reservation+0x63b/0xb90 [ubifs]
[ 6789.305834]  ? exc_invalid_op+0x68/0x80
[ 6789.306150]  ? make_reservation+0x63b/0xb90 [ubifs]
[ 6789.306564]  ? asm_exc_invalid_op+0x1a/0x20
[ 6789.306921]  ? make_reservation+0x63b/0xb90 [ubifs]
[ 6789.307348]  ? make_reservation+0x633/0xb90 [ubifs]
[ 6789.307742]  ubifs_jnl_truncate+0x184/0xcb0 [ubifs]
[ 6789.308162]  ? ubifs_xattr_get+0x143/0x310 [ubifs]
[ 6789.308572]  ? down_read+0x1a/0x180
[ 6789.308867]  ? unmap_mapping_pages+0x85/0x180
[ 6789.309228]  ubifs_setattr+0x4fa/0x770 [ubifs]
[ 6789.309622]  notify_change+0x478/0x790
[ 6789.309946]  ? do_truncate+0xa5/0x120
[ 6789.310246]  do_truncate+0xa5/0x120
[ 6789.310541]  ? fscrypt_symlink_getattr+0xe0/0xe0
[ 6789.310929]  path_openat+0x2d0/0x1a60
[ 6789.311247]  do_filp_open+0xd2/0x180
[ 6789.311558]  ? kmem_cache_alloc+0x2ef/0x550
[ 6789.311899]  ? alloc_fd+0x137/0x2a0
[ 6789.312191]  do_sys_openat2+0xc9/0x130
[ 6789.312513]  do_sys_open+0x5c/0xa0
[ 6789.312797]  __x64_sys_open+0x29/0x40
[ 6789.313109]  do_syscall_64+0x6f/0x140
[ 6789.313422]  entry_SYSCALL_64_after_hwframe+0x63/0x6b
[ 6789.313838] RIP: 0033:0x7f427a8ffe80
[ 6789.314136] Code: 00 f7 d8 64 89 01 48 83 c8 ff c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 83 3d 69 48 2d 00 00 75 10 b8 02 00 004
[ 6789.315598] RSP: 002b:00007ffd0a044b98 EFLAGS: 00000246 ORIG_RAX: 0000000000000002
[ 6789.316203] RAX: ffffffffffffffda RBX: 000055cb20f53c40 RCX: 00007f427a8ffe80
[ 6789.316787] RDX: 00000000000001b6 RSI: 0000000000000241 RDI: 000055cb20f54030
[ 6789.317369] RBP: 00007ffd0a044c30 R08: 0000000000000020 R09: fefeff2f312e6f6c
[ 6789.317934] R10: 0000000000000005 R11: 0000000000000246 R12: 0000000000000000
[ 6789.318507] R13: 0000000000000003 R14: 0000000000000001 R15: 000055cb20f54030
[ 6789.319071]  </TASK>
[ 6789.319253] Modules linked in: zstd ubifs ubi nandsim
[ 6789.319663] ---[ end trace 0000000000000000 ]---
[ 6789.320032] RIP: 0010:make_reservation+0x63b/0xb90 [ubifs]
[ 6789.320469] Code: 04 00 48 89 df 48 83 05 c2 bb 06 00 01 e8 7d 01 04 00 4c 89 ff 48 83 05 ba bb 06 00 01 e8 dd 7b f4 e0 48 83 05 b5 bb0
[ 6789.321939] RSP: 0018:ffffc90000627a18 EFLAGS: 00010202
[ 6789.322368] RAX: 0000000000000000 RBX: ffff888102318000 RCX: ffffffff830fd4d0
[ 6789.322951] RDX: 0000000000000000 RSI: 0000000000000206 RDI: ffff888106174e28
[ 6789.323559] RBP: 0000000000000002 R08: ffff888101ac0010 R09: ffff88813bd2d900
[ 6789.324140] R10: 0000000000000000 R11: ffffffffffb6d221 R12: ffff888107d774f8
[ 6789.324717] R13: 00000000ffffffe4 R14: 00000000000000d8 R15: ffff8881023180c8
[ 6789.325311] FS:  00007f427b7ae700(0000) GS:ffff88813bd80000(0000) knlGS:0000000000000000
[ 6789.325948] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 6789.326428] CR2: 00007f71146db000 CR3: 000000010e07f000 CR4: 00000000000006f0
[ 6789.327015] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 6789.327584] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[ 6789.328162] Kernel panic - not syncing: Fatal exception
[ 6789.329099] Kernel Offset: disabled
[ 6789.329411] ---[ end Kernel panic - not syncing: Fatal exception ]---
Comment 2 Zhihao Cheng 2023-11-20 01:44:12 UTC
Tips:
1. It's hardly to reproduce the problem by fsstress in my environment.
2. It's hardly to reproduce the problem without 'diff' patch, we only met once in real physical machine.
3. Reproducing concurrently in multi machines is recommended.
4. After fixing the problem and applying diff_fix, running for 4 days, the problem never occurred(fsstress && reproducer), the max value of 'cmt_retries' is 7, the max value of 'nospc_retries' is 4.
Comment 3 Zhihao Cheng 2023-11-20 01:44:53 UTC
Created attachment 305432 [details]
diff
Comment 4 Zhihao Cheng 2023-11-20 01:45:07 UTC
Created attachment 305433 [details]
test.sh
Comment 5 Zhihao Cheng 2023-11-20 01:45:36 UTC
Created attachment 305434 [details]
diff_fix
Comment 6 Zhihao Cheng 2024-01-22 04:14:11 UTC
Created attachment 305744 [details]
diff_fix_v2

Note You need to log in before you can comment on or make changes to this bug.