Bug 215837

Summary: A potential ABBA dead lock for writeback process and fsync process
Product: File System Reporter: Zhihao Cheng (chengzhihao1)
Component: VFSAssignee: fs_vfs
Status: NEW ---    
Severity: normal    
Priority: P1    
Hardware: All   
OS: Linux   
Kernel Version: 5.18-rc1 Subsystem:
Regression: No Bisected commit-id:
Attachments: reproduce.diff
b.c

Description Zhihao Cheng 2022-04-14 13:33:10 UTC
1. Apply reproduce.diff and compile kernel  (CONFIG_VFAT_FS, CONFIG_DETECT_HUNG_TASK)
2. Compile and run b.c

[  399.044861] INFO: task bb:2607 blocked for more than 30 seconds.
[  399.046824]       Not tainted 5.18.0-rc1-00005-gefae4d9eb6a2-dirty #394
[  399.048941] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[  399.051539] task:bb              state:D stack:    0 pid: 2607 ppid:  2426 flags:0x00004000
[  399.051556] Call Trace:
[  399.051561]  <TASK>
[  399.051570]  __schedule+0x480/0x1050
[  399.051592]  schedule+0x92/0x1a0
[  399.051602]  io_schedule+0x22/0x50
[  399.051613]  blk_mq_get_tag+0x1d3/0x3c0
[  399.051624]  ? ipi_rseq+0x70/0x70
[  399.051640]  __blk_mq_alloc_requests+0x21d/0x3f0
[  399.051657]  blk_mq_submit_bio+0x68d/0xca0
[  399.051674]  __submit_bio+0x1b5/0x2d0
[  399.051687]  submit_bio_noacct_nocheck+0x346/0x3e0
[  399.051697]  ? kvm_clock_get_cycles+0xd/0x20
[  399.051708]  submit_bio_noacct+0x34e/0x720
[  399.051718]  submit_bio+0x3b/0x150
[  399.051725]  submit_bh_wbc+0x161/0x230
[  399.051734]  __sync_dirty_buffer+0xd1/0x420
[  399.051744]  sync_dirty_buffer+0x17/0x20
[  399.051750]  __fat_write_inode+0x289/0x310
[  399.051766]  fat_write_inode+0x2a/0xa0
[  399.051774]  ? _raw_spin_lock+0x1b/0x90
[  399.051783]  __writeback_single_inode+0x53c/0x6f0
[  399.051795]  writeback_single_inode+0x145/0x200
[  399.051803]  sync_inode_metadata+0x45/0x70
[  399.051856]  __generic_file_fsync+0xa3/0x150
[  399.051880]  fat_file_fsync+0x1d/0x80
[  399.051895]  vfs_fsync_range+0x40/0xb0
[  399.051912]  do_fsync+0x48/0xa0
[  399.051929]  __x64_sys_fsync+0x18/0x30
[  399.051940]  do_syscall_64+0x35/0x80
[  399.051957]  entry_SYSCALL_64_after_hwframe+0x44/0xae
[  399.051968] RIP: 0033:0x7fb271f06e40
[  399.051980] RSP: 002b:00007fffcbc55a28 EFLAGS: 00000246 ORIG_RAX: 000000000000004a
[  399.051996] RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007fb271f06e40
[  399.052004] RDX: 0000000000000000 RSI: 00000000009fe010 RDI: 0000000000000003
[  399.052012] RBP: 00007fffcbceba40 R08: 0000000000000000 R09: 0000000000000000
[  399.052020] R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000400520
[  399.052027] R13: 00007fffcbcebb20 R14: 0000000000000000 R15: 0000000000000000
Comment 1 Zhihao Cheng 2022-04-14 13:34:08 UTC
Created attachment 300757 [details]
reproduce.diff
Comment 2 Zhihao Cheng 2022-04-14 13:34:18 UTC
Created attachment 300758 [details]
b.c
Comment 3 Zhihao Cheng 2022-04-14 13:41:07 UTC
sync will break the deadlock.
wb_writeback
  if ((work->for_background || work->for_kupdate) && !list_empty(&wb->work_list))  // sync adds new work in wb->work_list
     break;