Bug 214733

Summary: ubifs: deadlock in concurrent rename whiteout and inode writeback
Product: File System Reporter: Zhihao Cheng (chengzhihao1)
Component: OtherAssignee: fs_other
Status: NEW ---    
Severity: normal    
Priority: P1    
Hardware: All   
OS: Linux   
Kernel Version: 5.15-rc4 Tree: Mainline
Regression: No
Attachments: test.sh
delay.patch
a.c

Description Zhihao Cheng 2021-10-16 09:00:22 UTC
Following hung tasks:
    [   77.028764] task:kworker/u8:4    state:D stack:    0 pid:  132
    [   77.028820] Call Trace:
    [   77.029027]  schedule+0x8c/0x1b0
    [   77.029067]  mutex_lock+0x50/0x60
    [   77.029074]  ubifs_write_inode+0x68/0x1f0 [ubifs]
    [   77.029117]  __writeback_single_inode+0x43c/0x570
    [   77.029128]  writeback_sb_inodes+0x259/0x740
    [   77.029148]  wb_writeback+0x107/0x4d0
    [   77.029163]  wb_workfn+0x162/0x7b0
    
    [   92.390442] task:aa              state:D stack:    0 pid: 1506
    [   92.390448] Call Trace:
    [   92.390458]  schedule+0x8c/0x1b0
    [   92.390461]  wb_wait_for_completion+0x82/0xd0
    [   92.390469]  __writeback_inodes_sb_nr+0xb2/0x110
    [   92.390472]  writeback_inodes_sb_nr+0x14/0x20
    [   92.390476]  ubifs_budget_space+0x705/0xdd0 [ubifs]
    [   92.390503]  do_rename.cold+0x7f/0x187 [ubifs]
    [   92.390549]  ubifs_rename+0x8b/0x180 [ubifs]
    [   92.390571]  vfs_rename+0xdb2/0x1170
    [   92.390580]  do_renameat2+0x554/0x770
    
    , are caused by concurrent rename whiteout and inode writeback processes:
            rename_whiteout(Thread 1)               wb_workfn(Thread2)
    ubifs_rename
      do_rename
        lock_4_inodes (Hold ui_mutex)
        ubifs_budget_space
          make_free_space
            shrink_liability
              __writeback_inodes_sb_nr
                bdi_split_work_to_wbs (Queue new wb work)
                                                  wb_do_writeback(wb work)
                                                    __writeback_single_inode
                                                      ubifs_write_inode
                                                        LOCK(ui_mutex)
                                                               ↑
                  wb_wait_for_completion (Wait wb work) <-- deadlock!
Comment 1 Zhihao Cheng 2021-10-16 09:03:14 UTC
Reproducer:
1. Apply delay.patch and compile kernel (UBIFS/UBI/NANDSIM = m)
2. ./test.sh
3. Compile and run a.c(according print message from aa and kernel)

You may get:
[   57.813124] ---- Wait for consume space ---
[   59.448929] [dd] wait shrink ...
[   61.079974] [touch] wait shrink ...
[   61.956535] [chmod] wait shrink ...
[   66.269908] ~~~~~~ Begin budget space ~~~~~
[   66.277915] [aa] wait shrink ...
[   77.021941] INFO: task kworker/u8:4:132 blocked for more than 15 seconds.
[   77.024295]       Not tainted 5.15.0-rc5-00126-gec681c53f8d2-dirty #106
[   77.026348] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[   77.028764] task:kworker/u8:4    state:D stack:    0 pid:  132 ppid:     2 flags:0x00004000
[   77.028782] Workqueue: writeback wb_workfn (flush-ubifs_0_0)
[   77.028820] Call Trace:
[   77.028826]  __schedule+0x482/0x1060
[   77.028875]  ? ubifs_jnl_write_inode+0x68c/0x730 [ubifs]
[   77.029027]  schedule+0x8c/0x1b0
[   77.029036]  schedule_preempt_disabled+0x1a/0x30
[   77.029045]  __mutex_lock.isra.0+0x47b/0x860
[   77.029060]  __mutex_lock_slowpath+0x1f/0x30
[   77.029067]  mutex_lock+0x50/0x60
[   77.029074]  ubifs_write_inode+0x68/0x1f0 [ubifs]
[   77.029117]  __writeback_single_inode+0x43c/0x570
[   77.029128]  writeback_sb_inodes+0x259/0x740
[   77.029148]  wb_writeback+0x107/0x4d0
[   77.029163]  wb_workfn+0x162/0x7b0
[   77.029172]  ? update_load_avg+0xbd/0x520
[   77.029180]  ? set_next_entity+0x15b/0x3b0
[   77.029187]  ? pick_next_task_fair+0x258/0x640
[   77.029194]  process_one_work+0x291/0x650
[   77.029201]  worker_thread+0x2b7/0x700
[   77.029206]  ? rescuer_thread+0x580/0x580
[   77.029215]  kthread+0x18c/0x1e0
[   77.029224]  ? set_kthread_struct+0x70/0x70
[   77.029233]  ret_from_fork+0x1f/0x30


[   77.029257] INFO: task dd:1508 blocked for more than 15 seconds.
[   77.031142]       Not tainted 5.15.0-rc5-00126-gec681c53f8d2-dirty #106
[   77.032074] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[   77.033129] task:dd              state:D stack:    0 pid: 1508 ppid:  1507 flags:0x00000000
[   77.033133] Call Trace:
[   77.033136]  __schedule+0x482/0x1060
[   77.033142]  schedule+0x8c/0x1b0
[   77.033146]  wb_wait_for_completion+0x82/0xd0
[   77.033150]  ? woken_wake_function+0x30/0x30
[   77.033154]  __writeback_inodes_sb_nr+0xb2/0x110
[   77.033161]  writeback_inodes_sb_nr+0x14/0x20
[   77.033164]  ubifs_budget_space+0x705/0xdd0 [ubifs]
[   77.033187]  ubifs_write_begin+0x2e8/0xb80 [ubifs]
[   77.033209]  generic_perform_write+0xca/0x280
[   77.033215]  __generic_file_write_iter+0x159/0x2d0
[   77.033219]  generic_file_write_iter+0x85/0x110
[   77.033224]  ubifs_write_iter+0x170/0x1f0 [ubifs]
[   77.033243]  ? _crng_backtrack_protect+0x83/0xb0
[   77.033249]  new_sync_write+0x15a/0x230
[   77.033254]  vfs_write+0x2c2/0x440
[   77.033259]  ksys_write+0x73/0x160
[   77.033264]  __x64_sys_write+0x1e/0x30
[   77.033267]  do_syscall_64+0x35/0x80
[   77.033271]  entry_SYSCALL_64_after_hwframe+0x44/0xae

[   92.386366] INFO: task aa:1506 blocked for more than 15 seconds.
[   92.387602]       Not tainted 5.15.0-rc5-00126-gec681c53f8d2-dirty #106
[   92.388900] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[   92.390442] task:aa              state:D stack:    0 pid: 1506 ppid:  1502 flags:0x00004000
[   92.390448] Call Trace:
[   92.390453]  __schedule+0x482/0x1060
[   92.390458]  schedule+0x8c/0x1b0
[   92.390461]  wb_wait_for_completion+0x82/0xd0
[   92.390465]  ? woken_wake_function+0x30/0x30
[   92.390469]  __writeback_inodes_sb_nr+0xb2/0x110
[   92.390472]  writeback_inodes_sb_nr+0x14/0x20
[   92.390476]  ubifs_budget_space+0x705/0xdd0 [ubifs]
[   92.390503]  do_rename.cold+0x7f/0x187 [ubifs]
[   92.390529]  ? ubifs_rename+0x8b/0x180 [ubifs]
[   92.390549]  ubifs_rename+0x8b/0x180 [ubifs]
[   92.390569]  ? __cond_resched+0x1d/0x50
[   92.390571]  vfs_rename+0xdb2/0x1170
[   92.390577]  ? do_renameat2+0x554/0x770
[   92.390580]  do_renameat2+0x554/0x770
[   92.390584]  __x64_sys_renameat2+0x6e/0x90
[   92.390587]  do_syscall_64+0x35/0x80
[   92.390591]  entry_SYSCALL_64_after_hwframe+0x44/0xae
Comment 2 Zhihao Cheng 2021-10-16 09:22:54 UTC
Created attachment 299221 [details]
test.sh
Comment 3 Zhihao Cheng 2021-10-16 09:23:09 UTC
Created attachment 299223 [details]
delay.patch
Comment 4 Zhihao Cheng 2021-10-16 09:23:19 UTC
Created attachment 299225 [details]
a.c