Bug 206153
Summary: | ubifs: Deadlock in concurrent bulk-read and writepage | ||
---|---|---|---|
Product: | File System | Reporter: | Zhihao Cheng (chengzhihao1) |
Component: | Other | Assignee: | fs_other |
Status: | NEW --- | ||
Severity: | normal | CC: | alexander198961 |
Priority: | P1 | ||
Hardware: | All | ||
OS: | Linux | ||
Kernel Version: | 5.5-rc5 | Subsystem: | |
Regression: | No | Bisected commit-id: | |
Attachments: |
reproduce source code in C
setup ubifs from a dumped flash file flash reproduce delay and print |
Description
Zhihao Cheng
2020-01-10 11:16:41 UTC
A. Process of reading|writing pages in ubifs 1. vfs_write -> __vfs_write -> new_sync_write -> ubifs_write_iter -> generic_file_write_iter -> __generic_file_write_iter -> generic_perform_write: ubifs_write_begin page_alloc -> LOCK(page) do_readpage allocate_budget LOCK(ui->ui_mutex) [For append write only] ubifs_write_end UNLOCK(ui->ui_mutex) [For append write only] UNLOCK(page) 2. Write page:wb_workfn -> writeback_inodes_wb -> __writeback_inodes_wb -> writeback_sb_inodes -> __writeback_single_inode -> do_writepages -> generic_writepages -> write_cache_pages: LOCK(page) ubifs_writepage ubifs_write_inode LOCK(ui->ui_mutex) UNLOCK(ui->ui_mutex) do_writepage UNLOCK(page) 3. Read page:vfs_read -> __vfs_read -> new_sync_read -> generic_file_read_iter -> generic_file_buffered_read: Situation ①:Read single page find_get_page Possibility I page not in memory: page_cache_alloc LOCK(page) ubifs_readpage ubifs_bulk_read LOCK(ui->ui_mutex) UNLOCK(ui->ui_mutex) do_readpage UNLOCK(page) Possibility II page in memory,but not update: LOCK(page) IF NOT update ubifs_readpage ubifs_bulk_read LOCK(ui->ui_mutex) UNLOCK(ui->ui_mutex) do_readpage UNLOCK(page) IF update UNLOCK(page) Situation ②:Bulk read find_get_page (First two pages): page_alloc LOCK(page) ubifs_readpage ubifs_bulk_read LOCK(ui->ui_mutex) UNLOCK(ui->ui_mutex) do_readpage UNLOCK(page) (Page 3~N): page_alloc LOCK(page3) ubifs_readpage ubifs_bulk_read LOCK(ui->ui_mutex) ubifs_do_bulk_read populate_page(page3) UNLOCK(page3) find_or_create_page pagecache_get_page LOCK(page4) UNLOCK(page4) UNLOCK(ui->ui_mutex) B. Reproduce problem: 1. Apply reproduce_delay_and_print.patch and compile kernel,build nandsim/mtd/ubi/ubifs in modules 2. ./setup.sh 3. Compile reproduce.c and run ./reproduce, after a period of time, hung task occurs C. 打印信息 [ 310.515130] flash size: 512 MiB [ 310.515660] page size: 4096 bytes [ 310.516198] OOB area size: 128 bytes [ 310.516764] sector size: 128 KiB [ 310.517287] pages number: 131072 [ 310.517802] pages per sector: 32 [ 310.518327] bus width: 8 [ 310.518735] bits in sector size: 17 [ 310.519312] bits in page size: 12 [ 310.519855] bits in OOB size: 7 [ 310.520366] flash size with OOB: 540672 KiB [ 310.521024] page address bytes: 5 [ 310.521558] sector address bytes: 3 [ 310.522108] options: 0x80 [ 310.541678] [nandsim] warning: CONFIG_MTD_PARTITIONED_MASTER must be enabled to expose debugfs stuff [ 317.947702] {WRITE_ITER} A generic_file_write_iter, ino=65 LOCK INODE == [ 317.947724] [READ] Step 1. ubifs_bulk_read ino=65 page=0xffffea0004e130c0 LOCK UI == [ 317.950557] {WRITE_ITER} B generic_file_write_iter, ino=65 LOCK INODE -- [ 317.950581] {WRITE_ITER} Step 1. Get page by ubifs_write_begin, ino=65 LOCK PAGE == [ 317.951876] [READ] Step 1. ubifs_bulk_read ino=65 page=0xffffea0004e130c0 LOCK UI -- [ 317.951878] [READ] Step 3. ubifs_bulk_read ino=65 page=0xffffea0004e130c0 bulked=0 UNLOCK UI ++ [ 317.951948] [READ] Step 1. ubifs_bulk_read ino=65 page=0xffffea0004ff8d80 LOCK UI == [ 317.953542] {WRITE_ITER} Step 1. Get page 0xffffea0004eb0140 by ubifs_write_begin, ino=65 LOCK PAGE -- [ 317.954245] [READ] Step 1. ubifs_bulk_read ino=65 page=0xffffea0004ff8d80 LOCK UI -- [ 317.955617] {WRITE_ITER} Step 2. Appending page 0xffffea0004eb0140 by allocate_budget, ino=65 LOCK UI == [ 317.956815] [READ] Step 3. ubifs_bulk_read ino=65 page=0xffffea0004ff8d80 bulked=0 UNLOCK UI ++ [ 317.956846] [READ] Step 1. ubifs_bulk_read ino=65 page=0xffffea0004e640c0 LOCK UI == [ 317.958029] {WRITE_ITER} Step 2. Appending page 0xffffea0004eb0140 by allocate_budget, ino=65 LOCK UI -- [ 317.966116] {WRITE_ITER} Step 3. ubifs_write_end appending write page=0xffffea0004eb0140, ino=65 UNLOCK UI ++ [ 317.967644] {WRITE_ITER} Step 3. Fill page 0xffffea0004eb0140 by ubifs_write_end, ino=65 UNLOCK PAGE ++ [ 317.969117] {WRITE_ITER} C generic_file_write_iter, ino=65 UNLOCK INODE ++ [ 317.970200] <WRITE> Step 1. write_cache_pages page=0xffffea0004eb0140 data=345 LOCK PAGE ==. [ 317.971593] <WRITE> Step 1. write_cache_pages page=0xffffea0004eb0140 data=345 LOCK PAGE --. [ 317.972906] <WRITE> Step 2. ubifs_write_inode ino=65 LOCK UI ==. [ 318.067190] [READ] Step 1. ubifs_bulk_read ino=65 page=0xffffea0004e640c0 LOCK UI -- [ 318.475195] [READ] Step 2. pagecache_get_page page=0xffffea0004eb0140 LOCK PAGE ==. [ 338.275252] INFO: task aa:7447 blocked for more than 10 seconds. [ 338.276806] Not tainted 5.5.0-rc5-dirty #104 [ 338.277582] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [ 338.278806] Call Trace: [ 338.279225] ? __schedule+0x5b8/0xb50 [ 338.280001] schedule+0x9a/0x1b0 [ 338.280756] schedule_preempt_disabled+0x1a/0x30 [ 338.281810] __mutex_lock.isra.8+0x4e8/0x900 [ 338.282510] ? __mutex_lock_slowpath+0x1f/0x30 [ 338.283220] __mutex_lock_slowpath+0x1f/0x30 [ 338.284120] mutex_lock+0x57/0x70 [ 338.284925] ubifs_write_inode+0xb5/0x220 [ubifs] [ 338.285927] ubifs_writepage+0x299/0x330 [ubifs] [ 338.286663] __writepage+0x1b/0xa0 [ 338.287215] write_cache_pages+0x353/0x8a0 [ 338.288073] ? __wb_calc_thresh+0x160/0x160 [ 338.289035] ? kvm_sched_clock_read+0x25/0x40 [ 338.289941] ? sched_clock+0xd/0x20 [ 338.290550] ? wake_up_process+0x19/0x30 [ 338.291207] generic_writepages+0x74/0xc0 [ 338.292043] ? remove_wait_queue+0x7a/0x90 [ 338.292763] ? do_writepages+0x8d/0xb0 [ 338.293432] do_writepages+0x8d/0xb0 [ 338.294088] __filemap_fdatawrite_range+0x123/0x190 [ 338.294987] file_write_and_wait_range+0xa5/0x110 [ 338.295890] ubifs_fsync+0xc4/0x170 [ubifs] [ 338.296636] vfs_fsync_range+0x59/0xb0 [ 338.297323] ? fput+0x17/0x20 [ 338.297872] ? ksys_write+0xf8/0x150 [ 338.298485] do_fsync+0x50/0x90 [ 338.299083] __x64_sys_fsync+0x18/0x30 [ 338.299773] do_syscall_64+0xbf/0x440 [ 338.300402] entry_SYSCALL_64_after_hwframe+0x44/0xa9 [ 338.301268] RIP: 0033:0x7f21ef32889d [ 338.301926] Code: Bad RIP value. [ 338.302515] RSP: 002b:00007f21eef3cf00 EFLAGS: 00000293 ORIG_RAX: 000000000000004a [ 338.303948] RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007f21ef32889d [ 338.305244] RDX: 00007f21ef313740 RSI: 00007f21e80008c0 RDI: 0000000000000003 [ 338.306511] RBP: 00007f21eef40f40 R08: 00007f21eef41700 R09: 0000000000000077 [ 338.307747] R10: 0000000000000004 R11: 0000000000000293 R12: 00007f21eef41000 [ 338.308987] R13: 0000000000000000 R14: 00007ffcf71b1ba0 R15: 00007f21eef419c0 [ 338.310296] INFO: task aa:7448 blocked for more than 10 seconds. [ 338.311347] Not tainted 5.5.0-rc5-dirty #104 [ 338.312213] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [ 338.313575] Call Trace: [ 338.313981] ? __schedule+0x5b8/0xb50 [ 338.314583] schedule+0x9a/0x1b0 [ 338.315108] io_schedule+0x22/0x50 [ 338.315680] __lock_page+0x1be/0x390 [ 338.316279] ? add_to_page_cache_lru+0x170/0x170 [ 338.317023] pagecache_get_page+0x393/0x4b0 [ 338.317714] ubifs_readpage+0x4dc/0x770 [ubifs] [ 338.318464] generic_file_read_iter+0x663/0xc10 [ 338.319213] new_sync_read+0x195/0x240 [ 338.320012] __vfs_read+0x41/0x60 [ 338.320801] vfs_read+0xce/0x1f0 [ 338.321573] ksys_read+0x88/0x150 [ 338.322285] __x64_sys_read+0x1e/0x30 [ 338.322892] do_syscall_64+0xbf/0x440 [ 338.323550] entry_SYSCALL_64_after_hwframe+0x44/0xa9 [ 338.324740] RIP: 0033:0x7f21ef3282fd [ 338.325594] Code: Bad RIP value. [ 338.326189] RSP: 002b:00007f21ee73bf00 EFLAGS: 00000293 ORIG_RAX: 0000000000000000 [ 338.327411] RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007f21ef3282fd [ 338.329064] RDX: 0000000000004006 RSI: 00007f21ee73bf20 RDI: 0000000000000004 [ 338.330395] RBP: 00007f21ee73ff40 R08: 00007f21ee740700 R09: 00007f21ee740700 [ 338.331633] R10: 0000000000000004 R11: 0000000000000293 R12: 00007f21ee740000 [ 338.333281] R13: 0000000000000000 R14: 00007ffcf71b1ba0 R15: 00007f21ee7409c0 [ 348.515255] INFO: task aa:7447 blocked for more than 20 seconds. [ 348.516699] Not tainted 5.5.0-rc5-dirty #104 [ 348.517797] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [ 348.519042] Call Trace: [ 348.519477] ? __schedule+0x5b8/0xb50 [ 348.520335] schedule+0x9a/0x1b0 [ 348.521081] schedule_preempt_disabled+0x1a/0x30 [ 348.522005] __mutex_lock.isra.8+0x4e8/0x900 [ 348.522685] ? __mutex_lock_slowpath+0x1f/0x30 [ 348.523393] __mutex_lock_slowpath+0x1f/0x30 [ 348.524381] mutex_lock+0x57/0x70 [ 348.525185] ubifs_write_inode+0xb5/0x220 [ubifs] [ 348.526173] ubifs_writepage+0x299/0x330 [ubifs] [ 348.526916] __writepage+0x1b/0xa0 [ 348.527496] write_cache_pages+0x353/0x8a0 [ 348.528452] ? __wb_calc_thresh+0x160/0x160 [ 348.529429] ? kvm_sched_clock_read+0x25/0x40 [ 348.530206] ? sched_clock+0xd/0x20 [ 348.530762] ? wake_up_process+0x19/0x30 [ 348.531388] generic_writepages+0x74/0xc0 [ 348.532321] ? remove_wait_queue+0x7a/0x90 [ 348.533277] ? do_writepages+0x8d/0xb0 [ 348.533981] do_writepages+0x8d/0xb0 [ 348.534559] __filemap_fdatawrite_range+0x123/0x190 [ 348.535342] file_write_and_wait_range+0xa5/0x110 [ 348.536426] ubifs_fsync+0xc4/0x170 [ubifs] [ 348.537402] vfs_fsync_range+0x59/0xb0 [ 348.538126] ? fput+0x17/0x20 [ 348.538613] ? ksys_write+0xf8/0x150 [ 348.539191] do_fsync+0x50/0x90 [ 348.539827] __x64_sys_fsync+0x18/0x30 [ 348.540705] do_syscall_64+0xbf/0x440 [ 348.541561] entry_SYSCALL_64_after_hwframe+0x44/0xa9 [ 348.542385] RIP: 0033:0x7f21ef32889d [ 348.542953] Code: Bad RIP value. [ 348.543507] RSP: 002b:00007f21eef3cf00 EFLAGS: 00000293 ORIG_RAX: 000000000000004a [ 348.544914] RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007f21ef32889d [ 348.546027] RDX: 00007f21ef313740 RSI: 00007f21e80008c0 RDI: 0000000000000003 [ 348.547143] RBP: 00007f21eef40f40 R08: 00007f21eef41700 R09: 0000000000000077 [ 348.548631] R10: 0000000000000004 R11: 0000000000000293 R12: 00007f21eef41000 [ 348.550133] R13: 0000000000000000 R14: 00007ffcf71b1ba0 R15: 00007f21eef419c0 [ 348.551245] INFO: task aa:7448 blocked for more than 20 seconds. [ 348.552539] Not tainted 5.5.0-rc5-dirty #104 [ 348.553639] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [ 348.554872] Call Trace: [ 348.555280] ? __schedule+0x5b8/0xb50 [ 348.556077] schedule+0x9a/0x1b0 [ 348.556831] io_schedule+0x22/0x50 [ 348.557635] __lock_page+0x1be/0x390 [ 348.558223] ? add_to_page_cache_lru+0x170/0x170 [ 348.558945] pagecache_get_page+0x393/0x4b0 [ 348.559727] ubifs_readpage+0x4dc/0x770 [ubifs] [ 348.560780] generic_file_read_iter+0x663/0xc10 [ 348.561796] new_sync_read+0x195/0x240 [ 348.562402] __vfs_read+0x41/0x60 [ 348.562924] vfs_read+0xce/0x1f0 [ 348.563464] ksys_read+0x88/0x150 [ 348.564255] __x64_sys_read+0x1e/0x30 [ 348.565091] do_syscall_64+0xbf/0x440 [ 348.565861] entry_SYSCALL_64_after_hwframe+0x44/0xa9 [ 348.566661] RIP: 0033:0x7f21ef3282fd [ 348.567237] Code: Bad RIP value. [ 348.567911] RSP: 002b:00007f21ee73bf00 EFLAGS: 00000293 ORIG_RAX: 0000000000000000 [ 348.569611] RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007f21ef3282fd [ 348.570742] RDX: 0000000000004006 RSI: 00007f21ee73bf20 RDI: 0000000000000004 [ 348.572064] RBP: 00007f21ee73ff40 R08: 00007f21ee740700 R09: 00007f21ee740700 [ 348.573677] R10: 0000000000000004 R11: 0000000000000293 R12: 00007f21ee740000 [ 348.574794] R13: 0000000000000000 R14: 00007ffcf71b1ba0 R15: 00007f21ee7409c0 D. Some explanations The lock sequence of several reading and writing page processes of UBIFS mentioned in A is LOCK(page) first, then ui->mutex. Except for the content after page3 of case A-3-②(Locks ui->mutex first), it is assumed to be case T. If case T and case A-2 are executed simultaneously, ABBA deadlock will occur, and then the hung task call stack in C will appear. In addition, ABBA deadlock occurs when T and A-1 are executed simultaneously. Created attachment 286733 [details]
reproduce source code in C
Created attachment 286735 [details]
setup ubifs from a dumped flash file
Created attachment 286737 [details]
flash
Created attachment 286747 [details]
reproduce delay and print
|