I have a laptop with a 500gb btrfs filesystem on an SSD. I am using snapper to generate hourly/daily snapshots - roughly 10 snaps stored over the past week. When the snapshots get deleted and the btrfs-cleaner process runs, the system periodically freezes for long periods of time with the disk reading at ~100mb/sec and kworker or btrfs-cleaner running at 100% OR one of my userspace processes running at 100% of system CPU usage. I reduced the hung task blocker to 20sec to produce the following dump - the system is responsive but hangs the process the moment it tries to IO to the disk during this period. This period of ~20-30sec IO hangs lasts for about 10 minutes. When snapshot deletion is not occurring the system runs really smoothly; no IO issues or anything. I recently upgraded from ubuntu 14.04 lts on a 4.2.6 kernel which did not have these hangs. This issue occurs on ubuntu 16.04 lts standard kernel which I believe is 4.4.8, and also with this kernel build: http://kernel.ubuntu.com/~kernel-ppa/mainline/v4.4.11-xenial/ Mark [27914.006195] INFO: task btrfs-transacti:317 blocked for more than 20 seconds. [27914.006199] Tainted: P W OE 4.4.11-040411-generic #201605182255 [27914.006199] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [27914.006200] btrfs-transacti D ffff880270383da8 0 317 2 0x00000000 [27914.006203] ffff880270383da8 ffff880270383df0 ffffffff81e11500 ffff8800358e3b00 [27914.006205] ffff880270384000 ffff880277a4d1f0 ffff880277a4d000 ffff880277a4d1f0 [27914.006206] 0000000000000000 ffff880270383dc0 ffffffff81804245 ffff88023e009b00 [27914.006208] Call Trace: [27914.006214] [<ffffffff81804245>] schedule+0x35/0x80 [27914.006245] [<ffffffffc0327fb3>] wait_current_trans.isra.21+0xd3/0x120 [btrfs] [27914.006248] [<ffffffff810c2860>] ? wake_atomic_t_function+0x60/0x60 [27914.006257] [<ffffffffc032988b>] start_transaction+0x27b/0x4c0 [btrfs] [27914.006265] [<ffffffffc0329b87>] btrfs_attach_transaction+0x17/0x20 [btrfs] [27914.006274] [<ffffffffc0324082>] transaction_kthread+0x82/0x240 [btrfs] [27914.006281] [<ffffffffc0324000>] ? btrfs_cleanup_transaction+0x570/0x570 [btrfs] [27914.006284] [<ffffffff8109f428>] kthread+0xd8/0xf0 [27914.006286] [<ffffffff8109f350>] ? kthread_create_on_node+0x1a0/0x1a0 [27914.006288] [<ffffffff818086cf>] ret_from_fork+0x3f/0x70 [27914.006289] [<ffffffff8109f350>] ? kthread_create_on_node+0x1a0/0x1a0 [27914.006301] INFO: task cups-browsed:1472 blocked for more than 20 seconds. [27914.006302] Tainted: P W OE 4.4.11-040411-generic #201605182255 [27914.006302] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [27914.006303] cups-browsed D ffff880075a2fb40 0 1472 1 0x00000000 [27914.006305] ffff880075a2fb40 0001880000000001 ffffffff81e11500 ffff88027744bb00 [27914.006306] ffff880075a30000 ffff880277f7a9f0 ffff880277f7a800 ffff880277f7a9f0 [27914.006307] 0000000000000001 ffff880075a2fb58 ffffffff81804245 ffff88023e009b00 [27914.006309] Call Trace: [27914.006310] [<ffffffff81804245>] schedule+0x35/0x80 [27914.006318] [<ffffffffc0327fb3>] wait_current_trans.isra.21+0xd3/0x120 [btrfs] [27914.006320] [<ffffffff810c2860>] ? wake_atomic_t_function+0x60/0x60 [27914.006327] [<ffffffffc03298db>] start_transaction+0x2cb/0x4c0 [btrfs] [27914.006334] [<ffffffffc0329ae8>] btrfs_start_transaction+0x18/0x20 [btrfs] [27914.006342] [<ffffffffc0334f12>] btrfs_setattr+0x182/0x360 [btrfs] [27914.006344] [<ffffffff81226bf5>] notify_change+0x235/0x360 [27914.006346] [<ffffffff81207c05>] do_truncate+0x75/0xc0 [27914.006348] [<ffffffff8121811a>] path_openat+0x27a/0x1330 [27914.006349] [<ffffffff8121a3c1>] do_filp_open+0x91/0x100 [27914.006351] [<ffffffff81227b36>] ? __alloc_fd+0x46/0x190 [27914.006352] [<ffffffff8120917a>] do_sys_open+0x13a/0x230 [27914.006353] [<ffffffff8120928e>] SyS_open+0x1e/0x20 [27914.006355] [<ffffffff81808332>] entry_SYSCALL_64_fastpath+0x16/0x71 [27914.006413] INFO: task kded5:3570 blocked for more than 20 seconds. [27914.006414] Tainted: P W OE 4.4.11-040411-generic #201605182255 [27914.006415] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [27914.006415] kded5 D ffff880244203b40 0 3570 3560 0x00000000 [27914.006417] ffff880244203b40 0001880200000001 ffff880277da1d80 ffff88024fcb2c40 [27914.006418] ffff880244204000 ffff880075f889f0 ffff880075f88800 ffff880075f889f0 [27914.006419] 0000000000000001 ffff880244203b58 ffffffff81804245 ffff88023e009b00 [27914.006421] Call Trace: [27914.006422] [<ffffffff81804245>] schedule+0x35/0x80 [27914.006430] [<ffffffffc0327fb3>] wait_current_trans.isra.21+0xd3/0x120 [btrfs] [27914.006431] [<ffffffff810c2860>] ? wake_atomic_t_function+0x60/0x60 [27914.006438] [<ffffffffc03298db>] start_transaction+0x2cb/0x4c0 [btrfs] [27914.006445] [<ffffffffc0329ae8>] btrfs_start_transaction+0x18/0x20 [btrfs] [27914.006453] [<ffffffffc0334f12>] btrfs_setattr+0x182/0x360 [btrfs] [27914.006454] [<ffffffff81226bf5>] notify_change+0x235/0x360 [27914.006455] [<ffffffff81207c05>] do_truncate+0x75/0xc0 [27914.006456] [<ffffffff81207942>] ? do_dentry_open+0x272/0x310 [27914.006458] [<ffffffff8121811a>] path_openat+0x27a/0x1330 [27914.006459] [<ffffffff812197e1>] ? filename_lookup+0xf1/0x180 [27914.006461] [<ffffffff8121a3c1>] do_filp_open+0x91/0x100 [27914.006462] [<ffffffff81227b36>] ? __alloc_fd+0x46/0x190 [27914.006463] [<ffffffff8120917a>] do_sys_open+0x13a/0x230 [27914.006465] [<ffffffff8120928e>] SyS_open+0x1e/0x20 [27914.006466] [<ffffffff81808332>] entry_SYSCALL_64_fastpath+0x16/0x71 [27914.006477] INFO: task Thread (pooled):32343 blocked for more than 20 seconds. [27914.006478] Tainted: P W OE 4.4.11-040411-generic #201605182255 [27914.006478] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [27914.006479] Thread (pooled) D ffff88022ac37b98 0 32343 3560 0x00000000 [27914.006480] ffff88022ac37b98 0001880200000001 ffff880277da2c40 ffff8801438b49c0 [27914.006482] ffff88022ac38000 ffff880075f889f0 ffff880075f88800 ffff880075f889f0 [27914.006483] 0000000000000001 ffff88022ac37bb0 ffffffff81804245 ffff88023e009b00 [27914.006484] Call Trace: [27914.006485] [<ffffffff81804245>] schedule+0x35/0x80 [27914.006493] [<ffffffffc0327fb3>] wait_current_trans.isra.21+0xd3/0x120 [btrfs] [27914.006495] [<ffffffff810c2860>] ? wake_atomic_t_function+0x60/0x60 [27914.006501] [<ffffffffc03298db>] start_transaction+0x2cb/0x4c0 [btrfs] [27914.006508] [<ffffffffc0329ae8>] btrfs_start_transaction+0x18/0x20 [btrfs] [27914.006516] [<ffffffffc03385ca>] btrfs_create+0x4a/0x210 [btrfs] [27914.006517] [<ffffffff812158e5>] vfs_create+0xc5/0x120 [27914.006519] [<ffffffff812190ac>] path_openat+0x120c/0x1330 [27914.006534] [<ffffffff8118b7b9>] ? unlock_page+0x69/0x70 [27914.006536] [<ffffffff8118ba34>] ? filemap_map_pages+0x224/0x230 [27914.006538] [<ffffffff8121a3c1>] do_filp_open+0x91/0x100 [27914.006540] [<ffffffff81227b36>] ? __alloc_fd+0x46/0x190 [27914.006541] [<ffffffff8120917a>] do_sys_open+0x13a/0x230 [27914.006543] [<ffffffff8120928e>] SyS_open+0x1e/0x20 [27914.006545] [<ffffffff81808332>] entry_SYSCALL_64_fastpath+0x16/0x71 [27914.006612] INFO: task BrowserBlocking:27381 blocked for more than 20 seconds. [27914.006613] Tainted: P W OE 4.4.11-040411-generic #201605182255 [27914.006614] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [27914.006614] BrowserBlocking D ffff8801cb17fb98 0 27381 1 0x00000000 [27914.006616] ffff8801cb17fb98 0001880100000001 ffffffff81e11500 ffff880276783b00 [27914.006617] ffff8801cb180000 ffff880075f889f0 ffff880075f88800 ffff880075f889f0 [27914.006618] 0000000000000001 ffff8801cb17fbb0 ffffffff81804245 ffff88023e009b00 [27914.006620] Call Trace: [27914.006621] [<ffffffff81804245>] schedule+0x35/0x80 [27914.006629] [<ffffffffc0327fb3>] wait_current_trans.isra.21+0xd3/0x120 [btrfs] [27914.006631] [<ffffffff810c2860>] ? wake_atomic_t_function+0x60/0x60 [27914.006637] [<ffffffffc03298db>] start_transaction+0x2cb/0x4c0 [btrfs] [27914.006643] [<ffffffffc0329ae8>] btrfs_start_transaction+0x18/0x20 [btrfs] [27914.006651] [<ffffffffc03385ca>] btrfs_create+0x4a/0x210 [btrfs] [27914.006652] [<ffffffff812158e5>] vfs_create+0xc5/0x120 [27914.006654] [<ffffffff812190ac>] path_openat+0x120c/0x1330 [27914.006655] [<ffffffff810aac12>] ? wake_up_q+0x32/0x70 [27914.006657] [<ffffffff8121a3c1>] do_filp_open+0x91/0x100 [27914.006659] [<ffffffff81101607>] ? do_futex+0x307/0x540 [27914.006660] [<ffffffff81209ab9>] ? __vfs_write+0xc9/0x110 [27914.006662] [<ffffffff81227b36>] ? __alloc_fd+0x46/0x190 [27914.006663] [<ffffffff8120917a>] do_sys_open+0x13a/0x230 [27914.006678] [<ffffffff8120928e>] SyS_open+0x1e/0x20 [27914.006679] [<ffffffff81808332>] entry_SYSCALL_64_fastpath+0x16/0x71 [27914.006681] INFO: task Chrome_CacheThr:27386 blocked for more than 20 seconds. [27914.006682] Tainted: P W OE 4.4.11-040411-generic #201605182255 [27914.006682] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [27914.006683] Chrome_CacheThr D ffff88024af03b20 0 27386 1 0x00000000 [27914.006684] ffff88024af03b20 ffff88024af03b70 ffff880277da49c0 ffff88006a9f6740 [27914.006686] ffff88024af04000 ffff880075f889f0 ffff880075f88800 ffff880075f889f0 [27914.006687] 0000000000000001 ffff88024af03b38 ffffffff81804245 ffff88023e009b00 [27914.006688] Call Trace: [27914.006689] [<ffffffff81804245>] schedule+0x35/0x80 [27914.006696] [<ffffffffc0327fb3>] wait_current_trans.isra.21+0xd3/0x120 [btrfs] [27914.006711] [<ffffffff810c2860>] ? wake_atomic_t_function+0x60/0x60 [27914.006717] [<ffffffffc032988b>] start_transaction+0x27b/0x4c0 [btrfs] [27914.006737] [<ffffffffc0329b27>] btrfs_join_transaction+0x17/0x20 [btrfs] [27914.006747] [<ffffffffc032f1a5>] btrfs_dirty_inode+0x35/0xd0 [btrfs] [27914.006756] [<ffffffffc032f2bf>] btrfs_update_time+0x7f/0xc0 [btrfs] [27914.006771] [<ffffffff81224c3d>] file_update_time+0xbd/0x110 [27914.006778] [<ffffffffc0339759>] btrfs_page_mkwrite+0xc9/0x380 [btrfs] [27914.006780] [<ffffffff810c21a2>] ? __wake_up_common+0x52/0x90 [27914.006782] [<ffffffff811b9b46>] do_page_mkwrite+0x56/0xb0 [27914.006783] [<ffffffff811bbef8>] do_wp_page+0x158/0x6c0 [27914.006785] [<ffffffff8118b8de>] ? filemap_map_pages+0xce/0x230 [27914.006787] [<ffffffff811bd6ed>] handle_mm_fault+0xcdd/0x1820 [27914.006789] [<ffffffff8106a537>] __do_page_fault+0x197/0x400 [27914.006790] [<ffffffff8106a7c2>] do_page_fault+0x22/0x30 [27914.006791] [<ffffffff8180a4b8>] page_fault+0x28/0x30 [27914.006792] INFO: task Chrome_IOThread:27387 blocked for more than 20 seconds. [27914.006793] Tainted: P W OE 4.4.11-040411-generic #201605182255 [27914.006794] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [27914.006794] Chrome_IOThread D ffff8801dc353de8 0 27387 1 0x00000000 [27914.006796] ffff8801dc353de8 00000000214ad60e ffff880277da49c0 ffff88006f4149c0 [27914.006797] ffff8801dc354000 ffff880118b9d2e8 ffff880118b9d300 ffff8801dc353f58 [27914.006798] ffff88006f4149c0 ffff8801dc353e00 ffffffff81804245 ffff88006f4149c0 [27914.006799] Call Trace: [27914.006801] [<ffffffff81804245>] schedule+0x35/0x80 [27914.006802] [<ffffffff818070c0>] rwsem_down_read_failed+0xe0/0x140 [27914.006804] [<ffffffff813de924>] call_rwsem_down_read_failed+0x14/0x30 [27914.006805] [<ffffffff81806690>] ? down_read+0x20/0x30 [27914.006807] [<ffffffff8106a715>] __do_page_fault+0x375/0x400 [27914.006808] [<ffffffff8106a7c2>] do_page_fault+0x22/0x30 [27914.006808] [<ffffffff8180a4b8>] page_fault+0x28/0x30 [27914.006810] INFO: task CompositorTileW:27389 blocked for more than 20 seconds. [27914.006811] Tainted: P W OE 4.4.11-040411-generic #201605182255 [27914.006811] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [27914.006812] CompositorTileW D ffff88017e677df8 0 27389 1 0x00000000 [27914.006813] ffff88017e677df8 0000000000000001 ffff880277da2c40 ffff880277449d80 [27914.006814] ffff88017e678000 ffff880118b9d300 ffff880118b9d2e8 ffffffff00000000 [27914.006815] ffffffff00000001 ffff88017e677e10 ffffffff81804245 ffff880277449d80 [27914.006817] Call Trace: [27914.006818] [<ffffffff81804245>] schedule+0x35/0x80 [27914.006819] [<ffffffff81806e92>] rwsem_down_write_failed+0x202/0x350 [27914.006821] [<ffffffff813de953>] call_rwsem_down_write_failed+0x13/0x20 [27914.006822] [<ffffffff818066cd>] ? down_write+0x2d/0x40 [27914.006823] [<ffffffff811c4703>] vm_munmap+0x33/0x60 [27914.006824] [<ffffffff811c56f2>] SyS_munmap+0x22/0x30 [27914.006826] [<ffffffff81808332>] entry_SYSCALL_64_fastpath+0x16/0x71 [27914.006827] INFO: task Chrome_HistoryT:27427 blocked for more than 20 seconds. [27914.006828] Tainted: P W OE 4.4.11-040411-generic #201605182255 [27914.006828] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [27914.006829] Chrome_HistoryT D ffff8801cb2fbde8 0 27427 1 0x00000000 [27914.006830] ffff8801cb2fbde8 0000000000000001 ffff880277da6740 ffff880016570ec0 [27914.006832] ffff8801cb2fc000 ffff880118b9d2e8 ffff880118b9d300 ffff8801cb2fbf58 [27914.006833] ffff880016570ec0 ffff8801cb2fbe00 ffffffff81804245 ffff880016570ec0 [27914.006834] Call Trace: [27914.006835] [<ffffffff81804245>] schedule+0x35/0x80 [27914.006836] [<ffffffff818070c0>] rwsem_down_read_failed+0xe0/0x140 [27914.006838] [<ffffffff81209ab9>] ? __vfs_write+0xc9/0x110 [27914.006839] [<ffffffff813de924>] call_rwsem_down_read_failed+0x14/0x30 [27914.006841] [<ffffffff81806690>] ? down_read+0x20/0x30 [27914.006842] [<ffffffff8106a715>] __do_page_fault+0x375/0x400 [27914.006843] [<ffffffff8106a7c2>] do_page_fault+0x22/0x30 [27914.006844] [<ffffffff8180a4b8>] page_fault+0x28/0x30
Created attachment 219061 [details] a few sysreq w's during lockups Here is the kernel log during the lockups - it was mostly chrome or soffice.bin that got stuck at 100% system CPU, system was frozen for ~40-50sec during each freeze period and I ran sysreq w a few times during those periods
Appears this was caused by quotas - disabling quotas caused this hang issue to stop on the 4.4 series kernels.
This is a semi-automated bugzilla cleanup, report is against an old kernel version. If the problem still happens, please open a new bug. Thanks.