Bug 74761 - raid6, different devices with smaller ones full - kworker runs at 100% for 10+ minutes on writes
Summary: raid6, different devices with smaller ones full - kworker runs at 100% for 10...
Status: NEW
Alias: None
Product: File System
Classification: Unclassified
Component: btrfs (show other bugs)
Hardware: All Linux
: P1 normal
Assignee: Josef Bacik
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2014-04-24 17:13 UTC by kernel
Modified: 2016-03-20 11:27 UTC (History)
3 users (show)

See Also:
Kernel Version: 3.15.0-rc2
Tree: Mainline
Regression: No


Attachments
Attempt at reproducer (889 bytes, application/x-shellscript)
2015-05-28 21:26 UTC, Omar Sandoval
Details

Description kernel 2014-04-24 17:13:39 UTC
Here's the state of the filesystem at the time:

        Total devices 4 FS bytes used 924.64GiB
        devid    1 size 1.82TiB used 474.00GiB path /dev/sdd
        devid    2 size 465.76GiB used 465.76GiB path /dev/sda
        devid    3 size 465.76GiB used 465.76GiB path /dev/sdb
        devid    4 size 465.76GiB used 465.73GiB path /dev/sdc

Data, RAID6: total=924.00GiB, used=923.42GiB
System, RAID1: total=32.00MiB, used=208.00KiB
Metadata, RAID1: total=1.70GiB, used=1.28GiB
Metadata, DUP: total=384.00MiB, used=252.13MiB
unknown, single: total=512.00MiB, used=0.00

Attempting to write, say, 100MiB in this case results in a kworker process
pegged at 100% CPU for a very long time (10s of minutes), at which point the writes proceed for some time, and the process repeats until the eventual
"No space left on device" error. 

This is on an armv7l system, in case it matters.

Here's a stack trace I grabbed via SysRq at the time:

Exception stack(0xab4699c8 to 0xab469a10)
99c0:                   aec7c870 00000000 00000000 aec7c841 08000000
aec7c870
99e0: ab469ad0 bd51e880 00003000 00000000 0006c000 00000000 00000005
ab469a10
9a00: 80299c8c 80310098 200e0013 ffffffff
[<80011e80>] (__irq_svc) from [<80310098>] (rb_next+0x14/0x5c)
[<80310098>] (rb_next) from [<80299c8c>]
(btrfs_find_space_for_alloc+0x138/0x344)
[<80299c8c>] (btrfs_find_space_for_alloc) from [<80240020>]
(find_free_extent+0x378/0xabc)
[<80240020>] (find_free_extent) from [<80240840>]
(btrfs_reserve_extent+0xdc/0x164)
[<80240840>] (btrfs_reserve_extent) from [<8025aef4>]
(cow_file_range+0x17c/0x5bc)
[<8025aef4>] (cow_file_range) from [<8025c1e0>]
(run_delalloc_range+0x34c/0x380)
[<8025c1e0>] (run_delalloc_range) from [<80274d6c>]
(__extent_writepage+0x708/0x940)
[<80274d6c>] (__extent_writepage) from [<802754b4>]
(extent_writepages+0x238/0x368)
[<802754b4>] (extent_writepages) from [<8009b190>] (do_writepages+0x24/0x38)
[<8009b190>] (do_writepages) from [<800ef59c>]
(__writeback_single_inode+0x28/0x110)
[<800ef59c>] (__writeback_single_inode) from [<800f04c8>]
(writeback_sb_inodes+0x184/0x38c)
[<800f04c8>] (writeback_sb_inodes) from [<800f0740>]
(__writeback_inodes_wb+0x70/0xac)
[<800f0740>] (__writeback_inodes_wb) from [<800f0978>]
(wb_writeback+0x1fc/0x20c)
[<800f0978>] (wb_writeback) from [<800f0b78>]
(bdi_writeback_workfn+0x144/0x338)
[<800f0b78>] (bdi_writeback_workfn) from [<80037cfc>]
(process_one_work+0x110/0x368)
[<80037cfc>] (process_one_work) from [<800383c8>]
(worker_thread+0x138/0x3e8)
[<800383c8>] (worker_thread) from [<8003de90>] (kthread+0xcc/0xe8)
[<8003de90>] (kthread) from [<8000e238>] (ret_from_fork+0x14/0x3c)
Comment 1 Vladislav Yarmak 2015-05-22 17:50:58 UTC
I can confirm this bug on x86_64. Backtrace is below:

[112168.302885] CPU: 5 PID: 16044 Comm: kworker/u49:6 Not tainted 3.16.0-0.bpo.4-amd64 #1 Debian 3.16.7-ckt9-3~deb8u1~bpo70+1
[112168.302887] Hardware name: Supermicro X9DRi-LN4+/X9DR3-LN4+/X9DRi-LN4+/X9DR3-LN4+, BIOS 3.0b 05/27/2014
[112168.302892] Workqueue: writeback bdi_writeback_workfn (flush-btrfs-1)
[112168.302894] task: ffff88085e452a20 ti: ffff8801443ac000 task.ti: ffff8801443ac000
[112168.302896] RIP: 0010:[<ffffffff812dc04c>]  [<ffffffff812dc04c>] find_next_bit+0x9c/0xd0
[112168.302901] RSP: 0018:ffff8801443af5c0  EFLAGS: 00000283
[112168.302902] RAX: fffffffffffffffe RBX: ffff8804048ecae8 RCX: 0000000000000001
[112168.302904] RDX: 0000000000000001 RSI: 0000000000008000 RDI: 0000000000001680
[112168.302905] RBP: ffff8801e81edea4 R08: ffff88051fe63d30 R09: 0000000000006980
[112168.302907] R10: ffff8802f3d57950 R11: 0000000000000000 R12: ffff8801443af638
[112168.302908] R13: 0000000000006920 R14: 000000000000006a R15: 000000000000011d
[112168.302910] FS:  0000000000000000(0000) GS:ffff88087fca0000(0000) knlGS:0000000000000000
[112168.302912] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[112168.302913] CR2: 00000000021a5000 CR3: 0000000001812000 CR4: 00000000001407e0
[112168.302915] Stack:
[112168.302916]  ffffffffa02f92a0 ffff8801443af640 ffff8804048ecad8 ffff88010bbf4b80
[112168.302919]  ffff8804048ecac0 000000000011d000 ffff8801e81ede80 ffff8801443af730
[112168.302921]  00000000000f0000 ffff8801e81edea4 ffffffffa02fc1b6 0000000000012ec0
[112168.302924] Call Trace:
[112168.302953]  [<ffffffffa02f92a0>] ? search_bitmap.isra.24+0x90/0x110 [btrfs]
[112168.302963]  [<ffffffffa02fc1b6>] ? btrfs_find_space_for_alloc+0x126/0x270 [btrfs]
[112168.302972]  [<ffffffffa02a17a0>] ? find_free_extent+0x3b0/0xd20 [btrfs]
[112168.302982]  [<ffffffffa02a21d0>] ? btrfs_reserve_extent+0xb0/0x140 [btrfs]
[112168.303000]  [<ffffffffa02bd696>] ? cow_file_range+0x136/0x420 [btrfs]
[112168.303017]  [<ffffffffa02be765>] ? run_delalloc_range+0x315/0x350 [btrfs]
[112168.303036]  [<ffffffffa02d4a2e>] ? writepage_delalloc.isra.34+0xfe/0x170 [btrfs]
[112168.303053]  [<ffffffffa02d6f26>] ? __extent_writepage+0xf6/0x2d0 [btrfs]
[112168.303065]  [<ffffffff8114da0f>] ? find_get_pages_tag+0xcf/0x180
[112168.303074]  [<ffffffffa02d736e>] ? extent_write_cache_pages.isra.26.constprop.41+0x26e/0x3c0 [btrfs]
[112168.303084]  [<ffffffffa02d7991>] ? extent_writepages+0x51/0x70 [btrfs]
[112168.303094]  [<ffffffffa02bb650>] ? btrfs_submit_direct+0x6a0/0x6a0 [btrfs]
[112168.303097]  [<ffffffff811e4b52>] ? __writeback_single_inode+0x42/0x280
[112168.303100]  [<ffffffff811d80c0>] ? inode_add_lru+0x10/0x60
[112168.303103]  [<ffffffff811e5f93>] ? writeback_sb_inodes+0x213/0x460
[112168.303106]  [<ffffffff810b5ae0>] ? down_write_trylock+0x40/0x40
[112168.303109]  [<ffffffff811e627e>] ? __writeback_inodes_wb+0x9e/0xd0
[112168.303111]  [<ffffffff811e653b>] ? wb_writeback+0x28b/0x340
[112168.303114]  [<ffffffff811e668f>] ? wb_check_old_data_flush+0x9f/0xb0
[112168.303117]  [<ffffffff811e8d6a>] ? bdi_writeback_workfn+0x1ea/0x420
[112168.303120]  [<ffffffff810887dc>] ? process_one_work+0x15c/0x450
[112168.303122]  [<ffffffff81089ce2>] ? worker_thread+0x112/0x510
[112168.303125]  [<ffffffff81089bd0>] ? create_and_start_worker+0x60/0x60
[112168.303128]  [<ffffffff81090681>] ? kthread+0xc1/0xe0
[112168.303131]  [<ffffffff810905c0>] ? flush_kthread_worker+0xb0/0xb0
Comment 2 Omar Sandoval 2015-05-28 21:26:10 UTC
Created attachment 178201 [details]
Attempt at reproducer

I can't reproduce this on 4.1-rc5 with the attached script. Do you see this in more recent kernel versions?
Comment 3 Vladislav Yarmak 2015-05-28 22:12:42 UTC
Bug seen in 4.0.4 (current latest stable kernel from kernel.org).
Comment 4 Vladislav Yarmak 2015-05-28 22:16:23 UTC
Bug appears under concurrent write load, mostly.

Note You need to log in before you can comment on or make changes to this bug.