There seems to be an issue with btrfs in Kernel 6 (first seen in at least 6.0.12-300.fc37.x86_64, possibly earlier) and still observed in 6.1 (6.1.5-200.fc37.x86_64). This has manifested in prolonged user space hangs. I first noticed this in internal tooling for my employer where simple IO operations like a trivial cp commands that typically take under 100ms (when run with many other concurrent IO operations of a similar ilk) are observing "scheduling starvation" being stalled in a disk sleep state for nearly a minute. This also seems to be manifesting in hangs to Kwin that correlate with btrfs usage and things as routine and trivial (even on high performance systems) as system updates https://bugs.kde.org/show_bug.cgi?id=463353 I would also not be surprised if this behavior is responsible for things like this reddit thread: https://www.reddit.com/r/btrfs/comments/102or0t/btrfs_on_luks_whole_system_freezing_under_high_io/
As additional context, here are some sample kernel traces (echo w > /proc/sysrq-trigger): Dec 16 14:10:02 localhost kernel: sysrq: Show Blocked State Dec 16 14:10:02 localhost kernel: task:kworker/u64:6 state:D stack: 0 pid:2673507 ppid: 2 flags:0x00004000 Dec 16 14:10:02 localhost kernel: Workqueue: btrfs-worker btrfs_work_helper Dec 16 14:10:02 localhost kernel: Call Trace: Dec 16 14:10:02 localhost kernel: <TASK> Dec 16 14:10:02 localhost kernel: __schedule+0x332/0x12c0 Dec 16 14:10:02 localhost kernel: ? __sbitmap_get_word+0x37/0x80 Dec 16 14:10:02 localhost kernel: schedule+0x5d/0xe0 Dec 16 14:10:02 localhost kernel: io_schedule+0x42/0x70 Dec 16 14:10:02 localhost kernel: blk_mq_get_tag+0x10c/0x290 Dec 16 14:10:02 localhost kernel: ? dequeue_task_stop+0x70/0x70 Dec 16 14:10:02 localhost kernel: __blk_mq_alloc_requests+0x183/0x2b0 Dec 16 14:10:02 localhost kernel: blk_mq_submit_bio+0x2b7/0x5b0 Dec 16 14:10:02 localhost kernel: __submit_bio+0xf5/0x180 Dec 16 14:10:02 localhost kernel: submit_bio_noacct_nocheck+0x20d/0x2a0 Dec 16 14:10:02 localhost kernel: btrfs_submit_bio+0x17a/0x3d0 Dec 16 14:10:02 localhost kernel: btrfs_work_helper+0x14b/0x380 Dec 16 14:10:02 localhost kernel: process_one_work+0x1c7/0x380 Dec 16 14:10:02 localhost kernel: worker_thread+0x4d/0x380 Dec 16 14:10:02 localhost kernel: ? _raw_spin_lock_irqsave+0x23/0x50 Dec 16 14:10:02 localhost kernel: ? rescuer_thread+0x380/0x380 Dec 16 14:10:02 localhost kernel: kthread+0xe9/0x110 Dec 16 14:10:02 localhost kernel: ? kthread_complete_and_exit+0x20/0x20 Dec 16 14:10:02 localhost kernel: ret_from_fork+0x22/0x30 Dec 16 14:10:02 localhost kernel: </TASK> Dec 16 14:10:02 localhost kernel: task:cp state:D stack: 0 pid:2422766 ppid:2410777 flags:0x00000002 Dec 16 14:10:02 localhost kernel: Call Trace: Dec 16 14:10:02 localhost kernel: <TASK> Dec 16 14:10:02 localhost kernel: __schedule+0x332/0x12c0 Dec 16 14:10:02 localhost kernel: ? btrfs_delayed_update_inode+0x102/0x1f0 Dec 16 14:10:02 localhost kernel: schedule+0x5d/0xe0 Dec 16 14:10:02 localhost kernel: io_schedule+0x42/0x70 Dec 16 14:10:02 localhost kernel: folio_wait_bit_common+0x12d/0x3a0 Dec 16 14:10:02 localhost kernel: ? filemap_alloc_folio+0xc0/0xc0 Dec 16 14:10:02 localhost kernel: folio_wait_writeback+0x28/0x80 Dec 16 14:10:02 localhost kernel: __filemap_fdatawait_range+0x7f/0x100 Dec 16 14:10:02 localhost kernel: filemap_fdatawait_range+0xe/0x20 Dec 16 14:10:02 localhost kernel: btrfs_wait_ordered_range+0x7a/0x120 Dec 16 14:10:02 localhost kernel: btrfs_remap_file_range+0x139/0x550 Dec 16 14:10:02 localhost kernel: do_clone_file_range+0xe9/0x230 Dec 16 14:10:02 localhost kernel: vfs_clone_file_range+0x4d/0x140 Dec 16 14:10:02 localhost kernel: ovl_copyfile+0x15d/0x180 [overlay] Dec 16 14:10:02 localhost kernel: ovl_remap_file_range+0x6e/0xa0 [overlay] Dec 16 14:10:02 localhost kernel: do_clone_file_range+0xe9/0x230 Dec 16 14:10:02 localhost kernel: vfs_clone_file_range+0x4d/0x140 Dec 16 14:10:02 localhost kernel: ioctl_file_clone+0x49/0xb0 Dec 16 14:10:02 localhost kernel: do_vfs_ioctl+0x77/0x950 Dec 16 14:10:02 localhost kernel: ? security_file_ioctl+0x3c/0x60 Dec 16 14:10:02 localhost kernel: __x64_sys_ioctl+0x6e/0xd0 Dec 16 14:10:02 localhost kernel: do_syscall_64+0x5b/0x80 Dec 16 14:10:02 localhost kernel: ? syscall_exit_to_user_mode+0x17/0x40 Dec 16 14:10:02 localhost kernel: ? do_syscall_64+0x67/0x80 Dec 16 14:10:02 localhost kernel: ? syscall_exit_to_user_mode+0x17/0x40 Dec 16 14:10:02 localhost kernel: ? do_syscall_64+0x67/0x80 Dec 16 14:10:02 localhost kernel: ? do_user_addr_fault+0x1ef/0x690 Dec 16 14:10:02 localhost kernel: ? do_syscall_64+0x67/0x80 Dec 16 14:10:02 localhost kernel: ? exc_page_fault+0x70/0x170 Dec 16 14:10:02 localhost kernel: entry_SYSCALL_64_after_hwframe+0x63/0xcd Dec 16 14:10:02 localhost kernel: RIP: 0033:0x7f07d2d7dbaf Dec 16 14:10:02 localhost kernel: RSP: 002b:00007ffc4ba5bf40 EFLAGS: 00000246 ORIG_RAX: 0000000000000010 Dec 16 14:10:02 localhost kernel: RAX: ffffffffffffffda RBX: 0000000000000003 RCX: 00007f07d2d7dbaf Dec 16 14:10:02 localhost kernel: RDX: 0000000000000003 RSI: 0000000040049409 RDI: 0000000000000004 Dec 16 14:10:02 localhost kernel: RBP: 00007ffc4ba5d8cc R08: 0000000000000001 R09: 00000000000001a4 Dec 16 14:10:02 localhost kernel: R10: 0000000000000000 R11: 0000000000000246 R12: 00007ffc4ba5c3e0 Dec 16 14:10:02 localhost kernel: R13: 0000000000000001 R14: 0000000000000014 R15: 00007ffc4ba5c7b0 Dec 16 14:10:02 localhost kernel: </TASK> Dec 16 14:10:02 localhost kernel: task:cp state:D stack: 0 pid:2422815 ppid:2419287 flags:0x00000002 Dec 16 14:10:02 localhost kernel: Call Trace: Dec 16 14:10:02 localhost kernel: <TASK> Dec 16 14:10:02 localhost kernel: __schedule+0x332/0x12c0 Dec 16 14:10:02 localhost kernel: ? btrfs_dirty_inode+0x91/0xe0 Dec 16 14:10:02 localhost kernel: schedule+0x5d/0xe0 Dec 16 14:10:02 localhost kernel: io_schedule+0x42/0x70 Dec 16 14:10:02 localhost kernel: folio_wait_bit_common+0x12d/0x3a0 Dec 16 14:10:02 localhost kernel: ? filemap_alloc_folio+0xc0/0xc0 Dec 16 14:10:02 localhost kernel: folio_wait_writeback+0x28/0x80 Dec 16 14:10:02 localhost kernel: __filemap_fdatawait_range+0x7f/0x100 Dec 16 14:10:02 localhost kernel: filemap_fdatawait_range+0xe/0x20 Dec 16 14:10:02 localhost kernel: btrfs_wait_ordered_range+0x7a/0x120 Dec 16 14:10:02 localhost kernel: btrfs_remap_file_range+0x139/0x550 Dec 16 14:10:02 localhost kernel: do_clone_file_range+0xe9/0x230 Dec 16 14:10:02 localhost kernel: vfs_clone_file_range+0x4d/0x140 Dec 16 14:10:02 localhost kernel: ovl_copyfile+0x15d/0x180 [overlay] Dec 16 14:10:02 localhost kernel: ovl_remap_file_range+0x6e/0xa0 [overlay] Dec 16 14:10:02 localhost kernel: do_clone_file_range+0xe9/0x230 Dec 16 14:10:02 localhost kernel: vfs_clone_file_range+0x4d/0x140 Dec 16 14:10:02 localhost kernel: ioctl_file_clone+0x49/0xb0 Dec 16 14:10:02 localhost kernel: do_vfs_ioctl+0x77/0x950 Dec 16 14:10:02 localhost kernel: __x64_sys_ioctl+0x6e/0xd0 Dec 16 14:10:02 localhost kernel: do_syscall_64+0x5b/0x80 Dec 16 14:10:02 localhost kernel: ? exc_page_fault+0x70/0x170 Dec 16 14:10:02 localhost kernel: entry_SYSCALL_64_after_hwframe+0x63/0xcd Dec 16 14:10:02 localhost kernel: RIP: 0033:0x7f1e8dc22baf Dec 16 14:10:02 localhost kernel: RSP: 002b:00007ffd65479d50 EFLAGS: 00000246 ORIG_RAX: 0000000000000010 Dec 16 14:10:02 localhost kernel: RAX: ffffffffffffffda RBX: 0000000000000003 RCX: 00007f1e8dc22baf Dec 16 14:10:02 localhost kernel: RDX: 0000000000000003 RSI: 0000000040049409 RDI: 0000000000000004 Dec 16 14:10:02 localhost kernel: RBP: 00007ffd6547b8cc R08: 0000000000000001 R09: 00000000000001a4 Dec 16 14:10:02 localhost kernel: R10: 0000000000000000 R11: 0000000000000246 R12: 00007ffd6547a1f0 Dec 16 14:10:02 localhost kernel: R13: 0000000000000001 R14: 0000000000000014 R15: 00007ffd6547a5c0 Dec 16 14:10:02 localhost kernel: </TASK> Dec 16 14:10:02 localhost kernel: task:cp state:D stack: 0 pid:2422818 ppid:2421995 flags:0x00000002 Dec 16 14:10:02 localhost kernel: Call Trace: Dec 16 14:10:02 localhost kernel: <TASK> Dec 16 14:10:02 localhost kernel: __schedule+0x332/0x12c0 Dec 16 14:10:02 localhost kernel: schedule+0x5d/0xe0 Dec 16 14:10:02 localhost kernel: io_schedule+0x42/0x70 Dec 16 14:10:02 localhost kernel: folio_wait_bit_common+0x12d/0x3a0 Dec 16 14:10:02 localhost kernel: ? filemap_alloc_folio+0xc0/0xc0 Dec 16 14:10:02 localhost kernel: folio_wait_writeback+0x28/0x80 Dec 16 14:10:02 localhost kernel: __filemap_fdatawait_range+0x7f/0x100 Dec 16 14:10:02 localhost kernel: filemap_fdatawait_range+0xe/0x20 Dec 16 14:10:02 localhost kernel: btrfs_wait_ordered_range+0x7a/0x120 Dec 16 14:10:02 localhost kernel: btrfs_remap_file_range+0x139/0x550 Dec 16 14:10:02 localhost kernel: do_clone_file_range+0xe9/0x230 Dec 16 14:10:02 localhost kernel: vfs_clone_file_range+0x4d/0x140 Dec 16 14:10:02 localhost kernel: ovl_copyfile+0x15d/0x180 [overlay] Dec 16 14:10:02 localhost kernel: ovl_remap_file_range+0x6e/0xa0 [overlay] Dec 16 14:10:02 localhost kernel: do_clone_file_range+0xe9/0x230 Dec 16 14:10:02 localhost kernel: vfs_clone_file_range+0x4d/0x140 Dec 16 14:10:02 localhost kernel: ioctl_file_clone+0x49/0xb0 Dec 16 14:10:02 localhost kernel: do_vfs_ioctl+0x77/0x950 Dec 16 14:10:02 localhost kernel: __x64_sys_ioctl+0x6e/0xd0 Dec 16 14:10:02 localhost kernel: do_syscall_64+0x5b/0x80 Dec 16 14:10:02 localhost kernel: ? syscall_exit_to_user_mode+0x17/0x40 Dec 16 14:10:02 localhost kernel: ? do_syscall_64+0x67/0x80 Dec 16 14:10:02 localhost kernel: entry_SYSCALL_64_after_hwframe+0x63/0xcd Dec 16 14:10:02 localhost kernel: RIP: 0033:0x7ffa2a678baf Dec 16 14:10:02 localhost kernel: RSP: 002b:00007ffe77c03de0 EFLAGS: 00000246 ORIG_RAX: 0000000000000010 Dec 16 14:10:02 localhost kernel: RAX: ffffffffffffffda RBX: 0000000000000003 RCX: 00007ffa2a678baf Dec 16 14:10:02 localhost kernel: RDX: 0000000000000003 RSI: 0000000040049409 RDI: 0000000000000004 Dec 16 14:10:02 localhost kernel: RBP: 00007ffe77c058cc R08: 0000000000000001 R09: 00000000000001a4 Dec 16 14:10:02 localhost kernel: R10: 0000000000000000 R11: 0000000000000246 R12: 00007ffe77c04280 Dec 16 14:10:02 localhost kernel: R13: 0000000000000001 R14: 0000000000000014 R15: 00007ffe77c04650 Dec 16 14:10:02 localhost kernel: </TASK> Dec 16 14:10:02 localhost kernel: task:cp state:D stack: 0 pid:2422821 ppid:2420167 flags:0x00000002 Dec 16 14:10:02 localhost kernel: Call Trace: Dec 16 14:10:02 localhost kernel: <TASK> Dec 16 14:10:02 localhost kernel: __schedule+0x332/0x12c0 Dec 16 14:10:02 localhost kernel: ? _raw_spin_unlock_irqrestore+0x23/0x40 Dec 16 14:10:02 localhost kernel: schedule+0x5d/0xe0 Dec 16 14:10:02 localhost kernel: io_schedule+0x42/0x70 Dec 16 14:10:02 localhost kernel: folio_wait_bit_common+0x12d/0x3a0 Dec 16 14:10:02 localhost kernel: ? filemap_alloc_folio+0xc0/0xc0 Dec 16 14:10:02 localhost kernel: folio_wait_writeback+0x28/0x80 Dec 16 14:10:02 localhost kernel: __filemap_fdatawait_range+0x7f/0x100 Dec 16 14:10:02 localhost kernel: filemap_fdatawait_range+0xe/0x20 Dec 16 14:10:02 localhost kernel: btrfs_wait_ordered_range+0x7a/0x120 Dec 16 14:10:02 localhost kernel: btrfs_remap_file_range+0x139/0x550 Dec 16 14:10:02 localhost kernel: do_clone_file_range+0xe9/0x230 Dec 16 14:10:02 localhost kernel: vfs_clone_file_range+0x4d/0x140 Dec 16 14:10:02 localhost kernel: ovl_copyfile+0x15d/0x180 [overlay] Dec 16 14:10:02 localhost kernel: ovl_remap_file_range+0x6e/0xa0 [overlay] Dec 16 14:10:02 localhost kernel: do_clone_file_range+0xe9/0x230 Dec 16 14:10:02 localhost kernel: vfs_clone_file_range+0x4d/0x140 Dec 16 14:10:02 localhost kernel: ioctl_file_clone+0x49/0xb0 Dec 16 14:10:02 localhost kernel: do_vfs_ioctl+0x77/0x950 Dec 16 14:10:02 localhost kernel: ? security_file_ioctl+0x3c/0x60 Dec 16 14:10:02 localhost kernel: __x64_sys_ioctl+0x6e/0xd0 Dec 16 14:10:02 localhost kernel: do_syscall_64+0x5b/0x80 Dec 16 14:10:02 localhost kernel: ? do_syscall_64+0x67/0x80 Dec 16 14:10:02 localhost kernel: entry_SYSCALL_64_after_hwframe+0x63/0xcd Dec 16 14:10:02 localhost kernel: RIP: 0033:0x7fafe34febaf Dec 16 14:10:02 localhost kernel: RSP: 002b:00007fff79bee930 EFLAGS: 00000246 ORIG_RAX: 0000000000000010 Dec 16 14:10:02 localhost kernel: RAX: ffffffffffffffda RBX: 0000000000000003 RCX: 00007fafe34febaf Dec 16 14:10:02 localhost kernel: RDX: 0000000000000003 RSI: 0000000040049409 RDI: 0000000000000004 Dec 16 14:10:02 localhost kernel: RBP: 00007fff79bef8f6 R08: 0000000000000001 R09: 00000000000001a4 Dec 16 14:10:02 localhost kernel: R10: 0000000000000004 R11: 0000000000000246 R12: 00007fff79beedd0 Dec 16 14:10:02 localhost kernel: R13: 0000000000000001 R14: 0000000000000002 R15: 00007fff79bef1a0 Dec 16 14:10:02 localhost kernel: </TASK> Dec 16 14:10:02 localhost kernel: task:cp state:D stack: 0 pid:2422911 ppid:2401445 flags:0x00000002 Dec 16 14:10:02 localhost kernel: Call Trace: Dec 16 14:10:02 localhost kernel: <TASK> Dec 16 14:10:02 localhost kernel: __schedule+0x332/0x12c0 Dec 16 14:10:02 localhost kernel: ? __btrfs_end_transaction+0xf8/0x240 Dec 16 14:10:02 localhost kernel: schedule+0x5d/0xe0 Dec 16 14:10:02 localhost kernel: io_schedule+0x42/0x70 Dec 16 14:10:02 localhost kernel: folio_wait_bit_common+0x12d/0x3a0 Dec 16 14:10:02 localhost kernel: ? filemap_alloc_folio+0xc0/0xc0 Dec 16 14:10:02 localhost kernel: folio_wait_writeback+0x28/0x80 Dec 16 14:10:02 localhost kernel: __filemap_fdatawait_range+0x7f/0x100 Dec 16 14:10:02 localhost kernel: filemap_fdatawait_range+0xe/0x20 Dec 16 14:10:02 localhost kernel: btrfs_wait_ordered_range+0x7a/0x120 Dec 16 14:10:02 localhost kernel: btrfs_remap_file_range+0x139/0x550 Dec 16 14:10:02 localhost kernel: do_clone_file_range+0xe9/0x230 Dec 16 14:10:02 localhost kernel: vfs_clone_file_range+0x4d/0x140 Dec 16 14:10:02 localhost kernel: ovl_copyfile+0x15d/0x180 [overlay] Dec 16 14:10:02 localhost kernel: ovl_remap_file_range+0x6e/0xa0 [overlay] Dec 16 14:10:02 localhost kernel: do_clone_file_range+0xe9/0x230 Dec 16 14:10:02 localhost kernel: vfs_clone_file_range+0x4d/0x140 Dec 16 14:10:02 localhost kernel: ioctl_file_clone+0x49/0xb0 Dec 16 14:10:02 localhost kernel: do_vfs_ioctl+0x77/0x950 Dec 16 14:10:02 localhost kernel: __x64_sys_ioctl+0x6e/0xd0 Dec 16 14:10:02 localhost kernel: do_syscall_64+0x5b/0x80 Dec 16 14:10:02 localhost kernel: ? exc_page_fault+0x70/0x170 Dec 16 14:10:02 localhost kernel: entry_SYSCALL_64_after_hwframe+0x63/0xcd Dec 16 14:10:02 localhost kernel: RIP: 0033:0x7ff280aeebaf Dec 16 14:10:02 localhost kernel: RSP: 002b:00007ffdbfa595c0 EFLAGS: 00000246 ORIG_RAX: 0000000000000010 Dec 16 14:10:02 localhost kernel: RAX: ffffffffffffffda RBX: 0000000000000003 RCX: 00007ff280aeebaf Dec 16 14:10:02 localhost kernel: RDX: 0000000000000003 RSI: 0000000040049409 RDI: 0000000000000004 Dec 16 14:10:02 localhost kernel: RBP: 00007ffdbfa5a8cc R08: 0000000000000001 R09: 00000000000001a4 Dec 16 14:10:02 localhost kernel: R10: 0000000000000000 R11: 0000000000000246 R12: 00007ffdbfa59a60 Dec 16 14:10:02 localhost kernel: R13: 0000000000000001 R14: 0000000000000014 R15: 00007ffdbfa59e30 Dec 16 14:10:02 localhost kernel: </TASK>
(In reply to Wyatt Childers from comment #0) > There seems to be an issue with btrfs in Kernel 6 (first seen in at least > 6.0.12-300.fc37.x86_64, possibly earlier) and still observed in 6.1 > (6.1.5-200.fc37.x86_64). Does that mean 5.19.y worked fine, or did the early 6.0.y versions work fine, too?
> Does that mean 5.19.y worked fine, or did the early 6.0.y versions work fine, > too? I unfortunately don't have that information. There was a delay between moving to Kernel 6.0 and the investigation that lead to discovering this issue. I'm quite confident the problem was not in the 5.19.y series (though that's based on memory, not hard data and testing).
This matches my experience on Debian with kernel from their unstable repository (6.1.4-1). I've rolled back to the current stable kernel from Bullseye (5.10.0-20-amd64, which according to the changelog is based on 5.10.158-2) about a week ago after finding this bug and the long IO stalls have disappeared. Looks like Debian had 6.0 in backports, so I'll give it a try and report back.
From my testing, this seems to be between 5.19 and 6.0 - on 6.0 I've seen mkdir take up to 40 seconds, while on 5.19 it doesn't seem to ever go above half a second. My setup is raid 6 for data, raid1c3 for metadata; in my case the stalls seem to happen more frequently after a large number of deletions and btrfs-transaction will appear to be very busy according to iotop, with everything else just stalling.
Can I get the output from the following commands btrfs filesystem show <mountpoint> lsblk -o name,sched,mountpoints
For the affected fs and filtering unrelated FSes: root@zeus:~# btrfs filesystem show /gigante/ Label: 'gigante' uuid: a767e5e1-ee9e-4769-ac9b-c14fdc5b7502 Total devices 13 FS bytes used 104.61TiB devid 1 size 16.37TiB used 13.23TiB path /dev/sdp devid 2 size 16.37TiB used 13.23TiB path /dev/sdq devid 3 size 16.37TiB used 13.24TiB path /dev/sda devid 4 size 16.37TiB used 13.20TiB path /dev/sdf devid 5 size 9.10TiB used 9.09TiB path /dev/sdn devid 6 size 9.10TiB used 9.09TiB path /dev/sdk devid 7 size 9.10TiB used 9.09TiB path /dev/sdo devid 8 size 9.10TiB used 9.09TiB path /dev/sdj devid 9 size 9.10TiB used 9.09TiB path /dev/sdl devid 10 size 9.10TiB used 9.09TiB path /dev/sdd devid 11 size 9.10TiB used 9.09TiB path /dev/sdm devid 12 size 9.10TiB used 9.09TiB path /dev/sdb devid 13 size 9.10TiB used 9.09TiB path /dev/sdi root@zeus:~# lsblk -o name,sched,mountpoints |egrep -i "sdp|sdq|sda|sdf|sdn|sdk|sdo|sdj|sdl|sdd|sdm|sdb|sdi" sda bfq sdb bfq sdd bfq sdf bfq sdi bfq sdj bfq sdk bfq sdl bfq sdm bfq sdn bfq sdo bfq sdp bfq /gigante sdq bfq root@zeus:~#
Ok this matches other reports I’ve seen. Please change the io scheduler to mq-deadline and see if the problem goes away.
$> btrfs filesystem show / Label: none uuid: d9ab45ee-2acf-40eb-961a-4098ef36aa4e Total devices 1 FS bytes used 396.21GiB devid 1 size 476.85GiB used 440.02GiB path /dev/sdc2 $> btrfs filesystem show /a Label: none uuid: 1015de5d-d62d-4719-8dc0-858126326045 Total devices 1 FS bytes used 199.28GiB devid 1 size 307.40GiB used 202.02GiB path /dev/sdb2 $> btrfs filesystem show /b Label: none uuid: cc9a3fc6-0f75-4966-969e-2678c04ada0d Total devices 1 FS bytes used 147.55GiB devid 1 size 472.53GiB used 150.01GiB path /dev/sda2 $> btrfs filesystem show /c Label: none uuid: e341a7af-43ee-4c55-81bb-79fe2604658d Total devices 1 FS bytes used 94.13GiB devid 1 size 624.11GiB used 353.01GiB path /dev/sdb1 ---- NAME SCHED MOUNTPOINTS sda bfq ├─sda1 bfq /d └─sda2 bfq /b sdb bfq ├─sdb1 bfq /c └─sdb2 bfq /a sdc bfq ├─sdc1 bfq /boot/efi └─sdc2 bfq /home / sr0 bfq zram0 [SWAP] Following some testing of different IO schedulers, I can confirm the issue goes away when leaving bfq. For others that might find themselves here: none and kyber seem to be the best options for my SSDs. None is giving me the best workload throughput but I'm going to go with kyber for the moment, it seems marginally worse (but Pop!_OS has chosen it as the default to reduce latency). I'm going to assume the Pop!_OS folks know what they're doing a bit more than me, and switch to that (at least for now). I'll additionally note, I did notice music stuttering on kyber, noop, and mq-deadline. I don't remembering experience that with bfq before or after this regression.
For what it's worth, Fedora defaults to mq-deadline for everything except drives that use sdX notation. There's currently a udev rule override that sets bfq for those by default[1]. There are two options here: a. Request Fedora to drop the override. There's a bug report for it already[2]. b. Work with the maintainer of bfq to fix the problems we're seeing here. I'd obviously prefer to see problems fixed, but barring that, it'd be worth opening a discussion downstream to drop the bfq default for sdX drives. [1]: https://bugzilla.redhat.com/show_bug.cgi?id=1738828 [2]: https://bugzilla.redhat.com/show_bug.cgi?id=1851783
Err, here's the actual udev rule override patch in place: https://src.fedoraproject.org/rpms/systemd/blob/rawhide/f/use-bfq-scheduler.patch
So what's the way forward here? Changing the io scheduler is a nice workaround, but from the report it sounds like this didn't happen in earlier kernels -- and thus is a regression that should be fixed (who actually needs to fix this obviously depends on which change caused this; but nobody bisected this, or?). Or am I missing something?
The problem isn't in btrfs, it's in bfq. So from a perf/regression perspective, this bug belongs to bfq. From the downstream Fedora perspective, if nothing gets fixed, it's worth having that conversation again about the bfq enablement patch.
(In reply to Neal Gompa from comment #13) > The problem isn't in btrfs, it's in bfq. So from a perf/regression > perspective, this bug belongs to bfq. That's what's I suspected when skimming this (sorry, I can't follow each regression report closely). So: Did anyone tell the bfq developers about this? Or ideally bisect this?
Hmm, this looks stalled. (In reply to The Linux kernel's regression tracker (Thorsten Leemhuis) from comment #14) > > Did anyone tell the bfq developers about this? I did that a bit later through the mailing list: https://lore.kernel.org/lkml/d7fe697d-09eb-3a59-6cb3-8eec480d1407@leemhuis.info/ No reply, from a quick lore search it sees Paolo is a bit MIA currently. Happens. > Or ideally bisect this? I guess I have to assume silence means "no". Does anyone still care? If yes: please check if latest mainline is still affected; and ideally confirm that early 6.0 versions or 5.19 are working fine, to ensure this is really a regression.
I just wanted to note, even with things switched over to Kyber; I'm still seeing the occasional prolonged stall... I'm not entirely sure that a change made to BTRFS didn't play a significant role in this.