Hi. I've observed suspiciously demanding kworker/u$i:$j+events_unbound. Upon further look, it spends most of the time in btrfs_preempt_reclaim_metadata_space: > [10481.108813] CPU: 2 PID: 22662 Comm: kworker/u8:4 Kdump: loaded Not tainted > 5.12.0-rc2-1.gd9388fc-default #1 openSUSE Tumbleweed (unreleased) > [10481.108816] Workqueue: events_unbound btrfs_preempt_reclaim_metadata_space > [btrfs] > [10481.108818] RIP: 0010:_find_next_bit.constprop.0+0x3/0x80 > [10481.108818] Code: ff ff 48 83 f8 ff 74 12 48 23 14 c7 74 e9 48 c1 e0 06 48 > 0f bd d2 4c 8d 04 10 4c 89 c0 c3 66 0f 1f 84 00 00 00 00 00 49 89 d1 <48> 39 > d1 73 51 48 89 ca 48 c1 ea 06 48 8b 04 d7 48 85 f6 74 04 48 > [10481.108819] RSP: 0018:ffffb92f0380fe08 EFLAGS: 00000046 > [10481.108820] RAX: 0000000000117000 RBX: ffff99d2438f6000 RCX: > 0000000000000002 > [10481.108820] RDX: 0000000000000200 RSI: 0000000000000000 RDI: > ffffffffb2283aa0 > [10481.108821] RBP: 0000000000000206 R08: 0000000000000000 R09: > 0000000000000200 > [10481.108821] R10: 0000000000000004 R11: 0000000000000013 R12: > fffffffffffe1000 > [10481.108822] R13: ffff99d24dfef630 R14: 0000000000000000 R15: > ffff99d24dfefc50 > [10481.108822] FS: 0000000000000000(0000) GS:ffff99d356500000(0000) > knlGS:0000000000000000 > [10481.108823] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > [10481.108823] CR2: 00007f098c360008 CR3: 0000000215210003 CR4: > 00000000003706e0 > [10481.108824] DR0: 0000000000000000 DR1: 0000000000000000 DR2: > 0000000000000000 > [10481.108825] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: > 0000000000000400 > [10481.108825] Call Trace: > [10481.108826] cpumask_next+0x16/0x20 > [10481.108826] __percpu_counter_sum+0x3d/0x60 > [10481.108827] need_preemptive_reclaim.part.0+0x88/0x110 [btrfs] > [10481.108828] btrfs_preempt_reclaim_metadata_space+0xbc/0x210 [btrfs] > [10481.108828] process_one_work+0x1df/0x370 > [10481.108829] worker_thread+0x50/0x400 > [10481.108830] ? process_one_work+0x370/0x370 > [10481.108830] kthread+0x11b/0x140 > [10481.108831] ? __kthread_bind_mask+0x60/0x60 > [10481.108831] ret_from_fork+0x22/0x30 The filesystem is reasonably filled: > Data, single: total=147.94GiB, used=132.44GiB > System, single: total=32.00MiB, used=16.00KiB > Metadata, single: total=2.00GiB, used=1.14GiB > GlobalReserve, single: total=512.00MiB, used=0.00B I also checked with `btrfs/btrfs_flush_space` tracepoint, an illustrative output (repeated plethora times) is: > kworker/u8:12-28612 [002] .... 22227.461500: btrfs_flush_space: > dc1d7f42-5155-4095-a514-751950f8accd: state=3(FLUSH_DELAYED_REFS_NR) > flags=4(METADATA) num_bytes=851968 ret=0 for_preempt=1 It's not permanently spinning, but the overall (additional) load of the kworker spending time with btrfs_preempt_reclaim_metadata_space is around 1. I don't believe there's so much CPU work necessary in maintaining some free space, therefore I consider this a bug. (I didn't observe this with v5.11.)
Can I see the full btrfs fi usage so I have an idea of how full the fs is? I'm trying to reproduce the problem, but having trouble generating a workload that trips the preemptive reclaim at all. There's definitely some low hanging fruit, but I want to make sure I'm reproducing the problem instead of just throwing patches at it and hoping for the best.
Thanks for looking into this. > # btrfs fi usage /vol/ > Overall: > Device size: 149.97GiB > Device allocated: 149.97GiB > Device unallocated: 1.00MiB > Device missing: 0.00B > Used: 133.32GiB > Free (estimated): 15.81GiB (min: 15.81GiB) > Free (statfs, df): 15.81GiB > Data ratio: 1.00 > Metadata ratio: 1.00 > Global reserve: 512.00MiB (used: 0.00B) > Multiple profiles: no > > Data,single: Size:147.94GiB, Used:132.12GiB (89.31%) > /dev/mapper/vol 147.94GiB > > Metadata,single: Size:2.00GiB, Used:1.20GiB (59.86%) > /dev/mapper/vol 2.00GiB > > System,single: Size:32.00MiB, Used:16.00KiB (0.05%) > /dev/mapper/vol 32.00MiB > > Unallocated: > /dev/mapper/vol 1.00MiB
I have a branch that has my current work around this problem, please check it and see if it resolves your problem. https://github.com/josefbacik/linux/tree/preemptsum
Thanks. I should get down to that in the next week. BTW should I be cautious when testing it on my production FS?
That's up to you, I only ever test in production ;). That being said these just change how we trigger preemptive flushing, at worse it'll make things slower, they shouldn't break anything.
So I compared a 5.12.0-rc8 that I had at hand against v5.12 + the following patches: cherry-picked 2cdb3909c9e9 for applicability - 0001-btrfs-use-percpu_read_positive-instead-of-sum_positi.patch from the branch torvalds/master..josefbacik/preemptsum (9ccce092fc64..1a07c6b9cc88) - 0001-btrfs-correct-try_lock_extent-usage-in-read_extent_b.patch - 0002-btrfs-check-worker-before-need_preemptive_reclaim.patch - 0003-btrfs-only-clamp-the-first-time-we-have-to-start-flu.patch - 0004-btrfs-take-into-account-global-rsv-in-need_preemptiv.patch - 0005-btrfs-use-the-global-rsv-size-in-the-preemptive-thre.patch - 0006-btrfs-don-t-include-the-global-rsv-size-in-the-preem.patch - 0007-btrfs-only-ignore-delalloc-if-delalloc-is-much-small.patch - 0008-btrfs-handle-preemptive-delalloc-flushing-slightly-d.patch I measured sys time spent by any events_unbound kworker (during ordinary run of the system): kernel real time (s) sys time (s) ratio 5.12.0-rc8-1.ga71cb9a 16646 9797.83 0.588587 5.12.1-202105031446.gdc37b65 12622 55.46 0.00439391 So from the perspective of unjustified CPU consumption the patchset resolves the issue for me.
(In reply to Michal Koutný from comment #6) > So from the perspective of unjustified CPU consumption the patchset resolves > the issue for me. Hm, I may have not stressed it properly yesterday. It looks like switching git branches on the FS triggers the preemptive reclaim excessively. With the same metric I'm seeing today: 5.12.1-202105031446.gdc37b65 1587 638.64 0.402219 (The kworker consumes _minutes_ of CPU time after the switch but seems to settle eventually (unless I switch again).) The recent state of the FS is: > Overall: > Device size: 149.97GiB > Device allocated: 149.97GiB > Device unallocated: 1.00MiB > Device missing: 0.00B > Used: 141.97GiB > Free (estimated): 7.22GiB (min: 7.22GiB) > Free (statfs, df): 7.22GiB > Data ratio: 1.00 > Metadata ratio: 1.00 > Global reserve: 512.00MiB (used: 0.00B) > Multiple profiles: no > > Data,single: Size:147.94GiB, Used:140.72GiB (95.12%) > /dev/mapper/vol 147.94GiB > > Metadata,single: Size:2.00GiB, Used:1.25GiB (62.63%) > /dev/mapper/vol 2.00GiB > > System,single: Size:32.00MiB, Used:16.00KiB (0.05%) > /dev/mapper/vol 32.00MiB > > Unallocated: > /dev/mapper/vol 1.00MiB
Looks like I've been seeing this for the past week or so on Arch Linux with linux 5.12.4. Are there any workarounds besides applying the changes in https://github.com/josefbacik/linux/tree/preemptsum ?
I could mention my usage info: > $ sudo btrfs filesystem usage / > Overall: > Device size: 45.41GiB > Device allocated: 45.41GiB > Device unallocated: 1.00MiB > Device missing: 0.00B > Used: 42.24GiB > Free (estimated): 2.74GiB (min: 2.74GiB) > Free (statfs, df): 2.74GiB > Data ratio: 1.00 > Metadata ratio: 1.00 > Global reserve: 197.08MiB (used: 0.00B) > Multiple profiles: no > > Data,single: Size:43.37GiB, Used:40.62GiB (93.67%) > /dev/nvme0n1p7 43.37GiB > > Metadata,single: Size:2.01GiB, Used:1.62GiB (80.49%) > /dev/nvme0n1p7 2.01GiB > > System,single: Size:32.00MiB, Used:16.00KiB (0.05%) > /dev/nvme0n1p7 32.00MiB > > Unallocated: > /dev/nvme0n1p7 1.00MiB Should I expect this problem to disappear if delete some files to reach some magic free space threshold?
I'm seeing the same with kernel 5.12.5-200.fc33.x86_64. With RAID1, currently one of my CPUs is working at least half of the time, and the laptop fan knows no peace: /dev/sda1 40G 37G 2.1G 95% /home When I was at something around 5GiB free, the worker thread didn't trigger noticeably. I haven't observed any particular event that leads to bouts of CPU onslaught, but on the other hand I have several instances of Firefox idling at all times.
Quick update: it seems like a btrfs balance made the CPU symptoms go away for the time being.
Also on my system (Arch Linux, kernel 5.12.>=9), on a reasonably full btrfs partition (~90%), a kworker process indefinitely consumes 100% CPU. perf report reveals btrfs_preempt_reclaim_metadata_space as the culprit for me too. Downgrading to kernel 5.11.x fixes the issue (5.12.<=8 might've been okay too, unfortunately cannot confirm this). Same goes for a full btrfs rebalance, CPU usage is fixed.
I started to suffer this bug on latest 5.12.x releases using Archlinux, probably it was because my root partition had low free space at that time, around 2 GiB. I can confirm that 5.13.4 kernel is still affected by this bug. The solution of this problem was freeing some space moving a couple of big files to another partition. This is my root partition now: sudo btrfs filesystem usage / Overall: Device size: 60.00GiB Device allocated: 57.97GiB Device unallocated: 2.03GiB Device missing: 0.00B Used: 53.45GiB Free (estimated): 5.76GiB (min: 5.76GiB) Free (statfs, df): 5.75GiB Data ratio: 1.00 Metadata ratio: 1.00 Global reserve: 512.00MiB (used: 0.00B) Multiple profiles: no
I can confirm that kernel 5.13.4 suffers from this bug. I'm on Fedora 34 and the specific kernel version is 5.13.4-200.fc34.x86_64. I had 7GiB free on the root which causes CPU workload, so I had to remove some files to stop the CPU and fan usage. Now, I have 20GiB free and the CPU and fans usage are mitigated. If you need more information I'll be glad to provide it.
(In reply to agapito from comment #13) > I started to suffer this bug on latest 5.12.x releases using Archlinux, > probably it was because my root partition had low free space at that time, > around 2 GiB. I can confirm that 5.13.4 kernel is still affected by this bug. > > The solution of this problem was freeing some space moving a couple of big > files to another partition. This is my root partition now: > > > sudo btrfs filesystem usage / > Overall: > Device size: 60.00GiB > Device allocated: 57.97GiB > Device unallocated: 2.03GiB > Device missing: 0.00B > Used: 53.45GiB > Free (estimated): 5.76GiB (min: 5.76GiB) > Free (statfs, df): 5.75GiB > Data ratio: 1.00 > Metadata ratio: 1.00 > Global reserve: 512.00MiB (used: 0.00B) > Multiple profiles: no Forget it, still happening.
Happens to me with Fedora 34 5.13.5-200.fc34.x86_64. I have plenty of free space (54G, 74% partition filled).
same here with archlinux and kernel 5.13.7
is there a way to stop CPU usage without rebooting?
It turned out to be Virtualbox in my case. After a Windows reboot in it, the unbound events processes went away
actually not, hapenned again and vm reboot or shutdown didn't help...
(In reply to Bráulio Bhavamitra from comment #18) > is there a way to stop CPU usage without rebooting? In my case, launching btrfs rebalance helped: sudo btrfs balance start -musage=35 -dusage=35 /
Just had rebooted and it start with the 100% unbound_events usage... ValdikSS: thank you, it solved after 1min running this command! FYI: this might be triggered by Chia farming, as it uses a lot of disk space and files.
Had to use `--full-balance` after another events_unbound event, `-musage=35 -dusage=35` wasn't enough
Sorry everybody I didn't get notifications on this thing, I'll dig into it now.
While I'm trying to reproduce, could somebody collect some trace data while the problem is happening, something like echo "for_preempt == 1" > /sys/kernel/debug/tracing/events/btrfs/btrfs_flush_space/filter echo 1 > /sys/kernel/debug/tracing/events/btrfs/btrfs_flush_space/enable echo 1 > /sys/kernel/debug/tracing/events/btrfs/btrfs_done_preemptive_reclaim/enable cat /sys/kernel/debug/tracing/trace_pipe > out.txt count to 5 and then ctrl+c and upload out.txt so I can see wtf it's doing.
5s of that gave me a 730MiB file! Here are the first few lines of that: https://bpa.st/raw/RD5Q is that enough?
Wow ok, can you give me the output between two 'btrfs_done_preemptive_reclaim' events, including the 'btrfs_done_preemptive_reclaim' events themselves? If that's too big just snip the duplicate lines and tell me how much you snipped please.
Created attachment 298251 [details] some trace data out.txt (in tar.xz archive) as requested in https://bugzilla.kernel.org/show_bug.cgi?id=212185#c25
That attachment is the whole thing. Turns out a bunch of duplicated text compresses pretty well!
same for me, just all repeated with: kworker/u64:4-344762 [004] .... 167816.462311: btrfs_flush_space: 0e658495-8f2b-46f7-a2d7-6144825d1c5b: state=3(FLUSH_DELAYED_REFS_NR) flags=4(METADATA) num_bytes=262144 ret=0 for_preempt=1 full file at https://drive.google.com/file/d/1pSDKziJuPnCMaNipfTBhcyU9PSoHYaxB/view?usp=sharing
the issue is getting worse, now a --full-balance isn't helping anymore to get rid of the CPU usage of events_unbound. I'm running it on 2 other slow HDDs, but I suspect they are not the cause
found another backup btrfs partition that needed rebalance, and that got rid of the unbound events :P
Holy hell, alright can you just enable 'btrfs_done_preemptive_reclaim' and capture the trace file until you get a few events from that?
(In reply to Josef Bacik from comment #33) > Holy hell, alright can you just enable 'btrfs_done_preemptive_reclaim' and > capture the trace file until you get a few events from that? Could you be more explicit (like last time) on what commands I should run to get what you're looking for here?
echo 1 > /sys/kernel/debug/tracing/events/btrfs/btrfs_done_preemptive_reclaim/enable cat /sys/kernel/debug/tracing/trace_pipe don't need to put it into a file, it shouldn't happen that often, I just need a couple lines of the output to see what it says.
I've been experiencing this issue intermittently as well (Arch Linux, 5.13.7-arch1-1). I triggered the btrfs_done_preemptive_reclaim debug trace and got one output from it, after which the CPU usage went back to normal. kworker/u16:1-43744 [001] ...1 101704.405332: btrfs_done_preemptive_reclaim: 445d8f25-d7c1-4d52-83e3-9b1c37823f50: flags=METADATA total_bytes=1073741824 bytes_used=446365696 bytes_pinned=0 bytes_reserved=0 bytes_may_use=536870912 bytes_readonly=0 reclaim_size=0 clamp=6 global_reserved=536870912 trans_reserved=0 delayed_refs_reserved=0 delayed_reserved=0 chunk_free_space=1048576
It came back and generated a few more lines: kworker/u16:10-47445 [000] ...1 102617.533533: btrfs_done_preemptive_reclaim: 445d8f25-d7c1-4d52-83e3-9b1c37823f50: flags=METADATA total_bytes=1073741824 bytes_used=446185472 bytes_pinned=180224 bytes_reserved=294912 bytes_may_use=546701312 bytes_readonly=0 reclaim_size=0 clamp=6 global_reserved=536870912 trans_reserved=0 delayed_refs_reserved=7077888 delayed_reserved=2097152 chunk_free_space=1048576 kworker/u16:5-44518 [000] ...1 102696.944009: btrfs_done_preemptive_reclaim: 445d8f25-d7c1-4d52-83e3-9b1c37823f50: flags=METADATA total_bytes=1073741824 bytes_used=446152704 bytes_pinned=1605632 bytes_reserved=393216 bytes_may_use=625606656 bytes_readonly=0 reclaim_size=1310720 clamp=6 global_reserved=536870912 trans_reserved=0 delayed_refs_reserved=88735744 delayed_reserved=0 chunk_free_space=1048576 kworker/u16:5-44518 [002] ...1 102696.954654: btrfs_done_preemptive_reclaim: 445d8f25-d7c1-4d52-83e3-9b1c37823f50: flags=METADATA total_bytes=1073741824 bytes_used=446152704 bytes_pinned=1605632 bytes_reserved=1048576 bytes_may_use=624558080 bytes_readonly=0 reclaim_size=1310720 clamp=6 global_reserved=536870912 trans_reserved=0 delayed_refs_reserved=87293952 delayed_reserved=393216 chunk_free_space=1048576 kworker/u16:5-44518 [003] ...1 102720.052036: btrfs_done_preemptive_reclaim: 445d8f25-d7c1-4d52-83e3-9b1c37823f50: flags=METADATA total_bytes=1073741824 bytes_used=446251008 bytes_pinned=0 bytes_reserved=0 bytes_may_use=536870912 bytes_readonly=0 reclaim_size=0 clamp=6 global_reserved=536870912 trans_reserved=0 delayed_refs_reserved=0 delayed_reserved=0 chunk_free_space=1048576
Created attachment 298285 [details] more trace data Here's what I got for the less verbose trace request. The arrival of these messages correlates strongly with my CPU usage. I often see a core unpeg from the constant 100% it normally is when one of these trace messages arrives. I'll try to put together something that shows this. It seems we're definitely on the right track here.
I hope this is a good demo of how my CPU usage is related to these trace messages: https://asciinema.org/a/429966 Essentially, this issue makes a machine with an X thread CPU an X-1 thread one! In recording this example, I've noticed that a single problematic reclaim event can take a loooong time to actually complete, like tens of minutes in some cases. I don't understand enough to be able to trigger this myself. Creating (and then deleting) large files from /dev/random doesn't do it. Maybe I should try that with many small files...
Alright I think it's doing the "right" thing here, just stupidly. You've got 89mib of "free" metadata space, which isn't a lot in the grand scheme of things, and we have a threshold of 98% full before we decide its too full to do anything. Clearly 98% is too high of a threshold, because you're just spinning flushing stuff out constantly for no reason. I'll reduce this to 90% and that'll make it less silly.
https://lore.kernel.org/linux-btrfs/cover.1628706812.git.josef@toxicpanda.com/
FWIW the problem isn't with full file systems only; it's happening with the following stats too: Data, single: total=12.00GiB, used=11.07GiB System, single: total=32.00MiB, used=16.00KiB Metadata, single: total=256.00MiB, used=102.78MiB GlobalReserve, single: total=23.00MiB, used=0.00B It's a 20G btrfs filesystem on Linux 5.13.7, so about 7.8G is free space (on fast storage too). Cloning the systemd repo and copying it to a new directory seems to have triggered this for me now. CPU has been spinning in sys time for about 15 minutes so far, I'm waiting for the
* I'm waiting for the btrfs_done_preemptive_reclaim event once it's done (assuming it prints it at the end of the operation).
Got this after about 35 minutes of 100% CPU usage (the kworker/u2:7+events_unbound process is still going though!): kworker/u2:7-79786 [000] ...1 179939.368121: btrfs_done_preemptive_reclaim: 30924304-7ce1-4c07-b328-f81d310846f7: flags=METADATA total_bytes=268435456 bytes_used=108052480 bytes_pinned=835584 bytes_reserved=32768 bytes_may_use=27918336 bytes_readonly=0 reclaim_size=0 clamp=8 global_reserved=24117248 trans_reserved=0 delayed_refs_reserved=1048576 delayed_reserved=917504 chunk_free_space=7282339840
One more after the 1 hour mark: kworker/u2:7-79786 [000] ...1 181770.406475: btrfs_done_preemptive_reclaim: 30924304-7ce1-4c07-b328-f81d310846f7: flags=METADATA total_bytes=268435456 bytes_used=108265472 bytes_pinned=770048 bytes_reserved=0 bytes_may_use=27246592 bytes_readonly=0 reclaim_size=0 clamp=8 global_reserved=24117248 trans_reserved=0 delayed_refs_reserved=1048576 delayed_reserved=786432 chunk_free_space=7282339840 kworker/u2:7+events_unbound is showing 1h10m of CPU time now. I think I may need to reboot to stop it. 🐱
I see, in the patch email you sent to linux-btrfs@vger.kernel.org Josef, that you wrote "The problem is for very full file systems on slower disks..." but that does not seem to describe my situation. Yeah, my btrfs partition on this drive is pretty full, but the drive is a "Sandisk Corp WD Black SN750 / PC SN730 NVMe SSD" does that count as slower? I guess I'm just thinking if this is an issue on an M.2 NVMe SSD and it's worsened/caused by drive speed, imagine what it would be like on spinning rust! If I understand the message you sent a few ones up, the actual root of my problem is that I only have 89mib of free metadata space. Have I done something wrong here? Am I somehow in control of this allocation? It also doesn't match with what `btrfs filesystem usage /` shows me today: ``` $ sudo btrfs filesystem usage / Overall: Device size: 45.41GiB Device allocated: 45.41GiB Device unallocated: 1.00MiB Device missing: 0.00B Used: 41.96GiB Free (estimated): 2.76GiB (min: 2.76GiB) Free (statfs, df): 2.76GiB Data ratio: 1.00 Metadata ratio: 1.00 Global reserve: 214.39MiB (used: 0.00B) Multiple profiles: no Data,single: Size:43.12GiB, Used:40.35GiB (93.59%) /dev/nvme0n1p7 43.12GiB Metadata,single: Size:2.26GiB, Used:1.61GiB (71.12%) /dev/nvme0n1p7 2.26GiB System,single: Size:32.00MiB, Used:16.00KiB (0.05%) /dev/nvme0n1p7 32.00MiB Unallocated: /dev/nvme0n1p7 1.00MiB ``` That's 650GiB free, or only 71% metadata space used, right? Is that consistent with everything you've seen? Why does it disagree so much with 89mib?
*~670MiB free
I'm using an NVMe device too. Filesystem is ZSTD compressed. I'm not sure but I don't suffer from this bug on a normal hard disk, only when I write some files to the NVMe partition, like a system update.
I have compression enabled too. My mount is like this: /dev/nvme0n1p6 on / type btrfs (rw,relatime,compress=lzo,ssd,space_cache,subvolid=257,subvol=/root)
I don't have compression.
I think have the same problem. I have a lot of free space available, but when most of *allocated* space is used (data used > ~95%, not sure) I start seeing high CPU usage from kworker/u16:2+events_unbound. Problem disappears after creation and removal of a large (8GB) file with random data (to expand allocated space). /dev/sdb4 on /home type btrfs (rw,noatime,compress-force=zstd:3,ssd,space_cache,subvolid=299,subvol=/@home) ``` ❯ btrfs fi usage /home Overall: Device size: 351.00GiB Device allocated: 46.02GiB Device unallocated: 304.98GiB Device missing: 0.00B Used: 43.33GiB Free (estimated): 305.28GiB (min: 152.79GiB) Free (statfs, df): 305.28GiB Data ratio: 1.00 Metadata ratio: 2.00 Global reserve: 179.39MiB (used: 0.00B) Multiple profiles: no Data,single: Size:42.01GiB, Used:41.70GiB (99.27%) /dev/sdb4 42.01GiB Metadata,DUP: Size:2.00GiB, Used:833.70MiB (40.71%) /dev/sdb4 4.00GiB System,DUP: Size:8.00MiB, Used:16.00KiB (0.20%) /dev/sdb4 16.00MiB Unallocated: /dev/sdb4 304.98GiB ```
I've been trying to test Josef's patches, but my machine that exhibits this issue is crashing on the latest git master kernel (unpatched) and the Arch Linux release kernel (5.13.9) is missing some btrfs patches that Josef's patches rely on. I'll try bisecting master to see if I can back out whatever's crashing my system.
got 2 events_unbound :P one for each unbalanced partition (btw, also having this issues on a NVMe SSD)
was the patch applied to stable?
this problem is happening almost every day for me in one of the 3 SSDs partitions I have, randomly. It never happened though with any of the 2 HDD btrfs partitions
@Bráulio: It happens when you write to disk, and you will probably write more to an ssd disk if you have the operating system there, although it also happens on normal disks. Honestly i´m tired of this. I am thinking of migrating to to the old and stable ext4, especially when I read that the hard disk where I have stored my data appears on this list and is prone to possible corruption. https://btrfs.wiki.kernel.org/index.php/Hardware_bugs
The patches are now in the development queue, are on the way to Linus tree and then to stable. The ETA for that could be next week once 5.14 is released and stable picks it. (In reply to agapito from comment #56) > Honestly i´m tired of this. I am thinking of migrating to to the old and > stable ext4, especially when I read that the hard disk where I have stored > my data appears on this list and is prone to possible corruption. > > https://btrfs.wiki.kernel.org/index.php/Hardware_bugs I hope you understand that no filesystem will save your data on a buggy hardware, not to say ext4 won't even detect that in most cases.
(In reply to David Sterba from comment #57) > The patches are now in the development queue, are on the way to Linus tree > and then to stable. The ETA for that could be next week once 5.14 is > released and stable picks it. > > (In reply to agapito from comment #56) > > Honestly i´m tired of this. I am thinking of migrating to to the old and > > stable ext4, especially when I read that the hard disk where I have stored > > my data appears on this list and is prone to possible corruption. > > > > https://btrfs.wiki.kernel.org/index.php/Hardware_bugs > > I hope you understand that no filesystem will save your data on a buggy > hardware, not to say ext4 won't even detect that in most cases. I have been using ext4 since its release and NEVER had any problems. For over a decade I never lost a single file or had any record of any file corruption. I have never had to format the root partition for any kind of ext4 related problem even though I suffer from constant power outages and very importantly I have never suffered any performance problems using it. 8 months ago I bought a WD BLUE HDD as my main disk for my data and decided to format it as btrfs seduced by the idea of ZSTD compression; a few days later I formatted all my partitions to btrfs. To be fair, during these last months I have also suffered quite a few power outages and I had no problems either, but suffering this bug and waiting for the fix for almost 2 months and the fact that my hard drive is listed on the wiki as buggy, was too much for me. In this last weekend I have reorganized my hard drives and formatted all my partitions and none of them is btrfs now. The most important thing about a filesystem for me is forgetting that it exists and that's what happened to me for over a decade, until I tried btrfs. Maybe I'll try btrfs in the future, because I admit that on paper it's superior to the rest, but what's the point if bugs like this appear in the stable kernel branches? Debian and Red Hat can't be wrong, so maybe I'll give it another chance when it's the default filesystem in Debian. I am not trying to be disrespectful, I am just telling my experience.
First, i want to say i appreciate the work that is being done to fix this--it has been a quite annoying bug for me as well. I wanted to point out that I've seen zstd compression mentioned in several posts above, and that I am also using zstd and suffering from this bug. My drive overall is 70% used, with 71% metadata usage.
I just got this in a newly created partition on a new drive Seagate Firecuda 510 2tb. After a rsync 30gb of the root fs the events_unbound was triggered!
got this again after a simple system update with `pacman -Syu`. maybe this is due to a high `commit` option? It is set to 600 here
I'm seeing this happening and blocking I/O for up to a minute just after timeshift does its hourly snapshots. Admittedly, I have quite a lot of them.
This issue might be gone with kernel 4.14.0, it is been a few days that it doesn't happen to none of my 6 btrfs ssd partitions (btw, it never happened on my hdds)
I never managed to get an RC kernel to work on my machine (for reasons unrelated to this issue), but I've been running a 5.14.2 release kernel for about 3 days now and the issue has not returned.
I'm on 5.14.3 and I still have the issue. Did the patch make it into 5.14?
(In reply to Greyson Christoforo from comment #65) > Did the patch make it into 5.14? Looks like the answer to that is no, but I can see a derivative of the patch from https://bugzilla.kernel.org/show_bug.cgi?id=212185#c41 in 5.15-rc1.
For some reason the patch hasn't been picked from the pull request branch so I have to ask for including it manually.
Released in 5.14.7, I've been using it for some time and haven't observed the runaway kthread anymore. I took longer to fix than it should, sorry, something to improve next time. If anybody sees the problem in the supposedly fixed versions, please reopen.
I haven't noticed it yet running on 5.14.9 Thank you very much for the fix David!