Bug 202559
Summary: | BTRFS is on large filesystems very slow | ||
---|---|---|---|
Product: | File System | Reporter: | NoName_Nr_1 (thorsten.brandau) |
Component: | btrfs | Assignee: | BTRFS virtual assignee (fs_btrfs) |
Status: | NEW --- | ||
Severity: | high | CC: | stf_xl, thorsten.brandau |
Priority: | P1 | ||
Hardware: | x86-64 | ||
OS: | Linux | ||
Kernel Version: | 4.20 or anything before | Subsystem: | |
Regression: | No | Bisected commit-id: |
Description
NoName_Nr_1
2019-02-11 12:30:36 UTC
(In reply to NoName_Nr_1 from comment #0) > - The disk is often stuck at 100% busy (ATOP) but no file transfer is going > on When this happen could you do "perf top -a --stdio" to print what functions eat cpu power ? > My Raids are: > > 68 TB BTRFS > 19 TB BTRFS > 30 TB BTRFS > 13 TB BTRFS This looks pretty enterprise for me. I would consider using commercial linux offering and get paid support. Seriously? I do a bug report (which is not made easy anyways) and what I get is "go somewhere else"? It seems that Linux is not anymore what it was when I was junger and started to use it. I am not sure which enterprise you are referring to, but a couple of enterprise do not look enterprise for me. This is operated at a small company, which is nowhere close to an enterprise. My home raid is similar and getting pretty crowded. I have found some page which claims that having a "larger" number of snapshots (>10 !) would affect performance. The subject was not really releated to my problem from the text, however, as I reduced the number of snapshots dramatically, the problem got smaller. Which means that I more seldom get loads >15 during no-load operation (i.e. no serious file transfer). So my impression is, that snapshots are in fact part of the problem. I was using snapshots for being able to roll back my data partition. So doing hourly snapshots (8 days back), weekly snapshots (5 weeks back), monthyl snapshots (13 months back) and annual snapshots (3 years) for each of the BTRFS volumes. I would not consider this extensive but a reasnable strcuture for a file system offering snapshots. Anyhow, when I reduced the number of snapshots to a total of 10 per volume, the system lockdown was reduced (>in the past 4 weeks) dramatically. It however does not make too much sense backupwise. An no, that is not my only way of backing up the system, but it is for being able to reconstruct user deleted files which unfortunately happens from time to time. "perf" gives the following reading, but I need to provoke a full lockdown to make it more infromational. I will add this ASAP. (the top/atop/iotop shows aleways btrfs-transactions to be eating up all ressources and the load on the disk is 100% access at low write/read values). 2.76% [kernel] [k] get_page_from_freelist 2.71% [kernel] [k] svc_tcp_recvfrom 2.68% [kernel] [k] module_get_kallsym 1.93% [kernel] [k] vsnprintf 1.82% [kernel] [k] memcpy_erms 1.76% [kernel] [k] vmx_vmexit 1.68% [kernel] [k] free_pcppages_bulk 1.66% [kernel] [k] format_decode 1.46% [kernel] [k] number 1.42% [kernel] [k] kallsyms_expand_symbol.constprop.1 1.25% perf [.] rb_next 1.15% [kernel] [k] menu_select 1.12% [kernel] [k] cpuidle_enter_state 1.08% perf [.] __symbols__insert 1.03% libc-2.29.so [.] __GI_____strtoull_l_internal 1.02% [kernel] [k] __schedule 0.95% [kernel] [k] string 0.94% [kernel] [k] trace_hardirqs_off 0.93% [kernel] [k] native_queued_spin_lock_slowpath 0.93% [kernel] [k] __alloc_pages_nodemask 0.87% [kernel] [k] trace_hardirqs_on 0.78% [kernel] [k] free_unref_page 0.74% [kernel] [k] ipt_do_table 0.70% [kernel] [k] cache_reap 0.70% [kernel] [k] svc_recv 0.69% [kernel] [k] __x86_indirect_thunk_rax 0.62% [kernel] [k] __page_cache_release Hi I could not get it to full load, but with defrag it is pretty much under load (however, disk only at 33-70% instead of the typical 100% busy). The perf gives: 24.97% [kernel] [k] native_queued_spin_lock_slowpath 2.99% [kernel] [k] queued_write_lock_slowpath 2.82% [kernel] [k] _raw_spin_lock_irqsave 2.70% [kernel] [k] __schedule 2.44% [kernel] [k] prepare_to_wait_event 2.29% [kernel] [k] btrfs_tree_read_lock 2.24% [kernel] [k] menu_select 1.92% [kernel] [k] btrfs_get_token_32 1.86% [kernel] [k] map_private_extent_buffer 1.47% [kernel] [k] btrfs_tree_read_unlock 1.40% [kernel] [k] queued_read_lock_slowpath 1.12% [kernel] [k] btrfs_tree_lock 1.11% [kernel] [k] try_to_wake_up 1.08% [kernel] [k] btrfs_set_token_32 1.05% [kernel] [k] _raw_read_lock 0.93% [kernel] [k] btrfs_search_slot 0.92% [kernel] [k] generic_bin_search.constprop.40 0.88% [kernel] [k] do_idle 0.86% [kernel] [k] update_load_avg 0.86% [kernel] [k] native_sched_clock 0.84% [kernel] [k] btrfs_set_lock_blocking_rw 0.81% [kernel] [k] select_task_rq_fair 0.77% [kernel] [k] find_extent_buffer 0.77% [kernel] [k] add_delayed_ref_head 0.76% [kernel] [k] __update_load_avg_cfs_rq 0.74% [kernel] [k] trace_hardirqs_off 0.73% [kernel] [k] _raw_write_lock 0.72% [kernel] [k] _raw_spin_lock 0.71% [kernel] [k] __switch_to_asm 0.68% [kernel] [k] __switch_to 0.64% [kernel] [k] __wake_up_common 0.64% [kernel] [k] update_cfs_rq_h_load 0.59% [kernel] [k] __radix_tree_lookup 0.59% [kernel] [k] btrfs_tree_unlock 0.57% [kernel] [k] module_get_kallsym 0.56% [kernel] [k] free_extent_buffer 0.54% [kernel] [k] pick_next_task_fair I use a rolling release, so the kernel is now at Linux server06 5.0.3-1-default #1 SMP Fri Mar 22 17:30:35 UTC 2019 (2a31831) x86_64 x86_64 x86_64 GNU/Linux Hope that helps. It seems to get slowly better, but in comparison to XFS the file system is unfortunately still a lot slower. It is a pity as I really like the features. This looks like lock contention. What is the program / process name what cause this ? btrfs-transactions is typically the leader with top. Especially when something copies filetrees via nfs. Whole system is practically on a standstill for disk access. |