Kernel Bug Tracker – Bug 60860
creation of small files in quick succession results in *huge* loadavg spike
Last modified: 2013-11-04 11:31:07 UTC
i'm using btrfs on a live (fool! :) system - hey, somebody has to - and
it works very well with the exception of massive load spikes when running
say "apt-get update" when pdiff is enabled.
pdiff downloads very quickly a large number of small files - 5 files of size
15 to 50k in one second is not unusual. at an indeterminate but quite small
(under 50) number of files apt-get freezes for no reason. investigation
shows that in fact *everything's* locked up.
when running "top" (if it's possible to run top at this point) the loadavg
can be seen climbing rapidly. within 30 seconds it's passed 5. within a
minute it's up to at least 8. processes which don't normally reach the
top of the list start to show up with names such as "btrfs_transaction"
and so on.
killing apt-get (if it's possible to kill it) doesn't *immediately* solve
the problem - it takes several minutes for the loadavg to climb back down.
as this has happened more than once, i get a distinct "feel" that once this
has occurred that subsequent file usage "feels" much less responsive (even
just file-writing) as if btrfs was somehow compromised by the overload and
still has yet to recover. however this could be my imagination. the loadavg
jumping to 8 or above definitely is *not* imagination.
version of the kernel is that which is current in debian/testing.
cf debian bugreport
oh. sorry. forgot to add: on 3.9.0 the exact same issue was present.
Can you reproduce on btrfs-next and give me sysrq+w at a few intervals when you are seeing these spikes?
ahh i'm a long-time free software developer and linux kernel hacker, but i must
apologise i need a little more context. where's btrfs-next available from, and
is it stable enough to use on a live system; and secondly, where do i get
sysrq+w from, it's not something i've encountered before? (it's not a command,
i tried that already and the +w is a give-away that it's unlikely to be one!)
but really for this sort of thing it's not imperative you be on btrfs-next. I will rebase it on to 3.11 proper soon but right now its on 3.10, it only has btrfs patches that have passed my tests (this doesn't necessarily mean they won't break stuff, but they aren't immediately horrible).
For sysrq-w you just
echo w > /proc/sysrq-trigger
do that and it will dump all of the waiting tasks to dmesg. So you'll want to do that while you are seeing problems, wait a few seconds, do it again, and do that 2 or 3 times so I can see the pattern.
No reproducer ?