Bug 60860

Summary: creation of small files in quick succession results in *huge* loadavg spike
Product: File System Reporter: Luke Kenneth Casson Leighton (lkcl)
Component: btrfsAssignee: Josef Bacik (josef)
Status: RESOLVED OBSOLETE    
Severity: normal CC: alan, dsterba
Priority: P1    
Hardware: x86-64   
OS: Linux   
Kernel Version: 3.9.0, 3.10.2 Subsystem:
Regression: No Bisected commit-id:

Description Luke Kenneth Casson Leighton 2013-09-05 22:35:48 UTC
i'm using btrfs on a live (fool! :) system - hey, somebody has to - and
it works very well with the exception of massive load spikes when running
say "apt-get update" when pdiff is enabled.

pdiff downloads very quickly a large number of small files - 5 files of size
15 to 50k in one second is not unusual.  at an indeterminate but quite small
(under 50) number of files apt-get freezes for no reason.  investigation
shows that in fact *everything's* locked up.

when running "top" (if it's possible to run top at this point) the loadavg
can be seen climbing rapidly.  within 30 seconds it's passed 5.  within a
minute it's up to at least 8.  processes which don't normally reach the
top of the list start to show up with names such as "btrfs_transaction"
and so on.

killing apt-get (if it's possible to kill it) doesn't *immediately* solve
the problem - it takes several minutes for the loadavg to climb back down.

as this has happened more than once, i get a distinct "feel" that once this
has occurred that subsequent file usage "feels" much less responsive (even
just file-writing) as if btrfs was somehow compromised by the overload and
still has yet to recover.  however this could be my imagination.  the loadavg
jumping to 8 or above definitely is *not* imagination.

version of the kernel is that which is current in debian/testing.
Comment 1 Luke Kenneth Casson Leighton 2013-09-05 22:37:17 UTC
http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=721924

cf debian bugreport
Comment 2 Luke Kenneth Casson Leighton 2013-09-05 22:38:41 UTC
oh.  sorry.  forgot to add: on 3.9.0 the exact same issue was present.
Comment 3 Josef Bacik 2013-09-09 18:12:59 UTC
Can you reproduce on btrfs-next and give me sysrq+w at a few intervals when you are seeing these spikes?
Comment 4 Luke Kenneth Casson Leighton 2013-09-09 19:43:01 UTC
hi josef,

ahh i'm a long-time free software developer and linux kernel hacker, but i must
apologise i need a little more context.  where's btrfs-next available from, and
is it stable enough to use on a live system; and secondly, where do i get
sysrq+w from, it's not something i've encountered before? (it's not a command,
i tried that already and the +w is a give-away that it's unlikely to be one!)

l.
Comment 5 Josef Bacik 2013-09-09 20:57:04 UTC
git://git.kernel.org/pub/scm/linux/kernel/git/josef/btrfs-next.git

but really for this sort of thing it's not imperative you be on btrfs-next.  I will rebase it on to 3.11 proper soon but right now its on 3.10, it only has btrfs patches that have passed my tests (this doesn't necessarily mean they won't break stuff, but they aren't immediately horrible).

For sysrq-w you just

echo w > /proc/sysrq-trigger

do that and it will dump all of the waiting tasks to dmesg.  So you'll want to do that while you are seeing problems, wait a few seconds, do it again, and do that 2 or 3 times so I can see the pattern.
Comment 6 Alan 2013-11-04 11:31:07 UTC
No reproducer ?
Comment 7 David Sterba 2022-10-03 10:34:01 UTC
This is a semi-automated bugzilla cleanup, report is against an old kernel version. If the problem still happens, please open a new bug. Thanks.