Bug 63071

Summary: 40,000-extent file can't be deleted
Product: File System Reporter: John Goerzen (jgoerzen)
Component: btrfsAssignee: Josef Bacik (josef)
Status: RESOLVED OBSOLETE    
Severity: normal CC: dsterba, jim
Priority: P1    
Hardware: All   
OS: Linux   
Kernel Version: 3.10 (debian backports) Subsystem:
Regression: No Bisected commit-id:

Description John Goerzen 2013-10-15 16:24:42 UTC
I made the mistake of storing a VirtualBox 40G .vdi file on a btrfs partition (kernel 3.10).  I have managed to copy it to a file marked chattr +C (I read the wiki *NOW*), but I can't delete the original file -- rm hangs the system.

(I actually have the original file and another made with cp --reflink of it, while trying to diagnose the issue.  I need to rm both of them)

rm starts out with lots of hard disk activity, then that stops.  I eventually get the attached kernel BUG, then a little bit later it panics due to out of memory.

Photo: https://www.dropbox.com/s/8oxdaw8ux53y08g/PA150006.JPG
Comment 1 David Sterba 2013-10-15 16:37:32 UTC
4149         for (i = 0; i < num_pages; i++) {
4150                 p = alloc_page(GFP_ATOMIC);
4151                 BUG_ON(!p);

no free page for you

4152                 attach_extent_buffer_page(new, p);
4153                 WARN_ON(PageDirty(p));
4154                 SetPageUptodate(p);
4155                 new->pages[i] = p;
4156         }

The number of extents itself should not be a problem, I'm testing with milion-extent files (qcow2 images).

Probably the file deletion tries to grab a lot of memory which is not available on the system (qgroups are turned on, so it grabs more memory), so you may want to delete the file in pieces, ie. truncate it by small steps backwards. This should limit the amount of memory and delete the files.
---
#!/bin/sh

if [ -z "$1" ]; then
        echo "no file"
        exit 1
fi
file="$1"

size=`stat --format='%s' "$file"`
step=$((128*1023*1024))
next=$(($size - $step))

while [ $next -gt 0 ]; do
        echo "trunc to $next"
        truncate -s$next -- "$file"
        next=$(($next - $step))
done
rm -- "$file"
Comment 2 John Goerzen 2013-10-15 16:42:48 UTC
I will try, David.  I began starting work on this because VirtualBox locked up and I/O to this file was very slow.  Last I could get a reliable number, there were over 40,000 extents.  https://btrfs.wiki.kernel.org/index.php/Gotchas implies that 10,000 extents is a big problem.  Is that outdated?  Am I looking at a problem that copying this to a file made with chattr +C will NOT fix?
Comment 3 John Goerzen 2013-10-15 16:47:14 UTC
FWIW the machine this was on has 8GB RAM and was in single-user mode.
Comment 4 John Goerzen 2013-10-15 17:18:14 UTC
That script slowly worked.  It would trim a few GB and then the system would hang again.  Eventually I just tar'd up what I needed, deleted the subvolume, and restored from tar.  I may have gotten there eventually but I sat through a dozen reboots alreday.
Comment 5 John Goerzen 2013-10-19 22:44:43 UTC
I just ran into a very similar situation, on a same version kernel but on i386 instead of amd64.  The file in question far far smaller.  I tried to rm it.  Since there was a snapshot active, it should have returned just about immediately, but instead took down the entire server.

How can I help?
Comment 6 John Goerzen 2013-10-19 22:49:30 UTC
This file was only 5GB.  It had 27475 extents, and was created by restore (from the dump/restore program - a restoresymtable file)
Comment 7 John Goerzen 2013-10-21 14:10:05 UTC
Per the thread in the mailing list, others have seen this as well.  Perhaps it is related to qgroups, perhaps not, but for the moment it seems that is a possible culprit.

http://thread.gmane.org/gmane.comp.file-systems.btrfs/29252/focus=29259
Comment 8 John Goerzen 2013-10-25 13:38:22 UTC
On a system where this has caused a hang once, after btrfs quota disable and a reboot, doing the same action did not cause any problem at all.  Seems increasingly likely to be a quota problem.

I'm not sure what more info could be needed; this is still flagged NEEDINFO but I don't know what more info could be supplied?
Comment 9 Josef Bacik 2013-10-31 20:58:24 UTC
How do you have qgroups setup?  I'd like to try and reproduce this locally and to do that I need to know your qgroup setup.
Comment 10 John Goerzen 2013-10-31 21:05:50 UTC
I did nothing other than basically:

btrfs sub cre a
btrfs sub cre b
btrfs sub cre c
btrfs quota enable

I only turned them on so I could get disk-usage data for each subvolume via btrfs qg show.  I did not set any limits and create anything else.  I believe I did turn on quotas before loading data on the FS.  All data was loaded into subvolumes.

Does that help?
Comment 11 Josef Bacik 2013-11-01 21:06:10 UTC
I can't seem to reproduce, everything looks fine to me.  Can you try to rm the file and run

slabtop

in another window and see what floats to the top?  Use the 'c' command when it loads up to sort by cache size, and then just copy+paste whatever comes to the top for a long time.
Comment 12 John Goerzen 2013-11-03 13:36:40 UTC
Hi Josef,

I will try to write some code on Monday to duplicate this and see what I can find for you.

John
Comment 13 John Goerzen 2014-05-28 16:16:34 UTC
FYI - I eventually changed to zfs and cannot writ ethe code to duplicate.  I am curious if this was ever resolved.  FWIW, I did hear from others with similar issues.
Comment 14 David Sterba 2022-10-03 10:41:23 UTC
This is a semi-automated bugzilla cleanup, report is against an old kernel version. If the problem still happens, please open a new bug. Thanks.