Bug 15967

Summary: XFS is insanely slow at deleting files on a highly fragmented FS
Product: File System Reporter: Artem S. Tashkinov (aros)
Component: XFSAssignee: Dave Chinner (david)
Status: RESOLVED INVALID    
Severity: normal CC: david
Priority: P1    
Hardware: All   
OS: Linux   
Kernel Version: Subsystem:
Regression: No Bisected commit-id:

Description Artem S. Tashkinov 2010-05-13 11:59:16 UTC
A month ago I created an XFS filesystem and put irregularly hundreds of 100MB files on it (by interleaving them, e.g. wget -b file1; wget -b file2; wget -b file3). Now when I try delete those files on average it takes a whopping 1.8 - 3.0 seconds to delete any of them - I believe it's unacceptable.

$ time /bin/rm file121.bin

real    0m2.322s
user    0m0.000s
sys     0m0.040s

I haven't specified any special mount options, except noatime:

/dev/sda5 on /mnt/ext type xfs (rw,noatime)

FS was created using default options, except -i:

# mkfs.xfs -i size=512 /dev/sda5

using xfsprogs-3.1.1

I'm running a fast, modern 1TB 7200RPM HDD.

On ext4 fs on the same HDD it takes less than 0.1 seconds to delete the same file regardless fragmentation:

$ time /bin/rm 100MB.bin

real    0m0.069s
user    0m0.000s
sys     0m0.007s

I'm running vanilla kernel 32bit 2.6.33.3.
Comment 1 Artem S. Tashkinov 2010-05-13 12:07:41 UTC
To give you a picture of how fragmented this FS is:

xfs_db -r /dev/sda5
xfs_db> frag
actual 63950, ideal 27, fragmentation factor 99.96%
Comment 2 Artem S. Tashkinov 2010-05-13 12:14:42 UTC
Some of the files were *preallocated* by creating files of a specified size filled with zeros (probably truncate() syscall can do that).

So this has the following implication: XFS fails to properly allocate files and fails to excessive fragmentation.
Comment 3 Dave Chinner 2010-05-14 01:37:20 UTC
Artem, if you had done any reading at all, you would have found out all about the options XFS provides for preventing fragmentation in your use case. If you've got a simple problem like this, ask on the xfs mailing list before raising a bug. Workarounds:

1. speed up unlink: mount option "logbsize=262144"
2. prevent fragmentation with large files written simultaneously: mount option "allocsize=64m"
3. preallocation should be done by posix_fallocate() or ioctl(XFS_IOC_RESVSP)

That being said, there is a known bug in 2.6.33 (at least, not sure when it was introduced) in the generic code to help prevent _ext4_ fragmentation in exactly this workload. Unintentional bugs in that change have caused regressions in _XFS_ writeback that cause excessive fragmentation but fixes are still in progress. The above tunings will work around the fragmentation problems that the bug causes in the mean time.
Comment 4 Artem S. Tashkinov 2010-05-14 10:43:43 UTC
Dave, I don't want to sound like an insensitive clod, but in my opinion a *modern* FS should guard the user from such unpleasant peculiarities and work reliably in day-to-day scenarios with *default* settings.

E.g. http://en.wikipedia.org/wiki/XFS doesn't mention anywhere that XFS is only suitable for this or that usage scenario and one shouldn't stress it by accomplishing quite mundane tasks.
Comment 5 Dave Chinner 2010-05-17 03:01:11 UTC
(In reply to comment #4)
> Dave, I don't want to sound like an insensitive clod, but in my opinion a
> *modern* FS should guard the user from such unpleasant peculiarities and work
> reliably in day-to-day scenarios with *default* settings.

XFS normally works perfectly well in these workloads with the default settings.
However, I can't do anything to protect XFS against ext4 developers breaking the generic writeback code in subtle ways.

You've got workarounds you can use until the bug is fixed - if it were any other filesystem you wouldn't even have those. So you should be thankful that XFS can tweak the default settings when somebody breaks something outside XFS, not complaining about it.

Cheers,

Dave.