Since starting using kernel 2.6.35-rc5 I've noticed that it behaves strangely with loop devices. Steps to reproduce: 1) On top of ext4 fs: $ dd if=/dev/zero of=ext2.img bs=100M count=20 2) mke2fs ext2.img 3) mount -o loop,noatime ext2.img /mnt/loop 4) now start copying files (10-100Mb in size) from any media to /mnt/loop using cp -a Result: at times the computer will stall for up to 5-15 seconds with zero IO activity however load average will still be above 1.0. I estimate IO activity by observing data from GKRellM utility. Expected results: steady copying process.
Can you bisect the issue?
Did you have any success debugging this issue further?
I've no idea how to debug it, however this testcase is 100% reproducible. http://www.mediafire.com/?ah63babkaabbfn2
Look at the manual page for git-bisect to see how to find the patchset that introduced this regression. I fear else we may not know what or where to look for. Jens, do you have any idea?
In 2.6.36 this issue has become even worse. After creating 3GB (using dd, so the file is fully allocated) looped ext2 partition in a file on top of ext4 filesystem, I can only copy files by 300MB chunks, and after copying each chunk the system "hangs" (everything works, just copying process completely stalls) for up to 20 seconds doing nothing (with load average = 1.0). Filling this 3GB looped ext2 partition took roughly 5 minutes, even though I have a HDD capable of 100MB/sec throughput. Something is terribly broken in Linux kernel. "Host" filesystem fragmentation is quite normal (~5%). ext4 module runs all filesystems in question.
Checking if file is heavily fragmented: $ time dd if=ext2.img of=/dev/zero 6144000+0 records in 6144000+0 records out 3145728000 bytes (3.1 GB) copied, 25.5556 s, 123 MB/s real 0m25.676s user 0m1.630s sys 0m7.596s It's obviously not fragmented at all. (Maximum HDD throughput here is 135MB/sec at the beginning of the disk).
Did you have any luck using git-bisect?
This is an ext2 fs bug.
It's reproducible with ext4fs as well under Linux 3.5.
Didnt face this issue with 3.2.0-23-generic - 64 bit.
(In reply to comment #10) > Didnt face this issue with 3.2.0-23-generic - 64 bit. Sorry I was able to reproduce it under 3.2.0-23-generic - 64 bit
I wasn't able to duplicate this with an underlying ext4 file system, with the loop file system being ext2, using a 3.5 kernel on a system with 16G. I tried both using a variety of small and medium sized files, as well as one very large file, and I didn't see any unexplained stuttering in write bandwidth. (There were times when cp -r was clearly writing to the in-memory page cache, and the writeback hadn't begone yet. There were also times when we couldn't do any more writing, because we were busy reading from the source drive. But it all looked pretty normal to me.)
BTW, I was using iostat 1 to measure I/O activity. Call me old-school; I don't exactly trust GUI tools. :-)
(In reply to comment #12) > I wasn't able to duplicate this with an underlying ext4 file system, with the > loop file system being ext2, using a 3.5 kernel on a system with 16G. > > I tried both using a variety of small and medium sized files, as well as one > very large file, and I didn't see any unexplained stuttering in write > bandwidth. (There were times when cp -r was clearly writing to the > in-memory > page cache, and the writeback hadn't begone yet. There were also times when > we couldn't do any more writing, because we were busy reading from the source > drive. But it all looked pretty normal to me.) Hm, I cannot reproduce the problem with a loop device on top of tmpfs either, but I've just checked my 250GB ext4 partition and the bug is still there: When I try to copy a 3GB file (residing in tmpfs so it's all cached) to it, there's a 4 (four) seconds delay before a single byte gets written to the destination partition. After that all consequent files get written without any delays.
The delay before the write starts is normal; that's encoded in how the writeback code handles dirty pages. We don't start writes the instant that pages are dirtied in the buffer cache. In any case, if you have arguments about how the writeback code makes its choices, it's kinda of pointless to be complaining on a bug targetted at the ext4 developers, since we start the writeback when the VM system asks us to start write pages belonging to a particular inode.... See /usr/src/linux/Documentation/sysctl/vm.txt for more information.
(In reply to comment #15) > The delay before the write starts is normal; that's encoded in how the > writeback code handles dirty pages. We don't start writes the instant that > pages are dirtied in the buffer cache. > > In any case, if you have arguments about how the writeback code makes its > choices, it's kinda of pointless to be complaining on a bug targetted at the > ext4 developers, since we start the writeback when the VM system asks us to > start write pages belonging to a particular inode.... > > See /usr/src/linux/Documentation/sysctl/vm.txt for more information. Thanks for letting us know, probably what I perceived as a regression in ext4fs was in fact a change in the VM writeback code. It's still weird and terribly counter intuitive as no other OS exhibit this behavior, but now that I know why it happens I need to file a different bug report.
Created attachment 166031 [details] Video demonstration I'm NOT reopening this bug report. I'm merely adding a video clip which pertains to it.