Bug 14972
Summary: | [regression] msync() call on ext4 causes disk thrashing | ||
---|---|---|---|
Product: | File System | Reporter: | Artem S. Tashkinov (aros) |
Component: | ext4 | Assignee: | fs_ext4 (fs_ext4) |
Status: | RESOLVED CODE_FIX | ||
Severity: | high | CC: | brian, s.zuban, sandeen, tytso |
Priority: | P1 | ||
Hardware: | All | ||
OS: | Linux | ||
Kernel Version: | 2.6.32.2 | Subsystem: | |
Regression: | No | Bisected commit-id: | |
Attachments: |
run this way ./test test.txt 100
.config; dmesg; lspci; hdparm -I trace pipe output blktrace for /dev/sda while running the test application |
Created attachment 24399 [details]
.config; dmesg; lspci; hdparm -I
Regression from what? Ext3? Or some earlier kernel version? What arguments are you giving to this test program of yours? It looks like it does some number of reads and/or writes to the file, and then calls 10,000 msyncs with 50ms wait between each msync. I'm not seeing any disk activity as a result. Was anything else reading or writing to the file or to the file system at the same time? (In reply to comment #2) > Regression from what? Ext3? Or some earlier kernel version? Regression from ext3. > > What arguments are you giving to this test program of yours? It looks like > it > does some number of reads and/or writes to the file, and then calls 10,000 > msyncs with 50ms wait between each msync. While doing those msync()s mmap file is unchanged, but ... > > I'm not seeing any disk activity as a result. Was anything else reading or > writing to the file or to the file system at the same time? ... my HDD led keeps flashing continuously which certainly means some disk activity. Should I attach a video showing this abnormality on kernel 2.6.32.2 on runlevel 1 with no application running except bash and this application? OK, so that's not technically a regression as far as what the kernel bugzilla field is concerned.
>... my HDD led keeps flashing continuously which certainly means some disk
>activity. Should I attach a video showing this abnormality on kernel 2.6.32.2
>on runlevel 1 with no application running except bash and this application?
What would be much more useful would be to install blktrace, and then attach the output of "btrace /dev/sdXX" while this application is running.
When I do the test, I am seeing some excess write barriers which we can optimize away:
254,3 0 1148 31.613064584 10904 Q WB [test]
254,3 0 1149 31.664270330 10904 Q WB [test]
254,3 0 1150 31.715259078 10904 Q WB [test]
254,3 0 1151 31.772662156 10904 Q WB [test]
254,3 0 1152 31.827932269 10904 Q WB [test]
254,3 0 1153 31.883122551 10904 Q WB [test]
but that's not a disaster. (If there is no pending writes from other applications, this won't cause any extra hard drive activity.)
I'd like to confirm whether you are seeing anything more, since at least for me on my system, empty write barriers don't cause the hard drive activity light to go on. Maybe it does for your system, though. I'd like to confirm this since because if it's just a matter of extra (unnecessary) write barriers, we would prioritize this as a much lower priority bug to tackle than if there's something else going on.
What debug options are needed for blktrace? blktrace /dev/sda Invalid debug path /sys/kernel/debug: 2/No such file or directory You need to compile kernel with CONFIG_BLK_DEV_IO_TRACE and then you need to make sure that the debugfs file system is mounted in /sys/kernel/debug, i.e.: mount -t debugfs none /sys/kernel/debug The following documentation is from 2007, so there are some newer features (and the underlying implementation has moved to using ftrace), but the basic user interface hasn't changed much: http://pdfedit.petricek.net/bt/file_download.php?file_id=17&type=bug More information: http://www.gelato.org/pdf/apr2006/gelato_ICE06apr_blktrace_brunelle_hp.pdf Created attachment 24417 [details]
trace pipe output
blktrace produces zero output, so I'm attaching trace pipe output while running this test application.
I tried using blktrace this way:
# mount -t debugfs nul /sys/kernel/debug
# echo blk > /sys/kernel/debug/tracing/current_tracer
# echo 1 > /sys/block/sda/sda3/trace/enable
# blktrace /dev/sda3
BLKTRACESETUP(2) /dev/sda3 failed: 16/Device or resource busy
Thread 0 failed open /sys/kernel/debug/block/(null)/trace0: 2/No such file or directory
Thread 1 failed open /sys/kernel/debug/block/(null)/trace1: 2/No such file or directory
FAILED to start thread on CPU 0: 1/Operation not permitted
FAILED to start thread on CPU 1: 1/Operation not permitted
I'm not sure why you are getting the device busy or resource busy error to the BLKTRACESETUP ioctl. Are you sure this is a kernel with the blktrace support configured in? And I assume the '#' error message indicates that you are running as root. You don't have SELinux or some other LSM configured, do you? Blktrace running alone does not produce output; it produces trace files for each CPU that can be parsed using the blkparse program. If you look back, you'll see that I suggested running the btrace program after you install the blktrace package. The btrace program is a convenience script which runs blktrace and blkparse to produce immediate output to standard out. There are man pages for blktrace, blkparse, and btrace which should have been installed when you installed the binaries for blktrace. Still, it shouldn't have spit out those error messages.... grep BLK.*TRACE /usr/src/linux/.config CONFIG_BLK_DEV_IO_TRACE=y Yes, I'm running under root. SeLinux is disabled in the kernel. I've no idea what to do :( (In reply to comment #9) > grep BLK.*TRACE /usr/src/linux/.config > > CONFIG_BLK_DEV_IO_TRACE=y > > Yes, I'm running under root. SeLinux is disabled in the kernel. I've no idea > what to do :( Drop the echos. Just mounting debugfs & running blktrace will suffice: [root@inode ~]# mount -t debugfs none /sys/kernel/debug/ [root@inode ~]# blktrace /dev/sda ^C=== sda === CPU 0: 0 events, 0 KiB data CPU 1: 1 events, 1 KiB data Total: 1 events (dropped 0), 1 KiB data but if you do this first: [root@inode ~]# echo blk > /sys/kernel/debug/tracing/current_tracer [root@inode ~]# echo 1 > /sys/block/sda/sda3/trace/enable you get the -EBUSY: [root@inode ~]# blktrace /dev/sda BLKTRACESETUP(2) /dev/sda failed: 16/Device or resource busy Thread 1 failed open /sys/kernel/debug/block/(null)/trace1: 2/No such file or directory Thread 0 failed open /sys/kernel/debug/block/(null)/trace0: 2/No such file or directory FAILED to start thread on CPU 0: 1/Operation not permitted FAILED to start thread on CPU 1: 1/Operation not permitted It's not related to selinux. -Eric Created attachment 24427 [details]
blktrace for /dev/sda while running the test application
Thank you very much, Eric!
It seems like a lot of documentation needs to be updated.
due to bad implementation of some apps/libs this bug cause continual disk access (HDD led is always ON) and possible disk damage. Is there a chance to somehow speed up release of fix for this bug (it's been reported over a year ago)? I believe the first step to change status to CONFIRMED. Many people noticed unusual disk activity after upgrade to ext4: https://bbs.archlinux.org/viewtopic.php?pid=692073 http://bugs.winehq.org/show_bug.cgi?id=24044#c13 http://www.justlinux.com/forum/showthread.php?p=891737 In one of recent kernels (somewhere around 2.6.38) this issue was fixed. can someone else confirm it was really fixed? I'm still getting continual HDD activity on Linux laptop 3.0.0-16-generic #28-Ubuntu SMP Fri Jan 27 17:50:54 UTC 2012 i686 i686 i386 GNU/Linux in some apps (e.g. Miranda IM http://bugs.winehq.org/show_bug.cgi?id=24044#c13) (In reply to comment #14) > can someone else confirm it was really fixed? I'm still getting continual HDD > activity on Linux laptop 3.0.0-16-generic #28-Ubuntu SMP Fri Jan 27 17:50:54 > UTC 2012 i686 i686 i386 GNU/Linux in some apps (e.g. Miranda IM > http://bugs.winehq.org/show_bug.cgi?id=24044#c13) Please, try the attachment from comment 1. If you run it your HDD LED shouldn't light up, if it does then this bug is certainly not fixed. Like I said "it works for me": /dev/sdaX on / type ext4 (rw,noatime,nobarrier) /dev/sdaX on /home type ext4 (rw,noatime,nobarrier) |
Created attachment 24398 [details] run this way ./test test.txt 100 The attached application causes useless disk thrashing in case of ext4, but works normally in case of ext3 (and possibly other FS's). msync() for unchanged mmap'ed files shouldn't cause disk/HDD activity which I do observe.