Bug 13909
Summary: | md5sum on 3GB file is constantly hanging | ||
---|---|---|---|
Product: | File System | Reporter: | Rus (harbour) |
Component: | ext4 | Assignee: | fs_ext4 (fs_ext4) |
Status: | CLOSED OBSOLETE | ||
Severity: | blocking | CC: | alan, devzero, mike, sandeen, tytso |
Priority: | P1 | ||
Hardware: | All | ||
OS: | Linux | ||
Kernel Version: | 2.6.30.4 | Subsystem: | |
Regression: | Yes | Bisected commit-id: | |
Attachments: |
Dmesg for 2.6.30.4
Config for 2.6.30.4 |
Description
Rus
2009-08-04 10:44:45 UTC
As an FYI, it doesn't seem to happen on a 2.6.31 kernel (not sure of the exact revision, not currently running an upstream kernel). I have a 2.6.30.4 installed, I'll check there as I'm sure that it'd be helpful to have a confirmation. Arrgh. I seem to have borked my access to my other system... I may be able to figure out what I did to not be able to boot it tomorrow, but until then, I can't yet confirm... if you got your access back - does "dd if=3gbfile of=/dev/null" show same behaviour (unkillable process) ? ànything in syslog/dmesg when this happens? i`d guess it`s a harddisk problem, but that doesn`t explain why you don`t have it with 2.6.31. I can't reproduce the problem in any of my installed 2.6.30.y stable kernels... The output from dmesg would likely be needed here. As for my inability to boot my 2.6.30 system, it seems there's a bug in Karmic (or grub2) that fails to generate the proper boot entries for my old system. Oops. Hello, this is not harddisk problem - long SMART test is successfull, so I do not see this behaviour on 2.6.29 kernel. Tried dd - the same result. Attached dmesg and config files. Created attachment 22639 [details]
Dmesg for 2.6.30.4
Dmesg for 2.6.30.4
Created attachment 22640 [details]
Config for 2.6.30.4
Config for 2.6.30.4
Does just the md5sum/dd hang or the whole system? If you can, run ps and see what state is the dd and/or md5sum in. If it's in "D" state try # echo w > /proc/sysrq-trigger and see what shows up in dmesg. This dumps processes in wait state. Otherwise # echo t > /proc/sysrq-trigger will dump all processes. give that a shot & attach the resulting dmesg. Thanks, -Eric No, all the system do not hang. All VFS operations except this particular 3GB file are normal. dd/md5sum process is in R+ state, accessing this 3GB file with any othr program (xxd) put it in D+ state. For ex. first I've run dd on this file, next xxd - ps output : ............... 3882 tty12 R+ 51:45 dd if=ZverDVD2009.iso of=/dev/null 3929 tty11 D+ 0:00 xxd /Arhiv/Soft/winz/ZverDVD2009.iso ................. blocked process (xxd) output : ................................................... SysRq : Show Blocked State task PC stack pid father xxd D 0000007b 0 3929 3429 f4d8cdf0 00000046 0000007b 0000007b 000000d8 000000e0 ffffff10 f1a631e0 c0b60080 c0b60080 c0b5c78c f1a631e0 c0b60080 000d1dcf f4d8cdf0 00000001 0000007b f1a63020 f1a19180 f1aec000 c27e8080 f1ae77f0 f4d8cdf0 c0142c69 Call Trace: [<c0142c69>] ? mutex_spin_on_owner+0xe9/0x110 [<c0742482>] __mutex_lock_slowpath+0xe2/0x250 [<c0741b7a>] mutex_lock+0x2a/0x50 [<c0387fb5>] ima_path_check+0xb5/0x370 [<c03793f4>] ? security_inode_permission+0x64/0x90 [<c0256263>] may_open+0xe3/0x440 [<c025a43b>] ? path_walk+0x9b/0xc0 [<c025bfde>] do_filp_open+0x2be/0xe70 [<c01f1224>] ? unlock_page+0x54/0x70 [<c0216ff8>] ? __do_fault+0x408/0x710 [<c026b5b9>] ? alloc_fd+0x49/0x1b0 [<c0246575>] do_sys_open+0xd5/0x1a0 [<c02466d9>] sys_open+0x39/0x60 [<c0104ed8>] syscall_call+0x7/0xb ............ Running (but hanged dd) trace: dd R running 0 3882 3430 00000003 a949c236 f56dbcb8 f1aec000 c03cd800 00000000 f1b15ea0 00000000 c0104f1e 00000003 0000000c 00000000 a949c236 f56dbcb8 f1aecc18 f620ba04 0000007b 0000007b 000000d8 000000e0 ffffff10 c9c20c55 f1aecc28 c026eba2 Call Trace: [<c03cd800>] ? trace_hardirqs_on_thunk+0xc/0x10 [<c0104f1e>] ? restore_nocheck_notrace+0x0/0x10 [<c026eba2>] ? mnt_drop_write+0x32/0x200 [<c01f2b1a>] ? find_get_page+0x1aa/0x200 [<c026719f>] ? touch_atime+0x6f/0x310 [<c01f45dd>] ? generic_file_aio_read+0xb6d/0xb80 [<c02471dc>] ? do_sync_read+0xdc/0x150 [<c0158bf7>] ? irq_exit+0x57/0xd0 [<c0173f50>] ? autoremove_wake_function+0x0/0x70 [<c024737f>] ? rw_verify_area+0x12f/0x1e0 [<c02477fa>] ? vfs_read+0x16a/0x250 [<c0247100>] ? do_sync_read+0x0/0x150 [<c0251286>] ? kernel_read+0x46/0x70 [<c0388b74>] ? ima_calc_hash+0x1f4/0x260 [<c0388c67>] ? ima_collect_measurement+0x87/0xc0 [<c03881a6>] ? ima_path_check+0x2a6/0x370 [<c0256263>] ? may_open+0xe3/0x440 [<c025a43b>] ? path_walk+0x9b/0xc0 [<c025bfde>] ? do_filp_open+0x2be/0xe70 [<c01f1224>] ? unlock_page+0x54/0x70 [<c0216ff8>] ? __do_fault+0x408/0x710 [<c026b5b9>] ? alloc_fd+0x49/0x1b0 [<c0246575>] ? do_sys_open+0xd5/0x1a0 [<c02466d9>] ? sys_open+0x39/0x60 [<c0104ed8>] ? syscall_call+0x7/0xb ............................................... So it doesn't matter what file I'm selecting, only needs it it be big enough. this file is on an ext4-fs, correct ? is there a chance that you put it on another fs (e.g. ext3, reiser) and compare the behaviour? if there is no other partition for free use, you may try attaching an usb disk or use a remote share via nfs or cifs. Yes, you are right - the fs is ext4. Checked on USB connected harddrive (WD Raptor RAID Edition) : reiserfs - fail ext2 - fail ext3 - fail ntfs (ntfs-3g 1.5130) - fail What is good, that after disconnecting USB HDD, in several minutes - dd exits on I/O error, so hang is occured only while USB HDD is connected. You said that the md5sum is hanging in the R state, yes? If so, it's not hanging at all. It's still computing the MD5 sum. What sort of disks are you using (IDE/SCSI/SATA)? What sort of CPU (cat /proc/cpuinfo)? How much RAM (free)? mbt@zest:~$ time md5sum sample.3gb c698c87fb53058d493492b61f4c74189 sample.3gb real 0m12.017s user 0m8.561s sys 0m3.276s This is on a system that isn't heavily loaded, with a 3.0Gbps SATA II drive holding the file. Try this to see if it really is hanging or not. Install the "pv" utility. It is available from http://www.ivarch.com/programs/pv.shtml -- and in many distributions (I can say for certain that it is in Ubuntu's repositories at the very least, and am pretty sure that it's on Debian and RH repositories, too). Then, run: cat YOUR_3GB_FILE | pv | md5sum This will show you how much data is being pushed through the pipe, and at what rate of speed. My system is able to push about 200MB/s through pv. This will give you a better, visual indication of what is going on. When I said "hang" - its mean "hang", not "computing". Your program sais "0 bytes" in 2.6.30.4 and "114 Mb/s" in 2.6.29.x. Forgot to mention - my hardware is Acer Ferrari 1100, which has X2 CPU 2.3Ghz and 4GB RAM. As the prorams hangs, I use for the test the fresh booted system. You mentioned this happens in all filesystems so it's not an EXT4 problem. You also said it's a USB drive, which leads me to think it might be the following issue with the ehci_hcd kernel module, which I also have: http://bugzilla.kernel.org/show_bug.cgi?id=11159 You can verify this by tailing /var/log/kern.log and looking for USB related errors when pv's traffic stalls. The current lame fixup is to rmmod ehci_hcd, thus switching to USB 1.1 and limiting transfer rate to 1.5MB/s. I also suggest you try to md5 without ehci_hcd to see if the problem stops happening. so, does this ONLY happen on usb drive or does it ALSO happen there, i.e. does the same issue happen with internal ide/sata disk ? The bug does not related to USB. I found this bug file md5sum'ing ordinary local HDD file. The Roland Kletzing asked me few posts back to put this file on USB - so I did this. My original regression report was and is : Read access to any 3GB file on my local filesystem hangs the accessing process forever - it is not killable. This behaviour only found on 2.6.30.x kernels, particular on 2.6.30.4 which I'm running. Booting the same hardware to 2.6.29.x eliminates the problem. I just tried doing an md5sum on a 3GB file using a 2.6.31-rc5 kernel --- no problem. I'll have to try building a 2.6.30 kernel and try to replicate the problem. maybe running "strace md5sum bigfile 2>&1 | tail -n 25" "strace dd if=bigfile of=/dev/null 2>&1 |tail -n 25" gives some additional information where it gets stuck. sorry if this information already hides in the stacktrace - i`m not familar with reading those. also contents from /proc/sched_debug and checking /proc/$PID_OF_STUCK_TASK/stat for increasing values (utime/stime/eip) would perhaps give additional information. to see if this is scheduling/smp issue, you could either boot with 'maxcpus=1' into single cpu mode or start your tasks with cpu affinity "taskset -c 0 cmdname" or "taskset -c 1 cmdname". does that make any difference ? Something else useful would be to try grabbing the output of sysrq-l and sysrq-w. 1. Strace hangs in open(big_file). ....................................................................... open("/usr/lib/locale/ru_UA/LC_MESSAGES/SYS_LC_MESSAGES", O_RDONLY) = 3 fstat64(3, {st_mode=S_IFREG|0644, st_size=57, ...}) = 0 mmap2(NULL, 57, PROT_READ, MAP_PRIVATE, 3, 0) = 0xb80cc000 close(3) = 0 open("/usr/lib/locale/ru_UA/LC_MONETARY", O_RDONLY) = 3 fstat64(3, {st_mode=S_IFREG|0644, st_size=291, ...}) = 0 mmap2(NULL, 291, PROT_READ, MAP_PRIVATE, 3, 0) = 0xb80cb000 close(3) = 0 open("/usr/lib/locale/ru_UA/LC_COLLATE", O_RDONLY) = 3 fstat64(3, {st_mode=S_IFREG|0644, st_size=21143, ...}) = 0 mmap2(NULL, 21143, PROT_READ, MAP_PRIVATE, 3, 0) = 0xb80c5000 close(3) = 0 open("/usr/lib/locale/ru_UA/LC_TIME", O_RDONLY) = 3 fstat64(3, {st_mode=S_IFREG|0644, st_size=2359, ...}) = 0 mmap2(NULL, 2359, PROT_READ, MAP_PRIVATE, 3, 0) = 0xb80c4000 close(3) = 0 open("/usr/lib/locale/ru_UA/LC_NUMERIC", O_RDONLY) = 3 fstat64(3, {st_mode=S_IFREG|0644, st_size=55, ...}) = 0 mmap2(NULL, 55, PROT_READ, MAP_PRIVATE, 3, 0) = 0xb80c3000 close(3) = 0 open("/usr/lib/locale/ru_UA/LC_CTYPE", O_RDONLY) = 3 fstat64(3, {st_mode=S_IFREG|0644, st_size=222300, ...}) = 0 mmap2(NULL, 222300, PROT_READ, MAP_PRIVATE, 3, 0) = 0xb7ee8000 close(3) = 0 open("/Arhiv/Soft/winz/ZverDVD2009.iso", O_RDONLY|O_LARGEFILE ............................................................... 2. /proc/sched_debug - do not have such file. 3. As I see only one field is incrementing in /proc/$STUCK_PID/stat : cat /proc/3561/stat 3561 (dd) R 1 3560 3448 1036 3448 4194304 229 0 0 0 0 43525 0 0 20 0 1 0 14243 2289664 166 4294967295 134512640 134564808 0 0 0 0 0 0 0 4294967295 0 0 17 0 0 0 3494 0 0 cat /proc/3561/stat 3561 (dd) R 1 3560 3448 1036 3448 4194304 229 0 0 0 0 45054 0 0 20 0 1 0 14243 2289664 166 4294967295 134512640 134564808 0 0 0 0 0 0 0 4294967295 0 0 17 0 0 0 3494 0 0 4. taskset -c 0/1 is not influencing on hang. 5. sysrq-w already posted. sysrq-l : ............................................................... SysRq : Show backtrace of all active CPUs CPU0: f2d0dba0 ac4662f2 00000046 00000001 f2d0dbe8 f2d0dbc8 c01093ff 00000000 c08762a0 ac4662f2 f2d0dbe0 c049c46c c08a6750 00000000 ac4662f2 c27e4b6c f2d0dc00 c018e7ab f2d0dbe8 f2d0dbe8 ac4662f2 edd7f800 00000000 00000001 Call Trace: [<c01093ff>] ? show_stack+0x2f/0x50 [<c049c46c>] ? showacpu+0x7c/0x90 [<c018e7ab>] ? generic_smp_call_function_single_interrupt+0x7b/0xc0 [<c011d770>] ? smp_call_function_single_interrupt+0x30/0x60 [<c01055aa>] ? call_function_single_interrupt+0x36/0x3c [<c01f3b62>] ? generic_file_aio_read+0xf2/0xb80 [<c01789a7>] ? enqueue_hrtimer+0xa7/0xe0 [<c02471dc>] ? do_sync_read+0xdc/0x150 [<c03cd800>] ? trace_hardirqs_on_thunk+0xc/0x10 [<c0173f50>] ? autoremove_wake_function+0x0/0x70 [<c024737f>] ? rw_verify_area+0x12f/0x1e0 [<c02477fa>] ? vfs_read+0x16a/0x250 [<c0247100>] ? do_sync_read+0x0/0x150 [<c0251286>] ? kernel_read+0x46/0x70 [<c0388b74>] ? ima_calc_hash+0x1f4/0x260 [<c0388c67>] ? ima_collect_measurement+0x87/0xc0 [<c03881a6>] ? ima_path_check+0x2a6/0x370 [<c0256263>] ? may_open+0xe3/0x440 [<c025a43b>] ? path_walk+0x9b/0xc0 [<c025bfde>] ? do_filp_open+0x2be/0xe70 [<c026b5b9>] ? alloc_fd+0x49/0x1b0 [<c0246575>] ? do_sys_open+0xd5/0x1a0 [<c02466d9>] ? sys_open+0x39/0x60 [<c0104ed8>] ? syscall_call+0x7/0xb ........................................................ Sorry, when I said sysrq-w already posted - I mean that there is empty output for it when hang occured : .................................................. SysRq : Show Blocked State task PC stack pid father ................................................. on 2009-02-05 there was a commit of IMA - see http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=3323eec921efd815178a23107ab63588c605c0b2 that seems in your codepath and is also enabled in .config: CONFIG_IMA=y CONFIG_IMA_MEASURE_PCR_IDX=10 CONFIG_IMA_AUDIT=y as it is not in defconfig, it looks a little bit suspicious to me and i`m wondering why it is active on your system. i don`t really have a clue if this really is the culprit, but if this but was on my system, i would give it at try and disable ima to see if it makes a difference. don`t know how this can be enabled at runtime/boot. if a runtime switch doesn`t exist, you may bug the developer for providing one. ah, and there is http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=1a62e958fa4aaeeb752311b4f5e16b2a86737b23 that also looks suspicious >so I do not see this behaviour on 2.6.29 kernel.
btw, 2.6.29 did not have IMA.
CONFIG_IMA is active on my system because I'm playing with TPM chip. So I wonder if all people will use defconfig. Anyway - disabling CONFIG_IMA solves the problem. Thanks. we got some feedback from the ima developers: >> > > if this is a bug fixed in .31 (perhaps >> http://git.kernel.org/?>p=linux/kernel/git/torvalds/linux->2.6.git;a=commit;h=1a62e958fa4aaeeb752311b4f5e16b2a86737b23 >> > > ???) , shouldn`t that go to -stable, too ? >> > >> > Yes, it's in 2.6.30.1. >> >> so, is the reported one perhaps another issue? >> the user with that problem is using 2.6.30.4 > >No, I don't think so. The default measurement policy in 2.6.30 measures >all executables and all files opened for read by root. For 2.6.31, this >policy is enabled only if the kernel command line option "ima_tcb=1" is >specified. > >You're probably opening a 3GB file for read as root, which causes the >file to be measured, not really something you'd want to do as it takes a >real long time. Depending on what is causing the file to be measured, >you could run the app as user. The other option is to modify the IMA >measurement policy. (Refer to >linux-2.6.30.X/Documentation/ABI/testing/ima_policy.) > >Mimi so, before closing, it would make sense if you could do the following (with ima enabled): 1. do the tests (md5sum/dd) as non-root user. that should make a difference 2. do the test with a 1gb, 2gb and 3gb file and check the time needed to complete. does that scale linear? does the 2gb complete in reasonable time, but not the 3gb file ? 1. The test from non-root user passes. 2. Testing as root on IMA enabled kernel passes only files up to some size, seems like 1GB is limit. All files that are bigger - hangs forever. The below 1GB file test on IMA kernel - it completed in 4 seconds. But files more than 1GB (I've tested 2GB and 3GB) hangs forever - I was waiting for 10 minuites - no success. [Hors]:rus:~ # time -p md5sum 1GB 5430f5bdcde8f7c3c222df24eed64741 1GB real 4.39 user 2.84 sys 1.45 It is very strange that such size limit is present and it is not configurable. Or this is IMA bug. rus,any chance to try the attached patch and confirm if it is the real fix for the problem ? We're now able to duplicate this bug. Not sure why it only affects ext4. ima: kernel_read parm integer overflow bugfix Reading large files >2G causes process to loop. (http://bugzilla.kernel.org/show_bug.cgi?id=13909) Signed-off-by: Mimi Zohar <zohar@us.ibm.com> Index: security-testing-2.6/security/integrity/ima/ima_crypto.c =================================================================== --- security-testing-2.6.orig/security/integrity/ima/ima_crypto.c +++ security-testing-2.6/security/integrity/ima/ima_crypto.c @@ -47,7 +47,8 @@ int ima_calc_hash(struct file *file, cha struct scatterlist sg[1]; loff_t i_size; char *rbuf; - int rc, offset = 0; + unsigned long offset = 0; + int rc; rc = init_desc(&desc); if (rc != 0) Yes, this fixed the problem for 3GB file. But I've tried to cat 11GB file - full HDD activity for 3 hours - nothing happens. But dd exits with Ctrl-C at least. May be unsigned long is too small for max file size limit that exists in Linux filesystems ? Reply-To: rwheeler@redhat.com On 08/15/2009 09:54 AM, bugzilla-daemon@bugzilla.kernel.org wrote: > http://bugzilla.kernel.org/show_bug.cgi?id=13909 > > > > > > --- Comment #28 from Roland Kletzing<devzero@web.de> 2009-08-15 13:54:43 > --- > rus,any chance to try the attached patch and confirm if it is the real fix > for > the problem ? > > > > We're now able to duplicate this bug. Not sure why it only affects ext4. > > ima: kernel_read parm integer overflow bugfix > > Reading large files>2G causes process to loop. > (http://bugzilla.kernel.org/show_bug.cgi?id=13909) > > Signed-off-by: Mimi Zohar<zohar@us.ibm.com> > > Index: security-testing-2.6/security/integrity/ima/ima_crypto.c > =================================================================== > --- security-testing-2.6.orig/security/integrity/ima/ima_crypto.c > +++ security-testing-2.6/security/integrity/ima/ima_crypto.c > @@ -47,7 +47,8 @@ int ima_calc_hash(struct file *file, cha > struct scatterlist sg[1]; > loff_t i_size; > char *rbuf; > - int rc, offset = 0; > + unsigned long offset = 0; > + int rc; > > rc = init_desc(&desc); > if (rc != 0) > > I suspect that you will need to use the official offset type to avoid overflow at 4GB - loff_t should be more consistent, Ric |