Bug 13909 - md5sum on 3GB file is constantly hanging
Summary: md5sum on 3GB file is constantly hanging
Status: CLOSED OBSOLETE
Alias: None
Product: File System
Classification: Unclassified
Component: ext4 (show other bugs)
Hardware: All Linux
: P1 blocking
Assignee: fs_ext4@kernel-bugs.osdl.org
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2009-08-04 10:44 UTC by Rus
Modified: 2012-06-13 14:35 UTC (History)
5 users (show)

See Also:
Kernel Version: 2.6.30.4
Tree: Mainline
Regression: Yes


Attachments
Dmesg for 2.6.30.4 (31.25 KB, text/plain)
2009-08-08 19:41 UTC, Rus
Details
Config for 2.6.30.4 (99.79 KB, text/plain)
2009-08-08 19:42 UTC, Rus
Details

Description Rus 2009-08-04 10:44:45 UTC
Hi,

Simply pickup any 3GB file and try to md5sum it. Process is not killable. Plain Linus kernel, latest gcc and binutils. Laptop AMD X2 Acer Ferrari 1100 on SB600. Can supply any additional info.
Comment 1 Michael B. Trausch 2009-08-04 17:10:18 UTC
As an FYI, it doesn't seem to happen on a 2.6.31 kernel (not sure of the exact revision, not currently running an upstream kernel).  I have a 2.6.30.4 installed, I'll check there as I'm sure that it'd be helpful to have a confirmation.
Comment 2 Michael B. Trausch 2009-08-05 04:27:49 UTC
Arrgh.  I seem to have borked my access to my other system... I may be able to figure out what I did to not be able to boot it tomorrow, but until then, I can't yet confirm...
Comment 3 Roland Kletzing 2009-08-08 10:03:08 UTC
if you got your access back - does "dd if=3gbfile of=/dev/null" show same behaviour (unkillable process) ? 
ànything in syslog/dmesg when this happens?
i`d guess it`s a harddisk problem, but that doesn`t explain why you don`t have it with 2.6.31.
Comment 4 Michael B. Trausch 2009-08-08 17:10:05 UTC
I can't reproduce the problem in any of my installed 2.6.30.y stable kernels... The output from dmesg would likely be needed here.

As for my inability to boot my 2.6.30 system, it seems there's a bug in Karmic (or grub2) that fails to generate the proper boot entries for my old system.  Oops.
Comment 5 Rus 2009-08-08 19:41:05 UTC
Hello, this is not harddisk problem - long SMART test is successfull, so I do not see this behaviour on 2.6.29 kernel. Tried dd - the same result. Attached dmesg and config files.
Comment 6 Rus 2009-08-08 19:41:49 UTC
Created attachment 22639 [details]
Dmesg for 2.6.30.4

Dmesg for 2.6.30.4
Comment 7 Rus 2009-08-08 19:42:23 UTC
Created attachment 22640 [details]
Config for 2.6.30.4

Config for 2.6.30.4
Comment 8 Eric Sandeen 2009-08-08 19:54:49 UTC
Does just the md5sum/dd hang or the whole system?  If you can, run ps and see what state is the dd and/or md5sum in.  If it's in "D" state try 

# echo w > /proc/sysrq-trigger

and see what shows up in dmesg.  This dumps processes in wait state.

Otherwise

# echo t > /proc/sysrq-trigger

will dump all processes.

give that a shot & attach the resulting dmesg.

Thanks,
-Eric
Comment 9 Rus 2009-08-08 20:38:38 UTC
No, all the system do not hang. All VFS operations except this particular 3GB file are normal. dd/md5sum process is in R+ state, accessing this 3GB file with any othr program (xxd) put it in D+ state. For ex. first I've run dd on this file, next xxd - ps output :


...............
3882 tty12    R+    51:45 dd if=ZverDVD2009.iso of=/dev/null
3929 tty11    D+     0:00 xxd /Arhiv/Soft/winz/ZverDVD2009.iso
.................

blocked process (xxd) output :

...................................................
SysRq : Show Blocked State
  task                PC stack   pid father
xxd           D 0000007b     0  3929   3429
 f4d8cdf0 00000046 0000007b 0000007b 000000d8 000000e0 ffffff10 f1a631e0
 c0b60080 c0b60080 c0b5c78c f1a631e0 c0b60080 000d1dcf f4d8cdf0 00000001
 0000007b f1a63020 f1a19180 f1aec000 c27e8080 f1ae77f0 f4d8cdf0 c0142c69
Call Trace:
 [<c0142c69>] ? mutex_spin_on_owner+0xe9/0x110
 [<c0742482>] __mutex_lock_slowpath+0xe2/0x250
 [<c0741b7a>] mutex_lock+0x2a/0x50
 [<c0387fb5>] ima_path_check+0xb5/0x370
 [<c03793f4>] ? security_inode_permission+0x64/0x90
 [<c0256263>] may_open+0xe3/0x440
 [<c025a43b>] ? path_walk+0x9b/0xc0
 [<c025bfde>] do_filp_open+0x2be/0xe70
 [<c01f1224>] ? unlock_page+0x54/0x70
 [<c0216ff8>] ? __do_fault+0x408/0x710
 [<c026b5b9>] ? alloc_fd+0x49/0x1b0
 [<c0246575>] do_sys_open+0xd5/0x1a0
 [<c02466d9>] sys_open+0x39/0x60
 [<c0104ed8>] syscall_call+0x7/0xb

............

Running (but hanged dd) trace:

dd            R running      0  3882   3430
 00000003 a949c236 f56dbcb8 f1aec000 c03cd800 00000000 f1b15ea0 00000000
 c0104f1e 00000003 0000000c 00000000 a949c236 f56dbcb8 f1aecc18 f620ba04
 0000007b 0000007b 000000d8 000000e0 ffffff10 c9c20c55 f1aecc28 c026eba2
Call Trace:
 [<c03cd800>] ? trace_hardirqs_on_thunk+0xc/0x10
 [<c0104f1e>] ? restore_nocheck_notrace+0x0/0x10
 [<c026eba2>] ? mnt_drop_write+0x32/0x200
 [<c01f2b1a>] ? find_get_page+0x1aa/0x200
 [<c026719f>] ? touch_atime+0x6f/0x310
 [<c01f45dd>] ? generic_file_aio_read+0xb6d/0xb80
 [<c02471dc>] ? do_sync_read+0xdc/0x150
 [<c0158bf7>] ? irq_exit+0x57/0xd0
 [<c0173f50>] ? autoremove_wake_function+0x0/0x70
 [<c024737f>] ? rw_verify_area+0x12f/0x1e0
 [<c02477fa>] ? vfs_read+0x16a/0x250
 [<c0247100>] ? do_sync_read+0x0/0x150
 [<c0251286>] ? kernel_read+0x46/0x70
 [<c0388b74>] ? ima_calc_hash+0x1f4/0x260
 [<c0388c67>] ? ima_collect_measurement+0x87/0xc0
 [<c03881a6>] ? ima_path_check+0x2a6/0x370
 [<c0256263>] ? may_open+0xe3/0x440
 [<c025a43b>] ? path_walk+0x9b/0xc0
 [<c025bfde>] ? do_filp_open+0x2be/0xe70
 [<c01f1224>] ? unlock_page+0x54/0x70
 [<c0216ff8>] ? __do_fault+0x408/0x710
 [<c026b5b9>] ? alloc_fd+0x49/0x1b0
 [<c0246575>] ? do_sys_open+0xd5/0x1a0
 [<c02466d9>] ? sys_open+0x39/0x60
 [<c0104ed8>] ? syscall_call+0x7/0xb
...............................................

So it doesn't matter what file I'm selecting, only needs it it be big enough.
Comment 10 Roland Kletzing 2009-08-10 09:31:41 UTC
this file is on an ext4-fs, correct ?

is there a chance that you put it on another fs (e.g. ext3, reiser) and compare the behaviour?

if there is no other partition for free use, you may try attaching an usb disk or use a remote share via nfs or cifs.
Comment 11 Rus 2009-08-10 13:33:32 UTC
Yes, you are right - the fs is ext4. Checked on USB connected harddrive (WD Raptor RAID Edition) :

reiserfs - fail
ext2 - fail
ext3 - fail
ntfs (ntfs-3g 1.5130) - fail

What is good, that after disconnecting USB HDD, in several minutes - dd exits on I/O error, so hang is occured only while USB HDD is connected.
Comment 12 Michael B. Trausch 2009-08-10 13:58:44 UTC
You said that the md5sum is hanging in the R state, yes?  If so, it's not hanging at all.  It's still computing the MD5 sum.  What sort of disks are you using (IDE/SCSI/SATA)?  What sort of CPU (cat /proc/cpuinfo)?  How much RAM (free)?

mbt@zest:~$ time md5sum sample.3gb 
c698c87fb53058d493492b61f4c74189  sample.3gb

real	0m12.017s
user	0m8.561s
sys	0m3.276s

This is on a system that isn't heavily loaded, with a 3.0Gbps SATA II drive holding the file.

Try this to see if it really is hanging or not.  Install the "pv" utility.  It is available from http://www.ivarch.com/programs/pv.shtml -- and in many distributions (I can say for certain that it is in Ubuntu's repositories at the very least, and am pretty sure that it's on Debian and RH repositories, too).

Then, run:

cat YOUR_3GB_FILE | pv | md5sum

This will show you how much data is being pushed through the pipe, and at what rate of speed.  My system is able to push about 200MB/s through pv.  This will give you a better, visual indication of what is going on.
Comment 13 Rus 2009-08-10 14:11:20 UTC
When I said "hang" - its mean "hang", not "computing". Your program sais "0 bytes" in 2.6.30.4 and "114 Mb/s" in 2.6.29.x.
Comment 14 Rus 2009-08-10 14:13:23 UTC
Forgot to mention - my hardware is Acer Ferrari 1100, which has X2 CPU 2.3Ghz and 4GB RAM. As the prorams hangs, I use for the test the fresh booted system.
Comment 15 Leonardo Mosquera 2009-08-10 14:34:40 UTC
You mentioned this happens in all filesystems so it's not an EXT4 problem.
You also said it's a USB drive, which leads me to think it might be the following issue with the ehci_hcd kernel module, which I also have: http://bugzilla.kernel.org/show_bug.cgi?id=11159

You can verify this by tailing /var/log/kern.log and looking for USB related errors when pv's traffic stalls.

The current lame fixup is to rmmod ehci_hcd, thus switching to USB 1.1 and limiting transfer rate to 1.5MB/s. I also suggest you try to md5 without ehci_hcd to see if the problem stops happening.
Comment 16 Roland Kletzing 2009-08-10 15:11:09 UTC
so, does this ONLY happen on usb drive or does it ALSO happen there, i.e. does the same issue happen with internal ide/sata disk ?
Comment 17 Rus 2009-08-10 15:58:54 UTC
The bug does not related to USB. I found this bug file md5sum'ing ordinary local HDD file. The Roland Kletzing asked me few posts back to put this file on USB - so I did this. My original regression report was and is :

Read access to any 3GB file on my local filesystem hangs the accessing process forever - it is not killable.

This behaviour only found on 2.6.30.x kernels, particular on 2.6.30.4 which I'm running. Booting the same hardware to 2.6.29.x eliminates the problem.
Comment 18 Theodore Tso 2009-08-10 17:41:20 UTC
I just tried doing an md5sum on a 3GB file using a 2.6.31-rc5 kernel --- no problem.  I'll have to try building a 2.6.30 kernel and try to replicate the problem.
Comment 19 Roland Kletzing 2009-08-10 18:21:23 UTC
maybe running 

"strace md5sum bigfile 2>&1 | tail -n 25"
"strace dd if=bigfile of=/dev/null 2>&1 |tail -n 25"

gives some additional information where it gets stuck.
sorry if this information already hides in the stacktrace - i`m not familar with reading those. 


also contents from 
/proc/sched_debug

and checking 
/proc/$PID_OF_STUCK_TASK/stat for increasing values (utime/stime/eip)

would perhaps give additional information.

to see if this is scheduling/smp issue, you could either boot with 'maxcpus=1' into single cpu mode or start your tasks with cpu affinity "taskset -c 0 cmdname" or "taskset -c 1 cmdname". does that make any difference ?
Comment 20 Theodore Tso 2009-08-10 18:39:38 UTC
Something else useful would be to try grabbing the output of sysrq-l and sysrq-w.
Comment 21 Rus 2009-08-10 20:11:15 UTC
1. Strace hangs in open(big_file).

.......................................................................
open("/usr/lib/locale/ru_UA/LC_MESSAGES/SYS_LC_MESSAGES", O_RDONLY) = 3
fstat64(3, {st_mode=S_IFREG|0644, st_size=57, ...}) = 0
mmap2(NULL, 57, PROT_READ, MAP_PRIVATE, 3, 0) = 0xb80cc000
close(3)                                = 0
open("/usr/lib/locale/ru_UA/LC_MONETARY", O_RDONLY) = 3
fstat64(3, {st_mode=S_IFREG|0644, st_size=291, ...}) = 0
mmap2(NULL, 291, PROT_READ, MAP_PRIVATE, 3, 0) = 0xb80cb000
close(3)                                = 0
open("/usr/lib/locale/ru_UA/LC_COLLATE", O_RDONLY) = 3
fstat64(3, {st_mode=S_IFREG|0644, st_size=21143, ...}) = 0
mmap2(NULL, 21143, PROT_READ, MAP_PRIVATE, 3, 0) = 0xb80c5000
close(3)                                = 0
open("/usr/lib/locale/ru_UA/LC_TIME", O_RDONLY) = 3
fstat64(3, {st_mode=S_IFREG|0644, st_size=2359, ...}) = 0
mmap2(NULL, 2359, PROT_READ, MAP_PRIVATE, 3, 0) = 0xb80c4000
close(3)                                = 0
open("/usr/lib/locale/ru_UA/LC_NUMERIC", O_RDONLY) = 3
fstat64(3, {st_mode=S_IFREG|0644, st_size=55, ...}) = 0
mmap2(NULL, 55, PROT_READ, MAP_PRIVATE, 3, 0) = 0xb80c3000
close(3)                                = 0
open("/usr/lib/locale/ru_UA/LC_CTYPE", O_RDONLY) = 3
fstat64(3, {st_mode=S_IFREG|0644, st_size=222300, ...}) = 0
mmap2(NULL, 222300, PROT_READ, MAP_PRIVATE, 3, 0) = 0xb7ee8000
close(3)                                = 0
open("/Arhiv/Soft/winz/ZverDVD2009.iso", O_RDONLY|O_LARGEFILE
...............................................................

2. /proc/sched_debug - do not have such file.

3. As I see only one field is incrementing in /proc/$STUCK_PID/stat :

cat /proc/3561/stat
3561 (dd) R 1 3560 3448 1036 3448 4194304 229 0 0 0 0 43525 0 0 20 0 1 0 14243 2289664 166 4294967295 134512640 134564808 0 0 0 0 0 0 0 4294967295 0 0 17 0 0 0 3494 0 0

cat /proc/3561/stat
3561 (dd) R 1 3560 3448 1036 3448 4194304 229 0 0 0 0 45054 0 0 20 0 1 0 14243 2289664 166 4294967295 134512640 134564808 0 0 0 0 0 0 0 4294967295 0 0 17 0 0 0 3494 0 0

4. taskset -c 0/1 is not influencing on hang.

5. sysrq-w already posted. sysrq-l :

...............................................................
SysRq : Show backtrace of all active CPUs
CPU0:
 f2d0dba0 ac4662f2 00000046 00000001 f2d0dbe8 f2d0dbc8 c01093ff 00000000
 c08762a0 ac4662f2 f2d0dbe0 c049c46c c08a6750 00000000 ac4662f2 c27e4b6c
 f2d0dc00 c018e7ab f2d0dbe8 f2d0dbe8 ac4662f2 edd7f800 00000000 00000001
Call Trace:
 [<c01093ff>] ? show_stack+0x2f/0x50
 [<c049c46c>] ? showacpu+0x7c/0x90
 [<c018e7ab>] ? generic_smp_call_function_single_interrupt+0x7b/0xc0
 [<c011d770>] ? smp_call_function_single_interrupt+0x30/0x60
 [<c01055aa>] ? call_function_single_interrupt+0x36/0x3c
 [<c01f3b62>] ? generic_file_aio_read+0xf2/0xb80
 [<c01789a7>] ? enqueue_hrtimer+0xa7/0xe0
 [<c02471dc>] ? do_sync_read+0xdc/0x150
 [<c03cd800>] ? trace_hardirqs_on_thunk+0xc/0x10
 [<c0173f50>] ? autoremove_wake_function+0x0/0x70
 [<c024737f>] ? rw_verify_area+0x12f/0x1e0
 [<c02477fa>] ? vfs_read+0x16a/0x250
 [<c0247100>] ? do_sync_read+0x0/0x150
 [<c0251286>] ? kernel_read+0x46/0x70
 [<c0388b74>] ? ima_calc_hash+0x1f4/0x260
 [<c0388c67>] ? ima_collect_measurement+0x87/0xc0
 [<c03881a6>] ? ima_path_check+0x2a6/0x370
 [<c0256263>] ? may_open+0xe3/0x440
 [<c025a43b>] ? path_walk+0x9b/0xc0
 [<c025bfde>] ? do_filp_open+0x2be/0xe70
 [<c026b5b9>] ? alloc_fd+0x49/0x1b0
 [<c0246575>] ? do_sys_open+0xd5/0x1a0
 [<c02466d9>] ? sys_open+0x39/0x60
 [<c0104ed8>] ? syscall_call+0x7/0xb
........................................................
Comment 22 Rus 2009-08-10 20:13:54 UTC
Sorry, when I said sysrq-w already posted - I mean that there is empty output for it when hang occured :

..................................................
SysRq : Show Blocked State
  task                PC stack   pid father
.................................................
Comment 23 Roland Kletzing 2009-08-10 21:13:39 UTC
on 2009-02-05 there was a commit of IMA - see http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=3323eec921efd815178a23107ab63588c605c0b2

that seems in your codepath and is also enabled in .config:

CONFIG_IMA=y
CONFIG_IMA_MEASURE_PCR_IDX=10
CONFIG_IMA_AUDIT=y

as it is not in defconfig, it looks a little bit suspicious to me and i`m wondering why it is active on your system.

i don`t really have a clue if this really is the culprit, but if this but was on my system, i would give it at try and disable ima to see if it makes a difference. don`t know how this can be enabled at runtime/boot. if a runtime switch doesn`t exist, you may bug the developer for providing one.

ah, and there is http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=1a62e958fa4aaeeb752311b4f5e16b2a86737b23
that also looks suspicious
Comment 24 Roland Kletzing 2009-08-10 21:23:59 UTC
>so I do not see this behaviour on 2.6.29 kernel.
btw, 2.6.29 did not have IMA.
Comment 25 Rus 2009-08-10 21:59:53 UTC
CONFIG_IMA is active on my system because I'm playing with TPM chip. So I wonder if all people will use defconfig. Anyway - disabling CONFIG_IMA solves the problem. Thanks.
Comment 26 Roland Kletzing 2009-08-11 18:47:01 UTC
we got some feedback from the ima developers:

>> > > if this is a bug fixed in .31 (perhaps
>> http://git.kernel.org/?>p=linux/kernel/git/torvalds/linux->2.6.git;a=commit;h=1a62e958fa4aaeeb752311b4f5e16b2a86737b23
>> > > ???) , shouldn`t that go to -stable, too ?
>> > 
>> > Yes, it's in 2.6.30.1.
>> 
>> so, is the reported one perhaps another issue? 
>> the user with that problem is using 2.6.30.4
>
>No, I don't think so. The default measurement policy in 2.6.30 measures
>all executables and all files opened for read by root. For 2.6.31, this
>policy is enabled only if the kernel command line option "ima_tcb=1" is
>specified.
>
>You're probably opening a 3GB file for read as root, which causes the
>file to be measured, not really something you'd want to do as it takes a
>real long time. Depending on what is causing the file to be measured,
>you could run the app as user. The other option is to modify the IMA
>measurement policy. (Refer to
>linux-2.6.30.X/Documentation/ABI/testing/ima_policy.)
>
>Mimi


so, before closing, it would make sense if you could do the following (with ima enabled):

1. do the tests (md5sum/dd) as non-root user. that should make a difference
2. do the test with a 1gb, 2gb and 3gb file and check the time needed to complete. does that scale linear? does the 2gb complete in reasonable time, but not the 3gb file ?
Comment 27 Rus 2009-08-11 22:35:25 UTC
1. The test from non-root user passes.
2. Testing as root on IMA enabled kernel passes only files up to some size, seems like 1GB is limit. All files that are bigger - hangs forever. The below 1GB file test on IMA kernel - it completed in 4 seconds. But files more than 1GB (I've tested 2GB and 3GB) hangs forever - I was waiting for 10 minuites - no success. 

[Hors]:rus:~ # time -p md5sum 1GB 
5430f5bdcde8f7c3c222df24eed64741  1GB
real 4.39
user 2.84
sys 1.45

It is very strange that such size limit is present and it is not configurable. Or this is IMA bug.
Comment 28 Roland Kletzing 2009-08-15 13:54:43 UTC
rus,any chance to try the attached patch and confirm if it is the real fix for the problem ?



We're now able to duplicate this bug. Not sure why it only affects ext4.

ima: kernel_read parm integer overflow bugfix

Reading large files >2G causes process to loop.
(http://bugzilla.kernel.org/show_bug.cgi?id=13909)

Signed-off-by: Mimi Zohar <zohar@us.ibm.com>

Index: security-testing-2.6/security/integrity/ima/ima_crypto.c
===================================================================
--- security-testing-2.6.orig/security/integrity/ima/ima_crypto.c
+++ security-testing-2.6/security/integrity/ima/ima_crypto.c
@@ -47,7 +47,8 @@ int ima_calc_hash(struct file *file, cha
struct scatterlist sg[1];
loff_t i_size;
char *rbuf;
- int rc, offset = 0;
+ unsigned long offset = 0;
+ int rc;

rc = init_desc(&desc);
if (rc != 0)
Comment 29 Rus 2009-08-17 10:23:00 UTC
Yes, this fixed the problem for 3GB file. But I've tried to cat 11GB file - full HDD activity for 3 hours - nothing happens. But dd exits with Ctrl-C at least. May be unsigned long is too small for max file size limit that exists in Linux filesystems ?
Comment 30 Anonymous Emailer 2009-08-17 14:29:19 UTC
Reply-To: rwheeler@redhat.com

On 08/15/2009 09:54 AM, bugzilla-daemon@bugzilla.kernel.org wrote:
> http://bugzilla.kernel.org/show_bug.cgi?id=13909
>
>
>
>
>
> --- Comment #28 from Roland Kletzing<devzero@web.de>   2009-08-15 13:54:43
> ---
> rus,any chance to try the attached patch and confirm if it is the real fix
> for
> the problem ?
>
>
>
> We're now able to duplicate this bug. Not sure why it only affects ext4.
>
> ima: kernel_read parm integer overflow bugfix
>
> Reading large files>2G causes process to loop.
> (http://bugzilla.kernel.org/show_bug.cgi?id=13909)
>
> Signed-off-by: Mimi Zohar<zohar@us.ibm.com>
>
> Index: security-testing-2.6/security/integrity/ima/ima_crypto.c
> ===================================================================
> --- security-testing-2.6.orig/security/integrity/ima/ima_crypto.c
> +++ security-testing-2.6/security/integrity/ima/ima_crypto.c
> @@ -47,7 +47,8 @@ int ima_calc_hash(struct file *file, cha
> struct scatterlist sg[1];
> loff_t i_size;
> char *rbuf;
> - int rc, offset = 0;
> + unsigned long offset = 0;
> + int rc;
>
> rc = init_desc(&desc);
> if (rc != 0)
>
>    

I suspect that you will need to use the official offset type to avoid 
overflow at 4GB - loff_t should be more consistent,

Ric

Note You need to log in before you can comment on or make changes to this bug.