Bug 3608

Summary: Files get corrupt when reading from a network fs (nfs and smbfs) except they're already cached locally
Product: File System Reporter: Thomas Lenherr (thomas)
Component: OtherAssignee: fs_other
Status: REJECTED UNREPRODUCIBLE    
Severity: high CC: otheus, protasnb
Priority: P2    
Hardware: i386   
OS: Linux   
Kernel Version: 2.6.9 Subsystem:
Regression: --- Bisected commit-id:
Attachments: Further informations

Description Thomas Lenherr 2004-10-21 05:24:02 UTC
Distribution:
gentoo with vanilla kernel 2.6.9

Hardware Environment:
x86: dual xeon 3.06ghz

Problem Description:
If I read large files (~180mB) from a network fs (nfs _and_ smb) which are not
yet cached, I get corrupt files (checked with md5sum, cmp and diff). But if I
read the file again just after the first reading (while it's still in the local
mem-cache) I get the files without any corruption. 
If I read the files from a local fs everything is fine.

Steps to reproduce:
Let's say /nfs is a directory the a nfs-server and /nfs/file has a size of 180mB.
Further let's say /local is a local directory and /local/sum.md5 contains the
_correct_ md5-sum of /nfs/file (built this directly on the server and
transferred it on an other way, so this file _is_ correct).
Now I do the following:
cd /local
cp /nfs/file .    # I see much transfer on eth0 while the file gets transferred
md5sum -c sum.md5 # Result: FAILED
rm file
cp /nfs/file .    # There's almost no load on eth0 as the file is still in the
local cache
md5sum -c sum.md5 # Result: OK

so it seems there's something wrong with the local file-buffer or something like
that...

Ok, I held this report short, but I'll attach a mail I sent on LKML which
contains much more details about what I checked...

Sincerly, 
 Thomas
Comment 1 Thomas Lenherr 2004-10-21 05:28:14 UTC
Created attachment 3871 [details]
Further informations

Here's a mail about this bug I sent on the LKML containing much more details

 Thomas Lenherr
Comment 2 Thomas Lenherr 2004-10-28 07:07:26 UTC
It's weird: I'm using 2 proc's (P4) with HT activated and if I deactivate HT I'm
experiencing much less problem (less files get corrupted in average) and if I
deactivate the whole smp-support (so I use only one proc without HT) everything
works fine, all problems gone! 
So this seems to be rather a smp-related thing than fs-related...
Comment 3 Otheus 2007-02-19 14:00:37 UTC
I could NOT reproduce problem on 2.6.9 [SMP] (RHEL 4 AS) on x86_64 (Opteron 880,
both client and server) using files of 168M and 1.6G. 
Comment 4 Natalie Protasevich 2007-09-22 21:28:46 UTC
Does anyone still have this problem with recent kernels? It's been a while, must have been fixed..
Thanks.
Comment 5 Natalie Protasevich 2008-03-04 01:07:54 UTC
Closing the bug. Please reopen if confirmed with latest kernel.