Bug 5085

Summary: NFS file corruption after cross-directory move and inode reuse on server
Product: File System Reporter: Hans P. Reiser (reiser)
Component: NFSAssignee: Trond Myklebust (trondmy)
Status: REJECTED UNREPRODUCIBLE    
Severity: high CC: akpm
Priority: P2    
Hardware: i386   
OS: Linux   
Kernel Version: 2.6.11 Subsystem:
Regression: --- Bisected commit-id:
Attachments: Ensure we drop the inode after rename()

Description Hans P. Reiser 2005-08-17 16:43:41 UTC
Distribution: SuSE 9.3 professional
Hardware Environment: Intel Pentium 4, alternate Dual Xeon
Software Environment: SuSE 9.3 professional client with Debian/sarge
                      NFS server (user-space)
Problem Description:
Newly created files via NFS may contain content of larger, previously
deleted files with the same inode on the server. This breaks, e.g.,
svn checkouts to NFS mounted directories. Seems like some file content
is incorrectly cached on the client side.

Steps to reproduce (copy nfsbug.sh to empty direcotry on NFS)
--- nfsbug.sh ---
#!/bin/sh
mkdir tmp
echo "Some long line for nfs bug demonstration" > tmp/a
mv tmp/a a
echo "Some other text to be renamed to a" > tmp/a
mv tmp/a a
echo "short text" > tmp/b
cat tmp/b
--- end of nfsbug.sh ---

$./nfsbug.sh
short text
ine for nfs bug demonstration
$

Our NFS server runs user-level nfs on a ext3 file system. tcpdump on the
client reveals that the server uses the same NFS handle for the first
tmp/a (which is delete by the second mv operation) and the new tmp/b
file (this is due to the reuse of the inode of the deleted file).
Evidently, this causes the previous content of first tmp/a to be
written to tmp/b. Cross-directory move operations are essential to
reproduce this bug.

This bug breaks svn checkout to NFS.
This bug may have a serious security impacts if tmp/b can be created by a
different user than tmp/a, potentially leaking confidential content from
deleted tmp/a (unchecked)
Comment 1 Trond Myklebust 2005-08-17 16:57:11 UTC
Cannot reproduce with a sane NFS server. This looks like it is another
consequence of unfsd's broken handling of filehandles.
Comment 2 Trond Myklebust 2005-08-17 17:02:08 UTC
Against a netapp filer OnTap 7:
[trondmy@hammerfest gnurr]$ ~/nfsbug.sh
short text

Against a Linux kernel NFS 2.6.12
[trondmy@hammerfest trondmy]$ ~/nfsbug.sh
short text

Against Solaris 10
[trondmy@hammerfest trondmy]$ ~/nfsbug.sh
short text
Comment 3 Hans P. Reiser 2005-08-18 00:40:49 UTC
Yes, the problem only occurs if the nfs server re-uses a filehandle. While I agree
that this is better considered a broken nfs server than a broken client, I was
hoping that the interoperability with such broken servers could be increased. The
current behaviour breaks popular software, mainly Subversion.

I assume that a simple client-side change would remove this problem: If a nfs
*create* operation returns a (re-used) filehandle, all caches assosiated with
that handle should be invalidated. Of course, for "sane" servers, this is
unnecssary overhead. No chance for such a change?
Comment 4 Trond Myklebust 2005-08-18 07:37:07 UTC
If a (non-exclusive) create operation returns a filehandle that is in use by an
existing file, then that doesn't necessarily mean the filehandle has been
reused. An ordinary server may return an existing filehandle if a hard link to
that file exists.
The question here is rather why is the open() failing to truncate the file? I
have a feeling that we may be updating the attributes far too late in the case
when the file already exists.
Comment 5 Trond Myklebust 2005-08-18 14:16:45 UTC
Created attachment 5672 [details]
Ensure we drop the inode after rename()

Just out of interest. Does the following patch help fix your testcase? It will
never fix the case where someone does the same rename behind your back on
another client, but it might fix the single-client case.
Comment 6 Anonymous Emailer 2005-08-19 05:04:11 UTC
Reply-To: Hansi@i4.informatik.uni-erlangen.de


> Just out of interest. Does the following patch help fix your testcase? It will
> never fix the case where someone does the same rename behind your back on
> another client, but it might fix the single-client case.

Well, yes, it fixes my testcase. However, unfortunately it does not
fix the original problem with svn checkouts. I will look for another
testcase that reproduces the svn problem. Meanwhile, here is a short
strace excerpt of svn, which illustrates the problem:

7203  open("gctestenv/src/.svn/tmp/log", O_WRONLY|O_CREAT|O_EXCL, 0666) = 6
7203  write(6, "<modify-entry\n   committed-rev=\""..., 247) = 247
7203  close(6)                          = 0
7203  rename("gctestenv/src/.svn/tmp/log", "gctestenv/src/.svn/log") = 0
7203  stat64("gctestenv/src/.svn/log", {st_mode=S_IFREG|0644, st_size=305, ...}) = 0

There are 247 bytes written to the file. A stat directly after closing
it shows 305 bytes. These are the written bytes plus additional garbage,
apparently from a previously deleted file (deleted by a previous rename
operation).