Bug 9315 - Writing sparse files on nfs mounted filesystems produces wrong-placed holes
Summary: Writing sparse files on nfs mounted filesystems produces wrong-placed holes
Status: CLOSED PATCH_ALREADY_AVAILABLE
Alias: None
Product: File System
Classification: Unclassified
Component: NFS (show other bugs)
Hardware: All Linux
: P1 normal
Assignee: Trond Myklebust
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2007-11-06 04:01 UTC by Andreas Ley
Modified: 2007-11-16 11:02 UTC (History)
0 users

See Also:
Kernel Version: 2.6.23.1
Subsystem:
Regression: ---
Bisected commit-id:


Attachments
The 2007-10-20T13:49:53 version of security.debian.org::debian-security/dists/etch/updates/main/binary-i386/Packages.gz (188.50 KB, application/gzip)
2007-11-06 04:04 UTC, Andreas Ley
Details
NFS: Fix a writeback race... (2.25 KB, patch)
2007-11-06 04:53 UTC, Trond Myklebust
Details | Diff

Description Andreas Ley 2007-11-06 04:01:25 UTC
Most recent kernel where this bug did not occur: 2.6.22.11
Distribution: Debian Etch
Hardware Environment: Several x86 systems, Intel and AMD
Software Environment: Vanilla Debian Etch, self-compiled kernel from kernel.org
Problem Description: When writing sparse files with several applications onto nfs mounted filesystems, areas with NUL-bytes appear at locations where non-NUL-data should be. The problem occurs with multiple applications (depends on the file written, e.g. current firefox-bin triggers this error with both rsync and cpio - but at different places, maybe due to different write sizes) and only with 2.6.23 or 2.6.23.1 kernels - same setup with any 2.6.22.x kernel works fine, as do the same applications on non-Linux systems. Tested with different NFS servers, too (Linux, HP-UX, EMC) - all show the same symtoms. A local ext3 destination filesystem works fine.

Steps to reproduce:

# $dst is on a nfs-mounted, writable filesystem

wget ftp://mozilla.ussg.indiana.edu/pub/mozilla.org/firefox/releases/2.0.0.9/linux-i686/en-US/firefox-2.0.0.9.tar.gz
tar xzf firefox-2.0.0.9.tar.gz
cd firefox

# Using cpio:

echo firefox-bin | cpio -pdum --sparse $dst
md5sum firefox-bin $dst/firefox-bin
8cf961ebaaff03db222bc01d913ab7cc  firefox-bin
08d7baf9351c682309ac9181b1057716  .../firefox-bin

# The written file shows only NULs from from 0x9e7600 to 0x9e77ff where the source file has some non-NUL data

# Using rsync:

rsync -S firefox-bin $dst

8cf961ebaaff03db222bc01d913ab7cc  firefox-bin
cd0d218b489d32a7173faf4b0d901aa5  .../firefox-bin

# Here we find many blocks of bad NULs, the first from 0x000400 to 0x0007ff, from 0x004400 to 0x0047ff, from 0x007400 to 0x0077ff, from 0x00c400 to 0x00c7ff, from 0x010c00 to 0x010fff, from 0x019400 to 0x0197ff, and many more. 

# The problem can also be reproduced with rsync with the 2007-10-20T13:49:53 version of security.debian.org::debian-security/dists/etch/updates/main/binary-i386/Packages.gz
Comment 1 Andreas Ley 2007-11-06 04:04:45 UTC
Created attachment 13422 [details]
The 2007-10-20T13:49:53 version of security.debian.org::debian-security/dists/etch/updates/main/binary-i386/Packages.gz
Comment 2 Trond Myklebust 2007-11-06 04:53:43 UTC
Created attachment 13423 [details]
NFS: Fix a writeback race...

Known issue. The attached patch was sent to stable@kernel.org a couple of
weeks ago. I'll ping them on it.
Comment 3 Trond Myklebust 2007-11-16 11:02:20 UTC
The patch was finally accepted into 2.6.23.7, which was released today.

Closing bug...

Note You need to log in before you can comment on or make changes to this bug.