Bug 6127
Summary: | NFS updates file modification time on close() or after last write, breaks "cp -p", "cp -a" and "cp --preserve=timestamps" | ||
---|---|---|---|
Product: | File System | Reporter: | Bernhard Kaindl (bk) |
Component: | NFS | Assignee: | Trond Myklebust (trondmy) |
Status: | CLOSED CODE_FIX | ||
Severity: | high | CC: | bunk |
Priority: | P2 | ||
Hardware: | i386 | ||
OS: | Linux | ||
Kernel Version: | 2.6.16-rc4 | Subsystem: | |
Regression: | --- | Bisected commit-id: | |
Attachments: | NFS: writes should not clobber utimes() calls |
Description
Bernhard Kaindl
2006-02-24 08:40:12 UTC
Further explanation: cp versions prior to 5.92 closed the destination file before triggering the utime system call to set the modification time of the destination file, while 5.92 to 5.94 and the current CVS close the file _*_after_*_ setting the modfication time/ That's a sample of how the newer versions do it (from cp -p): fd 4 is the destination file here: > open("~", O_WRONLY|O_TRUNC|O_LARGEFILE) = 4 > fstat64(4, {st_mode=S_IFREG|0644, st_size=0, ...}) = 0 > write(4, "foo\n", 4) = 4 > utimes("/proc/self/fd/4", {1134767841, 0}) = 0 > chmod("baz", 0100644) = 0 > close(4) = 0 It has also been mentioned: > 2. Timestamps of empty files are preserved, which reinforces the above > suspicion (no writes for empty file) So the cause of the update may indeed that some data is pending to be written slightly after the utimes call and in any case finally on close, but this should, according to POSIX not update the mtime/not be visible to the user. I've a possible better workaround for cp (avoids fsync) now: It creates a copy of the destination fd, closes dest_desc and continues the remaining operations (utime/utimes/futimesat and fchmod with the copy: --- src/copy.c +++ src/copy.c @@ -420,6 +420,12 @@ timespec[0] = get_stat_atime (src_sb); timespec[1] = get_stat_mtime (src_sb); + int dest_desc2 = dup (dest_desc); + + close(dest_desc); /* needed for NFS with Linux-2.6 */ + + dest_desc = dest_desc2; + if (futimens (dest_desc, dst_name, timespec) != 0) { error (0, errno, _("preserving times for %s"), quote (dst_name)); copying /etc/passwd then looks like this: read(3, "root:x:0:0:root:/root:/bin/bash\n"..., 32768) = 1254 write(4, "root:x:0:0:root:/root:/bin/bash\n"..., 1254) = 1254 read(3, "", 32768) = 0 dup(4) = 5 close(4) = 0 utimes("/proc/self/fd/5", {1140803734, 0}) = 0 fchown(5, 0, 0) = 0 close(5) = 0 This avoids the need for fsync which waits until the data is written, which would have a negative impact on the performance of cp, expecially when copy huge numbers of small files locally. About using "/proc/self/fd/5" here: It's simply symlink, so nothing special, but also suboptimal I think since it creates an extra indifection which the kernel has to resolve internally. Not a big one of course, only a few function calls in the VFS layer. I think it would be better to use the futimesat() system call, but that's offtopic here, I'll report that to the coreutils people. Still it is a workaround only. I found this, Paul Eggert of UCLA got this this approved by the Austin Group: > I asked the POSIX folks about this (see > <http://www.opengroup.org/austin/mailarchives/ag-review/msg01971.html>) > and the answer came back that the Cygwin behavior was indeed a bug; > see <http://www.opengroup.org/austin/mailarchives/ag/msg08912.html> > and look for "XBD ERN 61". ---------------------------------------------------------------------- XBD ERN 61 File times update Accept as marked Change XBD page 99 lines 3089-3091 (File Times Update) from: All fields that are marked for update shall be updated when the file ceases to be open by any process, or when a stat(), fstat(), or lstat() is performed on the file. to: All fields that are marked for update shall be updated when the file ceases to be open by any process or before a stat(), fstat(), lstat(), fsync(), utime(), or utimes() is successfully performed on the file. ---------------------------------------------------------------------- So the POSIX folks want to have the fields updated before the functions are called which makes sense, at least I, from a API user's viewpoint, would like to have defined, always consistant behaviour. I have also read that it works this way on OpenBSD 3.4 NFS client and a Solaris 8 NFS server in this post: http://lists.gnu.org/archive/html/bug-coreutils/2005-12/msg00219.html I don't know how utimes() is implemented in NFS, but if this is the way it's wanted then probably the implementation of utimes in NFS. I just looked at fs/open.c:do_utimes() which calls fs/attr.c:notify_change() which calls fs/nfs/inode.c:nfs_setattr(), which has this check: /* Write all dirty data if we're changing file permissions or size */ if ((attr->ia_valid & (ATTR_MODE|ATTR_UID|ATTR_GID|ATTR_SIZE)) != 0) { filemap_write_and_wait(inode->i_mapping); nfs_wb_all(inode); } Could we add ATTR_CTIME|ATTR_MTIME|ATTR_ATIME which are set by du_utimes() to this check to write all dirty data also if we are changing timestamps? Created attachment 7472 [details]
NFS: writes should not clobber utimes() calls
Ensure that we flush out writes in the case when someone calls utimes() in
order to set the file times.
Trond, what about getting this patch into 2.6.16? I'm sceptical about doing that as it isn't really a critical regression: in fact, this is the first complaint despite the fact that the issue has existed for all(?) 2.6 kernels. Unless someone complains loudly, I'd therefore prefer to wait until 2.6.17. Patch merged into mainline kernel prior to 2.6.17-rc1. |