Bug 217999 - File corruption over NFS when writing files using O_DIRECT
Summary: File corruption over NFS when writing files using O_DIRECT
Status: NEW
Alias: None
Product: File System
Classification: Unclassified
Component: NFS (show other bugs)
Hardware: All Linux
: P3 high
Assignee: Trond Myklebust
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2023-10-11 23:51 UTC by Jack
Modified: 2023-10-30 00:02 UTC (History)
5 users (show)

See Also:
Kernel Version: 6.1.56
Subsystem:
Regression: Yes
Bisected commit-id:


Attachments

Description Jack 2023-10-11 23:51:56 UTC
When writing data using O_DIRECT on an nfs ver=3 mount on linux kernel 6.1.56 and 6.1.57, the resulting file is corrupted.

dd if=testfile of=testfile2 oflag=direct; md5sum testfile*
2048+0 records in
2048+0 records out
1048576 bytes (1.0 MB, 1.0 MiB) copied, 0.529726 s, 2.0 MB/s
87f9f4d163fe86185ef8d3e32e049174  testfile
4636dda4d9ea2d890c408796c99e33f8  testfile2

The corruption does not occur on 6.1.55 and does not occur on 6.5.7

dd if=test of=test2 oflag=direct; md5sum test*
1351612+0 records in
1351612+0 records out
692025344 bytes (692 MB, 660 MiB) copied, 245.179 s, 2.8 MB/s
c6c9bbddff6fd21a0e725cc43dd87fd8  test
c6c9bbddff6fd21a0e725cc43dd87fd8  test2

The test files were created with dd if=/dev/urandom of=filename
Comment 1 Bagas Sanjaya 2023-10-13 12:45:03 UTC
(In reply to Jack from comment #0)
> When writing data using O_DIRECT on an nfs ver=3 mount on linux kernel
> 6.1.56 and 6.1.57, the resulting file is corrupted.
> 
> dd if=testfile of=testfile2 oflag=direct; md5sum testfile*
> 2048+0 records in
> 2048+0 records out
> 1048576 bytes (1.0 MB, 1.0 MiB) copied, 0.529726 s, 2.0 MB/s
> 87f9f4d163fe86185ef8d3e32e049174  testfile
> 4636dda4d9ea2d890c408796c99e33f8  testfile2
> 
> The corruption does not occur on 6.1.55 and does not occur on 6.5.7
> 
> dd if=test of=test2 oflag=direct; md5sum test*
> 1351612+0 records in
> 1351612+0 records out
> 692025344 bytes (692 MB, 660 MiB) copied, 245.179 s, 2.8 MB/s
> c6c9bbddff6fd21a0e725cc43dd87fd8  test
> c6c9bbddff6fd21a0e725cc43dd87fd8  test2
> 
> The test files were created with dd if=/dev/urandom of=filename

Do you have this issue on v6.1.0?
Comment 2 Jack 2023-10-13 20:02:09 UTC
(In reply to Bagas Sanjaya from comment #1)
> (In reply to Jack from comment #0)
> > When writing data using O_DIRECT on an nfs ver=3 mount on linux kernel
> > 6.1.56 and 6.1.57, the resulting file is corrupted.
> > 
> > dd if=testfile of=testfile2 oflag=direct; md5sum testfile*
> > 2048+0 records in
> > 2048+0 records out
> > 1048576 bytes (1.0 MB, 1.0 MiB) copied, 0.529726 s, 2.0 MB/s
> > 87f9f4d163fe86185ef8d3e32e049174  testfile
> > 4636dda4d9ea2d890c408796c99e33f8  testfile2
> > 
> > The corruption does not occur on 6.1.55 and does not occur on 6.5.7
> > 
> > dd if=test of=test2 oflag=direct; md5sum test*
> > 1351612+0 records in
> > 1351612+0 records out
> > 692025344 bytes (692 MB, 660 MiB) copied, 245.179 s, 2.8 MB/s
> > c6c9bbddff6fd21a0e725cc43dd87fd8  test
> > c6c9bbddff6fd21a0e725cc43dd87fd8  test2
> > 
> > The test files were created with dd if=/dev/urandom of=filename
> 
> Do you have this issue on v6.1.0?

I didn't notice any issues in the 6.1 series before 6.1.56 but it can easily be reproduced with the dd command. I don't have 6.1.0 to test with.
Comment 3 Andy 2023-10-15 14:08:48 UTC
I can add that I'm also seeing NFS corruption on a NFSv4 mount with 6.1.57 in alpine linux.

I've tested using the mariadb-install-db script and fails complaining about corrupt files.
Comment 4 Neil Brown 2023-10-20 07:20:23 UTC
FYI the cause of this corruption is that the  backport of
   NFS: Fix error handling for O_DIRECT write scheduling

had an error.
The backported patch f16fd0b11f0f4d41846b5102b1656ea1fc9ac7a0
moves "pos += req_len" in nfs_direct_write_schedule_iovec() from after
    req->wb_index = pos >> PAGE_SHIFT;
to before that statement.  This ->wb_index is wrong.
Possibly a better way to look at this is the use of "pos" is moved to *after* it is updated.

The upstream patch 954998b60caa8f2a3bf3abe490de6f08d283687a
doesn't move the use of pos because
Commit 70e9db69f927 ("NFS: Clean up O_DIRECT request allocation")

removes the use.

v6.1.56 can be fixed with

diff --git a/fs/nfs/direct.c b/fs/nfs/direct.c
index 5a976fa343df..69134e11e7d0 100644
--- a/fs/nfs/direct.c
+++ b/fs/nfs/direct.c
@@ -864,6 +864,8 @@ static ssize_t nfs_direct_write_schedule_iovec(struct nfs_direct_req *dreq,
                                result = PTR_ERR(req);
                                break;
                        }
+                       req->wb_index = pos >> PAGE_SHIFT;
+                       req->wb_offset = pos & ~PAGE_MASK;
 
                        if (desc.pg_error < 0) {
                                nfs_free_request(req);
@@ -883,8 +885,6 @@ static ssize_t nfs_direct_write_schedule_iovec(struct nfs_direct_req *dreq,
                        }
 
                        nfs_lock_request(req);
-                       req->wb_index = pos >> PAGE_SHIFT;
-                       req->wb_offset = pos & ~PAGE_MASK;
                        if (nfs_pageio_add_request(&desc, req))
                                continue;

Note You need to log in before you can comment on or make changes to this bug.