Description of problem: Our recent regression test on upstream linux hit below failure on NFS: /usr/bin/fsx -Z -r 4096 -w 4096 -t 4096 -l 10485760 -S 0 /mnt/fsx-longterm/nfs-client/fsxtest_dio skipping zero size read skipping zero length punch hole skipping zero length punch hole truncating to largest ever: 0x25a000 copying to largest ever: 0x7d0000 truncating to largest ever: 0x99b000 copying to largest ever: 0x9ad000 copying to largest ever: 0x9b8000 truncating to largest ever: 0x9c4000 copying to largest ever: 0xa00000 READ BAD DATA: offset = 0x7000, size = 0xaff2, fname = /mnt/fsx-longterm/nfs-client/fsxtest_dio OFFSET GOOD BAD RANGE 0x0b000 0xda9e 0x0000 0x00000 operation# (mod 256) for the bad data unknown, check HOLE and EXTEND ops 0x0b001 0x9eda 0x0000 0x00001 operation# (mod 256) for the bad data unknown, check HOLE and EXTEND ops 0x0b002 0xda63 0x0000 0x00002 operation# (mod 256) for the bad data unknown, check HOLE and EXTEND ops 0x0b003 0x63da 0x0000 0x00003 operation# (mod 256) for the bad data unknown, check HOLE and EXTEND ops 0x0b004 0xdac6 0x0000 0x00004 operation# (mod 256) for the bad data unknown, check HOLE and EXTEND ops 0x0b005 0xc6da 0x0000 0x00005 operation# (mod 256) for the bad data unknown, check HOLE and EXTEND ops 0x0b006 0xdae0 0x0000 0x00006 operation# (mod 256) for the bad data unknown, check HOLE and EXTEND ops 0x0b007 0xe0da 0x0000 0x00007 operation# (mod 256) for the bad data unknown, check HOLE and EXTEND ops 0x0b008 0xda6e 0x0000 0x00008 operation# (mod 256) for the bad data unknown, check HOLE and EXTEND ops 0x0b009 0x6eda 0x0000 0x00009 operation# (mod 256) for the bad data unknown, check HOLE and EXTEND ops 0x0b00a 0xda34 0x0000 0x0000a operation# (mod 256) for the bad data unknown, check HOLE and EXTEND ops 0x0b00b 0x34da 0x0000 0x0000b operation# (mod 256) for the bad data unknown, check HOLE and EXTEND ops 0x0b00c 0xdaec 0x0000 0x0000c operation# (mod 256) for the bad data unknown, check HOLE and EXTEND ops 0x0b00d 0xecda 0x0000 0x0000d operation# (mod 256) for the bad data unknown, check HOLE and EXTEND ops 0x0b00e 0xdacd 0x0000 0x0000e operation# (mod 256) for the bad data unknown, check HOLE and EXTEND ops 0x0b00f 0xcdda 0x0000 0x0000f operation# (mod 256) for the bad data unknown, check HOLE and EXTEND ops LOG DUMP (209358 total operations): 209359(207 mod 256): READ 0x4a1000 thru 0x4a5fff (0x5000 bytes) 209360(208 mod 256): WRITE 0x96d000 thru 0x972fff (0x6000 bytes) HOLE 209361(209 mod 256): SKIPPED (no operation) 209362(210 mod 256): COPY 0x66d000 thru 0x679fff (0xd000 bytes) to 0x8fe000 thru 0x90afff 209363(211 mod 256): SKIPPED (no operation) 209364(212 mod 256): MAPWRITE 0x2f6000 thru 0x2f8c51 (0x2c52 bytes) 209365(213 mod 256): SKIPPED (no operation) 209366(214 mod 256): SKIPPED (no operation) 209367(215 mod 256): MAPREAD 0x14000 thru 0x23ec6 (0xfec7 bytes) 209368(216 mod 256): SKIPPED (no operation) ... ... 209354(202 mod 256): PUNCH 0xb73c7 thru 0xbc3a1 (0x4fdb bytes) 209355(203 mod 256): SKIPPED (no operation) 209356(204 mod 256): SKIPPED (no operation) 209357(205 mod 256): TRUNCATE DOWN from 0x6102a7 to 0x3a6000 209358(206 mod 256): MAPREAD 0x7000 thru 0x11ff1 (0xaff2 bytes) ***RRRR*** Log of operations saved to "/mnt/fsx-longterm/nfs-client/fsxtest_dio.fsxops"; replay with --replay-ops Correct content saved for comparison (maybe hexdump "/mnt/fsx-longterm/nfs-client/fsxtest_dio" vs "/mnt/fsx-longterm/nfs-client/fsxtest_dio.fsxgood") Version number of selected component: Mainline Linux v5.17-rc6+, HEAD=ac84e82f78cb55ce8e760e0b5887d56efd78d6bc How reproducible: I test on aarch64, x86_64 and ppc64le, then aarch64 and x86_64 hit this failure. And only pure dio testing failed, others include buffer-io, aio etc didn't fail. Will give it more testing if needed. Additional info: The fsx comes from upstream xfstests, xfstests/ltp/fsx.c. I'll upload full fsx test results later.
Created attachment 300551 [details] fsx bad file dump
Created attachment 300552 [details] fsx good file dump
Created attachment 300553 [details] fsx output log
Created attachment 300554 [details] fsx operations running log
Created attachment 300555 [details] .config file
The underlying filesystem is xfs: meta-data=/dev/vda2 isize=512 agcount=4, agsize=327680 blks = sectsz=512 attr=2, projid32bit=1 = crc=1 finobt=1, sparse=1, rmapbt=0 = reflink=1 bigtime=1 inobtcount=1 data = bsize=4096 blocks=1310720, imaxpct=25 = sunit=0 swidth=0 blks naming =version 2 bsize=4096 ascii-ci=0, ftype=1 log =internal log bsize=4096 blocks=2560, version=2 = sectsz=512 sunit=0 blks, lazy-count=1 realtime =none extsz=4096 blocks=0, rtextents=0 mount nfs without any specified mount options, all by default nfs-utils 2.5.4
This sounds a lot like a regression, but it's not totally clear from the description. Am I right in assuming that 5.16 worked fine? Or did even some of the 5.17-rc kernels work fine?
I'm assuming this is against a Linux server? Do you know if the client is using NFS v4.2 and if it has CONFIG_NFS_v4_2_READ_PLUS enabled? The fsx output looks a lot like the output we sometimes get when using v4.2 READ_PLUS, which is why we have it locked behind the config option with the message "This is intended for developers only" for now (I'm working on patches to address this, but they're not ready yet)
(In reply to Anna Schumaker from comment #8) > I'm assuming this is against a Linux server? Do you know if the client is > using NFS v4.2 and if it has CONFIG_NFS_v4_2_READ_PLUS enabled? > > The fsx output looks a lot like the output we sometimes get when using v4.2 > READ_PLUS, which is why we have it locked behind the config option with the > message "This is intended for developers only" for now (I'm working on > patches to address this, but they're not ready yet) Hi Anna, The .config file is in attachment. You check more details in it. I just found "CONFIG_NFS_V4_2_READ_PLUS=y" in it, so it might be that issue. I'll give it a test by disable this configuration. Thanks, Zorro
(In reply to Zorro Lang from comment #9) > (In reply to Anna Schumaker from comment #8) > > I'm assuming this is against a Linux server? Do you know if the client is > > using NFS v4.2 and if it has CONFIG_NFS_v4_2_READ_PLUS enabled? > > > > The fsx output looks a lot like the output we sometimes get when using v4.2 > > READ_PLUS, which is why we have it locked behind the config option with the > > message "This is intended for developers only" for now (I'm working on > > patches to address this, but they're not ready yet) > > Hi Anna, > > The .config file is in attachment. You check more details in it. I just > found "CONFIG_NFS_V4_2_READ_PLUS=y" in it, so it might be that issue. I'll > give it a test by disable this configuration. Hi Anna, I think you're right, after I disable CONFIG_NFS_V4_2_READ_PLUS, I can't reproduce this bug anymore. So after you fix it, I'd like to verify that works. Thanks, Zorro > > Thanks, > Zorro
I ran the fsx command above over night. .... skipping zero length punch hole skipping zero length punch hole skipping zero size read ^Csignal 2 testcalls = 124721256 [cel@morisot ~]$ mountstats --rpc Stats for klimt:/export/fast mounted on /mnt: RPC statistics: 267015341 RPC requests sent, 267015341 RPC replies received (0 XIDs not found) average backlog queue length: 0 GETATTR: 161259261 ops (60%) avg bytes sent per op: 199 avg bytes received per op: 240 backlog wait: 0.003842 RTT: 0.152617 total execute time: 0.168519 (milliseconds) READ_PLUS: 46936965 ops (17%) avg bytes sent per op: 216 avg bytes received per op: 4465 backlog wait: 0.012698 RTT: 0.200121 total execute time: 0.237943 (milliseconds) WRITE: 27575688 ops (10%) avg bytes sent per op: 33432 avg bytes received per op: 141 backlog wait: 0.011856 RTT: 0.720103 total execute time: 0.739979 (milliseconds) SETATTR: 12061698 ops (4%) avg bytes sent per op: 247 avg bytes received per op: 264 backlog wait: 0.004311 RTT: 0.223075 total execute time: 0.239609 (milliseconds) ALLOCATE: 9592654 ops (3%) avg bytes sent per op: 236 avg bytes received per op: 168 backlog wait: 0.003911 RTT: 0.205160 total execute time: 0.220310 (milliseconds) DEALLOCATE: 9588990 ops (3%) avg bytes sent per op: 236 avg bytes received per op: 168 backlog wait: 0.003670 RTT: 0.200074 total execute time: 0.214885 (milliseconds) .... So... the mount is NFSv4.2, the backing filesystem is XFS, and the client is clearly generating READ_PLUS and hole punches. I haven't been able to reproduce a problem so far with a Linux 5.19.0-rc7-00042-g94d75868c355 NFS server (no changes to the current READ_PLUS code -- this is just what is ready for 5.20).