Bug 68391 - Cthon basic test 5 fails with NFS/RDMA
Summary: Cthon basic test 5 fails with NFS/RDMA
Status: RESOLVED CODE_FIX
Alias: None
Product: File System
Classification: Unclassified
Component: NFS (show other bugs)
Hardware: All Linux
: P1 normal
Assignee: Chuck Lever
URL:
Keywords:
: 68361 68371 (view as bug list)
Depends on:
Blocks:
 
Reported: 2014-01-09 19:39 UTC by Chuck Lever
Modified: 2014-04-11 14:28 UTC (History)
2 users (show)

See Also:
Kernel Version: 3.12
Subsystem:
Regression: No
Bisected commit-id:


Attachments
Proposed fix for bz 68391 (1.62 KB, patch)
2014-01-17 19:27 UTC, Chuck Lever
Details | Diff

Description Chuck Lever 2014-01-09 19:39:15 UTC
This is NFSv3 with 3.12.0.  Same behavior with NFSv4 on 3.8.

[cel@manet cthon04]$ ./server -b

sh ./runtests  -b -t /mnt/10.0.0.1/manet.test

Starting BASIC tests: test directory /mnt/10.0.0.1/manet.test (arg: -t)

./test1: File and directory creation test
    created 155 files 62 directories 5 levels deep in 1.86 seconds
    ./test1 ok.

./test2: File and directory removal test
    removed 155 files 62 directories 5 levels deep in 1.78 seconds
    ./test2 ok.

./test3: lookups across mount point
    500 getcwd and stat calls in 0.0  seconds
    ./test3 ok.

./test4: setattr, getattr, and lookup
    1000 chmods and stats on 10 files in 8.19 seconds
    ./test4 ok.

./test5: read and write
    ./test5: (/mnt/10.0.0.1/manet.test) 'bigfile' read failed : Input/output
error
basic tests failed
Tests failed, leaving /mnt/10.0.0.1 mounted
[cel@manet cthon04]$
Comment 1 Chuck Lever 2014-01-16 18:26:56 UTC
RPC: 15170 __rpc_execute flags=0x1
RPC: 15170 call_start nfs4 proc READ (async)
RPC: 15170 call_reserve (status 0)
RPC: 15170 reserved req ffff88020ed0da00 xid c1d11a49
RPC: 15170 call_reserveresult (status 0)
RPC: 15170 call_refresh (status 0)
RPC: 15170 refreshing UNIX cred ffff88001f2fb9c0
RPC: 15170 call_refreshresult (status 0)
RPC: 15170 call_allocate (status 0)
RPC:       xprt_rdma_allocate: size 684, request 0xffff8800cd45e000
RPC: 15170 call_bind (status 0)
RPC: 15170 call_connect xprt ffff8800d0638000 is connected
RPC: 15170 call_transmit (status 0)
RPC: 15170 xprt_prepare_transmit
RPC: 15170 xprt_cwnd_limited cong = 0 cwnd = 8192
RPC: 15170 rpc_xdr_encode (status 0)
RPC: 15170 marshaling UNIX cred ffff88001f2fb9c0
RPC: 15170 using AUTH_UNIX cred ffff88001f2fb9c0 to wrap rpc data
RPC: 15170 xprt_transmit(176)
RPC:       rpcrdma_inline_pullup: pad 0 destp 0xffff8800cd45f810 len 176 hdrlen 176
RPC:       rpcrdma_register_frmr_external: Using frmr ffff880211a73760 to map 1 segments
RPC:       rpcrdma_create_chunks: write chunk elem 4096@0x205390000:0x7dd92 (more)
RPC:       rpcrdma_register_frmr_external: Using frmr ffff880211a74188 to map 1 segments
RPC:       rpcrdma_create_chunks: write chunk elem 152@0xcd45f974:0x81e29 (last)
RPC:       rpcrdma_marshal_req: write chunk: hdrlen 68 rpclen 176 padlen 0 headerp 0xffff8800cd45f100 base 0xffff8800cd45f760 lkey 0x500
RPC: 15170 xmit complete
RPC: 15170 sleep_on(queue "xprt_pending" time 4294782487)
RPC: 15170 added to queue ffff8800d0638258 "xprt_pending"
RPC: 15170 setting alarm for 60000 ms

 ....

RPC:       wake_up_first(ffff8800d0638190 "xprt_sending")
RPC:       rpcrdma_event_process: event rep ffff8800cd466000 status 0 opcode 80 length 128
RPC:       rpcrdma_event_process: event rep ffff880211a66920 status 0 opcode 8 length 128
RPC:       rpcrdma_event_process: event rep ffff880211a66948 status 0 opcode 8 length 128
RPC:       rpcrdma_event_process: event rep           (null) status 0 opcode 0 length 128
RPC:       rpcrdma_event_process: event rep ffff8800cd439000 status 0 opcode 80 length 128
RPC:       rpcrdma_reply_handler: reply 0xffff8800cd43b000 completes request 0xffff8800cd45e000
                  RPC request 0xffff88020ed0da00 xid 0x491ad1c1
RPC:       rpcrdma_count_chunks: chunk 4096@0x205390000:0x7dd92
RPC:       rpcrdma_count_chunks: chunk 0@0xcd45f974:0x81e29
RPC:       rpcrdma_inline_fixup: srcp 0xffff8800cd43b0a4 len 60 hdrlen 60
RPC:       rpcrdma_reply_handler: xprt_complete_rqst(0xffff8800d0638000, 0xffff88020ed0da00, 4156)
RPC: 15170 xid c1d11a49 complete (4156 bytes received)
RPC: 15170 __rpc_wake_up_task (now 4294782487)
RPC: 15170 disabling timer
RPC: 15170 removed from queue ffff8800d0638258 "xprt_pending"

 ....

RPC: 15170 __rpc_execute flags=0x801
RPC: 15170 call_status (status 4156)
RPC: 15170 call_decode (status 4156)
RPC: 15170 validating UNIX cred ffff88001f2fb9c0
RPC: 15170 using AUTH_UNIX cred ffff88001f2fb9c0 to unwrap rpc data
NFS: server cheating in read reply: count 4096 > recvd 0
RPC: 15170 call_decode result 0
 --> nfs4_read_done
Comment 2 Chuck Lever 2014-01-16 18:28:15 UTC
The server is returning 4156 bytes in the READ reply, but our client's XDR layer is not re-assembling the reply buffer so that the read XDR decoder can extract the returned data.
Comment 3 Chuck Lever 2014-01-16 18:32:45 UTC
*** Bug 68361 has been marked as a duplicate of this bug. ***
Comment 4 Chuck Lever 2014-01-17 19:27:26 UTC
Created attachment 122451 [details]
Proposed fix for bz 68391
Comment 5 Chuck Lever 2014-04-11 14:26:28 UTC
Commit 2b7bbc963da8d076f263574af4138b5df2e1581f was merged into 3.15.
Comment 6 Chuck Lever 2014-04-11 14:28:33 UTC
*** Bug 68371 has been marked as a duplicate of this bug. ***

Note You need to log in before you can comment on or make changes to this bug.