Bug 14053 - Kernel blocks during rsync to NFS-mounted directory exported from Sun OS machine
Summary: Kernel blocks during rsync to NFS-mounted directory exported from Sun OS machine
Alias: None
Product: File System
Classification: Unclassified
Component: NFS (show other bugs)
Hardware: All Linux
: P1 blocking
Assignee: Trond Myklebust
URL: http://kerneloops.org/submitresult.ph...
Depends on:
Reported: 2009-08-25 02:14 UTC by Matthew Breeze
Modified: 2010-02-03 20:42 UTC (History)
0 users

See Also:
Kernel Version:
Tree: Fedora
Regression: No

Fedora 10 machine details (/proc/{cpuinfo,modules,ioports,iomem,scsi/scsi},lspci -vvv) (49.04 KB, text/plain)
2009-08-25 02:14 UTC, Matthew Breeze
SUNRPC: Fix rpc_task_force_reencode (744 bytes, patch)
2009-08-26 00:00 UTC, Trond Myklebust
Details | Diff

Description Matthew Breeze 2009-08-25 02:14:26 UTC
Created attachment 22838 [details]
Fedora 10 machine details (/proc/{cpuinfo,modules,ioports,iomem,scsi/scsi},lspci -vvv)

We have a Fedora 10 machine and a Sun OS machine. The Sun OS machine exports a ZFS filesystem (pool) to the Fedora 10 machine, which mounts it using a NFS v4 mount (/etc/fstab: 	melon:/melon1   /melon1 nfs4    rw,rsize=8192,wsize=8192,timeo=14,intr). When a large file (3 GB) is read from the NFS-mounted directory on the Fedora 10 machine (/melon1) this kernel panic frequently occurs. Also, when performing a rsync to this directory this kernel panic also occurs, even being at one point was completely predictable by consistently crashing 3 times in each case after sending around 2 MB. However, the bug is not always so predictable. The Sun OS machine has apparently no ill effects.

I don't know if this is a failure in the implementation of NFS 4 in Linux or in Sun OS.

Sun OS machine information:

SunOS 5.10 Generic_137112-06 i86pc i386 i86pc

Fedora 10 machine information:

Client: Fedora 10 Linux
Client Kernel Version (/proc/version): Linux version (mockbuild@x86-4.fedora.phx.redhat.com) (gcc version 4.3.2 20081105 (Red Hat 4.3.2-7) (GCC) ) #1 SMP Fri Aug 14 20:49:37 EDT 2009
Comment 1 Trond Myklebust 2009-08-26 00:00:27 UTC
Created attachment 22847 [details]
SUNRPC: Fix rpc_task_force_reencode

SUNRPC: Fix rpc_task_force_reencode

If we're in the case where we need to force a reencode and then resend of
the RPC request, due to xprt_transmit failing with a networking error, then
we _must_ retransmit the entire request.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Comment 2 Trond Myklebust 2009-08-26 00:01:06 UTC

Does the above patch fix the Oops?

Comment 3 Matthew Breeze 2009-08-26 05:27:11 UTC
Hi Trond,

I'm testing now.

Comment 4 Matthew Breeze 2009-08-28 04:47:50 UTC
Hi Trond,

The patch fixed the bug. I haven't seen any oopses from either rsyncing or reading large files. Thanks for your help.

Comment 5 Trond Myklebust 2010-02-03 20:42:49 UTC
committed to mainline as 2574cc9f4ffc6c681c9177111357efe5b76f0e36 (SUNRPC: Fix rpc_task_force_reencode).

Note You need to log in before you can comment on or make changes to this bug.