Bug 16213

Summary: Some NFS/RPC file transfers fail and hang automounting
Product: File System Reporter: Philippe Dax (dax)
Component: NFSAssignee: Trond Myklebust (trondmy)
Status: CLOSED CODE_FIX    
Severity: normal CC: dax, kernel-org
Priority: P1    
Hardware: All   
OS: Linux   
Kernel Version: 2.6.33.5-112.fc13.x86_64 Subsystem:
Regression: No Bisected commit-id:
Attachments: SUNRPC: Fix a re-entrancy bug in xs_tcp_read_calldir()

Description Philippe Dax 2010-06-15 11:48:52 UTC
Given a file "foo" of 50Mb on a remote machine "remote". The command
<localmachine $> cp /remote_mount_point/foo bar
never finishes, and bar has a size lesser than foo.
and the automounting of the local machine is hanged.

The error message in /var/log/messages is:
kernel: Callback slot table overflowed

If foo is lesser than 10Mb, the problem does'nt occur.
After several tries with the source of 50Mb,
the different sizes of the target file "bar" are random.

This incident occurs with:
sunrpc.tcp_slot_table_entries = 16

but after tunning this parameter, all is fine with:
sunrpc.tcp_slot_table_entries = 32
Comment 1 Trond Myklebust 2010-06-15 13:05:12 UTC
Are you testing NFSv4.1? The above is an error message that I would never expect to see in NFSv2/3 or ordinary NFSv4.
Comment 2 Philippe Dax 2010-06-15 13:39:42 UTC
I don't know which is the version of NFS of the Fedora-13 (Goddard) distribution, however in the /etc/sysconfig/nfs config file all lines are commented and especially :

# Define which protocol versions mountd
# will advertise. The values are "no" or "yes"
# with yes being the default
#MOUNTD_NFS_V1="no"
#MOUNTD_NFS_V2="no"
#MOUNTD_NFS_V3="no"

# Turn off v2 and v3 protocol support
#RPCNFSDARGS="-N 2 -N 3"
# Turn off v4 protocol support
#RPCNFSDARGS="-N 4"
# Number of nfs server processes to be started.
# The default is 8. 
#RPCNFSDCOUNT=8
Comment 3 Trond Myklebust 2010-06-15 14:01:58 UTC
What does
    cat /proc/self/mountinfo
show on the client.

Also, what does
   cat /proc/fs/nfsd/versions
show on the server?

Finally, if you do the following on the server:
    service nfs stop
    echo "-4.1" >/proc/fs/nfsd/versions
    service nfs start
can you reproduce the bug?
Comment 4 Philippe Dax 2010-06-15 15:49:12 UTC
I don't know which is the version of NFS of the Fedora-13 (Goddard) distribution, however in the /etc/sysconfig/nfs config file all lines are commented and especially :

# Define which protocol versions mountd
# will advertise. The values are "no" or "yes"
# with yes being the default
#MOUNTD_NFS_V1="no"
#MOUNTD_NFS_V2="no"
#MOUNTD_NFS_V3="no"

# Turn off v2 and v3 protocol support
#RPCNFSDARGS="-N 2 -N 3"
# Turn off v4 protocol support
#RPCNFSDARGS="-N 4"
# Number of nfs server processes to be started.
# The default is 8. 
#RPCNFSDCOUNT=8
(In reply to comment #1)
> Are you testing NFSv4.1? The above is an error message that I would never
> expect to see in NFSv2/3 or ordinary NFSv4.

========================================================
cat /proc/self/mountinfo
17 22 0:3 / /proc rw,relatime - proc /proc rw
18 22 0:0 / /sys rw,relatime - sysfs /sys rw
19 22 0:5 / /dev rw,relatime - devtmpfs udev rw,size=2505744k,nr_inodes=626436,mode=755
20 19 0:11 / /dev/pts rw,relatime - devpts devpts rw,gid=5,mode=620,ptmxmode=000
21 19 0:17 / /dev/shm rw,relatime - tmpfs tmpfs rw
22 1 8:3 / / rw,relatime - ext4 /dev/sda3 rw,barrier=1,data=ordered
23 17 0:16 / /proc/bus/usb rw,relatime - usbfs /proc/bus/usb rw
24 22 8:1 / /boot rw,relatime - ext4 /dev/sda1 rw,barrier=1,data=ordered
25 17 0:18 / /proc/sys/fs/binfmt_misc rw,relatime - binfmt_misc none rw
26 22 0:19 / /var/lib/nfs/rpc_pipefs rw,relatime - rpc_pipefs sunrpc rw
27 22 0:20 / /misc rw,relatime - autofs /etc/auto.misc rw,fd=7,pgrp=1358,timeout=300,minproto=5,maxproto=5,indirect
28 22 0:21 / /net rw,relatime - autofs -hosts rw,fd=13,pgrp=1358,timeout=300,minproto=5,maxproto=5,indirect
29 22 0:22 / /usr/local rw,relatime - autofs /etc/auto_direct rw,fd=19,pgrp=1358,timeout=300,minproto=5,maxproto=5,direct
30 22 0:23 / /infres rw,relatime - autofs /etc/auto.indirect.infres rw,fd=24,pgrp=1358,timeout=300,minproto=5,maxproto=5,indirect
31 22 0:24 / /cal rw,relatime - autofs /etc/auto.indirect.cal rw,fd=30,pgrp=1358,timeout=300,minproto=5,maxproto=5,indirect
32 22 0:25 / /comelec rw,relatime - autofs /etc/auto.indirect.comelec rw,fd=36,pgrp=1358,timeout=300,minproto=5,maxproto=5,indirect
33 22 0:26 / /tsi rw,relatime - autofs /etc/auto.indirect.tsi rw,fd=42,pgrp=1358,timeout=300,minproto=5,maxproto=5,indirect
34 22 0:27 / /enst rw,relatime - autofs /etc/auto.indirect.enst rw,fd=48,pgrp=1358,timeout=300,minproto=5,maxproto=5,indirect
35 22 0:28 / /stud rw,relatime - autofs /etc/auto.indirect.stud rw,fd=54,pgrp=1358,timeout=300,minproto=5,maxproto=5,indirect
38 29 0:29 / /usr/local rw,relatime - nfs lame4.enst.fr:/usr/local rw,vers=3,rsize=32768,wsize=32768,namlen=255,soft,proto=tcp,timeo=600,retrans=2,sec=sys,mountaddr=137.194.192.104,mountvers=3,mountport=635,mountproto=udp,addr=137.194.192.104


For the Nfs server, I don't know, it is a NAS and these commands doesn't exist.
I have only nfsstat, nfsmount, nfsumount.
Comment 5 Philippe Dax 2010-06-16 10:05:20 UTC
nfsstat on the NAS server seems live with Nfs2 and Nfs3:

Server rpc:
  Connection oriented calls -1446287487, Connectionless calls 93776991
RPC errors:
  bad-prog   bad-ver    bad-proc   auth-err   garbage
  765056     67766      0          0          0          93776991  
Version 2: (108151472 calls)
  null       getattr    setattr    root       lookup     readlink
  7098       60982987   122213     0          46197218   125       
  read       wrcache    write      create     remove     rename
  311767     0          476180     7651       11141      4671      
  link       symlink    mkdir      rmdir      readdir    statfs
  1133       267        1890       56         6607       20468     
Version 3: (-229782239 calls)
  null       getattr    setattr    lookup     access     readlink
  2174031    -1988680268 40795922   1828978961 -1387801541 180174    
  read       write      create     mkdir      symlink    mknod
  610846355  604537531  14705319   1794849    47359      86        
  remove     rmdir      rename     link       readdir    readdirplus
  7652932    651056     1939331    148409     3457234    8980470   
  fsstat     fsinfo     pathconf   commit
  3801848    1282010    312497     14413164


If I use a classic server on an Unix machine the problem doesn't occur, for example a Solaris10 machine. Then I suspect the NAS, but the client should not crash if NFS versions are differents on both sides.
Comment 6 Philippe Dax 2010-06-16 10:10:06 UTC
Morever, the problem doesn't occur if the kernel version is 2.6.32 used by Fedora 12 distribution.
I was obliged to downgrade 30 machines from Fedora-13 to Fedora-12.
Comment 7 Trond Myklebust 2010-06-16 18:07:05 UTC
Created attachment 26815 [details]
SUNRPC: Fix a re-entrancy bug in xs_tcp_read_calldir()

If the attempt to read the calldir fails, then instead of storing the read
bytes, we currently discard them. This leads to a garbage final result when
upon re-entry to the same routine, we read the remaining bytes.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Comment 8 Trond Myklebust 2010-06-16 18:12:16 UTC
Does the above attachment help in any way?
Comment 9 Philippe Dax 2010-06-16 19:28:09 UTC
(In reply to comment #8)
> Does the above attachment help in any way?

Yes obviously, for your fix, thanks a lot for your efficiency :-)
But, now what can I do ?
Change sunrpc.tcp_slot_table_entries = 32 ? or an other alternative ?

Many thanks for your work !

Philippe
Comment 10 Trond Myklebust 2010-06-19 02:43:36 UTC
Hi Philippe,

I'm not sure that I really understand your reply above. Are you saying that the hang is fixed, or that there is no change? I'm obviously interested in  chasing this regression down, so please give me any details that you think might be relevant.

To try to answer your question: the sunrpc.tcp_slot_table_entries value may have an effect on the write performance of the client, but it can not explain a hang. If you are still seeing a hang when copying the file, then there is still a bug out there for me to fix.

Cheers
  Trond
Comment 11 Philippe Dax 2010-06-19 13:37:11 UTC
(In reply to comment #10)
> Hi Philippe,
> 
> I'm not sure that I really understand your reply above. Are you saying that
> the
> hang is fixed, or that there is no change? I'm obviously interested in 
> chasing
> this regression down, so please give me any details that you think might be
> relevant.
> 
> To try to answer your question: the sunrpc.tcp_slot_table_entries value may
> have an effect on the write performance of the client, but it can not explain
> a
> hang. If you are still seeing a hang when copying the file, then there is
> still
> a bug out there for me to fix.
> 
> Cheers
>   Trond


I am sorry to have been confused.
When you said "SUNRPC: Fix a re-entrancy bug in xs_tcp_read_calldir()"
I beleived you have fixed the bug.
What should I do now ? Apply this patch by recompiling the kernel and try again to check if the bug is still there ?
Comment 12 Trond Myklebust 2010-06-19 15:22:38 UTC
Yes please. Apply the patch, recompile, and then try to reproduce the bug.
Comment 13 Philippe Dax 2010-06-22 16:14:06 UTC
I have recompiled the kernel with your patch.
Everything works finely ! No hangs.
Thanks,

Philippe
Comment 14 Trond Myklebust 2010-06-22 17:12:18 UTC
Cool! Thank you very much for testing!

I'll queue this up for Linus and will also Cc: the stable kernels.
Comment 15 Holger Herzog 2010-07-18 12:31:23 UTC
(In reply to comment #7)
> SUNRPC: Fix a re-entrancy bug in xs_tcp_read_calldir()
Sorry, I'm not familiar with the kernel source. Is this fix now related to NFS only? My system hangs when copying large amounts of data to USB or IDE.
Comment 16 Trond Myklebust 2010-07-23 23:17:03 UTC
Hi Holger.

What you are seeing would be an entirely different bug. Have you filed a bug report to the relevant maintainers of the filesystem on your USB+IDE drives?