Most recent kernel where this bug did not occur: vanilla 2.6.18 all 2.6.18-mm[1..3] have the bug. I did a git-bisect to track it down to this commit: 51b6ded4d9a94a61035deba1d8f51a54e3a3dd86 is first bad commit commit 51b6ded4d9a94a61035deba1d8f51a54e3a3dd86 Author: Trond Myklebust <Trond.Myklebust@netapp.com> Date: Fri Sep 15 16:31:56 2006 -0400 NFSv4: When mounting with a port=0 argument, substitute port=2049 RFC3530 states that the registered port 2049 for the NFS protocol should be the default configuration in order to allow clients not to use the RPC binding protocols. If the mount program sends us a port=0, we therefore substitute port=2049. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com> :040000 040000 f093abbb7cb6a01155ee2d2a820839ca2e9edadb 4f90adecb2095bc921bcac83d9cff3c0fc1f5388 M fs Distribution: Gentoo-2006.1 Hardware Environment: x86_64 server, i386 client Software Environment: NFS-Server: tested with 2.6.18-mm2, but the same error could be seen with 2.6.17-rc6-mm2 as server. NFS-Client: 2.6.17 and 2.6.18 work, but 2.6.19-rc1 fails with the same configuration The fstab entry of the failing mount: 192.168.0.4:/portage /usr/portage nfs4 rw,noatime,intr,noauto 0 0 Problem Description: When trying to mount this share the mount process hangs several minutes, then it will display: "mount: 192.168.0.4:/portage: can't read superblock" The syslog will contain: "nfs: server 192.168.0.4 not responding, timed out" There is no firewall between the two systems, and the server does listen on port 2049. rpcinfo -p localhost on the server: program vers proto port 100000 2 tcp 111 portmapper 100000 2 udp 111 portmapper 100024 1 udp 32769 status 100024 1 tcp 51226 status 100021 1 udp 32770 nlockmgr 100021 3 udp 32770 nlockmgr 100021 4 udp 32770 nlockmgr 100003 2 udp 2049 nfs 100003 3 udp 2049 nfs 100003 4 udp 2049 nfs 100021 1 tcp 39360 nlockmgr 100021 3 tcp 39360 nlockmgr 100021 4 tcp 39360 nlockmgr 100003 2 tcp 2049 nfs 100003 3 tcp 2049 nfs 100003 4 tcp 2049 nfs 100005 1 udp 739 mountd 100005 1 tcp 742 mountd 100005 2 udp 739 mountd 100005 2 tcp 742 mountd 100005 3 udp 739 mountd 100005 3 tcp 742 mountd netstat --listen --inet contains these lines: tcp 0 0 0.0.0.0:2049 0.0.0.0:* LISTEN udp 0 0 0.0.0.0:2049 0.0.0.0:* Steps to reproduce: Export NFS: /var/export 192.168.0.2(fsid=0,rw,no_root_squash,sync) /var/export/portage 192.168.0.2(rw,nohide,no_root_squash,sync) Try to mount it via above fstab-line: mount /usr/portage
Created attachment 9174 [details] NFSv4: Fix thinko in fs/nfs/super.c This patch ought to fix it for you... BTW: Why _is_ the gentoo mount command setting a 0 port value?
I can confirm, that the attached patch fixes the problem for me. Gentoo may set the port to 0 because I did not specified a portnumber in the fstab. I have the following mount programm installed: sys-apps/util-linux-2.12r-r4 Gentoo patches the original sources from http://www.kernel.org/pub/linux/utils/util-linux/ with support for Loop-AES and several other patches. One also enabled NFSv4. You can find the NFSv4-Patch here: http://sources.gentoo.org/viewcvs.py/gentoo-x86/sys-apps/util-linux/files/util-linux-2.12i-nfsv4.patch?rev=1.1&view=markup But as far I can see, it should default to 2049...
Ah.. I found the error in the Gentoo-Patch: The port defaults correctly to 2049 and will also be read from the commandline options. But the patch seems to fail to fill the 'int port'-variable into the 'struct nfs4_mount_data' that will be passed to the kernel.
The fix is now in Linus' tree.