Bug 14276
Summary: | nfsroot will not remount rw and claims illegal options | ||
---|---|---|---|
Product: | File System | Reporter: | Hans de Bruin (jmdebruin) |
Component: | NFS | Assignee: | Trond Myklebust (trondmy) |
Status: | CLOSED CODE_FIX | ||
Severity: | low | CC: | randrik, rjw |
Priority: | P1 | ||
Hardware: | All | ||
OS: | Linux | ||
Kernel Version: | 2.6.32-rc2 | Subsystem: | |
Regression: | Yes | Bisected commit-id: | |
Bug Depends on: | |||
Bug Blocks: | 14230 | ||
Attachments: | NFS: Fix port initialisation in nfs_remount() |
Description
Hans de Bruin
2009-09-30 15:08:36 UTC
Have you tried applying the patch from http://patchwork.kernel.org/patch/50428/ ? Now I have. The problem does not go away. /proc/mount for root is the same as without the patch. I will do some bisecting tomorrow. -- Hans Notify-Also : Trond Myklebust <trond.myklebust@fys.uio.no> I found at least one strange bug-report on Ubuntu forums: http://ubuntuforums.org/showthread.php?p=8044358 " Re: 2.6.32-rc1 is out I have a problem with 2.6.32-rc1 kernel and NFS client on it. I can mount NFS directory that is on PC with 2.6.31 kernel, but I can't see files in it ( "ls" command says "input/output error"). I don't know how to make it work, Firs problem with NFS was a problem with starting nfs server on that kernel. I had to recompile it with nfs-server setup as part of kernel, and not as a module as it was until now." But I only found it because my own kernel (2.6.32-rc2-git, compiled for SGI O2 from sources up to 84d88d5d4efc37dfb8a93a4a58d8a227ee86ffa4 , "Merge branch 'sched-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip") show same error when I trying to do any access to NFS shares from 2.6.31-rc9 machine. But strangely, it can mount NFS root (at least with "rw" options in kernel cmdline). Sorry if my observations unrelated to this bug, I can see nothing in dmesg both from server (2.6.31-rc9) and from client (2.6.32-rc1-git, reported itself as 2.6.32-rc2). (In reply to comment #1) > Have you tried applying the patch from > http://patchwork.kernel.org/patch/50428/ > ? Thank you, at least my problem goes away! hans@orion:/var/diskless/kernel/linux-2.6$ git bisect bad 53a0b9c4c99ab0085a06421f71592722e5b3fd5f is first bad commit commit 53a0b9c4c99ab0085a06421f71592722e5b3fd5f Author: Chuck Lever <chuck.lever@oracle.com> Date: Sun Aug 9 15:09:36 2009 -0400 NFS: Replace nfs_parse_ip_address() with rpc_pton() Clean up: Use the common routine now provided in sunrpc.ko for parsing mount addresses. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com> :040000 040000 31c3998309b2c325b7ea927498e1a7e78e0e7747 513edbc936111eda9d1646db17e7240c77a32c42 M fs hans@orion:/var/diskless/kernel/linux-2.6$ On Sun, 2009-10-04 at 17:48 +0000, bugzilla-daemon@bugzilla.kernel.org wrote: > http://bugzilla.kernel.org/show_bug.cgi?id=14276 > > > > > > --- Comment #6 from Hans de Bruin <bruinjm@xs4all.nl> 2009-10-04 17:48:57 > --- > hans@orion:/var/diskless/kernel/linux-2.6$ git bisect bad > 53a0b9c4c99ab0085a06421f71592722e5b3fd5f is first bad commit > commit 53a0b9c4c99ab0085a06421f71592722e5b3fd5f > Author: Chuck Lever <chuck.lever@oracle.com> > Date: Sun Aug 9 15:09:36 2009 -0400 > > NFS: Replace nfs_parse_ip_address() with rpc_pton() > > Clean up: Use the common routine now provided in sunrpc.ko for parsing > mount > addresses. > > Signed-off-by: Chuck Lever <chuck.lever@oracle.com> > Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com> > > :040000 040000 31c3998309b2c325b7ea927498e1a7e78e0e7747 > 513edbc936111eda9d1646db17e7240c77a32c42 M fs > hans@orion:/var/diskless/kernel/linux-2.6$ OK. Switching over to email. Chuck, can you see how the rpc_pton() might be breaking nfsroot? The cmdline is of the form root=/dev/nfs nfsroot=10.10.0.2:/nfs/gemini,v3,tcp ro ip=::::::dhcp Cheers Trond On Sun, 2009-10-04 at 15:06 -0400, Trond Myklebust wrote: > Chuck, can you see how the rpc_pton() might be breaking nfsroot? > The cmdline is of the form > > root=/dev/nfs nfsroot=10.10.0.2:/nfs/gemini,v3,tcp ro ip=::::::dhcp I think I see it... The difference is that rpc_pton4() starts with memset(sap, 0, sizeof(struct sockaddr_in)); That clears the port number that was set in nfs_remount(), and so the comparison in nfs_compare_remount_data() fails. Does the following patch fix the problem? -------------------------------------------------------------------------- NFS: Fix port initialisation in nfs_remount() From: Trond Myklebust <Trond.Myklebust@netapp.com> The recent changeset 53a0b9c4c99ab0085a06421f71592722e5b3fd5f (NFS: Replace nfs_parse_ip_address() with rpc_pton()) broke nfs_remount, since the call to rpc_pton() will zero out the port number in data->nfs_server.address. This is actually due to a bug in nfs_remount: it should be looking at the port number in nfs_server.port instead... Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com> --- fs/nfs/super.c | 6 ++++-- 1 files changed, 4 insertions(+), 2 deletions(-) diff --git a/fs/nfs/super.c b/fs/nfs/super.c index 0d14704..fb3b280 100644 --- a/fs/nfs/super.c +++ b/fs/nfs/super.c @@ -1846,9 +1846,10 @@ nfs_compare_remount_data(struct nfs_server *nfss, data->acdirmin != nfss->acdirmin / HZ || data->acdirmax != nfss->acdirmax / HZ || data->timeo != (10U * nfss->client->cl_timeout->to_initval / HZ) || + data->nfs_server.port != nfss->port || data->nfs_server.addrlen != nfss->nfs_client->cl_addrlen || - memcmp(&data->nfs_server.address, &nfss->nfs_client->cl_addr, - data->nfs_server.addrlen) != 0) + !rpc_cmp_addr(&data->nfs_server.address, + &nfss->nfs_client->cl_addr)) return -EINVAL; return 0; @@ -1891,6 +1892,7 @@ nfs_remount(struct super_block *sb, int *flags, char *raw_data) data->acdirmin = nfss->acdirmin / HZ; data->acdirmax = nfss->acdirmax / HZ; data->timeo = 10U * nfss->client->cl_timeout->to_initval / HZ; + data->nfs_server.port = nfss->port; data->nfs_server.addrlen = nfss->nfs_client->cl_addrlen; memcpy(&data->nfs_server.address, &nfss->nfs_client->cl_addr, data->nfs_server.addrlen); Created attachment 23267 [details]
NFS: Fix port initialisation in nfs_remount()
On Oct 4, 2009, at 5:59 PM, Trond Myklebust wrote: > On Sun, 2009-10-04 at 15:06 -0400, Trond Myklebust wrote: >> Chuck, can you see how the rpc_pton() might be breaking nfsroot? >> The cmdline is of the form >> >> root=/dev/nfs nfsroot=10.10.0.2:/nfs/gemini,v3,tcp ro >> ip=::::::dhcp > > I think I see it... > > The difference is that rpc_pton4() starts with > > memset(sap, 0, sizeof(struct sockaddr_in)); > > That clears the port number that was set in nfs_remount(), and so the > comparison in nfs_compare_remount_data() fails. > > Does the following patch fix the problem? > > -------------------------------------------------------------------------- > NFS: Fix port initialisation in nfs_remount() > From: Trond Myklebust <Trond.Myklebust@netapp.com> > > The recent changeset 53a0b9c4c99ab0085a06421f71592722e5b3fd5f (NFS: > Replace > nfs_parse_ip_address() with rpc_pton()) broke nfs_remount, since the > call > to rpc_pton() will zero out the port number in data- > >nfs_server.address. > > This is actually due to a bug in nfs_remount: it should be looking > at the > port number in nfs_server.port instead... The analysis sounds correct. After the options are parsed, nfs_server.port is the port number the user specified on the command line. The user-specified value is saved separately because later the socket address's port number can be changed via an rpcbind. In the case of a remount, if the port value isn't changed by the new options, the client needs to know what the originally specified value was. > Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com> Acked-by: Chuck Lever <chuck.lever@oracle.com> I don't know if this will address the presenting problem, but yes, nfs_remount() is definitely not correct. > --- > > fs/nfs/super.c | 6 ++++-- > 1 files changed, 4 insertions(+), 2 deletions(-) > > > diff --git a/fs/nfs/super.c b/fs/nfs/super.c > index 0d14704..fb3b280 100644 > --- a/fs/nfs/super.c > +++ b/fs/nfs/super.c > @@ -1846,9 +1846,10 @@ nfs_compare_remount_data(struct nfs_server > *nfss, > data->acdirmin != nfss->acdirmin / HZ || > data->acdirmax != nfss->acdirmax / HZ || > data->timeo != (10U * nfss->client->cl_timeout->to_initval / > HZ) || > + data->nfs_server.port != nfss->port || > data->nfs_server.addrlen != nfss->nfs_client->cl_addrlen || > - memcmp(&data->nfs_server.address, &nfss->nfs_client->cl_addr, > - data->nfs_server.addrlen) != 0) > + !rpc_cmp_addr(&data->nfs_server.address, > + &nfss->nfs_client->cl_addr)) > return -EINVAL; > > return 0; > @@ -1891,6 +1892,7 @@ nfs_remount(struct super_block *sb, int > *flags, char *raw_data) > data->acdirmin = nfss->acdirmin / HZ; > data->acdirmax = nfss->acdirmax / HZ; > data->timeo = 10U * nfss->client->cl_timeout->to_initval / HZ; > + data->nfs_server.port = nfss->port; > data->nfs_server.addrlen = nfss->nfs_client->cl_addrlen; > memcpy(&data->nfs_server.address, &nfss->nfs_client->cl_addr, > data->nfs_server.addrlen); > > > -- > To unsubscribe from this list: send the line "unsubscribe linux-nfs" > in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html -- Chuck Lever chuck[dot]lever[at]oracle[dot]com I applied the patch over rc3 and it fixed the issue. -- Hans Fixed by commit bcd2ea17da6a329a7276cde7286d802f009af332 . |