Bug 13330

Summary: nfs4 NULL pointer dereference in _nfs4_do_setlk
Product: File System Reporter: Rich Ercolani (rercola)
Component: NFSAssignee: Trond Myklebust (trondmy)
Status: CLOSED CODE_FIX    
Severity: high CC: alan, rjw
Priority: P1    
Hardware: All   
OS: Linux   
Kernel Version: 2.6.30-rc4 Subsystem:
Regression: Yes Bisected commit-id:
Bug Depends on:    
Bug Blocks: 13070    
Attachments: NFSv4 BUG ON log
NFSv4: Fix an Oops in the lock recovery code
NFSv4: Fix NFSv4 async renewal
A new, more different trace with RIP at _setlk
Debugging patch.
A BUG ON with trond's debugging patch
NFS: Ensure we always hold the BKL when dereferencing inode->i_flock
BUG ON with that patch applied
NFSv4: Fix the 'nolock' option regression

Description Rich Ercolani 2009-05-17 04:44:41 UTC
Created attachment 21380 [details]
NFSv4 BUG ON log

My NFS server rebooted.

The machine with the kernel in question, one of many clients, spit out the attached error in dmesg, and all NFS activity on the machine blocked forever, necessitating a reboot.

This is not true on any of the other NFS clients on the network, which vary between 2.6.18 and 2.6.27, so it may be A) 64-bit specific somehow (the rest are 32-bit), B) recently introduced, or C) recently exposed by some existing bad behavior in NFS recovery being removed.

Machine was "vanilla" 2.6.30-rc4 (with commits b827e496c893de0c0f142abfaeb8730a2fd6b37f and 7fdf523067666b0eaff330f362401ee50ce187c4 added), 64-bit. NFSv4 mounted with rw,nosuid,nodev,noatime,hard,intr,nolock,sloppy,rsize=8192,wsize=8192,tcp,timeo=600.

I'll try reproducing this on latest GIT shortly, but it's hard to reproduce (since it only occurs after the NFS server reboots, and not even consistently then), so I don't know when I'll be able to report back that it occurs or not.
Comment 1 Rich Ercolani 2009-05-17 04:47:54 UTC
http://article.gmane.org/gmane.linux.nfs/26502 is a previous trace/occurrence of this, which should demonstrate exactly how hard it is to reproduce.
Comment 2 Trond Myklebust 2009-05-21 21:03:30 UTC
On Sun, 2009-05-17 at 04:44 +0000, bugzilla-daemon@bugzilla.kernel.org
wrote:
> http://bugzilla.kernel.org/show_bug.cgi?id=13330
> 
>            Summary: nfs4 NULL pointer dereference in _nfs4_do_setlk
>            Product: File System
>            Version: 2.5
>     Kernel Version: 2.6.30-rc4
>           Platform: All
>         OS/Version: Linux
>               Tree: Mainline
>             Status: NEW
>           Severity: high
>           Priority: P1
>          Component: NFS
>         AssignedTo: trond.myklebust@fys.uio.no
>         ReportedBy: rercola@acm.jhu.edu
>         Regression: No
> 
> 
> Created an attachment (id=21380)
>  --> (http://bugzilla.kernel.org/attachment.cgi?id=21380)
> NFSv4 BUG ON log
> 
> My NFS server rebooted.
> 
> The machine with the kernel in question, one of many clients, spit out the
> attached error in dmesg, and all NFS activity on the machine blocked forever,
> necessitating a reboot.
> 
> This is not true on any of the other NFS clients on the network, which vary
> between 2.6.18 and 2.6.27, so it may be A) 64-bit specific somehow (the rest
> are 32-bit), B) recently introduced, or C) recently exposed by some existing
> bad behavior in NFS recovery being removed.
> 
> Machine was "vanilla" 2.6.30-rc4 (with commits
> b827e496c893de0c0f142abfaeb8730a2fd6b37f and
> 7fdf523067666b0eaff330f362401ee50ce187c4 added), 64-bit. NFSv4 mounted with
>
> rw,nosuid,nodev,noatime,hard,intr,nolock,sloppy,rsize=8192,wsize=8192,tcp,timeo=600.
> 
> I'll try reproducing this on latest GIT shortly, but it's hard to reproduce
> (since it only occurs after the NFS server reboots, and not even consistently
> then), so I don't know when I'll be able to report back that it occurs or
> not.

Switching to email...

I'm having trouble reproducing this, and staring at the code itself
isn't helping (as far as I can see, the locking using nfsi->rwsem should
work).

Could you therefore please try the attached patch?

Cheers
  Trond
Comment 3 Trond Myklebust 2009-05-21 21:04:42 UTC
Created attachment 21474 [details]
NFSv4: Fix an Oops in the lock recovery code
Comment 4 Trond Myklebust 2009-05-21 21:27:55 UTC
On Thu, 2009-05-21 at 17:03 -0400, Trond Myklebust wrote:
> On Sun, 2009-05-17 at 04:44 +0000, bugzilla-daemon@bugzilla.kernel.org
> wrote:
> > http://bugzilla.kernel.org/show_bug.cgi?id=13330
> > 
> >            Summary: nfs4 NULL pointer dereference in _nfs4_do_setlk
> >            Product: File System
> >            Version: 2.5
> >     Kernel Version: 2.6.30-rc4
> >           Platform: All
> >         OS/Version: Linux
> >               Tree: Mainline
> >             Status: NEW
> >           Severity: high
> >           Priority: P1
> >          Component: NFS
> >         AssignedTo: trond.myklebust@fys.uio.no
> >         ReportedBy: rercola@acm.jhu.edu
> >         Regression: No
> > 
> > 
> > Created an attachment (id=21380)
> >  --> (http://bugzilla.kernel.org/attachment.cgi?id=21380)
> > NFSv4 BUG ON log
> > 
> > My NFS server rebooted.
> > 
> > The machine with the kernel in question, one of many clients, spit out the
> > attached error in dmesg, and all NFS activity on the machine blocked
> forever,
> > necessitating a reboot.
> > 
> > This is not true on any of the other NFS clients on the network, which vary
> > between 2.6.18 and 2.6.27, so it may be A) 64-bit specific somehow (the
> rest
> > are 32-bit), B) recently introduced, or C) recently exposed by some
> existing
> > bad behavior in NFS recovery being removed.
> > 
> > Machine was "vanilla" 2.6.30-rc4 (with commits
> > b827e496c893de0c0f142abfaeb8730a2fd6b37f and
> > 7fdf523067666b0eaff330f362401ee50ce187c4 added), 64-bit. NFSv4 mounted with
> >
> rw,nosuid,nodev,noatime,hard,intr,nolock,sloppy,rsize=8192,wsize=8192,tcp,timeo=600.
> > 
> > I'll try reproducing this on latest GIT shortly, but it's hard to reproduce
> > (since it only occurs after the NFS server reboots, and not even
> consistently
> > then), so I don't know when I'll be able to report back that it occurs or
> not.
> 
> Switching to email...
> 
> I'm having trouble reproducing this, and staring at the code itself
> isn't helping (as far as I can see, the locking using nfsi->rwsem should
> work).
> 
> Could you therefore please try the attached patch?

You might also want to apply the attached fix to the RENEW function. It
corrects a bug that can cause NFSv4 clients to fail to recover state
during the grace period...

Cheers
  Trond
Comment 5 Trond Myklebust 2009-05-21 21:28:18 UTC
Created attachment 21475 [details]
NFSv4: Fix NFSv4 async renewal
Comment 6 Rich Ercolani 2009-05-28 10:52:35 UTC
Created attachment 21600 [details]
A new, more different trace with RIP at _setlk

...well that's different.

NFS behavior appears unaffected - no stuck threads, nothing. Just...the BUG ON in the attachment.

Kernel git 0f6f49a8cd0163fdb1723ed29f01fc65177108dc with the patches in this bug applied.
Comment 7 Trond Myklebust 2009-05-28 12:28:31 UTC
Uuh... You appear to have a lock that contains a pointer to an invalid struct file. That is indeed a novelty.
Comment 8 Trond Myklebust 2009-05-28 19:52:34 UTC
Created attachment 21610 [details]
Debugging patch.
Comment 9 Trond Myklebust 2009-05-28 19:56:50 UTC
Could you try adding this debugging patch in order to see what is going on with the open context here? It should be impossible to trigger the BUG_ON(nfs_file_open_context(request->fl_file)->state == NULL) from nfs4_lock_delegation_recall(), since nfs_delegation_claim_opens() explicitly tests for that condition.
The only explanation I can see, therefore, is a reference leak of some sort.
Comment 10 Rich Ercolani 2009-05-29 20:10:33 UTC
Created attachment 21628 [details]
A BUG ON with trond's debugging patch

Oh look, a BUG ON for you. With your patch applied.
Comment 11 Trond Myklebust 2009-05-29 20:31:14 UTC
Sorry. That's a red herring... That's just the 'hung task' watchdog complaining
because some process was holding the directory mutex for more than 120 seconds.
It is quite normal for that to happen if the server was down for any extended
period of time.

Please just ignore those particular warnings, or turn off the hung task 
watchdog altogether...
Comment 12 Rich Ercolani 2009-05-29 20:34:42 UTC
I'm aware. I've seen them before, I was just including it to show that there was no other interesting printout before the BUG ON at the end.

[And there is, indeed, a BUG ON at the end.]
Comment 13 Trond Myklebust 2009-05-29 21:04:20 UTC
Oh... I missed the one at the very end.

Same apparently impossible state as before the last debugging patch. We pass the
if (nfs_file_open_context(fl->fl_file)->state != state) test in
nfs4_reclaim_locks, then apparently hit the
BUG_ON(nfs_file_open_context(fl->fl_file)->state == NULL);
in _nfs4_do_setlk. Could you please check that line 3472 in fs/nfs/nfs4proc.c
matches mine, and is indeed that particular BUG_ON?
Comment 14 Rich Ercolani 2009-05-29 21:43:41 UTC
line 3472:
	BUG_ON(fl->fl_ops == NULL);
in _do_setlk.
Comment 15 Trond Myklebust 2009-05-29 22:00:08 UTC
OK. That explains it. Your nfs4proc.c is shifted by 2 lines...
Comment 16 Rich Ercolani 2009-05-30 02:46:06 UTC
I'm reverting your patch, incidentally - the machine no longer stays up for longer than 30 minutes with it, and BUG ONs within 10 of boot.
Comment 17 Rafael J. Wysocki 2009-05-30 21:41:19 UTC
Handled-By : Trond Myklebust <trond.myklebust@fys.uio.no>
Comment 18 Trond Myklebust 2009-06-02 13:54:29 UTC
Created attachment 21710 [details]
NFS: Ensure we always hold the BKL when dereferencing inode->i_flock

This patch shouldn't really make a difference if you're just doing NFSv4, but it doesn't hurt to try it. If you are running a samba server on top of NFS (or some other program that uses leases), then it might explain what is going on...
Comment 19 Rich Ercolani 2009-06-04 17:27:39 UTC
Created attachment 21751 [details]
BUG ON with that patch applied

"Great."
Comment 20 Trond Myklebust 2009-06-04 18:28:22 UTC
Oh for crying out loud... I missed your 'nolock' mount option. Of course that's
going to fail...
Comment 21 Trond Myklebust 2009-06-04 18:45:19 UTC
Created attachment 21753 [details]
NFSv4: Fix the 'nolock' option regression

NFSv4 should just ignore the 'nolock' option. It is an NFSv2/v3 thing...
This fixes the Oops in http://bugzilla.kernel.org/show_bug.cgi?id=13330
Comment 22 Trond Myklebust 2009-06-05 14:33:04 UTC
Does the above patch indeed fix the Oops? It should turn off the 'nolock' option
and thus ensure that locks are managed by the NFSv4 code.
The alternative, if we want to support 'nolock' on NFSv4 (we haven't earlier) is
simply to turn off lock recovery for those mounts.
Comment 23 Rafael J. Wysocki 2009-06-07 21:03:52 UTC
On Sunday 07 June 2009, Trond Myklebust wrote:
> On Sun, 2009-06-07 at 11:52 +0200, Rafael J. Wysocki wrote:
> > This message has been generated automatically as a part of a report
> > of recent regressions.
> > 
> > The following bug entry is on the current list of known regressions
> > from 2.6.29.  Please verify if it still should be listed and let me know
> > (either way).
> > 
> > 
> > Bug-Entry   : http://bugzilla.kernel.org/show_bug.cgi?id=13330
> > Subject             : nfs4 NULL pointer dereference in _nfs4_do_setlk
> > Submitter   : Rich Ercolani <rercola@acm.jhu.edu>
> > Date                : 2009-05-17 04:44 (22 days old)
> > Handled-By  : Trond Myklebust <trond.myklebust@fys.uio.no>
> 
> It should still be listed. I do believe I've finally identified the
> cause, and have provided a patch, however that fix has not yet been
> verified by Rich...
Comment 24 Rich Ercolani 2009-06-07 21:32:08 UTC
Confirm. I've tried hard to trigger this again, to no effect. :)
Comment 25 Trond Myklebust 2009-06-07 21:41:23 UTC
Thanks. I'll send it to Linus tomorrow...