Bug 9003

Summary: oops in nfs4_cb_recall...
Product: File System Reporter: Daniel J Blueman (daniel.blueman)
Component: NFSAssignee: bfields
Status: CLOSED CODE_FIX    
Severity: high CC: akpm, bfields, daniel.blueman, michal.k.k.piotrowski, protasnb
Priority: P1    
Hardware: All   
OS: Linux   
Kernel Version: 2.6.23-rc4 Subsystem:
Regression: --- Bisected commit-id:
Attachments: workaround patch
probable fix
additional fix

Description Daniel J Blueman 2007-09-10 07:35:42 UTC
Distribution:
 - Ubuntu Gutsy Gibbon Tribe 4/5

Hardware Environment:
 - x86-64 on server and client B; i686 on client A
 - via small gigabit network

Problem Description:

When accessing a directory inode from a single other client, NFSv4
callbacks catastrophically failed [1] on the NFS server with
plain 2.6.23-rc4; clients are both 2.6.22 (Ubuntu Gutsy build) - one x86-64 and 
one i686.

Decoded xprt_reserve() at [2]. Let me know if more info is needed.

Most recent kernel where this bug did not occur:
 - seen twice with 2.6.23-rc4, so perhaps 2.6.22/23 is stable in this respect
 - happened twice in two weeks of running 2.6.23-rc4 with 2 clients and daily activity

Steps to reproduce:
 1. login to client A with delegations correctly working on NFSv4
  -> ie ensure 'clientaddr' is set when mounting

 2. use (eg) files and directories in user's homedir

 3. shutdown client A (perhaps unmounting fails?)

 4. login to client B w/ NFSv4 and delegations working
<oops on server>

--- [1]

general protection fault: 0000 [1] SMP
CPU 1
Modules linked in: dvb_usb_dtt200u dvb_usb dvb_core firmware_class
i2c_core uhci_hcd ehci_hcd usbcore
Pid: 24009, comm: nfs4_cb_recall Not tainted 2.6.23-rc4-109 #1
RIP: 0010:[xprt_reserve+217/384]  [xprt_reserve+217/384] xprt_reserve+0xd9/0x180
RSP: 0018:ffff81003905de20  EFLAGS: 00010286
RAX: ffff81000a2600a8 RBX: ffff81003a1d8780 RCX: 4d00000000610d00
RDX: 0b66656403100000 RSI: 0000000000000000 RDI: ffff81000a260000
RBP: ffff81003ebf3000 R08: 0000000000000001 R09: 0000000000000000
R10: 0000000000000001 R11: 0000000000000001 R12: ffff81000a260000
R13: ffff81003ebf3460 R14: 0000000000000000 R15: 0000000000000000
FS:  0000000000000000(0000) GS:ffff81003fdd4180(0000) knlGS:0000000000000000
CS:  0010 DS: 0018 ES: 0018 CR0: 000000008005003b
CR2: 00002b456c102468 CR3: 000000003e047000 CR4: 00000000000006e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff4ff0 DR7: 0000000000000400
Process nfs4_cb_recall (pid: 24009, threadinfo ffff81003905c000, task
ffff81000a358000)
Stack:  ffff810032360040 0000000000000000 ffff81003a1d8780 ffffffff804d2170
 ffff81003a1d8870 ffffffff80483a3b 0000000000000000 ffff81003a1d8780
 ffff81003a1d8780 ffffffff804d2170 ffff81003905ded0 ffffffff8047d876
Call Trace:
 [__rpc_execute+107/656] __rpc_execute+0x6b/0x290
 [rpc_do_run_task+118/208] rpc_do_run_task+0x76/0xd0
 [rpc_call_sync+21/64] rpc_call_sync+0x15/0x40
 [nfsd4_cb_recall+258/304] nfsd4_cb_recall+0x102/0x130
 [do_recall+0/32] do_recall+0x0/0x20
 [do_recall+17/32] do_recall+0x11/0x20
 [kthread+75/128] kthread+0x4b/0x80
 [child_rip+10/18] child_rip+0xa/0x12
 [kthread+0/128] kthread+0x0/0x80

--- [2]

dumping the code at xprt_reserve+0xd9:
(note, taken from a rebuild w/ '-g', so differing offsets, but registers match above)

ffffffff8047f330 <xprt_reserve>:
xprt_reserve():
2.6.23-rc4/net/sunrpc/xprt.c:799
ffffffff8047f330:       48 83 ec 28             sub    $0x28,%rsp
ffffffff8047f334:       48 89 5c 24 08          mov    %rbx,0x8(%rsp)
ffffffff8047f339:       48 89 6c 24 10          mov    %rbp,0x10(%rsp)
ffffffff8047f33e:       48 89 fb                mov    %rdi,%rbx
ffffffff8047f341:       4c 89 6c 24 20          mov    %r13,0x20(%rsp)
ffffffff8047f346:       4c 89 64 24 18          mov    %r12,0x18(%rsp)
2.6.23-rc4/net/sunrpc/xprt.c:800
ffffffff8047f34b:       48 8b 47 20             mov    0x20(%rdi),%rax
ffffffff8047f34f:       48 8b 40 60             mov    0x60(%rax),%rax
2.6.23-rc4/net/sunrpc/xprt.c:802
ffffffff8047f353:       c7 47 30 fb ff ff ff    movl   $0xfffffffb,0x30(%rdi)
2.6.23-rc4/net/sunrpc/xprt.c:803
ffffffff8047f35a:       4c 8d a8 60 04 00 00    lea    0x460(%rax),%r13
ffffffff8047f361:       4c 89 ef                mov    %r13,%rdi
ffffffff8047f364:       e8 57 ec 01 00          callq  ffffffff8049dfc0 <_spin_lock>
do_xprt_reserve():
2.6.23-rc4/net/sunrpc/xprt.c:776
ffffffff8047f369:       48 83 7b 28 00          cmpq   $0x0,0x28(%rbx)
2.6.23-rc4/net/sunrpc/xprt.c:773
ffffffff8047f36e:       48 8b 43 20             mov    0x20(%rbx),%rax
ffffffff8047f372:       48 8b 68 60             mov    0x60(%rax),%rbp
2.6.23-rc4/net/sunrpc/xprt.c:775
ffffffff8047f376:       c7 43 30 00 00 00 00    movl   $0x0,0x30(%rbx)
2.6.23-rc4/net/sunrpc/xprt.c:776
ffffffff8047f37d:       74 21                   je     ffffffff8047f3a0 <xprt_reserve+0x70>
xprt_reserve():
2.6.23-rc4/net/sunrpc/xprt.c:805
ffffffff8047f37f:       4c 89 ef                mov    %r13,%rdi
2.6.23-rc4/net/sunrpc/xprt.c:806
ffffffff8047f382:       48 8b 5c 24 08          mov    0x8(%rsp),%rbx
ffffffff8047f387:       48 8b 6c 24 10          mov    0x10(%rsp),%rbp
ffffffff8047f38c:       4c 8b 64 24 18          mov    0x18(%rsp),%r12
ffffffff8047f391:       4c 8b 6c 24 20          mov    0x20(%rsp),%r13
ffffffff8047f396:       48 83 c4 28             add    $0x28,%rsp
2.6.23-rc4/net/sunrpc/xprt.c:805
ffffffff8047f39a:       e9 11 eb 01 00          jmpq   ffffffff8049deb0 <_spin_unlock>
ffffffff8047f39f:       90                      nop    
do_xprt_reserve():
2.6.23-rc4/net/sunrpc/xprt.c:777
ffffffff8047f3a0:       48 8d 85 60 03 00 00    lea    0x360(%rbp),%rax
2.6.23-rc4/net/sunrpc/xprt.c:778
ffffffff8047f3a7:       48 3b 85 60 03 00 00    cmp    0x360(%rbp),%rax
ffffffff8047f3ae:       75 34                   jne    ffffffff8047f3e4 <xprt_reserve+0xb4>
2.6.23-rc4/net/sunrpc/xprt.c:785
ffffffff8047f3b0:       f6 05 d9 26 5f 00 01    testb  $0x1,0x5f26d9(%rip)        # ffffffff80a71a90 <rpc_debug>
ffffffff8047f3b7:       0f 85 e0 00 00 00       jne    ffffffff8047f49d <xprt_reserve+0x16d>
2.6.23-rc4/net/sunrpc/xprt.c:788
ffffffff8047f3bd:       48 8d bd e0 02 00 00    lea    0x2e0(%rbp),%rdi
2.6.23-rc4/net/sunrpc/xprt.c:786
ffffffff8047f3c4:       c7 43 30 f5 ff ff ff    movl   $0xfffffff5,0x30(%rbx)
2.6.23-rc4/net/sunrpc/xprt.c:787
ffffffff8047f3cb:       48 c7 83 e0 00 00 00    movq   $0x0,0xe0(%rbx)
ffffffff8047f3d2:       00 00 00 00 
2.6.23-rc4/net/sunrpc/xprt.c:788
ffffffff8047f3d6:       31 c9                   xor    %ecx,%ecx
ffffffff8047f3d8:       31 d2                   xor    %edx,%edx
ffffffff8047f3da:       48 89 de                mov    %rbx,%rsi
ffffffff8047f3dd:       e8 2e 4a 00 00          callq  ffffffff80483e10 <rpc_sleep_on>
ffffffff8047f3e2:       eb 9b                   jmp    ffffffff8047f37f <xprt_reserve+0x4f>
2.6.23-rc4/net/sunrpc/xprt.c:779
ffffffff8047f3e4:       4c 8b a5 60 03 00 00    mov    0x360(%rbp),%r12
ffffffff8047f3eb:       49 81 ec a8 00 00 00    sub    $0xa8,%r12
ffffffff8047f3f2:       49 8d 84 24 a8 00 00    lea    0xa8(%r12),%rax
ffffffff8047f3f9:       00 
list_del_init():
2.6.23-rc4/include/linux/list.h:256
ffffffff8047f3fa:       49 8b 8c 24 a8 00 00    mov    0xa8(%r12),%rcx
ffffffff8047f401:       00 
xprt_request_init():
2.6.23-rc4/net/sunrpc/xprt.c:828
ffffffff8047f402:       4c 89 e7                mov    %r12,%rdi
list_del_init():
2.6.23-rc4/include/linux/list.h:256
ffffffff8047f405:       48 8b 50 08             mov    0x8(%rax),%rdx  <--------- failing address at xprt_reserve+0xd9
__list_del():
2.6.23-rc4/include/linux/list.h:157
ffffffff8047f409:       48 89 51 08             mov    %rdx,0x8(%rcx)
2.6.23-rc4/include/linux/list.h:158
ffffffff8047f40d:       48 89 0a                mov    %rcx,(%rdx)
INIT_LIST_HEAD():
2.6.23-rc4/include/linux/list.h:33
ffffffff8047f410:       48 89 40 08             mov    %rax,0x8(%rax)
2.6.23-rc4/include/linux/list.h:32
ffffffff8047f414:       49 89 84 24 a8 00 00    mov    %rax,0xa8(%r12)
ffffffff8047f41b:       00 
do_xprt_reserve():
Comment 1 Andrew Morton 2007-09-10 10:17:24 UTC
Michal, this might be a post-2.6.23 regression.
Comment 2 Michal Piotrowski 2007-09-11 06:25:57 UTC
Could you test the latest -rc?

There were a few NFS fixes in -rc5.
Comment 3 bfields 2007-09-11 15:02:20 UTC
Created attachment 12797 [details]
workaround patch

This isn't a fix, but if it eliminates the symptoms then it would help confirm that the problem is what I think it is.
Comment 4 bfields 2007-09-11 15:05:23 UTC
Looks to me like a symptom of buggy locking and reference counting in the nfsv4 state code.  For example, we're reference counting the structure representing the delegation, but no the structure that contains the rpc_client used to make the callbacks.

This doesn't appear to be a regression.
Comment 5 bfields 2007-09-12 18:51:31 UTC
Created attachment 12809 [details]
probable fix

This is closer to a real fix.  If you can confirm it solves the original problem, that'd be great.

This fix is for a couple races that have existed for a while, so (assuming it's the reported problem) the bug wasn't a regression.
Comment 6 Daniel J Blueman 2007-09-13 02:58:17 UTC
Thanks for the patch, Bruce.

I have been running 2.6.23-rc6 for 1.5 days without problem so far and start testing this latest patch in the next day or so, however it will take >~2 weeks to be confident it addresses the issue, as it's less so easy to hit.

I found that I needed the 'clientaddr' mount option to get delegations working correctly, so perhaps this shows there is less exposure than expected/desired?
Comment 7 bfields 2007-09-27 07:58:44 UTC
Any news?

I've queued up this patch for 2.6.24 (actually, same change broken up into 3 separate patches)--currently in git://linux-nfs.org/~bfields/linux.git nfs-server-stable.
Comment 8 Daniel J Blueman 2007-10-04 06:14:21 UTC
No news - I didn't reproduce the bug in 2.6.23-rc7 after ~2 weeks and am not with this system for a month for further testing.
Comment 9 bfields 2007-11-26 16:23:07 UTC
Created attachment 13765 [details]
additional fix

I believe that this fix (on top of 2.6.24-rc3, which also contains the previous patch) should fix another race that could cause the same oops.
Comment 10 Daniel J Blueman 2007-11-27 02:11:54 UTC
Incidentally, I've been tracking upstream -rc kernel verions, and haven't found symptom of this bug at all with my usual multi-system workload, so it would seem less acute as before.
Comment 11 Natalie Protasevich 2008-02-11 02:44:00 UTC
OK, we can close this bug now. Please reopen if the problem appears again.