Distribution: - Ubuntu Gutsy Gibbon Tribe 4/5 Hardware Environment: - x86-64 on server and client B; i686 on client A - via small gigabit network Problem Description: When accessing a directory inode from a single other client, NFSv4 callbacks catastrophically failed [1] on the NFS server with plain 2.6.23-rc4; clients are both 2.6.22 (Ubuntu Gutsy build) - one x86-64 and one i686. Decoded xprt_reserve() at [2]. Let me know if more info is needed. Most recent kernel where this bug did not occur: - seen twice with 2.6.23-rc4, so perhaps 2.6.22/23 is stable in this respect - happened twice in two weeks of running 2.6.23-rc4 with 2 clients and daily activity Steps to reproduce: 1. login to client A with delegations correctly working on NFSv4 -> ie ensure 'clientaddr' is set when mounting 2. use (eg) files and directories in user's homedir 3. shutdown client A (perhaps unmounting fails?) 4. login to client B w/ NFSv4 and delegations working <oops on server> --- [1] general protection fault: 0000 [1] SMP CPU 1 Modules linked in: dvb_usb_dtt200u dvb_usb dvb_core firmware_class i2c_core uhci_hcd ehci_hcd usbcore Pid: 24009, comm: nfs4_cb_recall Not tainted 2.6.23-rc4-109 #1 RIP: 0010:[xprt_reserve+217/384] [xprt_reserve+217/384] xprt_reserve+0xd9/0x180 RSP: 0018:ffff81003905de20 EFLAGS: 00010286 RAX: ffff81000a2600a8 RBX: ffff81003a1d8780 RCX: 4d00000000610d00 RDX: 0b66656403100000 RSI: 0000000000000000 RDI: ffff81000a260000 RBP: ffff81003ebf3000 R08: 0000000000000001 R09: 0000000000000000 R10: 0000000000000001 R11: 0000000000000001 R12: ffff81000a260000 R13: ffff81003ebf3460 R14: 0000000000000000 R15: 0000000000000000 FS: 0000000000000000(0000) GS:ffff81003fdd4180(0000) knlGS:0000000000000000 CS: 0010 DS: 0018 ES: 0018 CR0: 000000008005003b CR2: 00002b456c102468 CR3: 000000003e047000 CR4: 00000000000006e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000ffff4ff0 DR7: 0000000000000400 Process nfs4_cb_recall (pid: 24009, threadinfo ffff81003905c000, task ffff81000a358000) Stack: ffff810032360040 0000000000000000 ffff81003a1d8780 ffffffff804d2170 ffff81003a1d8870 ffffffff80483a3b 0000000000000000 ffff81003a1d8780 ffff81003a1d8780 ffffffff804d2170 ffff81003905ded0 ffffffff8047d876 Call Trace: [__rpc_execute+107/656] __rpc_execute+0x6b/0x290 [rpc_do_run_task+118/208] rpc_do_run_task+0x76/0xd0 [rpc_call_sync+21/64] rpc_call_sync+0x15/0x40 [nfsd4_cb_recall+258/304] nfsd4_cb_recall+0x102/0x130 [do_recall+0/32] do_recall+0x0/0x20 [do_recall+17/32] do_recall+0x11/0x20 [kthread+75/128] kthread+0x4b/0x80 [child_rip+10/18] child_rip+0xa/0x12 [kthread+0/128] kthread+0x0/0x80 --- [2] dumping the code at xprt_reserve+0xd9: (note, taken from a rebuild w/ '-g', so differing offsets, but registers match above) ffffffff8047f330 <xprt_reserve>: xprt_reserve(): 2.6.23-rc4/net/sunrpc/xprt.c:799 ffffffff8047f330: 48 83 ec 28 sub $0x28,%rsp ffffffff8047f334: 48 89 5c 24 08 mov %rbx,0x8(%rsp) ffffffff8047f339: 48 89 6c 24 10 mov %rbp,0x10(%rsp) ffffffff8047f33e: 48 89 fb mov %rdi,%rbx ffffffff8047f341: 4c 89 6c 24 20 mov %r13,0x20(%rsp) ffffffff8047f346: 4c 89 64 24 18 mov %r12,0x18(%rsp) 2.6.23-rc4/net/sunrpc/xprt.c:800 ffffffff8047f34b: 48 8b 47 20 mov 0x20(%rdi),%rax ffffffff8047f34f: 48 8b 40 60 mov 0x60(%rax),%rax 2.6.23-rc4/net/sunrpc/xprt.c:802 ffffffff8047f353: c7 47 30 fb ff ff ff movl $0xfffffffb,0x30(%rdi) 2.6.23-rc4/net/sunrpc/xprt.c:803 ffffffff8047f35a: 4c 8d a8 60 04 00 00 lea 0x460(%rax),%r13 ffffffff8047f361: 4c 89 ef mov %r13,%rdi ffffffff8047f364: e8 57 ec 01 00 callq ffffffff8049dfc0 <_spin_lock> do_xprt_reserve(): 2.6.23-rc4/net/sunrpc/xprt.c:776 ffffffff8047f369: 48 83 7b 28 00 cmpq $0x0,0x28(%rbx) 2.6.23-rc4/net/sunrpc/xprt.c:773 ffffffff8047f36e: 48 8b 43 20 mov 0x20(%rbx),%rax ffffffff8047f372: 48 8b 68 60 mov 0x60(%rax),%rbp 2.6.23-rc4/net/sunrpc/xprt.c:775 ffffffff8047f376: c7 43 30 00 00 00 00 movl $0x0,0x30(%rbx) 2.6.23-rc4/net/sunrpc/xprt.c:776 ffffffff8047f37d: 74 21 je ffffffff8047f3a0 <xprt_reserve+0x70> xprt_reserve(): 2.6.23-rc4/net/sunrpc/xprt.c:805 ffffffff8047f37f: 4c 89 ef mov %r13,%rdi 2.6.23-rc4/net/sunrpc/xprt.c:806 ffffffff8047f382: 48 8b 5c 24 08 mov 0x8(%rsp),%rbx ffffffff8047f387: 48 8b 6c 24 10 mov 0x10(%rsp),%rbp ffffffff8047f38c: 4c 8b 64 24 18 mov 0x18(%rsp),%r12 ffffffff8047f391: 4c 8b 6c 24 20 mov 0x20(%rsp),%r13 ffffffff8047f396: 48 83 c4 28 add $0x28,%rsp 2.6.23-rc4/net/sunrpc/xprt.c:805 ffffffff8047f39a: e9 11 eb 01 00 jmpq ffffffff8049deb0 <_spin_unlock> ffffffff8047f39f: 90 nop do_xprt_reserve(): 2.6.23-rc4/net/sunrpc/xprt.c:777 ffffffff8047f3a0: 48 8d 85 60 03 00 00 lea 0x360(%rbp),%rax 2.6.23-rc4/net/sunrpc/xprt.c:778 ffffffff8047f3a7: 48 3b 85 60 03 00 00 cmp 0x360(%rbp),%rax ffffffff8047f3ae: 75 34 jne ffffffff8047f3e4 <xprt_reserve+0xb4> 2.6.23-rc4/net/sunrpc/xprt.c:785 ffffffff8047f3b0: f6 05 d9 26 5f 00 01 testb $0x1,0x5f26d9(%rip) # ffffffff80a71a90 <rpc_debug> ffffffff8047f3b7: 0f 85 e0 00 00 00 jne ffffffff8047f49d <xprt_reserve+0x16d> 2.6.23-rc4/net/sunrpc/xprt.c:788 ffffffff8047f3bd: 48 8d bd e0 02 00 00 lea 0x2e0(%rbp),%rdi 2.6.23-rc4/net/sunrpc/xprt.c:786 ffffffff8047f3c4: c7 43 30 f5 ff ff ff movl $0xfffffff5,0x30(%rbx) 2.6.23-rc4/net/sunrpc/xprt.c:787 ffffffff8047f3cb: 48 c7 83 e0 00 00 00 movq $0x0,0xe0(%rbx) ffffffff8047f3d2: 00 00 00 00 2.6.23-rc4/net/sunrpc/xprt.c:788 ffffffff8047f3d6: 31 c9 xor %ecx,%ecx ffffffff8047f3d8: 31 d2 xor %edx,%edx ffffffff8047f3da: 48 89 de mov %rbx,%rsi ffffffff8047f3dd: e8 2e 4a 00 00 callq ffffffff80483e10 <rpc_sleep_on> ffffffff8047f3e2: eb 9b jmp ffffffff8047f37f <xprt_reserve+0x4f> 2.6.23-rc4/net/sunrpc/xprt.c:779 ffffffff8047f3e4: 4c 8b a5 60 03 00 00 mov 0x360(%rbp),%r12 ffffffff8047f3eb: 49 81 ec a8 00 00 00 sub $0xa8,%r12 ffffffff8047f3f2: 49 8d 84 24 a8 00 00 lea 0xa8(%r12),%rax ffffffff8047f3f9: 00 list_del_init(): 2.6.23-rc4/include/linux/list.h:256 ffffffff8047f3fa: 49 8b 8c 24 a8 00 00 mov 0xa8(%r12),%rcx ffffffff8047f401: 00 xprt_request_init(): 2.6.23-rc4/net/sunrpc/xprt.c:828 ffffffff8047f402: 4c 89 e7 mov %r12,%rdi list_del_init(): 2.6.23-rc4/include/linux/list.h:256 ffffffff8047f405: 48 8b 50 08 mov 0x8(%rax),%rdx <--------- failing address at xprt_reserve+0xd9 __list_del(): 2.6.23-rc4/include/linux/list.h:157 ffffffff8047f409: 48 89 51 08 mov %rdx,0x8(%rcx) 2.6.23-rc4/include/linux/list.h:158 ffffffff8047f40d: 48 89 0a mov %rcx,(%rdx) INIT_LIST_HEAD(): 2.6.23-rc4/include/linux/list.h:33 ffffffff8047f410: 48 89 40 08 mov %rax,0x8(%rax) 2.6.23-rc4/include/linux/list.h:32 ffffffff8047f414: 49 89 84 24 a8 00 00 mov %rax,0xa8(%r12) ffffffff8047f41b: 00 do_xprt_reserve():
Michal, this might be a post-2.6.23 regression.
Could you test the latest -rc? There were a few NFS fixes in -rc5.
Created attachment 12797 [details] workaround patch This isn't a fix, but if it eliminates the symptoms then it would help confirm that the problem is what I think it is.
Looks to me like a symptom of buggy locking and reference counting in the nfsv4 state code. For example, we're reference counting the structure representing the delegation, but no the structure that contains the rpc_client used to make the callbacks. This doesn't appear to be a regression.
Created attachment 12809 [details] probable fix This is closer to a real fix. If you can confirm it solves the original problem, that'd be great. This fix is for a couple races that have existed for a while, so (assuming it's the reported problem) the bug wasn't a regression.
Thanks for the patch, Bruce. I have been running 2.6.23-rc6 for 1.5 days without problem so far and start testing this latest patch in the next day or so, however it will take >~2 weeks to be confident it addresses the issue, as it's less so easy to hit. I found that I needed the 'clientaddr' mount option to get delegations working correctly, so perhaps this shows there is less exposure than expected/desired?
Any news? I've queued up this patch for 2.6.24 (actually, same change broken up into 3 separate patches)--currently in git://linux-nfs.org/~bfields/linux.git nfs-server-stable.
No news - I didn't reproduce the bug in 2.6.23-rc7 after ~2 weeks and am not with this system for a month for further testing.
Created attachment 13765 [details] additional fix I believe that this fix (on top of 2.6.24-rc3, which also contains the previous patch) should fix another race that could cause the same oops.
Incidentally, I've been tracking upstream -rc kernel verions, and haven't found symptom of this bug at all with my usual multi-system workload, so it would seem less acute as before.
OK, we can close this bug now. Please reopen if the problem appears again.