Bug 7916

Summary: kernel reports "kernel BUG at fs/inode.c:1117"
Product: File System Reporter: richlv
Component: VFSAssignee: Neil Brown (neilb)
Status: RESOLVED CODE_FIX    
Severity: normal CC: neilb
Priority: P2    
Hardware: i386   
OS: Linux   
Kernel Version: 2.6.19.2 Subsystem:
Regression: --- Bisected commit-id:
Attachments: Patch which might fix the problem
Revised patch

Description richlv 2007-02-01 02:07:21 UTC
summary is pretty uninformative, but i can't think of a better one.

Distribution: slackware 11.0
Hardware Environment: Supermicro X6DH8-G
Software Environment: 2.6.19.2 using ext3 and nfs.
Problem Description:

following messages were printed (inbetween "cut here" marks).

kernel BUG at fs/inode.c:1117!
invalid opcode: 0000 [#1]
SMP
Modules linked in:
CPU:    1
EIP:    0060:[<c016e651>]    Not tainted VLI
EFLAGS: 00010246   (2.6.19.2 #5)
EIP is at iput+0x5b/0x65
eax: c04935a0   ebx: c87b20a8   ecx: 00000000   edx: 00000000
esi: 00000000   edi: f786e2c0   ebp: 00000000   esp: dbf69f20
ds: 007b   es: 007b   ss: 0068
Process nfsd (pid: 25285, ti=dbf68000 task=f7b58a70 task.ti=dbf68000)
Stack: f74d6000 c03d5c62 f74d6000 00000000 c03d5ca1 f786e2c0 f74d6000 00000000
       00000042 c03d75c9 f786e2f8 f786e2c0 c03d7702 00000000 f75298f4 f79fd080
       f780e3c0 00057e40 00000000 f7b58a70 c011727a 00000000 00000000 0000003d
Call Trace:
 [<c03d5c62>] svc_sock_release+0xde/0x149
 [<c03d5ca1>] svc_sock_release+0x11d/0x149
 [<c03d75c9>] svc_recv+0x39a/0x415
 [<c03d7702>] svc_send+0x8f/0x11b
 [<c011727a>] default_wake_function+0x0/0xc
 [<c011727a>] default_wake_function+0x0/0xc
 [<c01f5439>] nfsd+0xc5/0x280
 [<c01f5374>] nfsd+0x0/0x280
 [<c010380f>] kernel_thread_helper+0x7/0x10
 =======================
Code: 00 85 c0 74 1e 8b 83 98 00 00 00 ba e5 e5 16 c0 8b 40 20 85 c0 74 08 8b 40 18 85 c0 0f 45 d0 
89 d8 ff d2 5b c3 89 d8 ff d2 eb c9 <0f> 0b 5d 04 80 9d 40 c0 eb b4 83 ec 10 89 7c 24 08 89 6c 24 0c
EIP: [<c016e651>] iput+0x5b/0x65 SS:ESP 0068:dbf69f20
 <5>rpc-srv/tcp: nfsd: got error -104 when sending 4228 bytes - shutting down socket
Comment 1 Neil Brown 2007-02-01 18:27:10 UTC
What version of nfs-utils (showmount -V)?
Were you starting or stopping things at the time, or did it just
die while in normal use?
Has this happened more than once (i.e. is it repeatable?)
Did any of the clients crash or do something unusual?

Thanks.
NeilBrown
Comment 2 richlv 2007-02-01 23:57:33 UTC
showmount complained about uppercase V, but gave up with lowercase v :)
1.0.10

the problem happened during normal system use (though system was having somewhat high load, 
mostly from nfs).
it is not happening regularly, but it seems to have happened twice, with about a minute inbetween.

hard to tell about clients, most are not under my control.
Comment 3 Neil Brown 2007-02-04 21:55:36 UTC
Created attachment 10282 [details]
Patch which might fix the problem

I think this patch might fix the problem.
Comment 4 Neil Brown 2007-02-04 21:56:20 UTC
Are you in a position to test the patch?
Has it happened against since the two times you report?

Thanks.
Comment 5 richlv 2007-02-05 02:42:20 UTC
patch results for 2.6.19.2 :
Hunk #1 succeeded at 499 (offset -100 lines).
Hunk #2 succeeded at 1243 (offset -52 lines).
Hunk #3 succeeded at 1664 with fuzz 1 (offset -108 lines).

patch results for 2.6.20 :
Hunk #1 succeeded at 529 (offset -70 lines).
Hunk #2 succeeded at 1247 (offset -48 lines).
Hunk #3 succeeded at 1698 with fuzz 1 (offset -74 lines).

so i applied the patch to 2.6.20 and that's what the machine is running now.
the problem has not occurred again, but there also has not been such a prolonged period of high 
load since then.
Comment 6 Neil Brown 2007-02-06 15:57:23 UTC
Created attachment 10322 [details]
Revised patch

That patch had a bug - when you shut down the NFS server it will not close the
sockets properly and a reboot will be required.  That probably won't be a
problem
for you as people rarely shut down the NFS server except when they want to
reboot.

The attached patch is against 2.6.20 and fixes this bug.  I will submit it for
2.6.20.1
Thanks.
Comment 7 richlv 2007-02-06 22:47:01 UTC
indeed, nfs server shutdown is not something that occurs too often.

thanks for the fast response and fix, i suppose the report can be set to 'resolved' now :)
ah. i can do that myself. please, correct the status if i get it wrong.