Bug 7809

Summary: __lookup_hash() dereferences NULL pointer for pointer-to-function in fs/namei.c - causes oops.
Product: File System Reporter: James Vanns (james.vanns)
Component: VFSAssignee: fs_vfs
Status: REJECTED INVALID    
Severity: normal CC: cw
Priority: P2    
Hardware: i386   
OS: Linux   
Kernel Version: 2.6.9-42.0.2.RHEL4 Subsystem:
Regression: --- Bisected commit-id:
Attachments: Patch to namei.c for testing NULL pointers.
Patch to namei.c for testing NULL pointers.

Description James Vanns 2007-01-11 04:32:24 UTC
Most recent kernel where this bug did *NOT* occur: No idea
Distribution: Redhat Enterprise Linux 4 (RHEL4)
Hardware Environment: HP ProLiant DL380 G4 (IA32)
Software Environment: Dedicated NFS file server (knfsd over 6.4T XFS file system)
Problem Description:

Kernel Oops here:

Jan 10 10:17:51 ss2a kernel: Unable to handle kernel NULL pointer dereference at
virtual address 00000000 
Jan 10 10:17:51 ss2a kernel:  printing eip: 
Jan 10 10:17:51 ss2a kernel: 00000000 
Jan 10 10:17:51 ss2a kernel: *pde = dda19067 
Jan 10 10:17:51 ss2a kernel: Oops: 0000 [#1] 
... 
Jan 10 10:17:51 ss2a kernel: CPU:    0 
Jan 10 10:17:51 ss2a kernel: EIP:    0060:[<00000000>]    Not tainted VLI 
Jan 10 10:17:51 ss2a kernel: EFLAGS: 00010286   (2.6.9-42.0.2.EL.R7smp) 
Jan 10 10:17:51 ss2a kernel: EIP is at 0x0 
Jan 10 10:17:51 ss2a kernel: eax: edfef1b4   ebx: c042f660   ecx: 00000000  
edx: d42fe4f0 
Jan 10 10:17:51 ss2a kernel: esi: d42fe4f0   edi: f1ef1ef8   ebp: 00000000  
esp: f1ef1edc 
Jan 10 10:17:51 ss2a kernel: ds: 007b   es: 007b   ss: 0068 
Jan 10 10:17:51 ss2a kernel: Process nfsd (pid: 3560, threadinfo=f1ef0000
task=f1ec8f10) 
Jan 10 10:17:51 ss2a kernel: Stack: c0160bbd edfef1b4 ffffffff f60ec07b 636378f7
f518ef58 c0160c35 636378f7 
Jan 10 10:17:51 ss2a kernel:        00000013 f60ec068 c033b240 f518ef58 f518ef58
00000013 f60eb004 c0160c4b 
Jan 10 10:17:51 ss2a kernel:        00000000 f8c69e13 f60ec068 c31e2e00 f7dc1e40
c31e2e00 f8c1838a f60eb800 
Jan 10 10:17:51 ss2a kernel: Call Trace: 
Jan 10 10:17:51 ss2a kernel:  [<c0160bbd>] __lookup_hash+0x70/0x89 
Jan 10 10:17:51 ss2a kernel:  [<c0160c35>] lookup_one_len_it+0x58/0x67 
Jan 10 10:17:51 ss2a kernel:  [<c0160c4b>] lookup_one_len+0x7/0x9 
Jan 10 10:17:51 ss2a kernel:  [<f8c69e13>] nfsd_lookup+0x304/0x390 [nfsd] 
Jan 10 10:17:51 ss2a kernel:  [<f8c1838a>] svcauth_unix_set_client+0xa7/0xb5
[sunrpc] 
Jan 10 10:17:51 ss2a kernel:  [<f8c720ad>] nfsd3_proc_lookup+0xa9/0xb3 [nfsd] 
Jan 10 10:17:51 ss2a kernel:  [<f8c740c8>] nfs3svc_decode_diropargs+0x0/0xfa [nfsd] 
Jan 10 10:17:51 ss2a kernel:  [<f8c67685>] nfsd_dispatch+0xba/0x16d [nfsd] 
Jan 10 10:17:51 ss2a kernel:  [<f8c15559>] svc_process+0x432/0x6e1 [sunrpc] 
Jan 10 10:17:51 ss2a kernel:  [<f8c6745e>] nfsd+0x1cc/0x339 [nfsd] 
Jan 10 10:17:51 ss2a kernel:  [<f8c67292>] nfsd+0x0/0x339 [nfsd] 
Jan 10 10:17:51 ss2a kernel:  [<c01041f5>] kernel_thread_helper+0x5/0xb 
Jan 10 10:17:51 ss2a kernel: Code:  Bad EIP value. 
Jan 10 10:17:51 ss2a kernel:  <0>Fatal exception: panic in 5 seconds 
... 

Calls to symbols found in System.map suggests this is the correct 'area', my
attempt at ASM disassembly 70 bytes into __lookup_hash() (0xc0160bbd) shows:

0xc0160ba2:     call   0xc0168d13 ; call - 'd_alloc()' 
0xc0160ba7:     test   %eax,%eax ; test for zero/null pointer 'if (!new)' 
0xc0160ba9:     mov    %eax,%esi ; dentry = ERR_PTR(-ENOMEM); 
0xc0160bab:     je     0xc0160bce ; jump if equal to VA 0xc0160bce - 'goto out;' 
0xc0160bad:     mov    (%esp),%eax ; parameter - 'nd,' 
0xc0160bb0:     mov    %ebp,%ecx ; parameter - 'new,' 
0xc0160bb2:     mov    %esi,%edx ; parameter - 'inode' 
0xc0160bb4:     mov    0x90(%eax),%ebx ; inode->i_op-> 
0xc0160bba:     call   *0x4(%ebx) ; FS specific lookup - lookup(); 
0xc0160bbd:     test   %eax,%eax ; test for zero/null pointer - 'if (!dentry)' 
; ... 
0xc0160bce:     pop    %esi ; begin to restore stack - '} out:' 
; ... 
0xc0160bd5:     ret ; return to caller (exit function) 'return dentry;}'

Which corresponds to this section of the __lookup_hash() function in
/usr/src/linux/fs/namei.c:

if (!dentry) { 
   struct dentry *new = d_alloc(base, name); 
   dentry = ERR_PTR(-ENOMEM); 
   if (!new) 
      goto out; 
   dentry = inode->i_op->lookup(inode, new, nd); 
   if (!dentry) 
      dentry = new; 
   else
      dput(new); 
}

Steps to reproduce:

Cannot reproduce easily in a production env. but we see this on at least one
server out of 50 approximately once a week. We use large XFS file systems
exported via knfsd and gets used heavily (large files, lots of I/O).

I believe I have diagnosed correctly (if not I'd appreciate a pointer as to
where the *real* problem is) and have provided this patch (below). But I cannot
suggested why it is happening. Someone with more experience with this FS (VFS?)
layer could perhaps suggest why this pointer is NULL and indeed the FS dependent
call to lookup() via i_op is essentially corrupt.

Indeed EIP does seem to want to execute 0 because the function pointer lookup()
dereferenced by inode->i_op is never checked for NULL - neither, in fact, are
the inode and i_op pointers.

Patch:

--- namei.c     2007-01-10 17:33:54.133034000 +0000 
+++ namei.jvanns        2007-01-11 11:39:02.201418217 +0000 
@@ -1280,7 +1280,15 @@ 
                dentry = ERR_PTR(-ENOMEM); 
                if (!new) 
                        goto out; 
-               dentry = inode->i_op->lookup(inode, new, nd); 
+ 
+               if (!inode) 
+                       printk ("FSCFC: __lookup_hash(): NULL pointer 'inode' on
line %d.\n", __LINE__); 
+               else if (!inode->i_op) 
+                       printk ("FSCFC: __lookup_hash(): NULL pointer
'inode->i_op' (Dev #: %0#8x, Inode #: %lu) on line %d.\n", 
+                               inode->i_rdev, inode->i_ino, __LINE__); 
+               else 
+                       dentry = inode->i_op->lookup(inode, new, nd); 
+                
                if (!dentry) 
                        dentry = new; 
                else

Ignore printk() statements if needed - this is for our purposes.

Regards,

Jim Vanns
Comment 1 James Vanns 2007-01-11 04:33:54 UTC
Created attachment 10056 [details]
Patch to namei.c for testing NULL pointers.
Comment 2 James Vanns 2007-01-11 04:37:58 UTC
Comment on attachment 10056 [details]
Patch to namei.c for testing NULL pointers.

PLEASE IGNORE THIS PATCH. CHOOSE THE OTHER, NEWER PATCH.
Comment 3 James Vanns 2007-01-11 04:40:11 UTC
Created attachment 10057 [details]
Patch to namei.c for testing NULL pointers.

New patch actually tests lookup() for NULL.
Comment 4 Ian Soboroff 2008-06-23 11:36:20 UTC
I have this same oops, and reported it to lkml.

Daniel Blueman (daniel dot blueman at gmail) suggested that 4KSTACKS might be triggering this.  My kernel, being the stock RHEL kernel with XFS enabled, has 4KSTACKS enabled.  I will try running without 4KSTACKS, but it's hard to trigger this oops.
Comment 5 Alan 2008-09-23 09:50:07 UTC
Sorry this should have been closed ages ago - RHEL specific kernel bugs should go to the vendor and won't be handled here as the RHEL kernel has deviated quite a bit from upstream as it is an older kernel with backports