Bug 13998

Summary: fscache+nfs BUG: unable to handle kernel NULL pointer dereference at 0000000000000040
Product: File System Reporter: joshua (jlevine)
Component: OtherAssignee: David Howells (dhowells)
Status: RESOLVED OBSOLETE    
Severity: normal CC: akpm, alan, dhowells, esmil, jlevine, nyerup
Priority: P1    
Hardware: All   
OS: Linux   
Kernel Version: 2.6.38.8 Subsystem:
Regression: No Bisected commit-id:
Attachments: 2.6.38.8 kernel log with null pointer dereference

Description joshua 2009-08-16 17:30:32 UTC
Aug 16 10:09:37 dload04 kernel: [38708.623072] BUG: unable to handle kernel NULL pointer dereference at 0000000000000040
Aug 16 10:09:37 dload04 kernel: [38708.623179] IP: [<ffffffffa01b12ea>] fscache_object_slow_work_execute+0x7ba/0xc30 [fscache]
Aug 16 10:09:37 dload04 kernel: [38708.623292] PGD 11d985067 PUD 1035a2067 PMD 0 
Aug 16 10:09:37 dload04 kernel: [38708.623351] Oops: 0002 [#1] SMP 
Aug 16 10:09:37 dload04 kernel: [38708.623403] last sysfs file: /sys/kernel/uevent_seqnum
Aug 16 10:09:37 dload04 kernel: [38708.623460] CPU 3 
Aug 16 10:09:37 dload04 kernel: [38708.623504] Modules linked in: cachefiles autofs4 nfs lockd fscache nfs_acl auth_rpcgss sunrpc iptable_filter ip_tables x_tables usbhid psmouse serio_raw snd_pcm snd_timer snd soundcore snd_page_alloc amd64_edac_mod parport_pc parport i2c_amd8111 shpchp edac_core pcspkr i2c_amd756 amd_rng k8temp 3w_xxxx sata_mv tg3 floppy
Aug 16 10:09:37 dload04 kernel: [38708.623850] Pid: 6667, comm: kslowd Not tainted 2.6.31-rc6-iwin-linus-cachefs #1 H8DAR-T
Aug 16 10:09:37 dload04 kernel: [38708.623950] RIP: 0010:[<ffffffffa01b12ea>]  [<ffffffffa01b12ea>] fscache_object_slow_work_execute+0x7ba/0xc30 [fscache]
Aug 16 10:09:37 dload04 kernel: [38708.624074] RSP: 0018:ffff88000cc9fe40  EFLAGS: 00010246
Aug 16 10:09:37 dload04 kernel: [38708.624131] RAX: 0000000000000000 RBX: ffff880068324e38 RCX: ffffffffa023d005
Aug 16 10:09:37 dload04 kernel: [38708.624195] RDX: 0000000000000000 RSI: ffffc90000610740 RDI: ffff880068324ddc
Aug 16 10:09:37 dload04 kernel: [38708.624259] RBP: ffff88000cc9fe70 R08: 0000000000000000 R09: ffffffff81788200
Aug 16 10:09:37 dload04 kernel: [38708.624323] R10: 0000000000000001 R11: 0000000000000000 R12: ffffffff818d4c20
Aug 16 10:09:37 dload04 kernel: [38708.624387] R13: ffff880068324ddc R14: ffff880068324dc0 R15: 0000000000093620
Aug 16 10:09:37 dload04 kernel: [38708.624452] FS:  0000000046c84950(0000) GS:ffffc90000600000(0000) knlGS:0000000000000000
Aug 16 10:09:37 dload04 kernel: [38708.624552] CS:  0010 DS: 0018 ES: 0018 CR0: 000000008005003b
Aug 16 10:09:37 dload04 kernel: [38708.624612] CR2: 0000000000000040 CR3: 00000001180fb000 CR4: 00000000000006a0
Aug 16 10:09:37 dload04 kernel: [38708.624677] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
Aug 16 10:09:37 dload04 kernel: [38708.624743] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Aug 16 10:09:37 dload04 kernel: [38708.624808] Process kslowd (pid: 6667, threadinfo ffff88000cc9e000, task ffff880066b696b0)
Aug 16 10:09:37 dload04 kernel: [38708.624907] Stack:
Aug 16 10:09:37 dload04 kernel: [38708.624949]  000000010092846b ffff880068324e38 ffffffff818d4c20 0000000000000001
Aug 16 10:09:37 dload04 kernel: [38708.625018] <0> ffff88000cc9fe80 0000000000093620 ffff88000cc9fee0 ffffffff810bf0cf
Aug 16 10:09:37 dload04 kernel: [38708.625123] <0> 0000000000000001 ffff880066b696b0 ffffffff81069450 ffff88000cc9fe98
Aug 16 10:09:37 dload04 kernel: [38708.625264] Call Trace:
Aug 16 10:09:37 dload04 kernel: [38708.625318]  [<ffffffff810bf0cf>] slow_work_thread+0x2df/0x4b0
Aug 16 10:09:37 dload04 kernel: [38708.625382]  [<ffffffff81069450>] ? autoremove_wake_function+0x0/0x40
Aug 16 10:09:37 dload04 kernel: [38708.625447]  [<ffffffff810bedf0>] ? slow_work_thread+0x0/0x4b0
Aug 16 10:09:37 dload04 kernel: [38708.625507]  [<ffffffff81069026>] kthread+0xa6/0xb0
Aug 16 10:09:37 dload04 kernel: [38708.625566]  [<ffffffff8179f140>] ? early_idt_handler+0x0/0x71
Aug 16 10:09:37 dload04 kernel: [38708.625627]  [<ffffffff8100cf4a>] child_rip+0xa/0x20
Aug 16 10:09:37 dload04 kernel: [38708.625683]  [<ffffffff8179f140>] ? early_idt_handler+0x0/0x71
Aug 16 10:09:37 dload04 kernel: [38708.625743]  [<ffffffff81068f80>] ? kthread+0x0/0xb0
Aug 16 10:09:37 dload04 kernel: [38708.625799]  [<ffffffff8100cf40>] ? child_rip+0x0/0x20
Aug 16 10:09:37 dload04 kernel: [38708.625855] Code: ff ff 49 8b 46 60 49 83 66 28 fd 4d 8d 6e 1c 4c 89 f7 48 8b 00 ff 50 18 4c 89 ef e8 e1 90 32 e1 41 c7 06 06 00 00 00 49 8b 46 68 <f0> 0f ba 70 40 01 19 d2 85 d2 74 12 49 8b 7e 68 be 01 00 00 00 
Aug 16 10:09:37 dload04 kernel: [38708.626135] RIP  [<ffffffffa01b12ea>] fscache_object_slow_work_execute+0x7ba/0xc30 [fscache]
Aug 16 10:09:37 dload04 kernel: [38708.626135]  RSP <ffff88000cc9fe40>
Aug 16 10:09:37 dload04 kernel: [38708.626135] CR2: 0000000000000040
Aug 16 10:09:37 dload04 kernel: [38708.626854] ---[ end trace 1460164d9f48e424 ]---
Comment 1 joshua 2009-08-16 17:38:49 UTC
Aug 16 10:14:13 dload04 kernel: [38984.345500] BUG: soft lockup - CPU#3 stuck for 61s! [cachefilesd:6652]
Aug 16 10:14:13 dload04 kernel: [38984.345506] Modules linked in: cachefiles autofs4 nfs lockd fscache nfs_acl auth_rpcgss sunrpc iptable_filter ip_tables x_tables usbhid psmouse serio_raw snd_pcm snd_timer snd soundcore snd_page_alloc amd64_edac_mod parport_pc parport i2c_amd8111 shpchp edac_core pcspkr i2c_amd756 amd_rng k8temp 3w_xxxx sata_mv tg3 floppy
Aug 16 10:14:13 dload04 kernel: [38984.345506] CPU 3:
Aug 16 10:14:13 dload04 kernel: [38984.345506] Modules linked in: cachefiles autofs4 nfs lockd fscache nfs_acl auth_rpcgss sunrpc iptable_filter ip_tables x_tables usbhid psmouse serio_raw snd_pcm snd_timer snd soundcore snd_page_alloc amd64_edac_mod parport_pc parport i2c_amd8111 shpchp edac_core pcspkr i2c_amd756 amd_rng k8temp 3w_xxxx sata_mv tg3 floppy
Aug 16 10:14:13 dload04 kernel: [38984.349016] Pid: 6652, comm: cachefilesd Tainted: G      D    2.6.31-rc6-iwin-linus-cachefs #1 H8DAR-T
Aug 16 10:14:13 dload04 kernel: [38984.349016] RIP: 0010:[<ffffffff8102a156>]  [<ffffffff8102a156>] __ticket_spin_lock+0x16/0x20
Aug 16 10:14:13 dload04 kernel: [38984.349016] RSP: 0018:ffff8800785ddcd8  EFLAGS: 00000297
Aug 16 10:14:13 dload04 kernel: [38984.349016] RAX: 0000000000000605 RBX: ffff8800785ddcd8 RCX: 00000000c0000100
Aug 16 10:14:13 dload04 kernel: [38984.349016] RDX: ffff8800785ddd48 RSI: 0000000000000003 RDI: ffff880068324ddc
Aug 16 10:14:13 dload04 kernel: [38984.349016] RBP: ffffffff8100ca2e R08: ffff8800785dc000 R09: 00000000ffffffff
Aug 16 10:14:13 dload04 kernel: [38984.349016] R10: ffff8801385c4448 R11: 0000000000000000 R12: 0000000000013e40
Aug 16 10:14:13 dload04 kernel: [38984.349016] R13: 0000000000013e40 R14: 0000000000013e40 R15: 0000000000013e40
Aug 16 10:14:13 dload04 kernel: [38984.349016] FS:  00007f6a7da576e0(0000) GS:ffffc90000600000(0000) knlGS:0000000000000000
Aug 16 10:14:13 dload04 kernel: [38984.349016] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
Aug 16 10:14:13 dload04 kernel: [38984.349016] CR2: 00007f20e8604cf0 CR3: 00000000bde80000 CR4: 00000000000006a0
Aug 16 10:14:13 dload04 kernel: [38984.349016] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
Aug 16 10:14:13 dload04 kernel: [38984.349016] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Aug 16 10:14:13 dload04 kernel: [38984.349016] Call Trace:
Aug 16 10:14:13 dload04 kernel: [38984.349016]  [<ffffffff8104a52c>] ? __cond_resched+0x1c/0x50
Aug 16 10:14:13 dload04 kernel: [38984.349016]  [<ffffffff814da3c9>] ? _spin_lock+0x9/0x10
Aug 16 10:14:13 dload04 kernel: [38984.349016]  [<ffffffffa01ae56c>] ? fscache_withdraw_cache+0x10c/0x3d0 [fscache]
Aug 16 10:14:13 dload04 kernel: [38984.349016]  [<ffffffff810cff72>] ? release_pages+0x1d2/0x210
Aug 16 10:14:13 dload04 kernel: [38984.349016]  [<ffffffff814da3c9>] ? _spin_lock+0x9/0x10
Aug 16 10:14:13 dload04 kernel: [38984.349016]  [<ffffffffa023b950>] ? cachefiles_daemon_unbind+0xc0/0xf0 [cachefiles]
Aug 16 10:14:13 dload04 kernel: [38984.349016]  [<ffffffffa023ba27>] ? cachefiles_daemon_release+0x37/0x110 [cachefiles]
Aug 16 10:14:13 dload04 kernel: [38984.349016]  [<ffffffff811071de>] ? __fput+0xee/0x210
Aug 16 10:14:13 dload04 kernel: [38984.349016]  [<ffffffff81107587>] ? fput+0x17/0x20
Aug 16 10:14:13 dload04 kernel: [38984.349016]  [<ffffffff81103f28>] ? filp_close+0x58/0x90
Aug 16 10:14:13 dload04 kernel: [38984.349016]  [<ffffffff81051dac>] ? put_files_struct+0xcc/0xe0
Aug 16 10:14:13 dload04 kernel: [38984.349016]  [<ffffffff81051e10>] ? exit_files+0x50/0x60
Aug 16 10:14:13 dload04 kernel: [38984.349016]  [<ffffffff810529df>] ? do_exit+0x17f/0x800
Aug 16 10:14:13 dload04 kernel: [38984.349016]  [<ffffffff814dcd1e>] ? do_page_fault+0x1ae/0x350
Aug 16 10:14:13 dload04 kernel: [38984.349016]  [<ffffffff8105309e>] ? do_group_exit+0x3e/0xb0
Aug 16 10:14:13 dload04 kernel: [38984.349016]  [<ffffffff81053122>] ? sys_exit_group+0x12/0x20
Aug 16 10:14:13 dload04 kernel: [38984.349016]  [<ffffffff8100be82>] ? system_call_fastpath+0x16/0x1b
Aug 16 10:14:16 dload04 varnishd[5325]: Child (5326) not responding to ping, killing it.
Comment 2 joshua 2009-08-17 18:10:48 UTC
The patch referenced here:

http://bugzilla.kernel.org/show_bug.cgi?id=13469

Was applied prior to the build.

Thank you,
Joshua
Comment 3 Andrew Morton 2009-08-19 18:35:42 UTC
Assigned to David.
Comment 4 Jesper Nyerup 2011-06-22 14:25:55 UTC
This appears to still be an issue in kernel 2.6.38.8.

I can reproduce this with an NFS backend, using cachefilesd to register a cache on both XFS and EXT3 file systems.

See attached kernel log for details, and feel free to request additional details.
Comment 5 Jesper Nyerup 2011-06-22 14:27:43 UTC
Created attachment 63212 [details]
2.6.38.8 kernel log with null pointer dereference
Comment 6 Alan 2013-12-10 16:47:46 UTC
If this is still seen with modern kernels please re-open