Bug 6854 - NFS-stress triggers Kernel BUG at mm/truncate.c:76
Summary: NFS-stress triggers Kernel BUG at mm/truncate.c:76
Status: CLOSED CODE_FIX
Alias: None
Product: File System
Classification: Unclassified
Component: NFS (show other bugs)
Hardware: i386 Linux
: P2 normal
Assignee: Trond Myklebust
URL:
Keywords:
: 6852 6853 (view as bug list)
Depends on:
Blocks:
 
Reported: 2006-07-17 11:24 UTC by Johan van Baarlen
Modified: 2006-08-24 02:18 UTC (History)
0 users

See Also:
Kernel Version: 2.6.17.6, x86_64
Subsystem:
Regression: ---
Bisected commit-id:


Attachments
Patch for NFS-instability on 2.6.15.x (2.22 KB, patch)
2006-08-13 03:04 UTC, Johan van Baarlen
Details | Diff
Fix a potential deadlock in nfs_release_page (997 bytes, patch)
2006-08-13 10:43 UTC, Trond Myklebust
Details | Diff

Description Johan van Baarlen 2006-07-17 11:24:06 UTC
Running a machine with 8 dualcore Opterons 875 (16cores total), which makes it 
far more sensitive to locking-, timing- and racing bugs than the average linux-
machine.
Running a multi-threaded app (16 threads) that reads small portions of data 
from a large file (stored on remote server, accessed over NFS-udp), and write 
back to another file - same NFS-share. System crashes really hard (beyond 
softwatchdog- and oops-recovery) anywhere between 1 and 15 minutes.

On 2.6.15.x, problem appeared first, was fixed by kernelpatches to file.c and 
pagelist.c.

Went to 2.6.17.6 for the much improved multi-dualcore support, same problem 
appeared - unfortunately the original patch is integrated already, so it must 
be something else this time.

Kernel does not OOPS, but it locks up - on all CPUs, according to the logs.

Kernel BUG at mm/truncate.c:76
invalid opcode: 0000 [1] SMP
CPU 14
Modules linked in: nfs netconsole sch_sfq cls_u32 sch_tbf sch_prio 
iptable_filter ip_tables x_tables nfsd exportfs lockd 8250 seri
al_core ipv6 parport_pc lp parport autofs4 sunrpc w83627hf_wdt binfmt_misc xfs 
dm_mod video button battery ac ohci1394 ieee1394 oh
ci_hcd ehci_hcd i2c_nforce2 i2c_core tg3 floppy ide_cd cdrom
Pid: 8086, comm: dipfilter.x Not tainted 2.6.17.6 #1
RIP: 0010:[<ffffffff8025eb96>] <ffffffff8025eb96>{invalidate_complete_page+86}
RSP: 0018:ffff810393c09ca8  EFLAGS: 00010002
RAX: 0000000000000825 RBX: ffff8105ff8d86f0 RCX: ffff8103f271bb08
RDX: 0000000000000000 RSI: ffff810393c09c48 RDI: ffff8103f271bd88
RBP: ffff8103f271bd70 R08: 0000000000000001 R09: 000000000000002c
R10: 000000000000002c R11: ffff8105f8a26240 R12: ffff810393c09e08
R13: 0000000000000000 R14: ffff8103f271bd70 R15: 00000000000d1a2c
FS:  00002accace13dc0(0000) GS:ffff810e001bd9c0(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
CR2: 00002b050fbff000 CR3: 0000000ff7c21000 CR4: 00000000000006e0
Process dipfilter.x (pid: 8086, threadinfo ffff810393c08000, task 
ffff8103f8b70280)
Stack: 00000000000d1a2b ffff8105ff8d86f0 0000000000000000 ffffffff8025f0c0
       0000000000000000 00000000f271bc28 0000000000000000 ffffffffffffffff
       000000000000000e 0000000000000000
Call Trace: <ffffffff8025f0c0>{invalidate_inode_pages2_range+320}
       <ffffffff882042d9>{:nfs:nfs_revalidate_mapping+105}
       <ffffffff88202e29>{:nfs:nfs_file_write+169} <ffffffff8027d710>
{do_sync_write+208}
       <ffffffff802424e0>{autoremove_wake_function+0} <ffffffff80227a00>
{default_wake_function+0}
       <ffffffff8027d80f>{vfs_write+191} <ffffffff8027d9a3>{sys_write+83}
       <ffffffff80209c06>{system_call+126}
Code: 0f 0b 68 dc f2 46 80 c2 4c 00 48 89 df e8 18 74 ff ff f0 81
RIP <ffffffff8025eb96>{invalidate_complete_page+86} RSP <ffff810393c09ca8>
 NMI Watchdog detected LOCKUP on CPU 10
CPU 10
Modules linked in: nfs netconsole sch_sfq cls_u32 sch_tbf sch_prio 
iptable_filter ip_tables x_tables nfsd exportfs lockd 8250 seri
al_core ipv6 parport_pc lp parport autofs4 sunrpc w83627hf_wdt binfmt_misc xfs 
dm_mod video button battery ac ohci1394 ieee1394 oh
ci_hcd ehci_hcd i2c_nforce2 i2c_core tg3 floppy ide_cd cdrom
Pid: 8106, comm: dipfilter.x Not tainted 2.6.17.6 #1
RIP: 0010:[<ffffffff80309579>] <ffffffff80309579>{__read_lock_failed+5}
RSP: 0018:ffff8103f8647c08  EFLAGS: 00000097
RAX: ffff8103f271bd88 RBX: 00000000000d2431 RCX: ffff8103f8647d58
RDX: 0000000000000000 RSI: 00000000000d2431 RDI: ffff8103f271bd88
RBP: ffff8103f271bd70 R08: ffff8103f8646000 R09: 00000000ffffffff
R10: 00000000d2864068 R11: 0000000000000001 R12: ffff8103f271bd70
R13: 0000000000001000 R14: 0000000000001000 R15: ffff8103f8e83be8
FS:  00002b15cc9e2dc0(0000) GS:ffff810a0016ad40(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
CR2: 00007fff0b64dbd8 CR3: 0000000393c21000 CR4: 00000000000006e0
Process dipfilter.x (pid: 8106, threadinfo ffff8103f8646000, task 
ffff8103f8645820)
Stack: ffffffff8044802a ffffffff802569b5 ffff8103f271bd70 ffff810bff67f678
       00000000000d2431 ffffffff80256f7d ffffffff804479ec 0000000000000000
       0000000000000000 00000000d2438000
Call Trace: <ffffffff8044802a>{.text.lock.spinlock+83}
       <ffffffff802569b5>{find_get_page+21} <ffffffff80256f7d>
{do_generic_mapping_read+397}
       <ffffffff804479ec>{__up_wakeup+53} <ffffffff80257360>{file_read_actor+0}
       <ffffffff802591e9>{__generic_file_aio_read+425} <ffffffff802593f4>
{generic_file_aio_read+52}
       <ffffffff88202aba>{:nfs:nfs_file_read+170} <ffffffff8027d490>
{do_sync_read+208}
       <ffffffff802424e0>{autoremove_wake_function+0} <ffffffff80227a00>
{default_wake_function+0}
       <ffffffff8027d58c>{vfs_read+188} <ffffffff8027d913>{sys_read+83}
       <ffffffff80209c06>{system_call+126}

Code: 83 38 01 78 f9 f0 ff 08 0f 88 ed ff ff ff c3 90 90 90 90 90
console shuts up ...
 NMI Watchdog detected LOCKUP on CPU 0
<and so on for all CPUs>
Comment 1 Trond Myklebust 2006-07-17 11:43:10 UTC
Have you tried 2.6.18-rc1? That contains a patch that ensures pending writes
that would be affected by a truncate() call are cancelled.
Comment 2 Adrian Bunk 2006-07-17 12:04:34 UTC
*** Bug 6852 has been marked as a duplicate of this bug. ***
Comment 3 Adrian Bunk 2006-07-17 12:04:45 UTC
*** Bug 6853 has been marked as a duplicate of this bug. ***
Comment 4 Johan van Baarlen 2006-07-18 02:12:16 UTC
Just tried 2.6.18-rc2, same problem, although the bug surfaces at a different
location. Plus my consoles gets swamped with messages about reducing readahead,
and there is a dead loop problem on eth0 (broadcom tg3, can't find it anywhere
in the logs but it was visible on the console), but those don't seem to affect
normal operation.


Kernel BUG at fs/nfs/pagelist.c:87
invalid opcode: 0000 [1] SMP
CPU 6
Modules linked in: nfs netconsole sch_sfq cls_u32 sch_tbf sch_prio
iptable_filter ip_tables x_tables nfsd exportfs lockd 8250 serial_core ipv6
parport_pc lp parport autofs4 sunrpc w83627hf_wdt binfmt_misc xfs dm_mod video
button battery ac ohci1394 ieee1394 ohci_hcd ehci_hcd i2c_nforce2 i2c_core tg3
floppy ide_cd [<ffffffff8820e6f8>] :nfs:nfs_readpage+0x158/0x4a0
 [<ffffffff8025e09f>] do_generic_mapping_read+0x26f/0x580
 [<ffffffff8025e3b0>] file_read_actor+0x0/0x110
 [<ffffffff80260299>] __generic_file_aio_read+0x1a9/0x200
 [<ffffffff802604a4>] generic_file_aio_read+0x34/0x40
 [<ffffffff88206990>] :nfs:nfs_file_llseek+0x0/0x70
Dead loop on netdevice eth0, fix it urgently!
 [<ffffffff8028548c>] vfs_read+0xbc/0x170
 [<ffffffff80285813>] sys_read+0x53/0x90
 [<ffffffff80209c26>] system_call+0x7e/0x83


Code: 0f 0b 68 10 68 21 88 c2 57 
Comment 5 Trond Myklebust 2006-07-18 06:32:26 UTC
OK, that 'debugging code' is definitely a bug. Could you remove the line
    BUG_ON(PagePrivate(page));
on line 87 of fs/nfs/pagelist.c and then try again?
Comment 6 Johan van Baarlen 2006-07-21 00:41:02 UTC
Removed the line, no change - crashed hard, this time in truncate.c. Again on a
PagePrivate(page) line, but you probably saw that already. 
Every crash is related to NFS-writes, I've tried the same program reading NFS
but writing local disk - disk was unhappy with 16 threads hammering it, but
kernel did not crash.

----------- [cut here ] --------- [please bite here ] ---------
Kernel BUG at mm/truncate.c:76
invalid opcode: 0000 [1] SMP
CPU 1
Modules linked in: nfs netconsole sch_sfq cls_u32 sch_tbf sch_prio
iptable_filter ip_tables x_tables nfsd exportfs lockd 8250 serial_core ipv6
parport_pc lp parport autofs4 sunrpc w83627hf_wdt binfmt_misc xfs dm_mod video
button battery ac ohci1394 ieee1394 ohci_hcd ehci_hcd i2c_nforce2 i2c_core tg3
floppy ide_cd cdromPid: 6542, comm: dipfilter.x Not tainted 2.6.18-rc2 #3
RIP: 0010:[<ffffffff802653a6>]  [<ffffffff802653a6>]
invalidate_complete_page+0x56/0xa0
RSP: 0018:ffff8101f7d81ca8  EFLAGS: 00010002
RAX: 0000000000000825 RBX: ffff8109ffcff998 RCX: ffff8103f8574b08
RDX: 0000000000000000 RSI: 0000000000000007 RDI: ffff8103f8574d88
RBP: ffff8103f8574d70 R08: 0000000000040000 R09: 0000000000040000
R10: ffff810bf4552930 R11: 0000000000000003 R12: 0000000000000000
R13: 0000000000000002 R14: ffff8103f8574d70 R15: 000000000001cfac
FS:  00002ad29e1eedc0(0000) GS:ffff810400091340(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
CR2: 00002b1bc142e000 CR3: 00000005f77f1000 CR4: 00000000000006e0
Process dipfilter.x (pid: 6542, threadinfo ffff8101f7d80000, task ffff8101f22d9180)
Stack:  000000000001cfab ffff8109ffcff998 0000000000000000 ffffffff802658c0
 ffffffffffffffff 000000008820fb50 0000000000000000 ffffffffffffffff
 0000000000000009 0000000000000000 ffff810dffd11ba8 ffff810dffcd8b40
Call Trace:
 [<ffffffff802658c0>] invalidate_inode_pages2_range+0x140/0x210
 [<ffffffff882085f9>] :nfs:nfs_revalidate_mapping+0xb9/0x150
 [<ffffffff88206b61>] :nfs:nfs_file_read+0x71/0xe0
 [<ffffffff80285390>] do_sync_read+0xd0/0x110
 [<ffffffff80244450>] autoremove_wake_function+0x0/0x30
 [<ffffffff8045faef>] thread_return+0x0/0xf1
 [<ffffffff8028548c>] vfs_read+0xbc/0x170
 [<ffffffff80285813>] sys_read+0x53/0x90
 [<ffffffff80209c26>] system_call+0x7e/0x83


Code: 0f 0b 68 e6 e1 49 80 c2 4c 00 48 89 df e8 28 7c ff ff f0 81
RIP  [<ffffffff802653a6>] invalidate_complete_page+0x56/0xa0
 RSP <ffff8101f7d81ca8>
 NMI Watchdog detected LOCKUP on CPU 0
CPU 0
Modules linked in: nfs netconsole sch_sfq cls_u32 sch_tbf sch_prio
iptable_filter ip_tables x_tables nfsd exportfs lockd 8250 serial_core ipv6
parport_pc lp parport autofs4 sunrpc w83627hf_wdt binfmt_misc xfs dm_mod video
button battery ac ohci1394 ieee1394 ohci_hcd ehci_hcd i2c_nforce2 i2c_core tg3
floppy ide_cd cdromPid: 5496, comm: rpciod/0 Not tainted 2.6.18-rc2 #3
RIP: 0010:[<ffffffff8031403d>]  [<ffffffff8031403d>] __write_lock_failed+0x9/0x20
RSP: 0018:ffff8103f7cb9d78  EFLAGS: 00000087
RAX: ffff8103f8574d88 RBX: ffff8109ffd3db60 RCX: 0000000000000282
RDX: ffff8109f7d74e40 RSI: 0000000000000286 RDI: ffff8103f8574d88
RBP: ffff8109ffd3db60 R08: ffff810400396440 R09: 0000000000000000
R10: 0000000000000000 R11: ffffffff80219210 R12: ffff8103f8574d70
R13: ffff8103f8574d88 R14: ffff810bf7e73d58 R15: ffffffff88126b00
FS:  00002b563f72a7e0(0000) GS:ffffffff805b1000(0000) knlGS:0000000000000000
CS:  0010 DS: 0018 ES: 0018 CR0: 000000008005003b
CR2: 0000003d2f530eb0 CR3: 0000000bf4c31000 CR4: 00000000000006e0
Process rpciod/0 (pid: 5496, threadinfo ffff8103f7cb8000, task
ffff8101f8f00080)Stack:  ffffffff80461dad ffffffff80263b5b ffff8109ffd3db60
ffff8109ffd3db60
 ffff810bf7e73b00 ffff810bf7e73b08 ffff810bf7e73d58 ffffffff8025d8c0
 ffff8109f84c0740 ffffffff88210c6b ffff810bf7e73b08 0000000000000000
Call Trace:
 [<ffffffff80461dad>] .text.lock.spinlock+0x67/0x8a
 [<ffffffff80263b5b>] test_clear_page_writeback+0x4b/0xd0
 [<ffffffff8025d8c0>] end_page_writeback+0x20/0x50
 [<ffffffff88210c6b>] :nfs:nfs_writeback_done_full+0xfb/0x180
 [<ffffffff881263d5>] :sunrpc:rpc_exit_task+0x25/0x80
 [<ffffffff881269b3>] :sunrpc:__rpc_execute+0xa3/0x1f0
 [<ffffffff802406d0>] run_workqueue+0xb0/0x100
 [<ffffffff80243e60>] keventd_create_kthread+0x0/0x80
 [<ffffffff8024086a>] worker_thread+0x14a/0x190
 [<ffffffff80228f50>] default_wake_function+0x0/0x10
 [<ffffffff80228f50>] default_wake_function+0x0/0x10
 [<ffffffff80240720>] worker_thread+0x0/0x190
 [<ffffffff80243e19>] kthread+0xd9/0x120
 [<ffffffff8020aaae>] child_rip+0x8/0x12
 [<ffffffff80243e60>] keventd_create_kthread+0x0/0x80
 [<ffffffff80243d40>] kthread+0x0/0x120
 [<ffffffff8020aaa6>] child_rip+0x0/0x12


Code: 81 38 00 00 00 01 75 f6 f0 81 28 00 00 00 01 0f 85 e2 ff ff
console shuts up ...
<..>
and so on for every CPU in the box.
Comment 7 Johan van Baarlen 2006-08-13 03:04:00 UTC
Created attachment 8769 [details]
Patch for NFS-instability on 2.6.15.x
Comment 8 Johan van Baarlen 2006-08-13 03:04:35 UTC
And it keeps happening on -rc3 and -rc4... I've attached the patch that made 
the 2.6.15-series work alright, but I'm not sure whether it can be applied to 
later kernels - a lot of it seems already in-place.
Comment 9 Trond Myklebust 2006-08-13 10:43:06 UTC
Created attachment 8773 [details]
Fix a potential deadlock in nfs_release_page

From: Nikita Danilov <nikita@clusterfs.com>

nfs_wb_page() waits on request completion and, as a result, is not safe to be
called from nfs_release_page() invoked by VM scanner as part of GFP_NOFS
allocation. Fix possible deadlock by analyzing gfp mask and refusing to release
page if __GFP_FS is not set.

Signed-off-by: Nikita Danilov <danilov@gmail.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Comment 10 Johan van Baarlen 2006-08-24 02:18:06 UTC
System has been stable for 9 days now, I think the bug can be closed!

Note You need to log in before you can comment on or make changes to this bug.