Bug 15927

Summary: kernel BUG at fs/inode.c:313!
Product: File System Reporter: Tomas Zvala (fox)
Component: VFSAssignee: Miklos Szeredi (miklos)
Status: CLOSED OBSOLETE    
Severity: normal CC: akpm, alan, bblough, fox
Priority: P1    
Hardware: All   
OS: Linux   
Kernel Version: 2.6.33 Subsystem:
Regression: No Bisected commit-id:
Attachments: test patch
test patch2
Another test patch
Another test patch, updated

Description Tomas Zvala 2010-05-07 08:46:36 UTC
4 SPMC SBI7126T-S6 blades with 12G ram and 2 Xeons E5520 running glusterfs 3.0.4 through fuse 2.8.4 on kernel 2.6.33. Only one of those 4 boxes randomly triggers this bug and the system becomes unusable (some file handles on glusterfs hang) and it freezes on reboot.

Apr 22 17:22:45 imsblade3 [2430279.243581] ------------[ cut here ]------------
Apr 22 17:22:45 imsblade3 [2430279.244002] kernel BUG at fs/inode.c:313!
Apr 22 17:22:45 imsblade3 [2430279.244002] invalid opcode: 0000 [#1] SMP
Apr 22 17:22:45 imsblade3 [2430279.244002] last sysfs file: /sys/devices/platform/coretemp.7/temp1_crit
Apr 22 17:22:45 imsblade3 [2430279.244002] CPU 0
Apr 22 17:22:45 imsblade3 [2430279.244002] Pid: 32604, comm: dmon3.pl Not tainted 2.6.33imsblade #2 B7DW3/B7DW3
Apr 22 17:22:45 imsblade3 [2430279.244002] RIP: 0010:[<ffffffff810c47e8>]  [<ffffffff810c47e8>] clear_inode+0x108/0x110
Apr 22 17:22:45 imsblade3 [2430279.244002] RSP: 0018:ffff8801b4add728  EFLAGS: 00010202
Apr 22 17:22:45 imsblade3 [2430279.244002] RAX: 0000000000000000 RBX: ffff8800a7454dc0 RCX: ffff880002188f38
Apr 22 17:22:45 imsblade3 [2430279.244002] RDX: 0000000000000000 RSI: ffff8801b4add688 RDI: ffff8800a7454dc0
Apr 22 17:22:45 imsblade3 [2430279.244002] RBP: 0000000000000000 R08: ffff8801b4add5a8 R09: 0000000000000000
Apr 22 17:22:45 imsblade3 [2430279.244002] R10: 0000000000000040 R11: 0000000000000040 R12: ffff88022e54a000
Apr 22 17:22:45 imsblade3 [2430279.244002] R13: 0000000000000000 R14: ffff88022e54a0e8 R15: 0000000000000008
Apr 22 17:22:45 imsblade3 [2430279.244002] FS:  00007f26afbd16f0(0000) GS:ffff880028200000(0000) knlGS:0000000000000000
Apr 22 17:22:45 imsblade3 [2430279.244002] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Apr 22 17:22:45 imsblade3 [2430279.244002] CR2: 0000000004375000 CR3: 0000000191152000 CR4: 00000000000006f0
Apr 22 17:22:45 imsblade3 [2430279.244002] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
Apr 22 17:22:45 imsblade3 [2430279.244002] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Apr 22 17:22:45 imsblade3 [2430279.244002] Process dmon3.pl (pid: 32604, threadinfo ffff8801b4adc000, task ffff88022e525e00)
Apr 22 17:22:45 imsblade3 [2430279.244002] Stack:
Apr 22 17:22:45 imsblade3 [2430279.244002]  ffff8800a7454dc0 ffffffff810c5056 ffff880012d24a88 ffff880012d24a80
Apr 22 17:22:45 imsblade3 [2430279.244002] <0> ffff880012d24a88 ffffffff810c0db7 0000003f0000003f ffff880012d24a80
Apr 22 17:22:45 imsblade3 [2430279.244002] <0> ffff88022e54a000 ffffffff810c1077 ffff8801b4add7a8 ffff8801b4add798
Apr 22 17:22:45 imsblade3 [2430279.244002] Call Trace:
Apr 22 17:22:45 imsblade3 [2430279.244002]  [<ffffffff810c5056>] ? generic_delete_inode+0x166/0x1a0
Apr 22 17:22:45 imsblade3 [2430279.244002]  [<ffffffff810c0db7>] ? d_kill+0x47/0x80
Apr 22 17:22:45 imsblade3 [2430279.244002]  [<ffffffff810c1077>] ? __shrink_dcache_sb+0x277/0x370
Apr 22 17:22:45 imsblade3 [2430279.244002]  [<ffffffff810c12a0>] ? shrink_dcache_memory+0x120/0x1c0
Apr 22 17:22:45 imsblade3 [2430279.244002]  [<ffffffff8108430f>] ? shrink_slab+0x11f/0x180
Apr 22 17:22:45 imsblade3 [2430279.244002]  [<ffffffff81084dcb>] ? try_to_free_pages+0x23b/0x380
Apr 22 17:22:45 imsblade3 [2430279.244002]  [<ffffffff81081db0>] ? isolate_pages_global+0x0/0x250
Apr 22 17:22:45 imsblade3 [2430279.244002]  [<ffffffff8107e2b2>] ? __alloc_pages_nodemask+0x3b2/0x670
Apr 22 17:22:45 imsblade3 [2430279.244002]  [<ffffffff815e5e50>] ? schedule+0x310/0x930
Apr 22 17:22:45 imsblade3 [2430279.244002]  [<ffffffff810a7d16>] ? __slab_alloc+0x1d6/0x5e0
Apr 22 17:22:45 imsblade3 [2430279.244002]  [<ffffffff811eebd2>] ? fuse_alloc_inode+0x12/0xd0
Apr 22 17:22:45 imsblade3 [2430279.244002]  [<ffffffff815e5e50>] ? schedule+0x310/0x930
Apr 22 17:22:45 imsblade3 [2430279.244002]  [<ffffffff811ee3b0>] ? fuse_inode_eq+0x0/0x20
Apr 22 17:22:45 imsblade3 [2430279.244002]  [<ffffffff811ee3d0>] ? fuse_inode_set+0x0/0x10
Apr 22 17:22:45 imsblade3 [2430279.244002]  [<ffffffff810a831f>] ? kmem_cache_alloc+0x9f/0xb0
Apr 22 17:22:45 imsblade3 [2430279.244002]  [<ffffffff811eebd2>] ? fuse_alloc_inode+0x12/0xd0
Apr 22 17:22:45 imsblade3 [2430279.244002]  [<ffffffff810c52bf>] ? alloc_inode+0x1f/0xa0
Apr 22 17:22:45 imsblade3 [2430279.244002]  [<ffffffff810c5558>] ? iget5_locked+0x98/0x1c0
Apr 22 17:22:45 imsblade3 [2430279.244002]  [<ffffffff811ef555>] ? fuse_iget+0x85/0x1e0
Apr 22 17:22:45 imsblade3 [2430279.244002]  [<ffffffff811e9532>] ? fuse_lookup_name+0x1b2/0x1e0
Apr 22 17:22:45 imsblade3 [2430279.244002]  [<ffffffff811e9bf9>] ? fuse_lookup+0x69/0x200
Apr 22 17:22:45 imsblade3 [2430279.244002]  [<ffffffff810b833a>] ? do_lookup+0xfa/0x2a0
Apr 22 17:22:45 imsblade3 [2430279.244002]  [<ffffffff810ba53e>] ? link_path_walk+0x54e/0xe60
Apr 22 17:22:45 imsblade3 [2430279.244002]  [<ffffffff810bb00e>] ? path_walk+0x6e/0xf0
Apr 22 17:22:45 imsblade3 [2430279.244002]  [<ffffffff810bb1ab>] ? do_path_lookup+0x5b/0x60
Apr 22 17:22:45 imsblade3 [2430279.244002]  [<ffffffff810bbe87>] ? user_path_at+0x97/0xd0
Apr 22 17:22:45 imsblade3 [2430279.244002]  [<ffffffff810a7eed>] ? __slab_alloc+0x3ad/0x5e0
Apr 22 17:22:45 imsblade3 [2430279.244002]  [<ffffffff810b2ed4>] ? cp_new_stat+0xe4/0x100
Apr 22 17:22:45 imsblade3 [2430279.244002]  [<ffffffff810b2d67>] ? vfs_fstatat+0x37/0x80
Apr 22 17:22:45 imsblade3 [2430279.244002]  [<ffffffff810b2faf>] ? sys_newstat+0x1f/0x50
Apr 22 17:22:45 imsblade3 [2430279.244002]  [<ffffffff81002d2b>] ? system_call_fastpath+0x16/0x1b
Apr 22 17:22:45 imsblade3 [2430279.244002] Code: c7 c2 a0 3a 0c 81 be 07 00 00 00 e8 43 22 52 00 e9 44 ff ff ff 0f 0b eb fe 66 2e 0f 1f 84 00 00 00 00 00 0f 0b eb fe 0f 1f 40 00 <0f> 0b eb fe 0f 1f 40 00 53 31 c0 48 89 fb b9 47 00 00 00 48 c7
Apr 22 17:22:45 imsblade3 [2430279.244002] RIP  [<ffffffff810c47e8>] clear_inode+0x108/0x110
Apr 22 17:22:45 imsblade3 [2430279.244002]  RSP <ffff8801b4add728>
Apr 22 17:22:45 imsblade3 [2430279.616368] ---[ end trace c1617eab624e0059 ]---



May  3 04:56:07 imsblade3 [585877.972567] kernel BUG at fs/inode.c:313!
May  3 04:56:07 imsblade3 [585877.972567] invalid opcode: 0000 [#1] SMP
May  3 04:56:07 imsblade3 [585877.972567] last sysfs file: /sys/devices/platform/coretemp.7/temp1_crit
May  3 04:56:07 imsblade3 [585877.972567] CPU 0
May  3 04:56:07 imsblade3 [585877.972567] Pid: 28974, comm: dmon3.pl Not tainted 2.6.33imsblade #2 B7DW3/B7DW3
May  3 04:56:07 imsblade3 [585877.972567] RIP: 0010:[<ffffffff810c47e8>]  [<ffffffff810c47e8>] clear_inode+0x108/0x110
May  3 04:56:07 imsblade3 [585877.972567] RSP: 0018:ffff8801dfb41728  EFLAGS: 00010202
May  3 04:56:07 imsblade3 [585877.972567] RAX: 0000000000000000 RBX: ffff880059454dc0 RCX: ffff880059454ee0
May  3 04:56:07 imsblade3 [585877.972567] RDX: 0000000000000000 RSI: ffff8801dfb41688 RDI: ffff880059454dc0
May  3 04:56:07 imsblade3 [585877.972567] RBP: 0000000000000000 R08: ffff8801dfb415a8 R09: 0000000000000001
May  3 04:56:07 imsblade3 [585877.972567] R10: 0000000000000040 R11: 0000000000000040 R12: ffff88022a13c400
May  3 04:56:07 imsblade3 [585877.972567] R13: 0000000000000000 R14: ffff88022a13c4e8 R15: 0000000000000008
May  3 04:56:07 imsblade3 [585877.972567] FS:  00007f0a6429c6f0(0000) GS:ffff880028200000(0000) knlGS:0000000000000000
May  3 04:56:07 imsblade3 [585877.972567] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
May  3 04:56:07 imsblade3 [585877.972567] CR2: 0000000004c4e000 CR3: 00000000599f7000 CR4: 00000000000006f0
May  3 04:56:07 imsblade3 [585877.972567] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
May  3 04:56:07 imsblade3 [585877.972567] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
May  3 04:56:07 imsblade3 [585877.972567] Process dmon3.pl (pid: 28974, threadinfo ffff8801dfb40000, task ffff880051dd8bc0)
May  3 04:56:07 imsblade3 [585877.972567] Stack:
May  3 04:56:07 imsblade3 [585877.972567]  ffff880059454dc0 ffffffff810c5056 ffff8800726906c8 ffff8800726906c0
May  3 04:56:07 imsblade3 [585877.972567] <0> ffff8800726906c8 ffffffff810c0db7 0000009100000091 ffff8800726906c0
May  3 04:56:07 imsblade3 [585877.972567] <0> ffff88022a13c400 ffffffff810c1077 ffff8801dfb417a8 ffff8801dfb41798
May  3 04:56:07 imsblade3 [585877.972567] Call Trace:
May  3 04:56:07 imsblade3 [585877.972567]  [<ffffffff810c5056>] ? generic_delete_inode+0x166/0x1a0
May  3 04:56:07 imsblade3 [585877.972567]  [<ffffffff810c0db7>] ? d_kill+0x47/0x80
May  3 04:56:07 imsblade3 [585877.972567]  [<ffffffff810c1077>] ? __shrink_dcache_sb+0x277/0x370
May  3 04:56:07 imsblade3 [585877.972567]  [<ffffffff810c12a0>] ? shrink_dcache_memory+0x120/0x1c0
May  3 04:56:07 imsblade3 [585877.972567]  [<ffffffff8108430f>] ? shrink_slab+0x11f/0x180
May  3 04:56:07 imsblade3 [585877.972567]  [<ffffffff81084dcb>] ? try_to_free_pages+0x23b/0x380
May  3 04:56:07 imsblade3 [585877.972567]  [<ffffffff81081db0>] ? isolate_pages_global+0x0/0x250
May  3 04:56:07 imsblade3 [585877.972567]  [<ffffffff8107e2b2>] ? __alloc_pages_nodemask+0x3b2/0x670
May  3 04:56:07 imsblade3 [585877.972567]  [<ffffffff815e5e50>] ? schedule+0x310/0x930
May  3 04:56:07 imsblade3 [585877.972567]  [<ffffffff810a7d16>] ? __slab_alloc+0x1d6/0x5e0
May  3 04:56:07 imsblade3 [585877.972567]  [<ffffffff811eebd2>] ? fuse_alloc_inode+0x12/0xd0
May  3 04:56:07 imsblade3 [585877.972567]  [<ffffffff811ee3b0>] ? fuse_inode_eq+0x0/0x20
May  3 04:56:07 imsblade3 [585877.972567]  [<ffffffff811ee3d0>] ? fuse_inode_set+0x0/0x10
May  3 04:56:07 imsblade3 [585877.972567]  [<ffffffff810a831f>] ? kmem_cache_alloc+0x9f/0xb0
May  3 04:56:07 imsblade3 [585877.972567]  [<ffffffff811eebd2>] ? fuse_alloc_inode+0x12/0xd0
May  3 04:56:07 imsblade3 [585877.972567]  [<ffffffff810c52bf>] ? alloc_inode+0x1f/0xa0
May  3 04:56:07 imsblade3 [585877.972567]  [<ffffffff810c5558>] ? iget5_locked+0x98/0x1c0
May  3 04:56:07 imsblade3 [585877.972567]  [<ffffffff811ef555>] ? fuse_iget+0x85/0x1e0
May  3 04:56:07 imsblade3 [585877.972567]  [<ffffffff811e9532>] ? fuse_lookup_name+0x1b2/0x1e0
May  3 04:56:07 imsblade3 [585877.972567]  [<ffffffff811e9bf9>] ? fuse_lookup+0x69/0x200
May  3 04:56:07 imsblade3 [585877.972567]  [<ffffffff810b833a>] ? do_lookup+0xfa/0x2a0
May  3 04:56:07 imsblade3 [585877.972567]  [<ffffffff810ba53e>] ? link_path_walk+0x54e/0xe60
May  3 04:56:07 imsblade3 [585877.972567]  [<ffffffff810bb00e>] ? path_walk+0x6e/0xf0
May  3 04:56:07 imsblade3 [585877.972567]  [<ffffffff810bb1ab>] ? do_path_lookup+0x5b/0x60
May  3 04:56:07 imsblade3 [585877.972567]  [<ffffffff810bbe87>] ? user_path_at+0x97/0xd0
May  3 04:56:07 imsblade3 [585877.972567]  [<ffffffff81098668>] ? page_add_new_anon_rmap+0x58/0x90
May  3 04:56:07 imsblade3 [585877.972567]  [<ffffffff8108f3dc>] ? handle_mm_fault+0x70c/0x800
May  3 04:56:07 imsblade3 [585877.972567]  [<ffffffff810b2d67>] ? vfs_fstatat+0x37/0x80
May  3 04:56:07 imsblade3 [585877.972567]  [<ffffffff810b2faf>] ? sys_newstat+0x1f/0x50
May  3 04:56:07 imsblade3 [585877.972567]  [<ffffffff81024d3f>] ? do_page_fault+0x14f/0x370
May  3 04:56:07 imsblade3 [585877.972567]  [<ffffffff812ab201>] ? __up_read+0x21/0xc0
May  3 04:56:07 imsblade3 [585877.972567]  [<ffffffff815e841f>] ? page_fault+0x1f/0x30
May  3 04:56:07 imsblade3 [585877.972567]  [<ffffffff81002d2b>] ? system_call_fastpath+0x16/0x1b
May  3 04:56:07 imsblade3 [585877.972567] Code: c7 c2 a0 3a 0c 81 be 07 00 00 00 e8 43 22 52 00 e9 44 ff ff ff 0f 0b eb fe 66 2e 0f 1f 84 00 00 00 00 00 0f 0b eb fe 0f 1f 40 00 <0f> 0b eb fe 0f 1f 40 00 53 31 c0 48 89 fb b9 47 00 00 00 48 c7
May  3 04:56:07 imsblade3 [585877.972567] RIP  [<ffffffff810c47e8>] clear_inode+0x108/0x110
May  3 04:56:07 imsblade3 [585877.972567]  RSP <ffff8801dfb41728>
May  3 04:56:07 imsblade3 [585878.348434] ---[ end trace ff65c60b27049d37 ]---
Comment 1 Tomas Zvala 2010-05-11 00:31:32 UTC
Well i know it's not a best bugreport ever, but if there's anything else i can provide, please let me know. I can also provide more 'instances' of this bug .... like this one without fuse involved (actually i'm not sure its not involved, but it's not mentioned).


May 10 05:41:04 imsblade3 [242236.814441] ------------[ cut here ]------------
May 10 05:41:04 imsblade3 [242236.815190] kernel BUG at fs/inode.c:313!
May 10 05:41:04 imsblade3 [242236.815190] invalid opcode: 0000 [#1] SMP
May 10 05:41:04 imsblade3 [242236.815190] last sysfs file: /sys/devices/platform/coretemp.7/temp1_crit
May 10 05:41:04 imsblade3 [242236.815190] CPU 3
May 10 05:41:04 imsblade3 [242236.815190] Pid: 17097, comm: dmon3.pl Not tainted 2.6.33imsblade #2 B7DW3/B7DW3
May 10 05:41:04 imsblade3 [242236.815190] RIP: 0010:[<ffffffff810c47e8>]  [<ffffffff810c47e8>] clear_inode+0x108/0x110
May 10 05:41:04 imsblade3 [242236.815190] RSP: 0018:ffff8800b5b2fad0  EFLAGS: 00010202
May 10 05:41:04 imsblade3 [242236.815190] RAX: 0000000000000000 RBX: ffff88001909db80 RCX: ffff8801ef25a748
May 10 05:41:04 imsblade3 [242236.815190] RDX: 0000000000000000 RSI: ffff8800b5b2fa30 RDI: ffff88001909db80
May 10 05:41:04 imsblade3 [242236.815190] RBP: 0000000000000000 R08: ffff8800b5b2f950 R09: 0000000000000000
May 10 05:41:04 imsblade3 [242236.815190] R10: 0000000000000040 R11: 0000000000000040 R12: ffff88022e63a400
May 10 05:41:04 imsblade3 [242236.815190] R13: 0000000000000000 R14: ffff88022e63a4e8 R15: 0000000000000008
May 10 05:41:04 imsblade3 [242236.815190] FS:  00007fd1b67856f0(0000) GS:ffff8800282c0000(0000) knlGS:0000000000000000
May 10 05:41:04 imsblade3 [242236.815190] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
May 10 05:41:04 imsblade3 [242236.815190] CR2: 00000000043516e0 CR3: 000000007d58b000 CR4: 00000000000006e0
May 10 05:41:04 imsblade3 [242236.815190] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
May 10 05:41:04 imsblade3 [242236.815190] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
May 10 05:41:04 imsblade3 [242236.815190] Process dmon3.pl (pid: 17097, threadinfo ffff8800b5b2e000, task ffff88019253d240)
May 10 05:41:04 imsblade3 [242236.815190] Stack:
May 10 05:41:04 imsblade3 [242236.815190]  ffff88001909db80 ffffffff810c5056 ffff8800c0224548 ffff8800c0224540
May 10 05:41:04 imsblade3 [242236.815190] <0> ffff8800c0224548 ffffffff810c0db7 000000cb000000cb ffff8800c0224540
May 10 05:41:04 imsblade3 [242236.815190] <0> ffff88022e63a400 ffffffff810c1077 ffff8800b5b2fb50 ffff8800b5b2fb40
May 10 05:41:04 imsblade3 [242236.815190] Call Trace:
May 10 05:41:04 imsblade3 [242236.815190]  [<ffffffff810c5056>] ? generic_delete_inode+0x166/0x1a0
May 10 05:41:04 imsblade3 [242236.815190]  [<ffffffff810c0db7>] ? d_kill+0x47/0x80
May 10 05:41:04 imsblade3 [242236.815190]  [<ffffffff810c1077>] ? __shrink_dcache_sb+0x277/0x370
May 10 05:41:04 imsblade3 [242236.815190]  [<ffffffff810c12a0>] ? shrink_dcache_memory+0x120/0x1c0
May 10 05:41:04 imsblade3 [242236.815190]  [<ffffffff8108430f>] ? shrink_slab+0x11f/0x180
May 10 05:41:04 imsblade3 [242236.815190]  [<ffffffff81084dcb>] ? try_to_free_pages+0x23b/0x380
May 10 05:41:04 imsblade3 [242236.815190]  [<ffffffff81081db0>] ? isolate_pages_global+0x0/0x250
May 10 05:41:04 imsblade3 [242236.815190]  [<ffffffff8107e2b2>] ? __alloc_pages_nodemask+0x3b2/0x670
May 10 05:41:04 imsblade3 [242236.815190]  [<ffffffff8107e582>] ? __get_free_pages+0x12/0x60
May 10 05:41:04 imsblade3 [242236.815190]  [<ffffffff81037d63>] ? copy_process+0xd3/0x1100
May 10 05:41:04 imsblade3 [242236.815190]  [<ffffffff810a82eb>] ? kmem_cache_alloc+0x6b/0xb0
May 10 05:41:04 imsblade3 [242236.815190]  [<ffffffff81038e39>] ? do_fork+0xa9/0x3a0
May 10 05:41:04 imsblade3 [242236.815190]  [<ffffffff810c6a0a>] ? alloc_fd+0x4a/0x140
May 10 05:41:04 imsblade3 [242236.815190]  [<ffffffff810accde>] ? fd_install+0x2e/0x60
May 10 05:41:04 imsblade3 [242236.815190]  [<ffffffff810b6c71>] ? do_pipe_flags+0xc1/0x100
May 10 05:41:04 imsblade3 [242236.815190]  [<ffffffff81002f93>] ? stub_clone+0x13/0x20
May 10 05:41:04 imsblade3 [242236.815190]  [<ffffffff81002d2b>] ? system_call_fastpath+0x16/0x1b
May 10 05:41:04 imsblade3 [242236.815190] Code: c7 c2 a0 3a 0c 81 be 07 00 00 00 e8 43 22 52 00 e9 44 ff ff ff 0f 0b eb fe 66 2e 0f 1f 84 00 00 00 00 00 0f 0b eb fe 0f 1f 40 00 <0f> 0b eb fe 0f 1f 40 00 53 31 c0 48 89 fb b9 47 00 00 00 48 c7
May 10 05:41:04 imsblade3 [242236.815190] RIP  [<ffffffff810c47e8>] clear_inode+0x108/0x110
May 10 05:41:04 imsblade3 [242236.815190]  RSP <ffff8800b5b2fad0>
May 10 05:41:04 imsblade3 [242237.102362] ---[ end trace c68208beaf852921 ]---
May 10 05:44:02 imsblade3 [242414.273198] i801_smbus 0000:00:1f.3: Transaction timeout
May 10 05:44:02 imsblade3 [242414.486198] i801_smbus 0000:00:1f.3: Transaction timeout
May 10 05:45:47 imsblade3 [242519.902378] ------------[ cut here ]------------




May 10 05:45:47 imsblade3 [242519.902378] ------------[ cut here ]------------
May 10 05:45:47 imsblade3 [242519.903315] kernel BUG at fs/inode.c:313!
May 10 05:45:47 imsblade3 [242519.903315] invalid opcode: 0000 [#2] SMP
May 10 05:45:47 imsblade3 [242519.903315] last sysfs file: /sys/devices/platform/coretemp.7/temp1_crit
May 10 05:45:47 imsblade3 [242519.903315] CPU 5
May 10 05:45:47 imsblade3 [242519.903315] Pid: 549, comm: kswapd0 Tainted: G      D    2.6.33imsblade #2 B7DW3/B7DW3
May 10 05:45:47 imsblade3 [242519.903315] RIP: 0010:[<ffffffff810c47e8>]  [<ffffffff810c47e8>] clear_inode+0x108/0x110
May 10 05:45:47 imsblade3 [242519.903315] RSP: 0018:ffff88022f29fc40  EFLAGS: 00010202
May 10 05:45:47 imsblade3 [242519.903315] RAX: 0000000000000000 RBX: ffff8800305b23c0 RCX: ffff8800305b24e0
May 10 05:45:47 imsblade3 [242519.903315] RDX: 0000000000000000 RSI: ffff88022f29fba0 RDI: ffff8800305b23c0
May 10 05:45:47 imsblade3 [242519.903315] RBP: 0000000000000000 R08: ffff88022f29fac0 R09: 0000000000000000
May 10 05:45:47 imsblade3 [242519.903315] R10: 0000000000000040 R11: 0000000000000040 R12: ffff88022e63a400
May 10 05:45:47 imsblade3 [242519.903315] R13: 0000000000000000 R14: ffff88022e63a4e8 R15: 0000000000000008
May 10 05:45:47 imsblade3 [242519.903315] FS:  0000000000000000(0000) GS:ffff880028340000(0000) knlGS:0000000000000000
May 10 05:45:47 imsblade3 [242519.903315] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
May 10 05:45:47 imsblade3 [242519.903315] CR2: 00000000043bf5f8 CR3: 0000000018691000 CR4: 00000000000006e0
May 10 05:45:47 imsblade3 [242519.903315] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
May 10 05:45:47 imsblade3 [242519.903315] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
May 10 05:45:47 imsblade3 [242519.903315] Process kswapd0 (pid: 549, threadinfo ffff88022f29e000, task ffff88022fa24680)
May 10 05:45:47 imsblade3 [242519.903315] Stack:
May 10 05:45:47 imsblade3 [242519.903315]  ffff8800305b23c0 ffffffff810c5056 ffff880146b26e48 ffff880146b26e40
May 10 05:45:47 imsblade3 [242519.903315] <0> ffff880146b26e48 ffffffff810c0db7 000000c9000000c9 ffff880146b26e40
May 10 05:45:47 imsblade3 [242519.903315] <0> ffff88022e63a400 ffffffff810c1077 ffff88022f29fcc0 ffff88022f29fcb0
May 10 05:45:47 imsblade3 [242519.903315] Call Trace:
May 10 05:45:47 imsblade3 [242519.903315]  [<ffffffff810c5056>] ? generic_delete_inode+0x166/0x1a0
May 10 05:45:47 imsblade3 [242519.903315]  [<ffffffff810c0db7>] ? d_kill+0x47/0x80
May 10 05:45:47 imsblade3 [242519.903315]  [<ffffffff810c1077>] ? __shrink_dcache_sb+0x277/0x370
May 10 05:45:47 imsblade3 [242519.903315]  [<ffffffff810c12a0>] ? shrink_dcache_memory+0x120/0x1c0
May 10 05:45:47 imsblade3 [242519.903315]  [<ffffffff8108430f>] ? shrink_slab+0x11f/0x180
May 10 05:45:47 imsblade3 [242519.903315]  [<ffffffff8108481f>] ? kswapd+0x4af/0x820
May 10 05:45:47 imsblade3 [242519.903315]  [<ffffffff81081db0>] ? isolate_pages_global+0x0/0x250
May 10 05:45:47 imsblade3 [242519.903315]  [<ffffffff8104fd00>] ? autoremove_wake_function+0x0/0x30
May 10 05:45:47 imsblade3 [242519.903315]  [<ffffffff8102dd7b>] ? __wake_up_common+0x5b/0x90
May 10 05:45:47 imsblade3 [242519.903315]  [<ffffffff81084370>] ? kswapd+0x0/0x820
May 10 05:45:47 imsblade3 [242519.903315]  [<ffffffff8104f866>] ? kthread+0x96/0xb0
May 10 05:45:47 imsblade3 [242519.903315]  [<ffffffff81003a54>] ? kernel_thread_helper+0x4/0x10
May 10 05:45:47 imsblade3 [242519.903315]  [<ffffffff8104f7d0>] ? kthread+0x0/0xb0
May 10 05:45:47 imsblade3 [242519.903315]  [<ffffffff81003a50>] ? kernel_thread_helper+0x0/0x10
May 10 05:45:47 imsblade3 [242519.903315] Code: c7 c2 a0 3a 0c 81 be 07 00 00 00 e8 43 22 52 00 e9 44 ff ff ff 0f 0b eb fe 66 2e 0f 1f 84 00 00 00 00 00 0f 0b eb fe 0f 1f 40 00 <0f> 0b eb fe 0f 1f 40 00 53 31 c0 48 89 fb b9 47 00 00 00 48 c7
May 10 05:45:47 imsblade3 [242519.903315] RIP  [<ffffffff810c47e8>] clear_inode+0x108/0x110
May 10 05:45:47 imsblade3 [242519.903315]  RSP <ffff88022f29fc40>
May 10 05:45:47 imsblade3 [242520.172618] ---[ end trace c68208beaf852922 ]---
Comment 2 Miklos Szeredi 2010-05-11 10:33:47 UTC
Created attachment 26339 [details]
test patch

Tomas, can you please try a kernel with the attached patch and report the result?
Comment 3 Tomas Zvala 2010-05-25 06:56:31 UTC
Hello Miklos, 
it finally happened again (truly random :-O).

Does this help? 


t.

[403756.196269] inode->i_data.nrpages: 1 in fuse
[403756.196280] ------------[ cut here ]------------
[403756.196377] kernel BUG at fs/inode.c:316!
[403756.196377] invalid opcode: 0000 [#1] SMP
[403756.196377] last sysfs file: /sys/devices/platform/coretemp.7/temp1_crit
[403756.196377] CPU 6
[403756.196377] Pid: 13463, comm: dmon3.pl Not tainted 2.6.33imsblade #3 B7DW3/B7DW3
[403756.196377] RIP: 0010:[<ffffffff810c4804>]  [<ffffffff810c4804>] clear_inode+0x124/0x130
[403756.196377] RSP: 0018:ffff8801556cdad0  EFLAGS: 00010282
[403756.196377] RAX: 0000000000000037 RBX: ffff88019780b700 RCX: 00000000000fffff
[403756.196377] RDX: 0000000000000000 RSI: 0000000000000046 RDI: ffffffff818b8b10
[403756.196377] RBP: 0000000000000000 R08: ffffffff816e40ad R09: ffff8801d56cd96f
[403756.196377] R10: 0000000000000000 R11: 0000000000000003 R12: ffff88022e63c400
[403756.196377] R13: 0000000000000000 R14: ffff88022e63c4e8 R15: 0000000000000008
[403756.196377] FS:  00007fce3aaa76f0(0000) GS:ffff880028380000(0000) knlGS:0000000000000000
[403756.196377] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
[403756.196377] CR2: 000000000250d6e0 CR3: 0000000144863000 CR4: 00000000000006e0
[403756.196377] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[403756.196377] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
[403756.196377] Process dmon3.pl (pid: 13463, threadinfo ffff8801556cc000, task ffff880087452340)
[403756.196377] Stack:
[403756.196377]  ffff88019780b700 ffffffff810c5076 ffff8801cdccd9c8 ffff8801cdccd9c0
[403756.196377] <0> ffff8801cdccd9c8 ffffffff810c0db7 0000009400000094 ffff8801cdccd9c0
[403756.196377] <0> ffff88022e63c400 ffffffff810c1077 ffff8801556cdb50 ffff8801556cdb40
[403756.196377] Call Trace:
[403756.196377]  [<ffffffff810c5076>] ? generic_delete_inode+0x166/0x1a0
[403756.196377]  [<ffffffff810c0db7>] ? d_kill+0x47/0x80
[403756.196377]  [<ffffffff810c1077>] ? __shrink_dcache_sb+0x277/0x370
[403756.196377]  [<ffffffff810c12a0>] ? shrink_dcache_memory+0x120/0x1c0
[403756.196377]  [<ffffffff8108430f>] ? shrink_slab+0x11f/0x180
[403756.196377]  [<ffffffff81084dcb>] ? try_to_free_pages+0x23b/0x380
[403756.196377]  [<ffffffff81081db0>] ? isolate_pages_global+0x0/0x250
[403756.196377]  [<ffffffff8107e2b2>] ? __alloc_pages_nodemask+0x3b2/0x670
[403756.196377]  [<ffffffff81029831>] ? flush_tlb_page+0x41/0xa0
[403756.196377]  [<ffffffff8107e582>] ? __get_free_pages+0x12/0x60
[403756.196377]  [<ffffffff81037d63>] ? copy_process+0xd3/0x1100
[403756.196377]  [<ffffffff810a82eb>] ? kmem_cache_alloc+0x6b/0xb0
[403756.196377]  [<ffffffff81038e39>] ? do_fork+0xa9/0x3a0
[403756.196377]  [<ffffffff810c6a2a>] ? alloc_fd+0x4a/0x140
[403756.196377]  [<ffffffff810accde>] ? fd_install+0x2e/0x60
[403756.196377]  [<ffffffff810b6c71>] ? do_pipe_flags+0xc1/0x100
[403756.196377]  [<ffffffff81002f93>] ? stub_clone+0x13/0x20
[403756.196377]  [<ffffffff81002d2b>] ? system_call_fastpath+0x16/0x1b
[403756.196377] Code: 1f 84 00 00 00 00 00 0f 0b eb fe 0f 1f 40 00 48 8b 83 f8 00 00 00 48 c7 c7 68 3b 6f 81 48 8b 40 30 48 8b 10 31 c0 e8 5d 0e 52 00 <0f> 0b eb fe 0f 1f 84 00 00 00 00 00 53                                              31 c0 48 89 fb b9 47 00
[403756.196377] RIP  [<ffffffff810c4804>] clear_inode+0x124/0x130
[403756.196377]  RSP <ffff8801556cdad0>
[403756.488905] ---[ end trace 27fe7a3307e16daa ]---
Comment 4 Miklos Szeredi 2010-05-25 10:28:17 UTC
Thanks.  It shows that the inode is coming from a fuse filesystem.  I looked at the code very hard but still have no idea how that mapping->nrpages != 0 can happen.

I notice that the kernel version has a "imsblade" localversion.  Do you have any patches applied on top of vanilla?
Comment 5 Tomas Zvala 2010-05-25 12:11:15 UTC
No, no, imsblade is pretty much just the config version. The source is vanilla.

btw. we tested 2.6.34-rc6 with the same result.
Comment 6 Miklos Szeredi 2010-05-25 13:47:14 UTC
Created attachment 26538 [details]
test patch2

Thanks for the info.

Here's another test patch.  Is this WARN_ON triggered before the BUG?
Comment 7 Tomas Zvala 2010-06-18 18:50:20 UTC
Hey. Finally got another BUG! ... the WARN_ON seems not to be triggered ... 

[819678.709654] inode->i_data.nrpages: 1 in fuse
[819678.709667] ------------[ cut here ]------------
[819678.711601] kernel BUG at fs/inode.c:316!
[819678.711601] invalid opcode: 0000 [#1] SMP
[819678.711601] last sysfs file: /sys/devices/pci0000:00/0000:00:07.0/0000:03:00.0/host0/scsi_host/host0/proc_name
[819678.711601] CPU 7
[819678.711601] Pid: 15126, comm: dmon.pl Not tainted 2.6.33imsblade #4 B7DW3/B7DW3
[819678.711601] RIP: 0010:[<ffffffff810c4834>]  [<ffffffff810c4834>] clear_inode+0x124/0x130
[819678.711601] RSP: 0018:ffff880110011ad0  EFLAGS: 00010282
[819678.711601] RAX: 0000000000000037 RBX: ffff88008877e3c0 RCX: 00000000000fffff
[819678.711601] RDX: 0000000000000000 RSI: 0000000000000046 RDI: ffffffff818b8b10
[819678.711601] RBP: 0000000000000000 R08: ffffffff816e40ad R09: ffff88019001196f
[819678.711601] R10: 0000000000000000 R11: 0000000000000003 R12: ffff88022e639800
[819678.711601] R13: 0000000000000000 R14: ffff88022e6398e8 R15: 0000000000000008
[819678.711601] FS:  00007f3b165406f0(0000) GS:ffff8800283c0000(0000) knlGS:0000000000000000
[819678.711601] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
[819678.711601] CR2: 0000000003546360 CR3: 00000001794ff000 CR4: 00000000000006e0
[819678.711601] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[819678.711601] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
[819678.711601] Process dmon.pl (pid: 15126, threadinfo ffff880110010000, task ffff88022a72e9c0)
[819678.711601] Stack:
[819678.711601]  ffff88008877e3c0 ffffffff810c50a6 ffff880167594d88 ffff880167594d80
[819678.711601] <0> ffff880167594d88 ffffffff810c0de7 000000a4000000a4 ffff880167594d80
[819678.711601] <0> ffff88022e639800 ffffffff810c10a7 ffff880110011b50 ffff880110011b40
[819678.711601] Call Trace:
[819678.711601]  [<ffffffff810c50a6>] ? generic_delete_inode+0x166/0x1a0
[819678.711601]  [<ffffffff810c0de7>] ? d_kill+0x47/0x80
[819678.711601]  [<ffffffff810c10a7>] ? __shrink_dcache_sb+0x277/0x370
[819678.711601]  [<ffffffff810c12d0>] ? shrink_dcache_memory+0x120/0x1c0
[819678.711601]  [<ffffffff8108433f>] ? shrink_slab+0x11f/0x180
[819678.711601]  [<ffffffff81084dfb>] ? try_to_free_pages+0x23b/0x380
[819678.711601]  [<ffffffff81081de0>] ? isolate_pages_global+0x0/0x250
[819678.711601]  [<ffffffff8107e2e2>] ? __alloc_pages_nodemask+0x3b2/0x670
[819678.711601]  [<ffffffff81029831>] ? flush_tlb_page+0x41/0xa0
[819678.711601]  [<ffffffff8107e5b2>] ? __get_free_pages+0x12/0x60
[819678.711601]  [<ffffffff81037d63>] ? copy_process+0xd3/0x1100
[819678.711601]  [<ffffffff81038e39>] ? do_fork+0xa9/0x3a0
[819678.711601]  [<ffffffff810c6a5a>] ? alloc_fd+0x4a/0x140
[819678.711601]  [<ffffffff810acd0e>] ? fd_install+0x2e/0x60
[819678.711601]  [<ffffffff810b6ca1>] ? do_pipe_flags+0xc1/0x100
[819678.711601]  [<ffffffff81002f93>] ? stub_clone+0x13/0x20
[819678.711601]  [<ffffffff81002d2b>] ? system_call_fastpath+0x16/0x1b
[819678.711601] Code: 1f 84 00 00 00 00 00 0f 0b eb fe 0f 1f 40 00 48 8b 83 f8 00 00 00 48 c7 c7 68 3b 6f 81 48 8b 40 30 48 8b 10 31 c0 e8 4d 0e 52 00 <0f> 0b eb fe 0f 1f 84 00 00 00 00 00 53 31 c0 48 89 fb b9 47 00
[819678.711601] RIP  [<ffffffff810c4834>] clear_inode+0x124/0x130
[819678.711601]  RSP <ffff880110011ad0>
[819679.001188] ---[ end trace 2c3c3cc4e2ff89d4 ]---
Comment 8 Tomas Zvala 2010-10-25 14:29:29 UTC
Hello, 
we're still experiencing the same bug. I've also found two other bugreports looking pretty similar.

https://bugs.launchpad.net/ubuntu/+source/ntfs-3g/+bug/599249
https://bugzilla.redhat.com/show_bug.cgi?id=644085
Comment 9 Miklos Szeredi 2010-10-26 19:55:30 UTC
Created attachment 35132 [details]
Another test patch

There's a pattern, definitely.  It's always fuse and always triggers after some use.  But however hard I look at it I can't see anything that fuse is doing wrong, and I've been staring at this for many hours.

Look: generic_delete_inode calls truncate_inode_pages immediately before clear_inode (fuse doesn't define .delete_inode).

truncate_inode_pages iterates through all the pages in the mapping and calls truncate_inode_page on them, which removes the page from the mapping and decrements nrpages.  So I have no clue.

The above patch is an attempt to find out what exactly the situation is, whether only the count is wrong or truncate_inode_pages doesn't manage to get rid of all pages or what.

Can you please try this patch?
Comment 10 Miklos Szeredi 2010-10-26 19:59:55 UTC
Created attachment 35142 [details]
Another test patch, updated

Oops forgot to refresh the final one...