4 SPMC SBI7126T-S6 blades with 12G ram and 2 Xeons E5520 running glusterfs 3.0.4 through fuse 2.8.4 on kernel 2.6.33. Only one of those 4 boxes randomly triggers this bug and the system becomes unusable (some file handles on glusterfs hang) and it freezes on reboot. Apr 22 17:22:45 imsblade3 [2430279.243581] ------------[ cut here ]------------ Apr 22 17:22:45 imsblade3 [2430279.244002] kernel BUG at fs/inode.c:313! Apr 22 17:22:45 imsblade3 [2430279.244002] invalid opcode: 0000 [#1] SMP Apr 22 17:22:45 imsblade3 [2430279.244002] last sysfs file: /sys/devices/platform/coretemp.7/temp1_crit Apr 22 17:22:45 imsblade3 [2430279.244002] CPU 0 Apr 22 17:22:45 imsblade3 [2430279.244002] Pid: 32604, comm: dmon3.pl Not tainted 2.6.33imsblade #2 B7DW3/B7DW3 Apr 22 17:22:45 imsblade3 [2430279.244002] RIP: 0010:[<ffffffff810c47e8>] [<ffffffff810c47e8>] clear_inode+0x108/0x110 Apr 22 17:22:45 imsblade3 [2430279.244002] RSP: 0018:ffff8801b4add728 EFLAGS: 00010202 Apr 22 17:22:45 imsblade3 [2430279.244002] RAX: 0000000000000000 RBX: ffff8800a7454dc0 RCX: ffff880002188f38 Apr 22 17:22:45 imsblade3 [2430279.244002] RDX: 0000000000000000 RSI: ffff8801b4add688 RDI: ffff8800a7454dc0 Apr 22 17:22:45 imsblade3 [2430279.244002] RBP: 0000000000000000 R08: ffff8801b4add5a8 R09: 0000000000000000 Apr 22 17:22:45 imsblade3 [2430279.244002] R10: 0000000000000040 R11: 0000000000000040 R12: ffff88022e54a000 Apr 22 17:22:45 imsblade3 [2430279.244002] R13: 0000000000000000 R14: ffff88022e54a0e8 R15: 0000000000000008 Apr 22 17:22:45 imsblade3 [2430279.244002] FS: 00007f26afbd16f0(0000) GS:ffff880028200000(0000) knlGS:0000000000000000 Apr 22 17:22:45 imsblade3 [2430279.244002] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 Apr 22 17:22:45 imsblade3 [2430279.244002] CR2: 0000000004375000 CR3: 0000000191152000 CR4: 00000000000006f0 Apr 22 17:22:45 imsblade3 [2430279.244002] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 Apr 22 17:22:45 imsblade3 [2430279.244002] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 Apr 22 17:22:45 imsblade3 [2430279.244002] Process dmon3.pl (pid: 32604, threadinfo ffff8801b4adc000, task ffff88022e525e00) Apr 22 17:22:45 imsblade3 [2430279.244002] Stack: Apr 22 17:22:45 imsblade3 [2430279.244002] ffff8800a7454dc0 ffffffff810c5056 ffff880012d24a88 ffff880012d24a80 Apr 22 17:22:45 imsblade3 [2430279.244002] <0> ffff880012d24a88 ffffffff810c0db7 0000003f0000003f ffff880012d24a80 Apr 22 17:22:45 imsblade3 [2430279.244002] <0> ffff88022e54a000 ffffffff810c1077 ffff8801b4add7a8 ffff8801b4add798 Apr 22 17:22:45 imsblade3 [2430279.244002] Call Trace: Apr 22 17:22:45 imsblade3 [2430279.244002] [<ffffffff810c5056>] ? generic_delete_inode+0x166/0x1a0 Apr 22 17:22:45 imsblade3 [2430279.244002] [<ffffffff810c0db7>] ? d_kill+0x47/0x80 Apr 22 17:22:45 imsblade3 [2430279.244002] [<ffffffff810c1077>] ? __shrink_dcache_sb+0x277/0x370 Apr 22 17:22:45 imsblade3 [2430279.244002] [<ffffffff810c12a0>] ? shrink_dcache_memory+0x120/0x1c0 Apr 22 17:22:45 imsblade3 [2430279.244002] [<ffffffff8108430f>] ? shrink_slab+0x11f/0x180 Apr 22 17:22:45 imsblade3 [2430279.244002] [<ffffffff81084dcb>] ? try_to_free_pages+0x23b/0x380 Apr 22 17:22:45 imsblade3 [2430279.244002] [<ffffffff81081db0>] ? isolate_pages_global+0x0/0x250 Apr 22 17:22:45 imsblade3 [2430279.244002] [<ffffffff8107e2b2>] ? __alloc_pages_nodemask+0x3b2/0x670 Apr 22 17:22:45 imsblade3 [2430279.244002] [<ffffffff815e5e50>] ? schedule+0x310/0x930 Apr 22 17:22:45 imsblade3 [2430279.244002] [<ffffffff810a7d16>] ? __slab_alloc+0x1d6/0x5e0 Apr 22 17:22:45 imsblade3 [2430279.244002] [<ffffffff811eebd2>] ? fuse_alloc_inode+0x12/0xd0 Apr 22 17:22:45 imsblade3 [2430279.244002] [<ffffffff815e5e50>] ? schedule+0x310/0x930 Apr 22 17:22:45 imsblade3 [2430279.244002] [<ffffffff811ee3b0>] ? fuse_inode_eq+0x0/0x20 Apr 22 17:22:45 imsblade3 [2430279.244002] [<ffffffff811ee3d0>] ? fuse_inode_set+0x0/0x10 Apr 22 17:22:45 imsblade3 [2430279.244002] [<ffffffff810a831f>] ? kmem_cache_alloc+0x9f/0xb0 Apr 22 17:22:45 imsblade3 [2430279.244002] [<ffffffff811eebd2>] ? fuse_alloc_inode+0x12/0xd0 Apr 22 17:22:45 imsblade3 [2430279.244002] [<ffffffff810c52bf>] ? alloc_inode+0x1f/0xa0 Apr 22 17:22:45 imsblade3 [2430279.244002] [<ffffffff810c5558>] ? iget5_locked+0x98/0x1c0 Apr 22 17:22:45 imsblade3 [2430279.244002] [<ffffffff811ef555>] ? fuse_iget+0x85/0x1e0 Apr 22 17:22:45 imsblade3 [2430279.244002] [<ffffffff811e9532>] ? fuse_lookup_name+0x1b2/0x1e0 Apr 22 17:22:45 imsblade3 [2430279.244002] [<ffffffff811e9bf9>] ? fuse_lookup+0x69/0x200 Apr 22 17:22:45 imsblade3 [2430279.244002] [<ffffffff810b833a>] ? do_lookup+0xfa/0x2a0 Apr 22 17:22:45 imsblade3 [2430279.244002] [<ffffffff810ba53e>] ? link_path_walk+0x54e/0xe60 Apr 22 17:22:45 imsblade3 [2430279.244002] [<ffffffff810bb00e>] ? path_walk+0x6e/0xf0 Apr 22 17:22:45 imsblade3 [2430279.244002] [<ffffffff810bb1ab>] ? do_path_lookup+0x5b/0x60 Apr 22 17:22:45 imsblade3 [2430279.244002] [<ffffffff810bbe87>] ? user_path_at+0x97/0xd0 Apr 22 17:22:45 imsblade3 [2430279.244002] [<ffffffff810a7eed>] ? __slab_alloc+0x3ad/0x5e0 Apr 22 17:22:45 imsblade3 [2430279.244002] [<ffffffff810b2ed4>] ? cp_new_stat+0xe4/0x100 Apr 22 17:22:45 imsblade3 [2430279.244002] [<ffffffff810b2d67>] ? vfs_fstatat+0x37/0x80 Apr 22 17:22:45 imsblade3 [2430279.244002] [<ffffffff810b2faf>] ? sys_newstat+0x1f/0x50 Apr 22 17:22:45 imsblade3 [2430279.244002] [<ffffffff81002d2b>] ? system_call_fastpath+0x16/0x1b Apr 22 17:22:45 imsblade3 [2430279.244002] Code: c7 c2 a0 3a 0c 81 be 07 00 00 00 e8 43 22 52 00 e9 44 ff ff ff 0f 0b eb fe 66 2e 0f 1f 84 00 00 00 00 00 0f 0b eb fe 0f 1f 40 00 <0f> 0b eb fe 0f 1f 40 00 53 31 c0 48 89 fb b9 47 00 00 00 48 c7 Apr 22 17:22:45 imsblade3 [2430279.244002] RIP [<ffffffff810c47e8>] clear_inode+0x108/0x110 Apr 22 17:22:45 imsblade3 [2430279.244002] RSP <ffff8801b4add728> Apr 22 17:22:45 imsblade3 [2430279.616368] ---[ end trace c1617eab624e0059 ]--- May 3 04:56:07 imsblade3 [585877.972567] kernel BUG at fs/inode.c:313! May 3 04:56:07 imsblade3 [585877.972567] invalid opcode: 0000 [#1] SMP May 3 04:56:07 imsblade3 [585877.972567] last sysfs file: /sys/devices/platform/coretemp.7/temp1_crit May 3 04:56:07 imsblade3 [585877.972567] CPU 0 May 3 04:56:07 imsblade3 [585877.972567] Pid: 28974, comm: dmon3.pl Not tainted 2.6.33imsblade #2 B7DW3/B7DW3 May 3 04:56:07 imsblade3 [585877.972567] RIP: 0010:[<ffffffff810c47e8>] [<ffffffff810c47e8>] clear_inode+0x108/0x110 May 3 04:56:07 imsblade3 [585877.972567] RSP: 0018:ffff8801dfb41728 EFLAGS: 00010202 May 3 04:56:07 imsblade3 [585877.972567] RAX: 0000000000000000 RBX: ffff880059454dc0 RCX: ffff880059454ee0 May 3 04:56:07 imsblade3 [585877.972567] RDX: 0000000000000000 RSI: ffff8801dfb41688 RDI: ffff880059454dc0 May 3 04:56:07 imsblade3 [585877.972567] RBP: 0000000000000000 R08: ffff8801dfb415a8 R09: 0000000000000001 May 3 04:56:07 imsblade3 [585877.972567] R10: 0000000000000040 R11: 0000000000000040 R12: ffff88022a13c400 May 3 04:56:07 imsblade3 [585877.972567] R13: 0000000000000000 R14: ffff88022a13c4e8 R15: 0000000000000008 May 3 04:56:07 imsblade3 [585877.972567] FS: 00007f0a6429c6f0(0000) GS:ffff880028200000(0000) knlGS:0000000000000000 May 3 04:56:07 imsblade3 [585877.972567] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b May 3 04:56:07 imsblade3 [585877.972567] CR2: 0000000004c4e000 CR3: 00000000599f7000 CR4: 00000000000006f0 May 3 04:56:07 imsblade3 [585877.972567] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 May 3 04:56:07 imsblade3 [585877.972567] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 May 3 04:56:07 imsblade3 [585877.972567] Process dmon3.pl (pid: 28974, threadinfo ffff8801dfb40000, task ffff880051dd8bc0) May 3 04:56:07 imsblade3 [585877.972567] Stack: May 3 04:56:07 imsblade3 [585877.972567] ffff880059454dc0 ffffffff810c5056 ffff8800726906c8 ffff8800726906c0 May 3 04:56:07 imsblade3 [585877.972567] <0> ffff8800726906c8 ffffffff810c0db7 0000009100000091 ffff8800726906c0 May 3 04:56:07 imsblade3 [585877.972567] <0> ffff88022a13c400 ffffffff810c1077 ffff8801dfb417a8 ffff8801dfb41798 May 3 04:56:07 imsblade3 [585877.972567] Call Trace: May 3 04:56:07 imsblade3 [585877.972567] [<ffffffff810c5056>] ? generic_delete_inode+0x166/0x1a0 May 3 04:56:07 imsblade3 [585877.972567] [<ffffffff810c0db7>] ? d_kill+0x47/0x80 May 3 04:56:07 imsblade3 [585877.972567] [<ffffffff810c1077>] ? __shrink_dcache_sb+0x277/0x370 May 3 04:56:07 imsblade3 [585877.972567] [<ffffffff810c12a0>] ? shrink_dcache_memory+0x120/0x1c0 May 3 04:56:07 imsblade3 [585877.972567] [<ffffffff8108430f>] ? shrink_slab+0x11f/0x180 May 3 04:56:07 imsblade3 [585877.972567] [<ffffffff81084dcb>] ? try_to_free_pages+0x23b/0x380 May 3 04:56:07 imsblade3 [585877.972567] [<ffffffff81081db0>] ? isolate_pages_global+0x0/0x250 May 3 04:56:07 imsblade3 [585877.972567] [<ffffffff8107e2b2>] ? __alloc_pages_nodemask+0x3b2/0x670 May 3 04:56:07 imsblade3 [585877.972567] [<ffffffff815e5e50>] ? schedule+0x310/0x930 May 3 04:56:07 imsblade3 [585877.972567] [<ffffffff810a7d16>] ? __slab_alloc+0x1d6/0x5e0 May 3 04:56:07 imsblade3 [585877.972567] [<ffffffff811eebd2>] ? fuse_alloc_inode+0x12/0xd0 May 3 04:56:07 imsblade3 [585877.972567] [<ffffffff811ee3b0>] ? fuse_inode_eq+0x0/0x20 May 3 04:56:07 imsblade3 [585877.972567] [<ffffffff811ee3d0>] ? fuse_inode_set+0x0/0x10 May 3 04:56:07 imsblade3 [585877.972567] [<ffffffff810a831f>] ? kmem_cache_alloc+0x9f/0xb0 May 3 04:56:07 imsblade3 [585877.972567] [<ffffffff811eebd2>] ? fuse_alloc_inode+0x12/0xd0 May 3 04:56:07 imsblade3 [585877.972567] [<ffffffff810c52bf>] ? alloc_inode+0x1f/0xa0 May 3 04:56:07 imsblade3 [585877.972567] [<ffffffff810c5558>] ? iget5_locked+0x98/0x1c0 May 3 04:56:07 imsblade3 [585877.972567] [<ffffffff811ef555>] ? fuse_iget+0x85/0x1e0 May 3 04:56:07 imsblade3 [585877.972567] [<ffffffff811e9532>] ? fuse_lookup_name+0x1b2/0x1e0 May 3 04:56:07 imsblade3 [585877.972567] [<ffffffff811e9bf9>] ? fuse_lookup+0x69/0x200 May 3 04:56:07 imsblade3 [585877.972567] [<ffffffff810b833a>] ? do_lookup+0xfa/0x2a0 May 3 04:56:07 imsblade3 [585877.972567] [<ffffffff810ba53e>] ? link_path_walk+0x54e/0xe60 May 3 04:56:07 imsblade3 [585877.972567] [<ffffffff810bb00e>] ? path_walk+0x6e/0xf0 May 3 04:56:07 imsblade3 [585877.972567] [<ffffffff810bb1ab>] ? do_path_lookup+0x5b/0x60 May 3 04:56:07 imsblade3 [585877.972567] [<ffffffff810bbe87>] ? user_path_at+0x97/0xd0 May 3 04:56:07 imsblade3 [585877.972567] [<ffffffff81098668>] ? page_add_new_anon_rmap+0x58/0x90 May 3 04:56:07 imsblade3 [585877.972567] [<ffffffff8108f3dc>] ? handle_mm_fault+0x70c/0x800 May 3 04:56:07 imsblade3 [585877.972567] [<ffffffff810b2d67>] ? vfs_fstatat+0x37/0x80 May 3 04:56:07 imsblade3 [585877.972567] [<ffffffff810b2faf>] ? sys_newstat+0x1f/0x50 May 3 04:56:07 imsblade3 [585877.972567] [<ffffffff81024d3f>] ? do_page_fault+0x14f/0x370 May 3 04:56:07 imsblade3 [585877.972567] [<ffffffff812ab201>] ? __up_read+0x21/0xc0 May 3 04:56:07 imsblade3 [585877.972567] [<ffffffff815e841f>] ? page_fault+0x1f/0x30 May 3 04:56:07 imsblade3 [585877.972567] [<ffffffff81002d2b>] ? system_call_fastpath+0x16/0x1b May 3 04:56:07 imsblade3 [585877.972567] Code: c7 c2 a0 3a 0c 81 be 07 00 00 00 e8 43 22 52 00 e9 44 ff ff ff 0f 0b eb fe 66 2e 0f 1f 84 00 00 00 00 00 0f 0b eb fe 0f 1f 40 00 <0f> 0b eb fe 0f 1f 40 00 53 31 c0 48 89 fb b9 47 00 00 00 48 c7 May 3 04:56:07 imsblade3 [585877.972567] RIP [<ffffffff810c47e8>] clear_inode+0x108/0x110 May 3 04:56:07 imsblade3 [585877.972567] RSP <ffff8801dfb41728> May 3 04:56:07 imsblade3 [585878.348434] ---[ end trace ff65c60b27049d37 ]---
Well i know it's not a best bugreport ever, but if there's anything else i can provide, please let me know. I can also provide more 'instances' of this bug .... like this one without fuse involved (actually i'm not sure its not involved, but it's not mentioned). May 10 05:41:04 imsblade3 [242236.814441] ------------[ cut here ]------------ May 10 05:41:04 imsblade3 [242236.815190] kernel BUG at fs/inode.c:313! May 10 05:41:04 imsblade3 [242236.815190] invalid opcode: 0000 [#1] SMP May 10 05:41:04 imsblade3 [242236.815190] last sysfs file: /sys/devices/platform/coretemp.7/temp1_crit May 10 05:41:04 imsblade3 [242236.815190] CPU 3 May 10 05:41:04 imsblade3 [242236.815190] Pid: 17097, comm: dmon3.pl Not tainted 2.6.33imsblade #2 B7DW3/B7DW3 May 10 05:41:04 imsblade3 [242236.815190] RIP: 0010:[<ffffffff810c47e8>] [<ffffffff810c47e8>] clear_inode+0x108/0x110 May 10 05:41:04 imsblade3 [242236.815190] RSP: 0018:ffff8800b5b2fad0 EFLAGS: 00010202 May 10 05:41:04 imsblade3 [242236.815190] RAX: 0000000000000000 RBX: ffff88001909db80 RCX: ffff8801ef25a748 May 10 05:41:04 imsblade3 [242236.815190] RDX: 0000000000000000 RSI: ffff8800b5b2fa30 RDI: ffff88001909db80 May 10 05:41:04 imsblade3 [242236.815190] RBP: 0000000000000000 R08: ffff8800b5b2f950 R09: 0000000000000000 May 10 05:41:04 imsblade3 [242236.815190] R10: 0000000000000040 R11: 0000000000000040 R12: ffff88022e63a400 May 10 05:41:04 imsblade3 [242236.815190] R13: 0000000000000000 R14: ffff88022e63a4e8 R15: 0000000000000008 May 10 05:41:04 imsblade3 [242236.815190] FS: 00007fd1b67856f0(0000) GS:ffff8800282c0000(0000) knlGS:0000000000000000 May 10 05:41:04 imsblade3 [242236.815190] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b May 10 05:41:04 imsblade3 [242236.815190] CR2: 00000000043516e0 CR3: 000000007d58b000 CR4: 00000000000006e0 May 10 05:41:04 imsblade3 [242236.815190] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 May 10 05:41:04 imsblade3 [242236.815190] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 May 10 05:41:04 imsblade3 [242236.815190] Process dmon3.pl (pid: 17097, threadinfo ffff8800b5b2e000, task ffff88019253d240) May 10 05:41:04 imsblade3 [242236.815190] Stack: May 10 05:41:04 imsblade3 [242236.815190] ffff88001909db80 ffffffff810c5056 ffff8800c0224548 ffff8800c0224540 May 10 05:41:04 imsblade3 [242236.815190] <0> ffff8800c0224548 ffffffff810c0db7 000000cb000000cb ffff8800c0224540 May 10 05:41:04 imsblade3 [242236.815190] <0> ffff88022e63a400 ffffffff810c1077 ffff8800b5b2fb50 ffff8800b5b2fb40 May 10 05:41:04 imsblade3 [242236.815190] Call Trace: May 10 05:41:04 imsblade3 [242236.815190] [<ffffffff810c5056>] ? generic_delete_inode+0x166/0x1a0 May 10 05:41:04 imsblade3 [242236.815190] [<ffffffff810c0db7>] ? d_kill+0x47/0x80 May 10 05:41:04 imsblade3 [242236.815190] [<ffffffff810c1077>] ? __shrink_dcache_sb+0x277/0x370 May 10 05:41:04 imsblade3 [242236.815190] [<ffffffff810c12a0>] ? shrink_dcache_memory+0x120/0x1c0 May 10 05:41:04 imsblade3 [242236.815190] [<ffffffff8108430f>] ? shrink_slab+0x11f/0x180 May 10 05:41:04 imsblade3 [242236.815190] [<ffffffff81084dcb>] ? try_to_free_pages+0x23b/0x380 May 10 05:41:04 imsblade3 [242236.815190] [<ffffffff81081db0>] ? isolate_pages_global+0x0/0x250 May 10 05:41:04 imsblade3 [242236.815190] [<ffffffff8107e2b2>] ? __alloc_pages_nodemask+0x3b2/0x670 May 10 05:41:04 imsblade3 [242236.815190] [<ffffffff8107e582>] ? __get_free_pages+0x12/0x60 May 10 05:41:04 imsblade3 [242236.815190] [<ffffffff81037d63>] ? copy_process+0xd3/0x1100 May 10 05:41:04 imsblade3 [242236.815190] [<ffffffff810a82eb>] ? kmem_cache_alloc+0x6b/0xb0 May 10 05:41:04 imsblade3 [242236.815190] [<ffffffff81038e39>] ? do_fork+0xa9/0x3a0 May 10 05:41:04 imsblade3 [242236.815190] [<ffffffff810c6a0a>] ? alloc_fd+0x4a/0x140 May 10 05:41:04 imsblade3 [242236.815190] [<ffffffff810accde>] ? fd_install+0x2e/0x60 May 10 05:41:04 imsblade3 [242236.815190] [<ffffffff810b6c71>] ? do_pipe_flags+0xc1/0x100 May 10 05:41:04 imsblade3 [242236.815190] [<ffffffff81002f93>] ? stub_clone+0x13/0x20 May 10 05:41:04 imsblade3 [242236.815190] [<ffffffff81002d2b>] ? system_call_fastpath+0x16/0x1b May 10 05:41:04 imsblade3 [242236.815190] Code: c7 c2 a0 3a 0c 81 be 07 00 00 00 e8 43 22 52 00 e9 44 ff ff ff 0f 0b eb fe 66 2e 0f 1f 84 00 00 00 00 00 0f 0b eb fe 0f 1f 40 00 <0f> 0b eb fe 0f 1f 40 00 53 31 c0 48 89 fb b9 47 00 00 00 48 c7 May 10 05:41:04 imsblade3 [242236.815190] RIP [<ffffffff810c47e8>] clear_inode+0x108/0x110 May 10 05:41:04 imsblade3 [242236.815190] RSP <ffff8800b5b2fad0> May 10 05:41:04 imsblade3 [242237.102362] ---[ end trace c68208beaf852921 ]--- May 10 05:44:02 imsblade3 [242414.273198] i801_smbus 0000:00:1f.3: Transaction timeout May 10 05:44:02 imsblade3 [242414.486198] i801_smbus 0000:00:1f.3: Transaction timeout May 10 05:45:47 imsblade3 [242519.902378] ------------[ cut here ]------------ May 10 05:45:47 imsblade3 [242519.902378] ------------[ cut here ]------------ May 10 05:45:47 imsblade3 [242519.903315] kernel BUG at fs/inode.c:313! May 10 05:45:47 imsblade3 [242519.903315] invalid opcode: 0000 [#2] SMP May 10 05:45:47 imsblade3 [242519.903315] last sysfs file: /sys/devices/platform/coretemp.7/temp1_crit May 10 05:45:47 imsblade3 [242519.903315] CPU 5 May 10 05:45:47 imsblade3 [242519.903315] Pid: 549, comm: kswapd0 Tainted: G D 2.6.33imsblade #2 B7DW3/B7DW3 May 10 05:45:47 imsblade3 [242519.903315] RIP: 0010:[<ffffffff810c47e8>] [<ffffffff810c47e8>] clear_inode+0x108/0x110 May 10 05:45:47 imsblade3 [242519.903315] RSP: 0018:ffff88022f29fc40 EFLAGS: 00010202 May 10 05:45:47 imsblade3 [242519.903315] RAX: 0000000000000000 RBX: ffff8800305b23c0 RCX: ffff8800305b24e0 May 10 05:45:47 imsblade3 [242519.903315] RDX: 0000000000000000 RSI: ffff88022f29fba0 RDI: ffff8800305b23c0 May 10 05:45:47 imsblade3 [242519.903315] RBP: 0000000000000000 R08: ffff88022f29fac0 R09: 0000000000000000 May 10 05:45:47 imsblade3 [242519.903315] R10: 0000000000000040 R11: 0000000000000040 R12: ffff88022e63a400 May 10 05:45:47 imsblade3 [242519.903315] R13: 0000000000000000 R14: ffff88022e63a4e8 R15: 0000000000000008 May 10 05:45:47 imsblade3 [242519.903315] FS: 0000000000000000(0000) GS:ffff880028340000(0000) knlGS:0000000000000000 May 10 05:45:47 imsblade3 [242519.903315] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b May 10 05:45:47 imsblade3 [242519.903315] CR2: 00000000043bf5f8 CR3: 0000000018691000 CR4: 00000000000006e0 May 10 05:45:47 imsblade3 [242519.903315] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 May 10 05:45:47 imsblade3 [242519.903315] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 May 10 05:45:47 imsblade3 [242519.903315] Process kswapd0 (pid: 549, threadinfo ffff88022f29e000, task ffff88022fa24680) May 10 05:45:47 imsblade3 [242519.903315] Stack: May 10 05:45:47 imsblade3 [242519.903315] ffff8800305b23c0 ffffffff810c5056 ffff880146b26e48 ffff880146b26e40 May 10 05:45:47 imsblade3 [242519.903315] <0> ffff880146b26e48 ffffffff810c0db7 000000c9000000c9 ffff880146b26e40 May 10 05:45:47 imsblade3 [242519.903315] <0> ffff88022e63a400 ffffffff810c1077 ffff88022f29fcc0 ffff88022f29fcb0 May 10 05:45:47 imsblade3 [242519.903315] Call Trace: May 10 05:45:47 imsblade3 [242519.903315] [<ffffffff810c5056>] ? generic_delete_inode+0x166/0x1a0 May 10 05:45:47 imsblade3 [242519.903315] [<ffffffff810c0db7>] ? d_kill+0x47/0x80 May 10 05:45:47 imsblade3 [242519.903315] [<ffffffff810c1077>] ? __shrink_dcache_sb+0x277/0x370 May 10 05:45:47 imsblade3 [242519.903315] [<ffffffff810c12a0>] ? shrink_dcache_memory+0x120/0x1c0 May 10 05:45:47 imsblade3 [242519.903315] [<ffffffff8108430f>] ? shrink_slab+0x11f/0x180 May 10 05:45:47 imsblade3 [242519.903315] [<ffffffff8108481f>] ? kswapd+0x4af/0x820 May 10 05:45:47 imsblade3 [242519.903315] [<ffffffff81081db0>] ? isolate_pages_global+0x0/0x250 May 10 05:45:47 imsblade3 [242519.903315] [<ffffffff8104fd00>] ? autoremove_wake_function+0x0/0x30 May 10 05:45:47 imsblade3 [242519.903315] [<ffffffff8102dd7b>] ? __wake_up_common+0x5b/0x90 May 10 05:45:47 imsblade3 [242519.903315] [<ffffffff81084370>] ? kswapd+0x0/0x820 May 10 05:45:47 imsblade3 [242519.903315] [<ffffffff8104f866>] ? kthread+0x96/0xb0 May 10 05:45:47 imsblade3 [242519.903315] [<ffffffff81003a54>] ? kernel_thread_helper+0x4/0x10 May 10 05:45:47 imsblade3 [242519.903315] [<ffffffff8104f7d0>] ? kthread+0x0/0xb0 May 10 05:45:47 imsblade3 [242519.903315] [<ffffffff81003a50>] ? kernel_thread_helper+0x0/0x10 May 10 05:45:47 imsblade3 [242519.903315] Code: c7 c2 a0 3a 0c 81 be 07 00 00 00 e8 43 22 52 00 e9 44 ff ff ff 0f 0b eb fe 66 2e 0f 1f 84 00 00 00 00 00 0f 0b eb fe 0f 1f 40 00 <0f> 0b eb fe 0f 1f 40 00 53 31 c0 48 89 fb b9 47 00 00 00 48 c7 May 10 05:45:47 imsblade3 [242519.903315] RIP [<ffffffff810c47e8>] clear_inode+0x108/0x110 May 10 05:45:47 imsblade3 [242519.903315] RSP <ffff88022f29fc40> May 10 05:45:47 imsblade3 [242520.172618] ---[ end trace c68208beaf852922 ]---
Created attachment 26339 [details] test patch Tomas, can you please try a kernel with the attached patch and report the result?
Hello Miklos, it finally happened again (truly random :-O). Does this help? t. [403756.196269] inode->i_data.nrpages: 1 in fuse [403756.196280] ------------[ cut here ]------------ [403756.196377] kernel BUG at fs/inode.c:316! [403756.196377] invalid opcode: 0000 [#1] SMP [403756.196377] last sysfs file: /sys/devices/platform/coretemp.7/temp1_crit [403756.196377] CPU 6 [403756.196377] Pid: 13463, comm: dmon3.pl Not tainted 2.6.33imsblade #3 B7DW3/B7DW3 [403756.196377] RIP: 0010:[<ffffffff810c4804>] [<ffffffff810c4804>] clear_inode+0x124/0x130 [403756.196377] RSP: 0018:ffff8801556cdad0 EFLAGS: 00010282 [403756.196377] RAX: 0000000000000037 RBX: ffff88019780b700 RCX: 00000000000fffff [403756.196377] RDX: 0000000000000000 RSI: 0000000000000046 RDI: ffffffff818b8b10 [403756.196377] RBP: 0000000000000000 R08: ffffffff816e40ad R09: ffff8801d56cd96f [403756.196377] R10: 0000000000000000 R11: 0000000000000003 R12: ffff88022e63c400 [403756.196377] R13: 0000000000000000 R14: ffff88022e63c4e8 R15: 0000000000000008 [403756.196377] FS: 00007fce3aaa76f0(0000) GS:ffff880028380000(0000) knlGS:0000000000000000 [403756.196377] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b [403756.196377] CR2: 000000000250d6e0 CR3: 0000000144863000 CR4: 00000000000006e0 [403756.196377] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [403756.196377] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 [403756.196377] Process dmon3.pl (pid: 13463, threadinfo ffff8801556cc000, task ffff880087452340) [403756.196377] Stack: [403756.196377] ffff88019780b700 ffffffff810c5076 ffff8801cdccd9c8 ffff8801cdccd9c0 [403756.196377] <0> ffff8801cdccd9c8 ffffffff810c0db7 0000009400000094 ffff8801cdccd9c0 [403756.196377] <0> ffff88022e63c400 ffffffff810c1077 ffff8801556cdb50 ffff8801556cdb40 [403756.196377] Call Trace: [403756.196377] [<ffffffff810c5076>] ? generic_delete_inode+0x166/0x1a0 [403756.196377] [<ffffffff810c0db7>] ? d_kill+0x47/0x80 [403756.196377] [<ffffffff810c1077>] ? __shrink_dcache_sb+0x277/0x370 [403756.196377] [<ffffffff810c12a0>] ? shrink_dcache_memory+0x120/0x1c0 [403756.196377] [<ffffffff8108430f>] ? shrink_slab+0x11f/0x180 [403756.196377] [<ffffffff81084dcb>] ? try_to_free_pages+0x23b/0x380 [403756.196377] [<ffffffff81081db0>] ? isolate_pages_global+0x0/0x250 [403756.196377] [<ffffffff8107e2b2>] ? __alloc_pages_nodemask+0x3b2/0x670 [403756.196377] [<ffffffff81029831>] ? flush_tlb_page+0x41/0xa0 [403756.196377] [<ffffffff8107e582>] ? __get_free_pages+0x12/0x60 [403756.196377] [<ffffffff81037d63>] ? copy_process+0xd3/0x1100 [403756.196377] [<ffffffff810a82eb>] ? kmem_cache_alloc+0x6b/0xb0 [403756.196377] [<ffffffff81038e39>] ? do_fork+0xa9/0x3a0 [403756.196377] [<ffffffff810c6a2a>] ? alloc_fd+0x4a/0x140 [403756.196377] [<ffffffff810accde>] ? fd_install+0x2e/0x60 [403756.196377] [<ffffffff810b6c71>] ? do_pipe_flags+0xc1/0x100 [403756.196377] [<ffffffff81002f93>] ? stub_clone+0x13/0x20 [403756.196377] [<ffffffff81002d2b>] ? system_call_fastpath+0x16/0x1b [403756.196377] Code: 1f 84 00 00 00 00 00 0f 0b eb fe 0f 1f 40 00 48 8b 83 f8 00 00 00 48 c7 c7 68 3b 6f 81 48 8b 40 30 48 8b 10 31 c0 e8 5d 0e 52 00 <0f> 0b eb fe 0f 1f 84 00 00 00 00 00 53 31 c0 48 89 fb b9 47 00 [403756.196377] RIP [<ffffffff810c4804>] clear_inode+0x124/0x130 [403756.196377] RSP <ffff8801556cdad0> [403756.488905] ---[ end trace 27fe7a3307e16daa ]---
Thanks. It shows that the inode is coming from a fuse filesystem. I looked at the code very hard but still have no idea how that mapping->nrpages != 0 can happen. I notice that the kernel version has a "imsblade" localversion. Do you have any patches applied on top of vanilla?
No, no, imsblade is pretty much just the config version. The source is vanilla. btw. we tested 2.6.34-rc6 with the same result.
Created attachment 26538 [details] test patch2 Thanks for the info. Here's another test patch. Is this WARN_ON triggered before the BUG?
Hey. Finally got another BUG! ... the WARN_ON seems not to be triggered ... [819678.709654] inode->i_data.nrpages: 1 in fuse [819678.709667] ------------[ cut here ]------------ [819678.711601] kernel BUG at fs/inode.c:316! [819678.711601] invalid opcode: 0000 [#1] SMP [819678.711601] last sysfs file: /sys/devices/pci0000:00/0000:00:07.0/0000:03:00.0/host0/scsi_host/host0/proc_name [819678.711601] CPU 7 [819678.711601] Pid: 15126, comm: dmon.pl Not tainted 2.6.33imsblade #4 B7DW3/B7DW3 [819678.711601] RIP: 0010:[<ffffffff810c4834>] [<ffffffff810c4834>] clear_inode+0x124/0x130 [819678.711601] RSP: 0018:ffff880110011ad0 EFLAGS: 00010282 [819678.711601] RAX: 0000000000000037 RBX: ffff88008877e3c0 RCX: 00000000000fffff [819678.711601] RDX: 0000000000000000 RSI: 0000000000000046 RDI: ffffffff818b8b10 [819678.711601] RBP: 0000000000000000 R08: ffffffff816e40ad R09: ffff88019001196f [819678.711601] R10: 0000000000000000 R11: 0000000000000003 R12: ffff88022e639800 [819678.711601] R13: 0000000000000000 R14: ffff88022e6398e8 R15: 0000000000000008 [819678.711601] FS: 00007f3b165406f0(0000) GS:ffff8800283c0000(0000) knlGS:0000000000000000 [819678.711601] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b [819678.711601] CR2: 0000000003546360 CR3: 00000001794ff000 CR4: 00000000000006e0 [819678.711601] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [819678.711601] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 [819678.711601] Process dmon.pl (pid: 15126, threadinfo ffff880110010000, task ffff88022a72e9c0) [819678.711601] Stack: [819678.711601] ffff88008877e3c0 ffffffff810c50a6 ffff880167594d88 ffff880167594d80 [819678.711601] <0> ffff880167594d88 ffffffff810c0de7 000000a4000000a4 ffff880167594d80 [819678.711601] <0> ffff88022e639800 ffffffff810c10a7 ffff880110011b50 ffff880110011b40 [819678.711601] Call Trace: [819678.711601] [<ffffffff810c50a6>] ? generic_delete_inode+0x166/0x1a0 [819678.711601] [<ffffffff810c0de7>] ? d_kill+0x47/0x80 [819678.711601] [<ffffffff810c10a7>] ? __shrink_dcache_sb+0x277/0x370 [819678.711601] [<ffffffff810c12d0>] ? shrink_dcache_memory+0x120/0x1c0 [819678.711601] [<ffffffff8108433f>] ? shrink_slab+0x11f/0x180 [819678.711601] [<ffffffff81084dfb>] ? try_to_free_pages+0x23b/0x380 [819678.711601] [<ffffffff81081de0>] ? isolate_pages_global+0x0/0x250 [819678.711601] [<ffffffff8107e2e2>] ? __alloc_pages_nodemask+0x3b2/0x670 [819678.711601] [<ffffffff81029831>] ? flush_tlb_page+0x41/0xa0 [819678.711601] [<ffffffff8107e5b2>] ? __get_free_pages+0x12/0x60 [819678.711601] [<ffffffff81037d63>] ? copy_process+0xd3/0x1100 [819678.711601] [<ffffffff81038e39>] ? do_fork+0xa9/0x3a0 [819678.711601] [<ffffffff810c6a5a>] ? alloc_fd+0x4a/0x140 [819678.711601] [<ffffffff810acd0e>] ? fd_install+0x2e/0x60 [819678.711601] [<ffffffff810b6ca1>] ? do_pipe_flags+0xc1/0x100 [819678.711601] [<ffffffff81002f93>] ? stub_clone+0x13/0x20 [819678.711601] [<ffffffff81002d2b>] ? system_call_fastpath+0x16/0x1b [819678.711601] Code: 1f 84 00 00 00 00 00 0f 0b eb fe 0f 1f 40 00 48 8b 83 f8 00 00 00 48 c7 c7 68 3b 6f 81 48 8b 40 30 48 8b 10 31 c0 e8 4d 0e 52 00 <0f> 0b eb fe 0f 1f 84 00 00 00 00 00 53 31 c0 48 89 fb b9 47 00 [819678.711601] RIP [<ffffffff810c4834>] clear_inode+0x124/0x130 [819678.711601] RSP <ffff880110011ad0> [819679.001188] ---[ end trace 2c3c3cc4e2ff89d4 ]---
Hello, we're still experiencing the same bug. I've also found two other bugreports looking pretty similar. https://bugs.launchpad.net/ubuntu/+source/ntfs-3g/+bug/599249 https://bugzilla.redhat.com/show_bug.cgi?id=644085
Created attachment 35132 [details] Another test patch There's a pattern, definitely. It's always fuse and always triggers after some use. But however hard I look at it I can't see anything that fuse is doing wrong, and I've been staring at this for many hours. Look: generic_delete_inode calls truncate_inode_pages immediately before clear_inode (fuse doesn't define .delete_inode). truncate_inode_pages iterates through all the pages in the mapping and calls truncate_inode_page on them, which removes the page from the mapping and decrements nrpages. So I have no clue. The above patch is an attempt to find out what exactly the situation is, whether only the count is wrong or truncate_inode_pages doesn't manage to get rid of all pages or what. Can you please try this patch?
Created attachment 35142 [details] Another test patch, updated Oops forgot to refresh the final one...