hi, i'm currently testing an OCFS2-over-LVM-over-AoE and getting an oops while 'svn up' warmup. the machine has a dual opteron (8-core) with 64GB ecc ram. here's the dmesg log: (...) [91168.611032] ------------[ cut here ]------------ [91168.614320] kernel BUG at /home/users/builder/rpm/BUILD/kernel-2.6.36.2/linux-2.6.36/mm/slub.c:2874! [91168.614320] invalid opcode: 0000 [#1] SMP [91168.614320] last sysfs file: /sys/devices/virtual/bdi/0:31/uevent [91168.614320] CPU 8 [91168.614320] Modules linked in: ocfs2_stack_o2cb ocfs2 quota_tree ocfs2_dlmfs ocfs2_stackglue ocfs2_dlm ocfs2_nodemanager configfs nfs fscache binfmt_misc nfsd lockd nfs_acl auth_rpcgss sunrpc exportfs sch_sfq iptable_nat nf_nat nf_conntrack_ipv4 nf_conntrack nf_defrag_ipv4 iptable_filter xt_TCPMSS xt_tcpudp iptable_mangle ip_tables ip6table_filter ip6_tables x_tables ext4 jbd2 crc16 raid10 raid0 dm_mod uvesafb cn aoe autofs4 dummy ide_cd_mod cdrom ata_generic pata_acpi ide_pci_generic pata_atiixp joydev usbhid hid atiixp igb ohci_hcd ssb ehci_hcd mmc_core ide_core amd64_edac_mod usbcore dca i2c_piix4 pcmcia thermal sg evdev edac_core k10temp ghes i2c_core shpchp pcmcia_core hwmon pcspkr processor button edac_mce_amd hed pci_hotplug psmouse serio_raw sd_mod crc_t10dif raid1 md_mod ext3 jbd mbcache ahci libahci libata scsi_mod [last unloaded: configfs] [91168.614320] [91168.614320] Pid: 12858, xid: #0, comm: svn Not tainted 2.6.36.2-1 #1 H8DGU/H8DGU [91168.614320] RIP: 0010:[<ffffffff811322fa>] [<ffffffff811322fa>] kfree+0x10a/0x110 [91168.614320] RSP: 0018:ffff88100d529d18 EFLAGS: 00010246 [91168.614320] RAX: ffffea0038000000 RBX: ffff881000000001 RCX: 0000000000000002 [91168.614320] RDX: 0000000000000000 RSI: 0000000000000800 RDI: ffff881000000001 [91168.614320] RBP: ffff88100d529d38 R08: 0000000000000000 R09: 0000000000000000 [91168.614320] R10: 000000000000b77c R11: 0000000008000080 R12: ffffffff81152b99 [91168.614320] R13: 0000000000000000 R14: 0000000000000000 R15: ffff880cfdcebb30 [91168.614320] FS: 00007fdd04656760(0000) GS:ffff880c2e200000(0000) knlGS:00000000f6d436c0 [91168.614320] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b [91168.614320] CR2: 000000004af7e000 CR3: 0000000659bad000 CR4: 00000000000006f0 [91168.614320] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [91168.614320] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 [91168.614320] Process svn (pid: 12858, threadinfo ffff88100d528000, task ffff88100d9c9560) [91168.614320] Stack: [91168.614320] ffff880d0bab4840 0000000000000000 ffff880d0bab4840 0000000000000000 [91168.614320] <0> ffff88100d529da8 ffffffff81152b99 ffff881008000080 000000000000b77c [91168.614320] <0> ffff880d0bab48e0 ffff880cfdd1c2f0 ffff881000000001 ffff880cfdcebb30 [91168.614320] Call Trace: [91168.614320] [<ffffffff81152b99>] vfs_rename+0x2c9/0x440 [91168.614320] [<ffffffff8115433b>] sys_renameat+0x2cb/0x320 [91168.614320] [<ffffffff811448d2>] ? do_sync_read+0xd2/0x110 [91168.614320] [<ffffffff81072cb5>] ? sched_clock_local+0x25/0x90 [91168.614320] [<ffffffff8103ccfd>] ? finish_task_switch+0x3d/0xc0 [91168.614320] [<ffffffff811543a6>] sys_rename+0x16/0x20 [91168.614320] [<ffffffff81002f1b>] system_call_fastpath+0x16/0x1b [91168.614320] Code: 4d 85 ed 0f 84 3c ff ff ff 49 8b 45 00 49 8b 7d 08 49 83 c5 10 48 89 da 4c 89 e6 ff d0 49 8b 45 00 48 85 c0 75 e7 e9 1a ff ff ff <0f> 0b 0f 1f 40 00 55 48 83 ef 70 48 89 e5 e8 e3 fe ff ff c9 c3 [91168.614320] RIP [<ffffffff811322fa>] kfree+0x10a/0x110 [91168.614320] RSP <ffff88100d529d18> [91168.614320] ------------[ cut here ]------------ [91168.614320] WARNING: at /home/users/builder/rpm/BUILD/kernel-2.6.36.2/linux-2.6.36/kernel/softirq.c:144 local_bh_enable_ip+0x72/0xa0() [91168.614320] Hardware name: H8DGU [91168.614320] Modules linked in: ocfs2_stack_o2cb ocfs2 quota_tree ocfs2_dlmfs ocfs2_stackglue ocfs2_dlm ocfs2_nodemanager configfs nfs fscache binfmt_misc nfsd lockd nfs_acl auth_rpcgss sunrpc exportfs sch_sfq iptable_nat nf_nat nf_conntrack_ipv4 nf_conntrack nf_defrag_ipv4 iptable_filter xt_TCPMSS xt_tcpudp iptable_mangle ip_tables ip6table_filter ip6_tables x_tables ext4 jbd2 crc16 raid10 raid0 dm_mod uvesafb cn aoe autofs4 dummy ide_cd_mod cdrom ata_generic pata_acpi ide_pci_generic pata_atiixp joydev usbhid hid atiixp igb ohci_hcd ssb ehci_hcd mmc_core ide_core amd64_edac_mod usbcore dca i2c_piix4 pcmcia thermal sg evdev edac_core k10temp ghes i2c_core shpchp pcmcia_core hwmon pcspkr processor button edac_mce_amd hed pci_hotplug psmouse serio_raw sd_mod crc_t10dif raid1 md_mod ext3 jbd mbcache ahci libahci libata scsi_mod [last unloaded: configfs] [91168.614320] Pid: 12858, comm: svn xid: #0 Not tainted 2.6.36.2-1 #1 [91168.614320] Call Trace: [91168.614320] [<ffffffff810490aa>] warn_slowpath_common+0x7a/0xb0 [91168.614320] [<ffffffff810490f5>] warn_slowpath_null+0x15/0x20 [91168.614320] [<ffffffff8104ff42>] local_bh_enable_ip+0x72/0xa0 [91168.614320] [<ffffffff8143bc1f>] _raw_spin_unlock_bh+0xf/0x20 [91168.614320] [<ffffffffa02169a3>] cn_netlink_send+0xa3/0x1e0 [cn] [91168.614320] [<ffffffffa0221773>] __this_module+0x3373/0x4720 [uvesafb] [91168.614320] [<ffffffffa0221526>] ? __this_module+0x3126/0x4720 [uvesafb] [91168.614320] [<ffffffff812d8019>] ? notify_update+0x29/0x30 [91168.614320] [<ffffffffa0221a3c>] __this_module+0x363c/0x4720 [uvesafb] [91168.614320] [<ffffffff812736c9>] fb_blank+0x39/0x70 [91168.614320] [<ffffffff8128203b>] fbcon_blank+0x22b/0x2e0 [91168.614320] [<ffffffff810497c8>] ? release_console_sem+0x1a8/0x1f0 [91168.614320] [<ffffffff81059876>] ? lock_timer_base.clone.23+0x36/0x70 [91168.614320] [<ffffffff81059e88>] ? mod_timer+0x158/0x2c0 [91168.614320] [<ffffffff812da851>] do_unblank_screen+0xb1/0x1c0 [91168.614320] [<ffffffff812da96b>] unblank_screen+0xb/0x10 [91168.614320] [<ffffffff8124e1ed>] bust_spinlocks+0x1d/0x40 [91168.614320] [<ffffffff8143cdfb>] oops_end+0x3b/0xe0 [91168.614320] [<ffffffff81006e76>] die+0x56/0x90 [91168.614320] [<ffffffff8143c5d4>] do_trap+0xc4/0x170 [91168.614320] [<ffffffff81004010>] do_invalid_op+0x90/0xb0 [91168.614320] [<ffffffff811322fa>] ? kfree+0x10a/0x110 [91168.614320] [<ffffffff81152b99>] ? vfs_rename+0x2c9/0x440 [91168.614320] [<ffffffff81003b5a>] invalid_op+0x1a/0x20 [91168.614320] [<ffffffff81152b99>] ? vfs_rename+0x2c9/0x440 [91168.614320] [<ffffffff811322fa>] ? kfree+0x10a/0x110 [91168.614320] [<ffffffff81132228>] ? kfree+0x38/0x110 [91168.614320] [<ffffffff81152b99>] vfs_rename+0x2c9/0x440 [91168.614320] [<ffffffff8115433b>] sys_renameat+0x2cb/0x320 [91168.614320] [<ffffffff811448d2>] ? do_sync_read+0xd2/0x110 [91168.614320] [<ffffffff81072cb5>] ? sched_clock_local+0x25/0x90 [91168.614320] [<ffffffff8103ccfd>] ? finish_task_switch+0x3d/0xc0 [91168.614320] [<ffffffff811543a6>] sys_rename+0x16/0x20 [91168.614320] [<ffffffff81002f1b>] system_call_fastpath+0x16/0x1b [91168.614320] ---[ end trace 9f14f1964ebea55b ]--- [91169.894441] ---[ end trace 9f14f1964ebea55c ]--- # df -h Filesystem Size Used Avail Use% Mounted on /dev/md0 7.6G 1.3G 5.9G 18% / (...) /dev/mapper/vg_storage-lv_storage 811G 18G 793G 3% /remote/cluster # lvdisplay --- Logical volume --- LV Name /dev/vg_storage/lv_storage VG Name vg_storage LV UUID LsQHHY-FJC7-7wId-f1Ge-SSUp-d2SR-ILWmPa LV Write Access read/write LV Status available # open 2 LV Size 810.16 GiB Current LE 207400 Segments 2 Allocation inherit Read ahead sectors auto - currently set to 256 Block device 253:0 # pvdisplay --- Physical volume --- PV Name /dev/etherd/e92.0 VG Name vg_storage PV Size 405.08 GiB / not usable 2.00 MiB Allocatable yes (but full) PE Size 4.00 MiB Total PE 103700 Free PE 0 Allocated PE 103700 PV UUID eou0n9-3IiH-NHGk-vU40-Q4N0-OYM2-w8nrJB --- Physical volume --- PV Name /dev/etherd/e93.0 VG Name vg_storage PV Size 405.08 GiB / not usable 2.00 MiB Allocatable yes (but full) PE Size 4.00 MiB Total PE 103700 Free PE 0 Allocated PE 103700 PV UUID b1eAIC-wIcc-303u-Pyj3-rXdf-RaYC-rXFqmh
I can't see a BUG statement at line 2874 of 2.6.36.2's mm/slub.c. Please let us know what additional patches have been applied and please let us know what your slub,c:2874 looks like.
(In reply to comment #1) > I can't see a BUG statement at line 2874 of 2.6.36.2's mm/slub.c. Please let > us know what additional patches have been applied and please let us know what > your slub,c:2874 looks like. the are applied (by vendor) grsec and vserver patches (not used on this machine). on pure 2.6.36.2 the suitable BUG_ON() is at line 2833: http://git.kernel.org/?p=linux/kernel/git/stable/linux-2.6.36.y.git;a=blob;f=mm/slub.c;h=13fffe1f0f3dc5992471ffd590326b9517b0cc3a;hb=a1346c99fc89f2b3d35c7d7e2e4aef8ea4124342#l2833 should i build a pure 2.6.36.2 from git and stress it again?
Yes, please. I doubt you will able to trigger the problem with mainline, though. The oops looks like slab corruption where vfs_rename() passes an invalid pointer to kfree(). If the problem was in mainline, I suspect a lot more people would be triggering it.
ok, i've built the pure 2.6.36.2 and boot on both machines with slub_debug=FZPU. first test machine (server, hostname: odra) have 2 raid-10 devices exported via AoE (ggaoed daemon) as e92.0/e93.0 with linear lvm and ocfs2 on top. on this machine the ocfs2 is mounted in /remote/cluster and it's locally unused. second test machine (client, hostname: hal) imports lvm on AoE devices and mounts in /remote/cluster too. now, on the client machine in /remote/cluster i'm running 'svn up' on a few GB repo from external svn server and observing svn logs. after few minutes the client machine just reboots. there's no ooops nor panic on console. i've checked the bios and there's no logs from health monitoring, so it looks like a total kernel memory corruption. from the other side, on the server machine i've logged an oops from slab checker. here's dmesg fragment: (...) Dec 29 16:09:22 odra kernel: [ 4307.772621] o2net: no longer connected to node hal (num 0) at 10.0.2.24:7777 ^^^^^^^^^ here, the server node noticed the client reboot. Dec 29 16:09:22 odra kernel: [ 4307.772685] (o2hb-56CBFCA7B6,4162,10):o2dlm_eviction_cb:267 o2dlm has evicted node 0 from group 56CBFCA7B60B4C7BBA86D0ABE068A1FD Dec 29 16:09:22 odra kernel: [ 4307.828455] (ocfs2rec,7005,10):ocfs2_replay_journal:1605 Recovering node 0 from slot 1 on device (253,0) Dec 29 16:09:23 odra kernel: [ 4309.206338] (dlm_reco_thread,4177,12):dlm_get_lock_resource:836 56CBFCA7B60B4C7BBA86D0ABE068A1FD:$RECOVERY: at least one node (0) to recover before lock mastery can begin Dec 29 16:09:23 odra kernel: [ 4309.206345] (dlm_reco_thread,4177,12):dlm_get_lock_resource:870 56CBFCA7B60B4C7BBA86D0ABE068A1FD: recovery map is not empty, but must master $RECOVERY lock now Dec 29 16:09:23 odra kernel: [ 4309.206374] (dlm_reco_thread,4177,12):dlm_do_recovery:523 (4177) Node 2 is the Recovery Master for the Dead Node 0 for Domain 56CBFCA7B60B4C7BBA86D0ABE068A1FD Dec 29 16:09:46 odra kernel: [ 4332.206553] (ocfs2rec,7005,10):ocfs2_begin_quota_recovery:407 Beginning quota recovery in slot 1 Dec 29 16:09:46 odra kernel: [ 4332.233272] (kworker/u:0,6651,10):ocfs2_finish_quota_recovery:598 Finishing quota recovery in slot 1 Dec 29 16:23:53 odra kernel: [ 5178.862323] ============================================================================= Dec 29 16:23:53 odra kernel: [ 5178.862327] BUG kmalloc-16: Redzone overwritten Dec 29 16:23:53 odra kernel: [ 5178.862329] ----------------------------------------------------------------------------- Dec 29 16:23:53 odra kernel: [ 5178.862331] Dec 29 16:23:53 odra kernel: [ 5178.862334] INFO: 0xffff88080d41e4e0-0xffff88080d41e4e3. First byte 0x0 instead of 0xcc Dec 29 16:23:53 odra kernel: [ 5178.862366] INFO: Allocated in ocfs2_recovery_init+0x66/0xf0 [ocfs2] age=1478919 cpu=4 pid=4161 Dec 29 16:23:53 odra kernel: [ 5178.862373] INFO: Freed in dev_uevent+0x133/0x160 age=1497427 cpu=4 pid=3928 Dec 29 16:23:53 odra kernel: [ 5178.862376] INFO: Slab 0xffffea001c2e6690 objects=46 used=40 fp=0xffff88080d41e580 flags=0x6000000000000c1 Dec 29 16:23:53 odra kernel: [ 5178.862379] INFO: Object 0xffff88080d41e4d0 @offset=1232 fp=0xffff88080d41e528 Dec 29 16:23:53 odra kernel: [ 5178.862380] Dec 29 16:23:53 odra kernel: [ 5178.862382] Bytes b4 0xffff88080d41e4c0: 0a b3 fe ff 00 00 00 00 5a 5a 5a 5a 5a 5a 5a 5a .���....ZZZZZZZZ Dec 29 16:23:53 odra kernel: [ 5178.862391] Object 0xffff88080d41e4d0: 00 00 00 00 00 00 00 00 e0 e4 41 0d 08 88 ff ff ........��A...�� Dec 29 16:23:53 odra kernel: [ 5178.862399] Redzone 0xffff88080d41e4e0: 00 00 00 00 cc cc cc cc ....���� Dec 29 16:23:53 odra kernel: [ 5178.862406] Padding 0xffff88080d41e520: 5a 5a 5a 5a 5a 5a 5a 5a ZZZZZZZZ Dec 29 16:23:53 odra kernel: [ 5178.862415] Pid: 7619, comm: umount Not tainted 2.6.36.2 #3 Dec 29 16:23:53 odra kernel: [ 5178.862417] Call Trace: Dec 29 16:23:53 odra kernel: [ 5178.862427] [<ffffffff811168ee>] print_trailer+0xfe/0x160 Dec 29 16:23:53 odra kernel: [ 5178.862431] [<ffffffff81116f44>] check_bytes_and_report+0xf4/0x130 Dec 29 16:23:53 odra kernel: [ 5178.862435] [<ffffffff81116fe3>] check_object+0x63/0x260 Dec 29 16:23:53 odra kernel: [ 5178.862449] [<ffffffffa05b724d>] ? ocfs2_recovery_exit+0xbd/0xd0 [ocfs2] Dec 29 16:23:53 odra kernel: [ 5178.862453] [<ffffffff81118355>] __slab_free+0x225/0x390 Dec 29 16:23:53 odra kernel: [ 5178.862457] [<ffffffff81118707>] kfree+0xb7/0x110 Dec 29 16:23:53 odra kernel: [ 5178.862469] [<ffffffffa05b724d>] ocfs2_recovery_exit+0xbd/0xd0 [ocfs2] Dec 29 16:23:53 odra kernel: [ 5178.862473] [<ffffffff81140f7c>] ? iput+0x2c/0x280 Dec 29 16:23:53 odra kernel: [ 5178.862482] [<ffffffffa057ee2d>] ? ocfs2_truncate_log_shutdown+0x7d/0x1c0 [ocfs2] Dec 29 16:23:53 odra kernel: [ 5178.862498] [<ffffffffa05e38b2>] ocfs2_dismount_volume+0xa2/0x430 [ocfs2] Dec 29 16:23:53 odra kernel: [ 5178.862510] [<ffffffffa05e3c90>] ocfs2_put_super+0x50/0x110 [ocfs2] Dec 29 16:23:53 odra kernel: [ 5178.862514] [<ffffffff8112c3a1>] generic_shutdown_super+0x51/0xd0 Dec 29 16:23:53 odra kernel: [ 5178.862517] [<ffffffff8112c44c>] kill_block_super+0x2c/0x50 Dec 29 16:23:53 odra kernel: [ 5178.862529] [<ffffffffa05e1872>] ocfs2_kill_sb+0x72/0x80 [ocfs2] Dec 29 16:23:53 odra kernel: [ 5178.862533] [<ffffffff8112c755>] deactivate_locked_super+0x45/0x60 Dec 29 16:23:53 odra kernel: [ 5178.862536] [<ffffffff8112d4a5>] deactivate_super+0x45/0x60 Dec 29 16:23:53 odra kernel: [ 5178.862540] [<ffffffff81144e86>] mntput_no_expire+0x86/0xf0 Dec 29 16:23:53 odra kernel: [ 5178.862544] [<ffffffff81145997>] sys_umount+0x67/0x370 Dec 29 16:23:53 odra kernel: [ 5178.862550] [<ffffffff81002dab>] system_call_fastpath+0x16/0x1b Dec 29 16:23:53 odra kernel: [ 5178.862553] FIX kmalloc-16: Restoring 0xffff88080d41e4e0-0xffff88080d41e4e3=0xcc Dec 29 16:23:53 odra kernel: [ 5178.862554] Dec 29 16:23:58 odra kernel: [ 5183.637198] ocfs2: Unmounting device (253,0) on (node 2) i suppose that AoE kernel module could overwrite something on both nodes but i don't see any debugging options in `modinfo aoe`.
Created attachment 41882 [details] my modular kernel config.
The oops says that "osb->recovery_map" passed to kfree() in ocfs2_recovery_exit() has is red-zone overwritten (data written below or above the struct). Looks like OCFS2 bug to me.
This patch in mainline probably cures the problem: http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commitdiff;h=75d9bbc73804285020aa4d99bd2a9600edea8945 That should be tagged for stable. Andrew?
(In reply to comment #7) > This patch in mainline probably cures the problem: > > > http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commitdiff;h=75d9bbc73804285020aa4d99bd2a9600edea8945 is't a cure for client reboot too or only for server redzone damage?
Just server, probably. Might be worth-while to try out 2.6.37-rc8 on both client and server side.
(In reply to comment #9) > Just server, probably. Might be worth-while to try out 2.6.37-rc8 on both > client and server side. on 2.6.37-rc8 there's no redzone corruption but both machines reboot after few minutes of svn i/o. now, i'll try isolate faulty layer on 1-node ocfs2 cluster with and without ata-over-ethernet...
i've isolated from this combo problem one more testcase for redzone corruption (only AoE+LVM) and filled as PR26012.
Closing as obsolete, if this is still seen with modern kernels please re-open and update