Bug 25782 - kernel BUG at linux-2.6.36/mm/slub.c:2874!
Summary: kernel BUG at linux-2.6.36/mm/slub.c:2874!
Status: RESOLVED OBSOLETE
Alias: None
Product: Memory Management
Classification: Unclassified
Component: Slab Allocator (show other bugs)
Hardware: All Linux
: P1 high
Assignee: Pekka Enberg
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2010-12-28 22:18 UTC by Pawel Sikora
Modified: 2012-11-20 16:57 UTC (History)
1 user (show)

See Also:
Kernel Version: 2.6.36.2
Subsystem:
Regression: No
Bisected commit-id:


Attachments
my modular kernel config. (116.20 KB, application/octet-stream)
2010-12-29 16:14 UTC, Pawel Sikora
Details

Description Pawel Sikora 2010-12-28 22:18:03 UTC
hi,

i'm currently testing an OCFS2-over-LVM-over-AoE and getting
an oops while 'svn up' warmup.

the machine has a dual opteron (8-core) with 64GB ecc ram.

here's the dmesg log:

(...)
[91168.611032] ------------[ cut here ]------------
[91168.614320] kernel BUG at /home/users/builder/rpm/BUILD/kernel-2.6.36.2/linux-2.6.36/mm/slub.c:2874!
[91168.614320] invalid opcode: 0000 [#1] SMP 
[91168.614320] last sysfs file: /sys/devices/virtual/bdi/0:31/uevent
[91168.614320] CPU 8 
[91168.614320] Modules linked in: ocfs2_stack_o2cb ocfs2 quota_tree ocfs2_dlmfs ocfs2_stackglue ocfs2_dlm ocfs2_nodemanager configfs nfs fscache binfmt_misc nfsd lockd nfs_acl auth_rpcgss sunrpc exportfs sch_sfq iptable_nat nf_nat nf_conntrack_ipv4 nf_conntrack nf_defrag_ipv4 iptable_filter xt_TCPMSS xt_tcpudp iptable_mangle ip_tables ip6table_filter ip6_tables x_tables ext4 jbd2 crc16 raid10 raid0 dm_mod uvesafb cn aoe autofs4 dummy ide_cd_mod cdrom ata_generic pata_acpi ide_pci_generic pata_atiixp joydev usbhid hid atiixp igb ohci_hcd ssb ehci_hcd mmc_core ide_core amd64_edac_mod usbcore dca i2c_piix4 pcmcia thermal sg evdev edac_core k10temp ghes i2c_core shpchp pcmcia_core hwmon pcspkr processor button edac_mce_amd hed pci_hotplug psmouse serio_raw sd_mod crc_t10dif raid1 md_mod ext3 jbd mbcache ahci libahci libata scsi_mod [last unloaded: configfs]
[91168.614320] 
[91168.614320] Pid: 12858, xid: #0, comm: svn Not tainted 2.6.36.2-1 #1 H8DGU/H8DGU
[91168.614320] RIP: 0010:[<ffffffff811322fa>]  [<ffffffff811322fa>] kfree+0x10a/0x110
[91168.614320] RSP: 0018:ffff88100d529d18  EFLAGS: 00010246
[91168.614320] RAX: ffffea0038000000 RBX: ffff881000000001 RCX: 0000000000000002
[91168.614320] RDX: 0000000000000000 RSI: 0000000000000800 RDI: ffff881000000001
[91168.614320] RBP: ffff88100d529d38 R08: 0000000000000000 R09: 0000000000000000
[91168.614320] R10: 000000000000b77c R11: 0000000008000080 R12: ffffffff81152b99
[91168.614320] R13: 0000000000000000 R14: 0000000000000000 R15: ffff880cfdcebb30
[91168.614320] FS:  00007fdd04656760(0000) GS:ffff880c2e200000(0000) knlGS:00000000f6d436c0
[91168.614320] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
[91168.614320] CR2: 000000004af7e000 CR3: 0000000659bad000 CR4: 00000000000006f0
[91168.614320] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[91168.614320] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
[91168.614320] Process svn (pid: 12858, threadinfo ffff88100d528000, task ffff88100d9c9560)
[91168.614320] Stack:
[91168.614320]  ffff880d0bab4840 0000000000000000 ffff880d0bab4840 0000000000000000
[91168.614320] <0> ffff88100d529da8 ffffffff81152b99 ffff881008000080 000000000000b77c
[91168.614320] <0> ffff880d0bab48e0 ffff880cfdd1c2f0 ffff881000000001 ffff880cfdcebb30
[91168.614320] Call Trace:
[91168.614320]  [<ffffffff81152b99>] vfs_rename+0x2c9/0x440
[91168.614320]  [<ffffffff8115433b>] sys_renameat+0x2cb/0x320
[91168.614320]  [<ffffffff811448d2>] ? do_sync_read+0xd2/0x110
[91168.614320]  [<ffffffff81072cb5>] ? sched_clock_local+0x25/0x90
[91168.614320]  [<ffffffff8103ccfd>] ? finish_task_switch+0x3d/0xc0
[91168.614320]  [<ffffffff811543a6>] sys_rename+0x16/0x20
[91168.614320]  [<ffffffff81002f1b>] system_call_fastpath+0x16/0x1b
[91168.614320] Code: 4d 85 ed 0f 84 3c ff ff ff 49 8b 45 00 49 8b 7d 08 49 83 c5 10 48 89 da 4c 89 e6 ff d0 49 8b 45 00 48 85 c0 75 e7 e9 1a ff ff ff <0f> 0b 0f 1f 40 00 55 48 83 ef 70 48 89 e5 e8 e3 fe ff ff c9 c3
[91168.614320] RIP  [<ffffffff811322fa>] kfree+0x10a/0x110
[91168.614320]  RSP <ffff88100d529d18>
[91168.614320] ------------[ cut here ]------------
[91168.614320] WARNING: at /home/users/builder/rpm/BUILD/kernel-2.6.36.2/linux-2.6.36/kernel/softirq.c:144 local_bh_enable_ip+0x72/0xa0()
[91168.614320] Hardware name: H8DGU
[91168.614320] Modules linked in: ocfs2_stack_o2cb ocfs2 quota_tree ocfs2_dlmfs ocfs2_stackglue ocfs2_dlm ocfs2_nodemanager configfs nfs fscache binfmt_misc nfsd lockd nfs_acl auth_rpcgss sunrpc exportfs sch_sfq iptable_nat nf_nat nf_conntrack_ipv4 nf_conntrack nf_defrag_ipv4 iptable_filter xt_TCPMSS xt_tcpudp iptable_mangle ip_tables ip6table_filter ip6_tables x_tables ext4 jbd2 crc16 raid10 raid0 dm_mod uvesafb cn aoe autofs4 dummy ide_cd_mod cdrom ata_generic pata_acpi ide_pci_generic pata_atiixp joydev usbhid hid atiixp igb ohci_hcd ssb ehci_hcd mmc_core ide_core amd64_edac_mod usbcore dca i2c_piix4 pcmcia thermal sg evdev edac_core k10temp ghes i2c_core shpchp pcmcia_core hwmon pcspkr processor button edac_mce_amd hed pci_hotplug psmouse serio_raw sd_mod crc_t10dif raid1 md_mod ext3 jbd mbcache ahci libahci libata scsi_mod [last unloaded: configfs]
[91168.614320] Pid: 12858, comm: svn xid: #0 Not tainted 2.6.36.2-1 #1
[91168.614320] Call Trace:
[91168.614320]  [<ffffffff810490aa>] warn_slowpath_common+0x7a/0xb0
[91168.614320]  [<ffffffff810490f5>] warn_slowpath_null+0x15/0x20
[91168.614320]  [<ffffffff8104ff42>] local_bh_enable_ip+0x72/0xa0
[91168.614320]  [<ffffffff8143bc1f>] _raw_spin_unlock_bh+0xf/0x20
[91168.614320]  [<ffffffffa02169a3>] cn_netlink_send+0xa3/0x1e0 [cn]
[91168.614320]  [<ffffffffa0221773>] __this_module+0x3373/0x4720 [uvesafb]
[91168.614320]  [<ffffffffa0221526>] ? __this_module+0x3126/0x4720 [uvesafb]
[91168.614320]  [<ffffffff812d8019>] ? notify_update+0x29/0x30
[91168.614320]  [<ffffffffa0221a3c>] __this_module+0x363c/0x4720 [uvesafb]
[91168.614320]  [<ffffffff812736c9>] fb_blank+0x39/0x70
[91168.614320]  [<ffffffff8128203b>] fbcon_blank+0x22b/0x2e0
[91168.614320]  [<ffffffff810497c8>] ? release_console_sem+0x1a8/0x1f0
[91168.614320]  [<ffffffff81059876>] ? lock_timer_base.clone.23+0x36/0x70
[91168.614320]  [<ffffffff81059e88>] ? mod_timer+0x158/0x2c0
[91168.614320]  [<ffffffff812da851>] do_unblank_screen+0xb1/0x1c0
[91168.614320]  [<ffffffff812da96b>] unblank_screen+0xb/0x10
[91168.614320]  [<ffffffff8124e1ed>] bust_spinlocks+0x1d/0x40
[91168.614320]  [<ffffffff8143cdfb>] oops_end+0x3b/0xe0
[91168.614320]  [<ffffffff81006e76>] die+0x56/0x90
[91168.614320]  [<ffffffff8143c5d4>] do_trap+0xc4/0x170
[91168.614320]  [<ffffffff81004010>] do_invalid_op+0x90/0xb0
[91168.614320]  [<ffffffff811322fa>] ? kfree+0x10a/0x110
[91168.614320]  [<ffffffff81152b99>] ? vfs_rename+0x2c9/0x440
[91168.614320]  [<ffffffff81003b5a>] invalid_op+0x1a/0x20
[91168.614320]  [<ffffffff81152b99>] ? vfs_rename+0x2c9/0x440
[91168.614320]  [<ffffffff811322fa>] ? kfree+0x10a/0x110
[91168.614320]  [<ffffffff81132228>] ? kfree+0x38/0x110
[91168.614320]  [<ffffffff81152b99>] vfs_rename+0x2c9/0x440
[91168.614320]  [<ffffffff8115433b>] sys_renameat+0x2cb/0x320
[91168.614320]  [<ffffffff811448d2>] ? do_sync_read+0xd2/0x110
[91168.614320]  [<ffffffff81072cb5>] ? sched_clock_local+0x25/0x90
[91168.614320]  [<ffffffff8103ccfd>] ? finish_task_switch+0x3d/0xc0
[91168.614320]  [<ffffffff811543a6>] sys_rename+0x16/0x20
[91168.614320]  [<ffffffff81002f1b>] system_call_fastpath+0x16/0x1b
[91168.614320] ---[ end trace 9f14f1964ebea55b ]---
[91169.894441] ---[ end trace 9f14f1964ebea55c ]---


# df -h
Filesystem            Size  Used Avail Use% Mounted on
/dev/md0              7.6G  1.3G  5.9G  18% /
(...)
/dev/mapper/vg_storage-lv_storage
                      811G   18G  793G   3% /remote/cluster

# lvdisplay

  --- Logical volume ---
  LV Name                /dev/vg_storage/lv_storage
  VG Name                vg_storage
  LV UUID                LsQHHY-FJC7-7wId-f1Ge-SSUp-d2SR-ILWmPa
  LV Write Access        read/write
  LV Status              available
  # open                 2
  LV Size                810.16 GiB
  Current LE             207400
  Segments               2
  Allocation             inherit
  Read ahead sectors     auto
  - currently set to     256
  Block device           253:0

# pvdisplay

  --- Physical volume ---
  PV Name               /dev/etherd/e92.0
  VG Name               vg_storage
  PV Size               405.08 GiB / not usable 2.00 MiB
  Allocatable           yes (but full)
  PE Size               4.00 MiB
  Total PE              103700
  Free PE               0
  Allocated PE          103700
  PV UUID               eou0n9-3IiH-NHGk-vU40-Q4N0-OYM2-w8nrJB

  --- Physical volume ---
  PV Name               /dev/etherd/e93.0
  VG Name               vg_storage
  PV Size               405.08 GiB / not usable 2.00 MiB
  Allocatable           yes (but full)
  PE Size               4.00 MiB
  Total PE              103700
  Free PE               0
  Allocated PE          103700
  PV UUID               b1eAIC-wIcc-303u-Pyj3-rXdf-RaYC-rXFqmh
Comment 1 Andrew Morton 2010-12-28 22:54:59 UTC
I can't see a BUG statement at line 2874 of 2.6.36.2's mm/slub.c.  Please let us know what additional patches have been applied and please let us know what your slub,c:2874 looks like.
Comment 2 Pawel Sikora 2010-12-28 23:19:54 UTC
(In reply to comment #1)
> I can't see a BUG statement at line 2874 of 2.6.36.2's mm/slub.c.  Please let
> us know what additional patches have been applied and please let us know what
> your slub,c:2874 looks like.

the are applied (by vendor) grsec and vserver patches (not used on this machine).
on pure 2.6.36.2 the suitable BUG_ON() is at line 2833:

http://git.kernel.org/?p=linux/kernel/git/stable/linux-2.6.36.y.git;a=blob;f=mm/slub.c;h=13fffe1f0f3dc5992471ffd590326b9517b0cc3a;hb=a1346c99fc89f2b3d35c7d7e2e4aef8ea4124342#l2833

should i build a pure 2.6.36.2 from git and stress it again?
Comment 3 Pekka Enberg 2010-12-29 06:51:51 UTC
Yes, please. I doubt you will able to trigger the problem with mainline, though. The oops looks like slab corruption where vfs_rename() passes an invalid pointer to kfree(). If the problem was in mainline, I suspect a lot more people would be triggering it.
Comment 4 Pawel Sikora 2010-12-29 16:07:40 UTC
ok, i've built the pure 2.6.36.2 and boot on both machines with slub_debug=FZPU.

first test machine (server, hostname: odra) have 2 raid-10 devices
exported via AoE (ggaoed daemon) as e92.0/e93.0 with linear lvm
and ocfs2 on top. on this machine the ocfs2 is mounted in /remote/cluster
and it's locally unused.

second test machine (client, hostname: hal) imports lvm on AoE devices
and mounts in /remote/cluster too.

now, on the client machine in /remote/cluster i'm running 'svn up'
on a few GB repo from external svn server and observing svn logs.
after few minutes the client machine just reboots. there's no ooops
nor panic on console. i've checked the bios and there's no logs
from health monitoring, so it looks like a total kernel memory corruption.

from the other side, on the server machine i've logged an oops
from slab checker. here's dmesg fragment:

(...)
Dec 29 16:09:22 odra kernel: [ 4307.772621] o2net: no longer connected to node hal (num 0) at 10.0.2.24:7777

^^^^^^^^^ here, the server node noticed the client reboot.

Dec 29 16:09:22 odra kernel: [ 4307.772685] (o2hb-56CBFCA7B6,4162,10):o2dlm_eviction_cb:267 o2dlm has evicted node 0 from group 56CBFCA7B60B4C7BBA86D0ABE068A1FD
Dec 29 16:09:22 odra kernel: [ 4307.828455] (ocfs2rec,7005,10):ocfs2_replay_journal:1605 Recovering node 0 from slot 1 on device (253,0)
Dec 29 16:09:23 odra kernel: [ 4309.206338] (dlm_reco_thread,4177,12):dlm_get_lock_resource:836 56CBFCA7B60B4C7BBA86D0ABE068A1FD:$RECOVERY: at least one node (0) to recover before lock mastery can begin
Dec 29 16:09:23 odra kernel: [ 4309.206345] (dlm_reco_thread,4177,12):dlm_get_lock_resource:870 56CBFCA7B60B4C7BBA86D0ABE068A1FD: recovery map is not empty, but must master $RECOVERY lock now
Dec 29 16:09:23 odra kernel: [ 4309.206374] (dlm_reco_thread,4177,12):dlm_do_recovery:523 (4177) Node 2 is the Recovery Master for the Dead Node 0 for Domain 56CBFCA7B60B4C7BBA86D0ABE068A1FD
Dec 29 16:09:46 odra kernel: [ 4332.206553] (ocfs2rec,7005,10):ocfs2_begin_quota_recovery:407 Beginning quota recovery in slot 1
Dec 29 16:09:46 odra kernel: [ 4332.233272] (kworker/u:0,6651,10):ocfs2_finish_quota_recovery:598 Finishing quota recovery in slot 1
Dec 29 16:23:53 odra kernel: [ 5178.862323] =============================================================================
Dec 29 16:23:53 odra kernel: [ 5178.862327] BUG kmalloc-16: Redzone overwritten
Dec 29 16:23:53 odra kernel: [ 5178.862329] -----------------------------------------------------------------------------
Dec 29 16:23:53 odra kernel: [ 5178.862331]
Dec 29 16:23:53 odra kernel: [ 5178.862334] INFO: 0xffff88080d41e4e0-0xffff88080d41e4e3. First byte 0x0 instead of 0xcc
Dec 29 16:23:53 odra kernel: [ 5178.862366] INFO: Allocated in ocfs2_recovery_init+0x66/0xf0 [ocfs2] age=1478919 cpu=4 pid=4161
Dec 29 16:23:53 odra kernel: [ 5178.862373] INFO: Freed in dev_uevent+0x133/0x160 age=1497427 cpu=4 pid=3928
Dec 29 16:23:53 odra kernel: [ 5178.862376] INFO: Slab 0xffffea001c2e6690 objects=46 used=40 fp=0xffff88080d41e580 flags=0x6000000000000c1
Dec 29 16:23:53 odra kernel: [ 5178.862379] INFO: Object 0xffff88080d41e4d0 @offset=1232 fp=0xffff88080d41e528
Dec 29 16:23:53 odra kernel: [ 5178.862380]
Dec 29 16:23:53 odra kernel: [ 5178.862382] Bytes b4 0xffff88080d41e4c0:  0a b3 fe ff 00 00 00 00 5a 5a 5a 5a 5a 5a 5a 5a .���....ZZZZZZZZ
Dec 29 16:23:53 odra kernel: [ 5178.862391]   Object 0xffff88080d41e4d0:  00 00 00 00 00 00 00 00 e0 e4 41 0d 08 88 ff ff ........��A...��
Dec 29 16:23:53 odra kernel: [ 5178.862399]  Redzone 0xffff88080d41e4e0:  00 00 00 00 cc cc cc cc                         ....����
Dec 29 16:23:53 odra kernel: [ 5178.862406]  Padding 0xffff88080d41e520:  5a 5a 5a 5a 5a 5a 5a 5a                         ZZZZZZZZ
Dec 29 16:23:53 odra kernel: [ 5178.862415] Pid: 7619, comm: umount Not tainted 2.6.36.2 #3
Dec 29 16:23:53 odra kernel: [ 5178.862417] Call Trace:
Dec 29 16:23:53 odra kernel: [ 5178.862427]  [<ffffffff811168ee>] print_trailer+0xfe/0x160
Dec 29 16:23:53 odra kernel: [ 5178.862431]  [<ffffffff81116f44>] check_bytes_and_report+0xf4/0x130
Dec 29 16:23:53 odra kernel: [ 5178.862435]  [<ffffffff81116fe3>] check_object+0x63/0x260
Dec 29 16:23:53 odra kernel: [ 5178.862449]  [<ffffffffa05b724d>] ? ocfs2_recovery_exit+0xbd/0xd0 [ocfs2]
Dec 29 16:23:53 odra kernel: [ 5178.862453]  [<ffffffff81118355>] __slab_free+0x225/0x390
Dec 29 16:23:53 odra kernel: [ 5178.862457]  [<ffffffff81118707>] kfree+0xb7/0x110
Dec 29 16:23:53 odra kernel: [ 5178.862469]  [<ffffffffa05b724d>] ocfs2_recovery_exit+0xbd/0xd0 [ocfs2]
Dec 29 16:23:53 odra kernel: [ 5178.862473]  [<ffffffff81140f7c>] ? iput+0x2c/0x280
Dec 29 16:23:53 odra kernel: [ 5178.862482]  [<ffffffffa057ee2d>] ? ocfs2_truncate_log_shutdown+0x7d/0x1c0 [ocfs2]
Dec 29 16:23:53 odra kernel: [ 5178.862498]  [<ffffffffa05e38b2>] ocfs2_dismount_volume+0xa2/0x430 [ocfs2]
Dec 29 16:23:53 odra kernel: [ 5178.862510]  [<ffffffffa05e3c90>] ocfs2_put_super+0x50/0x110 [ocfs2]
Dec 29 16:23:53 odra kernel: [ 5178.862514]  [<ffffffff8112c3a1>] generic_shutdown_super+0x51/0xd0
Dec 29 16:23:53 odra kernel: [ 5178.862517]  [<ffffffff8112c44c>] kill_block_super+0x2c/0x50
Dec 29 16:23:53 odra kernel: [ 5178.862529]  [<ffffffffa05e1872>] ocfs2_kill_sb+0x72/0x80 [ocfs2]
Dec 29 16:23:53 odra kernel: [ 5178.862533]  [<ffffffff8112c755>] deactivate_locked_super+0x45/0x60
Dec 29 16:23:53 odra kernel: [ 5178.862536]  [<ffffffff8112d4a5>] deactivate_super+0x45/0x60
Dec 29 16:23:53 odra kernel: [ 5178.862540]  [<ffffffff81144e86>] mntput_no_expire+0x86/0xf0
Dec 29 16:23:53 odra kernel: [ 5178.862544]  [<ffffffff81145997>] sys_umount+0x67/0x370
Dec 29 16:23:53 odra kernel: [ 5178.862550]  [<ffffffff81002dab>] system_call_fastpath+0x16/0x1b
Dec 29 16:23:53 odra kernel: [ 5178.862553] FIX kmalloc-16: Restoring 0xffff88080d41e4e0-0xffff88080d41e4e3=0xcc
Dec 29 16:23:53 odra kernel: [ 5178.862554]
Dec 29 16:23:58 odra kernel: [ 5183.637198] ocfs2: Unmounting device (253,0) on (node 2)

i suppose that AoE kernel module could overwrite something on
both nodes but i don't see any debugging options in `modinfo aoe`.
Comment 5 Pawel Sikora 2010-12-29 16:14:17 UTC
Created attachment 41882 [details]
my modular kernel config.
Comment 6 Pekka Enberg 2010-12-30 07:41:11 UTC
The oops says that "osb->recovery_map" passed to kfree() in ocfs2_recovery_exit() has is red-zone overwritten (data written below or above the struct). Looks like OCFS2 bug to me.
Comment 7 Pekka Enberg 2010-12-30 07:49:43 UTC
This patch in mainline probably cures the problem:

http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commitdiff;h=75d9bbc73804285020aa4d99bd2a9600edea8945

That should be tagged for stable. Andrew?
Comment 8 Pawel Sikora 2010-12-30 08:24:27 UTC
(In reply to comment #7)
> This patch in mainline probably cures the problem:
> 
>
> http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commitdiff;h=75d9bbc73804285020aa4d99bd2a9600edea8945

is't a cure for client reboot too or only for server redzone damage?
Comment 9 Pekka Enberg 2010-12-30 08:37:49 UTC
Just server, probably. Might be worth-while to try out 2.6.37-rc8 on both client and server side.
Comment 10 Pawel Sikora 2010-12-30 12:58:01 UTC
(In reply to comment #9)
> Just server, probably. Might be worth-while to try out 2.6.37-rc8 on both
> client and server side.

on 2.6.37-rc8 there's no redzone corruption but both machines reboot
after few minutes of svn i/o. now, i'll try isolate faulty layer
on 1-node ocfs2 cluster with and without ata-over-ethernet...
Comment 11 Pawel Sikora 2011-01-01 16:58:43 UTC
i've isolated from this combo problem one more testcase
for redzone corruption (only AoE+LVM) and filled as PR26012.
Comment 12 Alan 2012-11-20 16:57:52 UTC
Closing as obsolete, if this is still seen with modern kernels please re-open and update

Note You need to log in before you can comment on or make changes to this bug.