Bug 7614 - kernel BUG at mm/slab.c:594! Seems to affect various SCSI systems in various ways.
Summary: kernel BUG at mm/slab.c:594! Seems to affect various SCSI systems in variou...
Status: CLOSED PATCH_ALREADY_AVAILABLE
Alias: None
Product: IO/Storage
Classification: Unclassified
Component: SCSI (show other bugs)
Hardware: i386 Linux
: P2 normal
Assignee: io_scsi
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2006-12-01 15:41 UTC by Dylan Martin
Modified: 2008-04-08 15:45 UTC (History)
2 users (show)

See Also:
Kernel Version: 2.6.18
Subsystem:
Regression: ---
Bisected commit-id:


Attachments

Description Dylan Martin 2006-12-01 15:41:39 UTC
Most recent kernel where this bug did *NOT* occur: Unknown
Distribution: Fedora
Hardware Environment: i386 with aic7XXX and other
Software Environment: 
Problem Description: In my system, my tape drive generates the error below from
time to time, seemingly randomly.  Others seem to have had the same problem with
different hardware using different scsi devices.  

Here's other people with the same error:

http://lkml.org/lkml/2006/11/12/63
http://www.mail-archive.com/bcm43xx-dev@lists.berlios.de/msg02470.html
http://www.spinics.net/lists/linux-scsi/msg12907.html
http://www.spinics.net/lists/linux-scsi/msg12783.html

------------[ cut here ]------------
kernel BUG at mm/slab.c:594!
invalid opcode: 0000 [#1]
last sysfs file: /class/net/vmnet1/address
Modules linked in: ipv6 vmnet(U) vmmon(U) video sbs i2c_ec container button
battery asus_acpi ac lp parport_pc parport ohci_hcd floppy st sg serio_raw
i2c_piix4 i2c_core cfi_probe gen_probe scb2_flash mtdcore e1000 chipreg
map_funcs pcspkr ide_cd cdrom dm_snapshot dm_zero dm_mirror dm_mod ext3 jbd
aacraid aic7xxx scsi_transport_spi sd_mod scsi_mod
CPU:    0
EIP:    0060:[<c045d3ac>]    Tainted: P      VLI
EFLAGS: 00010202   (2.6.18-1.2239.fc5 #1) 
EIP is at kmem_cache_free+0x29/0x62
eax: c0010068   ebx: f7ffe860   ecx: f7ff5620   edx: c1800000
esi: f7ffcf40   edi: 00000000   ebp: f6b4ac90   esp: ebba4e28
ds: 007b   es: 007b   ss: 0068
Process tar (pid: 5584, ti=ebba4000 task=dff52270 task.ti=ebba4000)
ds: 007b   es: 007b   ss: 0068
Process tar (pid: 5584, ti=ebba4000 task=dff52270 task.ti=ebba4000)
Stack: f7ffe860 f7ffcf40 00000000 c0448c0e 00000000 e0963da0 f7ffcf40 e0963da0 
       c04653e6 00000800 00000000 c046510c f8843f74 00000800 00000000 00000006 
       ebba4f5c da3876c8 00000800 00000000 00000002 c1704480 00002000 00000000 
Call Trace:
 [<c0448c0e>] mempool_free+0x61/0x66
 [<c04653e6>] bio_free+0x25/0x30
 [<c046510c>] bio_put+0x27/0x28
 [<f8843f74>] scsi_execute_async+0x15a/0x32d [scsi_mod]
 [<f8952c57>] st_do_scsi+0x1d1/0x221 [st]
 [<f89565d8>] st_read+0x2ed/0x7a6 [st]
 [<c04613d8>] vfs_read+0xa6/0x157
 [<c0461760>] sys_read+0x41/0x67
 [<c0402d9b>] syscall_call+0x7/0xb
DWARF2 unwinder stuck at syscall_call+0x7/0xb
Leftover inexact backtrace:
 =======================
Code: 5f c3 57 89 d7 8d 92 00 00 00 40 89 c1 c1 ea 0c c1 e2 05 03 15 70 53 79 c0
 56 53 8b 02 f6 c4 40 74 03 8b 52 0c 8b 02 84 c0 78 08 <0f> 0b 52 02 e3 f8 61 c0
 39 4a 18 74 08 0f 0b 99 0d e3 f8 61 c0 
EIP: [<c045d3ac>] kmem_cache_free+0x29/0x62 SS:ESP 0068:ebba4e28

Steps to reproduce: Unknown.  Seriously... Argh.

Sorry that this bug report isn't very helpful.  I just figured havning SCSI
problems was serious enough to warrant posting without better information.
Comment 1 Dylan Martin 2006-12-01 15:43:53 UTC
Oh, I forgot to mention, this causes my tape drive to become unreachable.  Other
people have mentioned it causing a system crash.
Comment 2 Arjan van de Ven 2006-12-02 01:35:00 UTC
Does this happen without the proprietary modules as well?
Comment 3 Dylan Martin 2006-12-04 15:22:31 UTC
I can't test that myself, but other people (links above) are having this problem
with what seems like the standard kernel.  Again, sorry I can't do a better bug
report.  I don't have time to do this right, but I thought I should start something.
Comment 4 Juan David Ruiz 2007-01-04 15:47:12 UTC
Same error trying to connect a Nokia 5200 cell phone to my Fedora 6 updated
(2.6.18-1.2869.fc6)

------------[ cut here ]------------
kernel BUG at mm/slab.c:594!
invalid opcode: 0000 [#1]
SMP 
last sysfs file:
/devices/pci0000:00/0000:00:1e.0/0000:02:0e.0/usb1/1-2/1-2:1.10/usbdev1.4_ep87/dev
Modules linked in: rndis_host cdc_ether usbnet cdc_acm michael_mic arc4
ieee80211_crypt_tkip irnet ppp_generic slhc irtty_sir sir_dev ircomm_tty ircomm
irda crc_ccitt autofs4 hidp rfcomm l2cap bluetooth ip_conntrack_netbios_ns
ipt_REJECT xt_state ip_conntrack nfnetlink iptable_filter ip_tables ip6t_REJECT
xt_tcpudp ip6table_filter ip6_tables x_tables dm_mirror dm_multipath dm_mod
video sbs i2c_ec i2c_core button battery asus_acpi ac radeon drm ipv6 parport_pc
lp parport joydev floppy snd_intel8x0 snd_ac97_codec snd_ac97_bus snd_seq_dummy
snd_seq_oss snd_seq_midi_event snd_seq snd_seq_device snd_pcm_oss snd_mixer_oss
snd_pcm snd_timer snd ipw2200 soundcore ieee80211 ide_cd e100 pcspkr
ieee80211_crypt serio_raw snd_page_alloc cdrom mii ext3 jbd ehci_hcd ohci_hcd
uhci_hcd
CPU:    0
EIP:    0060:[<c046b395>]    Not tainted VLI
EFLAGS: 00010006   (2.6.18-1.2869.fc6 #1) 
EIP is at kfree+0x2e/0x65
eax: 8000086c   ebx: ffffff92   ecx: c8a4c000   edx: c11148c0
esi: 00000286   edi: c8a46cbe   ebp: c8a4c400   esp: d674fdac
ds: 007b   es: 007b   ss: 0068
Process modprobe (pid: 2859, ti=d674f000 task=c860f380 task.ti=d674f000)
Stack: ffffff92 f0d09906 c8a4c006 f0d0912c c1606680 00000000 d674fdd4 00000000 
       d674fdf0 c04068d1 c0614616 c0612d84 00000003 00000000 c06a1550 f0d1b6bc 
       cc46e800 f0d1a720 c8a4c000 d5e156c8 cc379c00 f0d1b680 f0d1b6b4 c0584913 
Call Trace:
 [<f0d0912c>] usbnet_probe+0x561/0x574 [usbnet]
 [<c0584c22>] usb_probe_interface+0x58/0x86
 [<c0552ad5>] driver_probe_device+0x45/0x9a
 [<c0552c00>] __driver_attach+0x65/0x8f
 [<c055255a>] bus_for_each_dev+0x37/0x59
 [<c0552a36>] driver_attach+0x16/0x18
 [<c0552252>] bus_add_driver+0x6f/0x10d
 [<c0584a3f>] usb_register_driver+0x65/0xcb
 [<c043f237>] sys_init_module+0x17de/0x1977
 [<c0404013>] syscall_call+0x7/0xb
DWARF2 unwinder stuck at syscall_call+0x7/0xb
Leftover inexact backtrace:
 =======================
Code: 56 89 c7 53 74 58 9c 5e fa 8d 90 00 00 00 40 c1 ea 0c c1 e2 05 03 15 90 c6
80 c0 8b 02 f6 c4 40 74 03 8b 52 0c 8b 02 84 c0 78 08 <0f> 0b 52 02 f3 90 63 c0
89 e0 8b 4a 18 25 00 f0 ff ff 8b 40 10 
EIP: [<c046b395>] kfree+0x2e/0x65 SS:ESP 0068:d674fdac
 <3>BUG: sleeping function called from invalid context at kernel/rwsem.c:20
in_atomic():0, irqs_disabled():1
 [<c04051db>] dump_trace+0x69/0x1af
 [<c0405339>] show_trace_log_lvl+0x18/0x2c
 [<c04058ed>] show_trace+0xf/0x11
 [<c04059ea>] dump_stack+0x15/0x17
 [<c0439482>] down_read+0x12/0x20
 [<c04315e1>] blocking_notifier_call_chain+0xe/0x29
 [<c0427638>] do_exit+0x1b/0x776
 [<c040588e>] die+0x29d/0x2c2
 [<c0405fd3>] do_invalid_op+0xa2/0xab
 [<c0404b85>] error_code+0x39/0x40
DWARF2 unwinder stuck at error_code+0x39/0x40
Leftover inexact backtrace:
 [<c046b395>] kfree+0x2e/0x65
 [<f0d0912c>] usbnet_probe+0x561/0x574 [usbnet]
 [<c04068d1>] do_IRQ+0xb0/0xbc
 [<c0614616>] _spin_unlock_irq+0x5/0x7
 [<c0612d84>] schedule+0x960/0x9dd
 [<c0584913>] usb_match_dynamic_id+0x4d/0x55
 [<c0552b9b>] __driver_attach+0x0/0x8f
 [<c0584c22>] usb_probe_interface+0x58/0x86
 [<c0552ad5>] driver_probe_device+0x45/0x9a
 [<c0552c00>] __driver_attach+0x65/0x8f
 [<c055255a>] bus_for_each_dev+0x37/0x59
 [<c0552a36>] driver_attach+0x16/0x18
 [<c0552b9b>] __driver_attach+0x0/0x8f
 [<c0552252>] bus_add_driver+0x6f/0x10d
 [<c0584a3f>] usb_register_driver+0x65/0xcb
 [<c04315f6>] blocking_notifier_call_chain+0x23/0x29
 [<c043f237>] sys_init_module+0x17de/0x1977
 [<c046fba8>] vfs_read+0xa6/0x157
 [<c0404013>] syscall_call+0x7/0xb
 =======================
Comment 5 Niel Lambrechts 2007-01-15 02:28:40 UTC
This can be reproduced on 2.6.19.2. Simply attach a Nokia N73 cellular phone 
via USB with the phone in "PC Suite" mode. No errors are seen when the phone is 
in "mass storage device" mode. 

(Although my kernel is tainted because of Madwifi + fglrx, the problem can be 
reproduced without these modules being loaded)

Jan 14 23:38:30 atomheart kernel: [ 3509.412000] ------------[ cut here ]-------
-----
Jan 14 23:38:30 atomheart kernel: [ 3509.412000] kernel BUG at mm/slab.c:594!
Jan 14 23:38:30 atomheart kernel: [ 3509.412000] invalid opcode: 0000 [#1]
Jan 14 23:38:30 atomheart kernel: [ 3509.412000] SMP
Jan 14 23:38:30 atomheart kernel: [ 3509.412000] Modules linked in: rndis_host 
cdc_ether usbnet mii cdc_acm nls_cp437 vfat fat sg sd_mod usb_storage scsi_mod 
libusual wlan_tkip binfmt_misc cpufreq_ondemand cpufreq_userspace 
cpufreq_powersave speedstep_centrino freq_table rfcomm l2cap bluetooth 
af_packet nfsd exportfs lockd sunrpc uinput capability commoncap ipv6 fglrx(P) 
video sbs ibm_acpi i2c_ec i2c_core dock button battery container ac asus_acpi 
nls_utf8 ntfs dm_mod md_mod nvram lp joydev tsdev pcmcia psmouse evdev 
irtty_sir sir_dev nsc_ircc irda parport_pc e1000 serio_raw crc_ccitt parport 
floppy snd_intel8x0 snd_ac97_codec snd_ac97_bus snd_pcm_oss snd_mixer_oss 
snd_pcm snd_timer snd soundcore wlan_scan_sta pcspkr yenta_socket 
rsrc_nonstatic pcmcia_core ath_pci ath_rate_sample wlan ath_hal(P) 
snd_page_alloc shpchp pci_hotplug intel_agp agpgart xt_tcpudp xt_state 
ipt_REJECT xt_limit ipt_LOG ip_conntrack_ftp ip_conntrack nfnetlink 
iptable_filter ip_tables x_tables ext3 jbd mbcache ehci_hcd uhci_hcd usbcore 
ide_ge
Jan 14 23:38:30 atomheart kernel: eric ide_cd cdrom ide_disk piix generic 
thermal processor fan
Jan 14 23:38:30 atomheart kernel: [ 3509.412000] CPU:    0
Jan 14 23:38:30 atomheart kernel: [ 3509.412000] EIP:    0060:[kfree+121/
144]    Tainted: P      VLI
Jan 14 23:38:30 atomheart kernel: [ 3509.412000] EFLAGS: 00010006   (2.6.19.2 
#1)
Jan 14 23:38:30 atomheart kernel: [ 3509.412000] EIP is at kfree+0x79/0x90
Jan 14 23:38:30 atomheart kernel: [ 3509.412000] eax: 40000060   ebx: 
ffffffe0   ecx: f04f8000   edx: c1609e20
Jan 14 23:38:30 atomheart kernel: [ 3509.412000] esi: f04f1d8d   edi: 
00000286   ebp: f04f8400   esp: cf05bd98
Jan 14 23:38:30 atomheart kernel: [ 3509.412000] ds: 007b   es: 007b   ss: 0068
Jan 14 23:38:30 atomheart kernel: [ 3509.412000] Process modprobe (pid: 15004, 
ti=cf05a000 task=ea892030 task.ti=cf05a000)
Jan 14 23:38:30 atomheart kernel: [ 3509.412000] Stack: ffffffe0 f93eee60 
f04f8006 f93ee25c 0d55af00 00000331 dfc204b0 a070dcd3
Jan 14 23:38:30 atomheart kernel: [ 3509.412000]        00000001 00000003 
c0396b84 00000286 c012cda0 f4b3e2dc ffffffff 00000286
Jan 14 23:38:30 atomheart kernel: [ 3509.412000]        e6c62000 f93ff800 
f04f8000 ce138d48 f4b3e000 f4b3e000 f88a3e7c f4b3e000
Jan 14 23:38:30 atomheart kernel: [ 3509.412000] Call Trace:
Jan 14 23:38:30 atomheart kernel: [ 3509.412000]  [pg0+955499100/1068811264] 
usbnet_probe+0x2cc/0x5e0 [usbnet]
Jan 14 23:38:30 atomheart kernel: [ 3509.412000]  [lock_timer_base+32/80] 
lock_timer_base+0x20/0x50
Jan 14 23:38:30 atomheart kernel: [ 3509.412000]  [pg0+943660668/1068811264] 
usb_resume_both+0x6c/0xe0 [usbcore]
Jan 14 23:38:30 atomheart kernel: [ 3509.412000]  [pg0+943660938/1068811264] 
usb_autoresume_device+0x4a/0x60 [usbcore]
Jan 14 23:38:30 atomheart kernel: [ 3509.412000]  [pg0+943663627/1068811264] 
usb_probe_interface+0x9b/0xe0 [usbcore]
Jan 14 23:38:30 atomheart kernel: [ 3509.412000]  [really_probe+59/272] 
really_probe+0x3b/0x110
Jan 14 23:38:30 atomheart kernel: [ 3509.412000]  [driver_probe_device+73/192] 
driver_probe_device+0x49/0xc0
Jan 14 23:38:30 atomheart kernel: [ 3509.412000]  [__driver_attach+158/
160] __driver_attach+0x9e/0xa0
Jan 14 23:38:30 atomheart kernel: [ 3509.412000]  [bus_for_each_dev+58/96] 
bus_for_each_dev+0x3a/0x60
Jan 14 23:38:30 atomheart kernel: [ 3509.412000]  [driver_attach+22/32] 
driver_attach+0x16/0x20
Jan 14 23:38:30 atomheart kernel: [ 3509.412000]  [__driver_attach+0/
160] __driver_attach+0x0/0xa0
Jan 14 23:38:30 atomheart kernel: [ 3509.412000]  [bus_add_driver+123/416] 
bus_add_driver+0x7b/0x1a0
Jan 14 23:38:30 atomheart kernel: [ 3509.412000]  [pg0+943662275/1068811264] 
usb_register_driver+0x83/0x100 [usbcore]
Jan 14 23:38:30 atomheart kernel: [ 3509.412000]  [sys_init_module+347/7136] 
sys_init_module+0x15b/0x1be0
Jan 14 23:38:30 atomheart kernel: [ 3509.412000]  [sysenter_past_esp+86/121] 
sysenter_past_esp+0x56/0x79
Jan 14 23:38:30 atomheart kernel: [ 3509.412000]  [atm_push_raw+43/48] 
atm_push_raw+0x2b/0x30
Jan 14 23:38:30 atomheart kernel: [ 3509.412000]  =======================
Jan 14 23:38:30 atomheart kernel: [ 3509.412000] Code: 74 83 14 83 c0 01 89 03 
57 9d 8b 1c 24 8b 74 24 04 8b 7c 24 08 83 c4 0c c3 8b 52 0c eb c4 89 c8 89 da 
e8 cb fe ff ff 8b 03 eb d4 <0f> 0b 52 02 ba 34 30 c0 eb b3 8d b6 00 00 00 00 8d 
bc 27 00 00
Jan 14 23:38:30 atomheart kernel: [ 3509.412000] EIP: [kfree+121/144] 
kfree+0x79/0x90 SS:ESP 0068:cf05bd98
Comment 6 Niel Lambrechts 2007-01-15 04:05:06 UTC
If I make the following change, my system no longer crashes. I'm not convinced 
that this actually addresses the root cause, but perhaps it can help someone:

--- slab.c.old  2006-11-29 23:57:37.000000000 +0200
+++ slab.c      2007-01-15 13:15:26.000000000 +0200
@@ -591,7 +591,8 @@
 {
        if (unlikely(PageCompound(page)))
                page = (struct page *)page_private(page);
-       BUG_ON(!PageSlab(page));
+       /* BUG_ON(!PageSlab(page)); */
+       VM_BUG_ON(atomic_read(&page->_count) == 0);
        return (struct kmem_cache *)page->lru.next;
}




Regards,
Niel Lambrechts




Comment 7 Niel Lambrechts 2007-01-15 04:19:14 UTC
Ignore that last patch, I get a hard crash when I plug the cable out.
Comment 8 Niel Lambrechts 2007-01-16 12:10:56 UTC
Good news! I just compiled 2.6.20-rc5, and I'm able to connect my N73 in 'PC 
Suite' mode without any crash. I could even perform dialup successfully across 
the USB cable:
dmesg output:
[  360.580000] cdc_acm 2-2:1.8: ttyACM0: USB ACM device
[  360.636000] usbcore: registered new interface driver cdc_acm
[  360.636000] drivers/usb/class/cdc-acm.c: v0.25:USB Abstract Control Model 
driver for USB modems and ISDN adapters
[  485.792000] cdc_acm 2-2:1.8: ttyACM0: USB ACM device

Hope this helps someone!
Comment 9 Natalie Protasevich 2007-07-31 00:51:27 UTC
Sounds like for Niel the bug got resolved.
Dylan, Juan, have you tested with recent kernels, 2.6.(22 or 23)+?
Thanks.
Comment 10 Dylan Martin 2007-09-07 09:45:57 UTC
Sorry!  I just don't have time any more.  I kept thinking maybe I
could find the time, but it's just not happening. 

I figured it would be better to respond with no information than just
leave it hanging...

Thanks!
-Dylan

> http://bugzilla.kernel.org/show_bug.cgi?id=7614
> 
> 
> 
> ------- Comment #9 from protasnb@gmail.com  2007-07-31 00:51 -------
> Sounds like for Niel the bug got resolved.
> Dylan, Juan, have you tested with recent kernels, 2.6.(22 or 23)+?
> Thanks.
> 
> 
> -- 
> Configure bugmail: http://bugzilla.kernel.org/userprefs.cgi?tab=email
> ------- You are receiving this mail because: -------
> You reported the bug, or are watching the reporter.
Comment 11 Natalie Protasevich 2008-04-08 15:45:56 UTC
Yes, definitely :)
I am closing this bug, but if it re-appears with latest kernel, please reopen.

Note You need to log in before you can comment on or make changes to this bug.