Bug 10575 - WARNING: at mm/slub.c:2444
Summary: WARNING: at mm/slub.c:2444
Status: CLOSED OBSOLETE
Alias: None
Product: Memory Management
Classification: Unclassified
Component: Slab Allocator (show other bugs)
Hardware: All Linux
: P1 normal
Assignee: Andrew Morton
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2008-04-29 06:31 UTC by Peter Teoh
Modified: 2012-05-12 16:25 UTC (History)
2 users (show)

See Also:
Kernel Version: 2.6.25-sched-devel.git-x86-latest.git
Subsystem:
Regression: No
Bisected commit-id:


Attachments

Description Peter Teoh 2008-04-29 06:31:32 UTC
kernel version:

cat include/config/kernel.release 
2.6.25-sched-devel.git-x86-latest.git

Distribution:

FC6

Hardware Environment:
Software Environment:
Problem Description:

Shutting down the system generated the following errors:

Apr 28 00:20:22 funnyman libvirtd: Shutting down on signal 15
Apr 28 00:20:25 funnyman kernel: sky2 eth0: Link is down.
Apr 28 00:20:25 funnyman xinetd[3373]: Exiting...
Apr 28 00:20:30 funnyman kernel: ------------[ cut here ]------------
Apr 28 00:20:30 funnyman kernel: WARNING: at mm/slub.c:2444 kmem_cache_destroy+0xfe/0x108()
Apr 28 00:20:30 funnyman kernel: Modules linked in: rfcomm hidp l2cap bluetooth button ext2 btrfs hfsplus usb_storage nls_utf8 bridge autofs4 nf_conntrack(-) xt_tcpudp x_tables sunrpc loop dm_multipath video output sbs sbshc battery ac ipv6 parport_pc lp parport snd_usb_audio snd_usb_lib snd_rawmidi snd_hwdep snd_hda_intel snd_seq_dummy snd_seq_oss snd_seq_midi_event snd_seq snd_seq_device snd_pcm_oss sg firewire_ohci snd_mixer_oss snd_pcm firewire_core crc_itu_t snd_timer snd pata_jmicron soundcore serio_raw sky2 snd_page_alloc pcspkr i2c_i801 iTCO_wdt iTCO_vendor_support i2c_core floppy dm_snapshot dm_zero dm_mirror dm_mod ahci ata_generic ata_piix libata sd_mod scsi_mod ext3 jbd ehci_hcd ohci_hcd uhci_hcd [last unloaded: xt_state]
Apr 28 00:20:30 funnyman kernel: Pid: 11669, comm: modprobe Not tainted 2.6.25-sched-devel.git-x86-latest.git #1
Apr 28 00:20:30 funnyman kernel:  [<c042bad6>] warn_on_slowpath+0x46/0x56
Apr 28 00:20:30 funnyman kernel:  [<c0415a33>] ? apic_wait_icr_idle+0x16/0x1d
Apr 28 00:20:30 funnyman kernel:  [<c0415243>] ? __send_IPI_dest_field+0x50/0x54
Apr 28 00:20:30 funnyman kernel:  [<c04020e5>] ? send_IPI_mask+0xd/0xf
Apr 28 00:20:30 funnyman kernel:  [<c046773c>] ? get_pageblock_flags_group+0x50/0x6e
Apr 28 00:20:30 funnyman kernel:  [<c046777e>] ? get_pageblock_migratetype+0x24/0x27
Apr 28 00:20:30 funnyman kernel:  [<c0468472>] ? free_hot_page+0xf/0x11
Apr 28 00:20:30 funnyman kernel:  [<c0468494>] ? __free_pages+0x20/0x2b
Apr 28 00:20:30 funnyman kernel:  [<c047f471>] ? __free_slab+0xac/0xb4
Apr 28 00:20:30 funnyman kernel:  [<c0480754>] kmem_cache_destroy+0xfe/0x108
Apr 28 00:20:30 funnyman kernel:  [<f8d337c0>] nf_conntrack_cleanup+0x53/0x7a [nf_conntrack]
Apr 28 00:20:30 funnyman kernel:  [<f8d3766d>] nf_conntrack_standalone_fini+0x1c/0x1e [nf_conntrack]
Apr 28 00:20:30 funnyman kernel:  [<c044b56f>] sys_delete_module+0x177/0x1af
Apr 28 00:20:30 funnyman kernel:  [<c0472c00>] ? remove_vma+0x31/0x53
Apr 28 00:20:30 funnyman kernel:  [<c0473468>] ? do_munmap+0x182/0x19c
Apr 28 00:20:30 funnyman kernel:  [<c0404bae>] sysenter_past_esp+0x6a/0x90
Apr 28 00:20:30 funnyman kernel:  [<c0640000>] ? pci_scan_bridge+0x1dc/0x2eb
Apr 28 00:20:30 funnyman hcid[9436]: Got disconnected from the system message bus
Apr 28 00:20:30 funnyman kernel:  =======================
Apr 28 00:20:30 funnyman rpc.statd[2994]: Caught signal 15, un-registering and exiting.
Apr 28 00:20:30 funnyman kernel: ---[ end trace eb2ec02455daeda8 ]---
Apr 28 00:20:30 funnyman portmap[11769]: connect from 127.0.0.1 to unset(status): request from unprivileged port
Apr 28 00:20:30 funnyman pcscd: pcscdaemon.c:529:signal_trap() Preparing for suicide

and mm/slub.c:2444 are as follows:

 2433  * Close a cache and release the kmem_cache structure
   2434  * (must be used for caches created using kmem_cache_create)
   2435  */
   2436 void kmem_cache_destroy(struct kmem_cache *s)
   2437 {
   2438         down_write(&slub_lock);
   2439         s->refcount--;
   2440         if (!s->refcount) {
   2441                 list_del(&s->list);
   2442                 up_write(&slub_lock);
   2443                 if (kmem_cache_close(s))
   2444                         WARN_ON(1);
   2445                 sysfs_slab_remove(s);
   2446         } else
   2447                 up_write(&slub_lock);
   2448 }
   2449 EXPORT_SYMBOL(kmem_cache_destroy);

How to reproduce:

Not sure how, as it occur during shutdown.
Comment 1 Anonymous Emailer 2008-04-29 08:23:04 UTC
Reply-To: akpm@linux-foundation.org

(switched to email.  Please respond via emailed reply-to-all, not via the
bugzilla web interface).

On Tue, 29 Apr 2008 06:31:36 -0700 (PDT) bugme-daemon@bugzilla.kernel.org wrote:

> http://bugzilla.kernel.org/show_bug.cgi?id=10575
> 
>            Summary: WARNING: at mm/slub.c:2444
>            Product: Memory Management
>            Version: 2.5
>      KernelVersion: 2.6.25-sched-devel.git-x86-latest.git
>           Platform: All
>         OS/Version: Linux
>               Tree: Mainline
>             Status: NEW
>           Severity: normal
>           Priority: P1
>          Component: Slab Allocator
>         AssignedTo: akpm@osdl.org
>         ReportedBy: htmldeveloper@gmail.com
> 
> 
> kernel version:
> 
> cat include/config/kernel.release 
> 2.6.25-sched-devel.git-x86-latest.git
> 
> Distribution:
> 
> FC6
> 
> Hardware Environment:
> Software Environment:
> Problem Description:
> 
> Shutting down the system generated the following errors:
> 
> Apr 28 00:20:22 funnyman libvirtd: Shutting down on signal 15
> Apr 28 00:20:25 funnyman kernel: sky2 eth0: Link is down.
> Apr 28 00:20:25 funnyman xinetd[3373]: Exiting...
> Apr 28 00:20:30 funnyman kernel: ------------[ cut here ]------------
> Apr 28 00:20:30 funnyman kernel: WARNING: at mm/slub.c:2444
> kmem_cache_destroy+0xfe/0x108()
> Apr 28 00:20:30 funnyman kernel: Modules linked in: rfcomm hidp l2cap
> bluetooth
> button ext2 btrfs hfsplus usb_storage nls_utf8 bridge autofs4 nf_conntrack(-)
> xt_tcpudp x_tables sunrpc loop dm_multipath video output sbs sbshc battery ac
> ipv6 parport_pc lp parport snd_usb_audio snd_usb_lib snd_rawmidi snd_hwdep
> snd_hda_intel snd_seq_dummy snd_seq_oss snd_seq_midi_event snd_seq
> snd_seq_device snd_pcm_oss sg firewire_ohci snd_mixer_oss snd_pcm
> firewire_core
> crc_itu_t snd_timer snd pata_jmicron soundcore serio_raw sky2 snd_page_alloc
> pcspkr i2c_i801 iTCO_wdt iTCO_vendor_support i2c_core floppy dm_snapshot
> dm_zero dm_mirror dm_mod ahci ata_generic ata_piix libata sd_mod scsi_mod
> ext3
> jbd ehci_hcd ohci_hcd uhci_hcd [last unloaded: xt_state]
> Apr 28 00:20:30 funnyman kernel: Pid: 11669, comm: modprobe Not tainted
> 2.6.25-sched-devel.git-x86-latest.git #1
> Apr 28 00:20:30 funnyman kernel:  [<c042bad6>] warn_on_slowpath+0x46/0x56
> Apr 28 00:20:30 funnyman kernel:  [<c0415a33>] ? apic_wait_icr_idle+0x16/0x1d
> Apr 28 00:20:30 funnyman kernel:  [<c0415243>] ?
> __send_IPI_dest_field+0x50/0x54
> Apr 28 00:20:30 funnyman kernel:  [<c04020e5>] ? send_IPI_mask+0xd/0xf
> Apr 28 00:20:30 funnyman kernel:  [<c046773c>] ?
> get_pageblock_flags_group+0x50/0x6e
> Apr 28 00:20:30 funnyman kernel:  [<c046777e>] ?
> get_pageblock_migratetype+0x24/0x27
> Apr 28 00:20:30 funnyman kernel:  [<c0468472>] ? free_hot_page+0xf/0x11
> Apr 28 00:20:30 funnyman kernel:  [<c0468494>] ? __free_pages+0x20/0x2b
> Apr 28 00:20:30 funnyman kernel:  [<c047f471>] ? __free_slab+0xac/0xb4
> Apr 28 00:20:30 funnyman kernel:  [<c0480754>] kmem_cache_destroy+0xfe/0x108
> Apr 28 00:20:30 funnyman kernel:  [<f8d337c0>] nf_conntrack_cleanup+0x53/0x7a
> [nf_conntrack]
> Apr 28 00:20:30 funnyman kernel:  [<f8d3766d>]
> nf_conntrack_standalone_fini+0x1c/0x1e [nf_conntrack]
> Apr 28 00:20:30 funnyman kernel:  [<c044b56f>] sys_delete_module+0x177/0x1af
> Apr 28 00:20:30 funnyman kernel:  [<c0472c00>] ? remove_vma+0x31/0x53
> Apr 28 00:20:30 funnyman kernel:  [<c0473468>] ? do_munmap+0x182/0x19c
> Apr 28 00:20:30 funnyman kernel:  [<c0404bae>] sysenter_past_esp+0x6a/0x90
> Apr 28 00:20:30 funnyman kernel:  [<c0640000>] ? pci_scan_bridge+0x1dc/0x2eb
> Apr 28 00:20:30 funnyman hcid[9436]: Got disconnected from the system message
> bus
> Apr 28 00:20:30 funnyman kernel:  =======================
> Apr 28 00:20:30 funnyman rpc.statd[2994]: Caught signal 15, un-registering
> and
> exiting.
> Apr 28 00:20:30 funnyman kernel: ---[ end trace eb2ec02455daeda8 ]---
> Apr 28 00:20:30 funnyman portmap[11769]: connect from 127.0.0.1 to
> unset(status): request from unprivileged port
> Apr 28 00:20:30 funnyman pcscd: pcscdaemon.c:529:signal_trap() Preparing for
> suicide
> 
> and mm/slub.c:2444 are as follows:
> 
>  2433  * Close a cache and release the kmem_cache structure
>    2434  * (must be used for caches created using kmem_cache_create)
>    2435  */
>    2436 void kmem_cache_destroy(struct kmem_cache *s)
>    2437 {
>    2438         down_write(&slub_lock);
>    2439         s->refcount--;
>    2440         if (!s->refcount) {
>    2441                 list_del(&s->list);
>    2442                 up_write(&slub_lock);
>    2443                 if (kmem_cache_close(s))
>    2444                         WARN_ON(1);
>    2445                 sysfs_slab_remove(s);
>    2446         } else
>    2447                 up_write(&slub_lock);
>    2448 }
>    2449 EXPORT_SYMBOL(kmem_cache_destroy);
> 
> How to reproduce:
> 
> Not sure how, as it occur during shutdown.

Looks like nf_contrack is destroying a slab cache which still has
live objects.

I think this came up a few days ago but I'm not sure if it was fixed?
Comment 2 Patrick McHardy 2008-04-29 12:15:29 UTC
Andrew Morton wrote:
> (switched to email.  Please respond via emailed reply-to-all, not via the
> bugzilla web interface).
>
> On Tue, 29 Apr 2008 06:31:36 -0700 (PDT) bugme-daemon@bugzilla.kernel.org
> wrote:
>
>   
>>
>> kernel version:
>>
>> cat include/config/kernel.release 
>> 2.6.25-sched-devel.git-x86-latest.git
>>
>> Shutting down the system generated the following errors:
>>
>> Apr 28 00:20:22 funnyman libvirtd: Shutting down on signal 15
>> Apr 28 00:20:25 funnyman kernel: sky2 eth0: Link is down.
>> Apr 28 00:20:25 funnyman xinetd[3373]: Exiting...
>> Apr 28 00:20:30 funnyman kernel: ------------[ cut here ]------------
>> Apr 28 00:20:30 funnyman kernel: WARNING: at mm/slub.c:2444
>> kmem_cache_destroy+0xfe/0x108()
>> Apr 28 00:20:30 funnyman kernel: Modules linked in: rfcomm hidp l2cap
>> bluetooth
>> button ext2 btrfs hfsplus usb_storage nls_utf8 bridge autofs4
>> nf_conntrack(-)
>> xt_tcpudp x_tables sunrpc loop dm_multipath video output sbs sbshc battery
>> ac
>> ipv6 parport_pc lp parport snd_usb_audio snd_usb_lib snd_rawmidi snd_hwdep
>> snd_hda_intel snd_seq_dummy snd_seq_oss snd_seq_midi_event snd_seq
>> snd_seq_device snd_pcm_oss sg firewire_ohci snd_mixer_oss snd_pcm
>> firewire_core
>> crc_itu_t snd_timer snd pata_jmicron soundcore serio_raw sky2 snd_page_alloc
>> pcspkr i2c_i801 iTCO_wdt iTCO_vendor_support i2c_core floppy dm_snapshot
>> dm_zero dm_mirror dm_mod ahci ata_generic ata_piix libata sd_mod scsi_mod
>> ext3
>> jbd ehci_hcd ohci_hcd uhci_hcd [last unloaded: xt_state]
>> Apr 28 00:20:30 funnyman kernel: Pid: 11669, comm: modprobe Not tainted
>> 2.6.25-sched-devel.git-x86-latest.git #1
>> Apr 28 00:20:30 funnyman kernel:  [<c042bad6>] warn_on_slowpath+0x46/0x56
>> Apr 28 00:20:30 funnyman kernel:  [<c0415a33>] ?
>> apic_wait_icr_idle+0x16/0x1d
>> Apr 28 00:20:30 funnyman kernel:  [<c0415243>] ?
>> __send_IPI_dest_field+0x50/0x54
>> Apr 28 00:20:30 funnyman kernel:  [<c04020e5>] ? send_IPI_mask+0xd/0xf
>> Apr 28 00:20:30 funnyman kernel:  [<c046773c>] ?
>> get_pageblock_flags_group+0x50/0x6e
>> Apr 28 00:20:30 funnyman kernel:  [<c046777e>] ?
>> get_pageblock_migratetype+0x24/0x27
>> Apr 28 00:20:30 funnyman kernel:  [<c0468472>] ? free_hot_page+0xf/0x11
>> Apr 28 00:20:30 funnyman kernel:  [<c0468494>] ? __free_pages+0x20/0x2b
>> Apr 28 00:20:30 funnyman kernel:  [<c047f471>] ? __free_slab+0xac/0xb4
>> Apr 28 00:20:30 funnyman kernel:  [<c0480754>] kmem_cache_destroy+0xfe/0x108
>> Apr 28 00:20:30 funnyman kernel:  [<f8d337c0>]
>> nf_conntrack_cleanup+0x53/0x7a
>> [nf_conntrack]
>> Apr 28 00:20:30 funnyman kernel:  [<f8d3766d>]
>> nf_conntrack_standalone_fini+0x1c/0x1e [nf_conntrack]
>> Apr 28 00:20:30 funnyman kernel:  [<c044b56f>] sys_delete_module+0x177/0x1af
>> Apr 28 00:20:30 funnyman kernel:  [<c0472c00>] ? remove_vma+0x31/0x53
>> Apr 28 00:20:30 funnyman kernel:  [<c0473468>] ? do_munmap+0x182/0x19c
>> Apr 28 00:20:30 funnyman kernel:  [<c0404bae>] sysenter_past_esp+0x6a/0x90
>> Apr 28 00:20:30 funnyman kernel:  [<c0640000>] ? pci_scan_bridge+0x1dc/0x2eb
>> Apr 28 00:20:30 funnyman hcid[9436]: Got disconnected from the system
>> message
>> bus
>> Apr 28 00:20:30 funnyman kernel:  =======================
>> Apr 28 00:20:30 funnyman rpc.statd[2994]: Caught signal 15, un-registering
>> and
>> exiting.
>> Apr 28 00:20:30 funnyman kernel: ---[ end trace eb2ec02455daeda8 ]---
>> Apr 28 00:20:30 funnyman portmap[11769]: connect from 127.0.0.1 to
>> unset(status): request from unprivileged port
>> Apr 28 00:20:30 funnyman pcscd: pcscdaemon.c:529:signal_trap() Preparing for
>> suicide
>>
>> and mm/slub.c:2444 are as follows:
>>
>>  2433  * Close a cache and release the kmem_cache structure
>>    2434  * (must be used for caches created using kmem_cache_create)
>>    2435  */
>>    2436 void kmem_cache_destroy(struct kmem_cache *s)
>>    2437 {
>>    2438         down_write(&slub_lock);
>>    2439         s->refcount--;
>>    2440         if (!s->refcount) {
>>    2441                 list_del(&s->list);
>>    2442                 up_write(&slub_lock);
>>    2443                 if (kmem_cache_close(s))
>>    2444                         WARN_ON(1);
>>    2445                 sysfs_slab_remove(s);
>>    2446         } else
>>    2447                 up_write(&slub_lock);
>>    2448 }
>>    2449 EXPORT_SYMBOL(kmem_cache_destroy);
>>
>> How to reproduce:
>>
>> Not sure how, as it occur during shutdown.
>>     
>
> Looks like nf_contrack is destroying a slab cache which still has
> live objects.
>
> I think this came up a few days ago but I'm not sure if it was fixed?

I believe Stephen fixed a use-after-free in bridging a few days ago,
are you referring to this? Otherwise a pointer would be appreciated.

In any case, htmldeveloper, could you provide some more information
about your setup, i.e. firewall rules, does the unload happen during
load, ...? Did you also notice the bug on other kernel versions than
sched-devel.git-x86-latest.git? Thanks.
Comment 3 Anonymous Emailer 2008-04-29 12:38:08 UTC
Reply-To: akpm@linux-foundation.org

On Tue, 29 Apr 2008 21:14:46 +0200
Patrick McHardy <kaber@trash.net> wrote:

> Andrew Morton wrote:
> > (switched to email.  Please respond via emailed reply-to-all, not via the
> > bugzilla web interface).
> >
> > On Tue, 29 Apr 2008 06:31:36 -0700 (PDT) bugme-daemon@bugzilla.kernel.org
> wrote:
> >
> >   
> >>
> >> kernel version:
> >>
> >> cat include/config/kernel.release 
> >> 2.6.25-sched-devel.git-x86-latest.git
> >>
> >> Shutting down the system generated the following errors:
> >>
> >> Apr 28 00:20:22 funnyman libvirtd: Shutting down on signal 15
> >> Apr 28 00:20:25 funnyman kernel: sky2 eth0: Link is down.
> >> Apr 28 00:20:25 funnyman xinetd[3373]: Exiting...
> >> Apr 28 00:20:30 funnyman kernel: ------------[ cut here ]------------
> >> Apr 28 00:20:30 funnyman kernel: WARNING: at mm/slub.c:2444
> >> kmem_cache_destroy+0xfe/0x108()
> >> Apr 28 00:20:30 funnyman kernel: Modules linked in: rfcomm hidp l2cap
> bluetooth
> >> button ext2 btrfs hfsplus usb_storage nls_utf8 bridge autofs4
> nf_conntrack(-)
> >> xt_tcpudp x_tables sunrpc loop dm_multipath video output sbs sbshc battery
> ac
> >> ipv6 parport_pc lp parport snd_usb_audio snd_usb_lib snd_rawmidi snd_hwdep
> >> snd_hda_intel snd_seq_dummy snd_seq_oss snd_seq_midi_event snd_seq
> >> snd_seq_device snd_pcm_oss sg firewire_ohci snd_mixer_oss snd_pcm
> firewire_core
> >> crc_itu_t snd_timer snd pata_jmicron soundcore serio_raw sky2
> snd_page_alloc
> >> pcspkr i2c_i801 iTCO_wdt iTCO_vendor_support i2c_core floppy dm_snapshot
> >> dm_zero dm_mirror dm_mod ahci ata_generic ata_piix libata sd_mod scsi_mod
> ext3
> >> jbd ehci_hcd ohci_hcd uhci_hcd [last unloaded: xt_state]
> >> Apr 28 00:20:30 funnyman kernel: Pid: 11669, comm: modprobe Not tainted
> >> 2.6.25-sched-devel.git-x86-latest.git #1
> >> Apr 28 00:20:30 funnyman kernel:  [<c042bad6>] warn_on_slowpath+0x46/0x56
> >> Apr 28 00:20:30 funnyman kernel:  [<c0415a33>] ?
> apic_wait_icr_idle+0x16/0x1d
> >> Apr 28 00:20:30 funnyman kernel:  [<c0415243>] ?
> >> __send_IPI_dest_field+0x50/0x54
> >> Apr 28 00:20:30 funnyman kernel:  [<c04020e5>] ? send_IPI_mask+0xd/0xf
> >> Apr 28 00:20:30 funnyman kernel:  [<c046773c>] ?
> >> get_pageblock_flags_group+0x50/0x6e
> >> Apr 28 00:20:30 funnyman kernel:  [<c046777e>] ?
> >> get_pageblock_migratetype+0x24/0x27
> >> Apr 28 00:20:30 funnyman kernel:  [<c0468472>] ? free_hot_page+0xf/0x11
> >> Apr 28 00:20:30 funnyman kernel:  [<c0468494>] ? __free_pages+0x20/0x2b
> >> Apr 28 00:20:30 funnyman kernel:  [<c047f471>] ? __free_slab+0xac/0xb4
> >> Apr 28 00:20:30 funnyman kernel:  [<c0480754>]
> kmem_cache_destroy+0xfe/0x108
> >> Apr 28 00:20:30 funnyman kernel:  [<f8d337c0>]
> nf_conntrack_cleanup+0x53/0x7a
> >> [nf_conntrack]
> >> Apr 28 00:20:30 funnyman kernel:  [<f8d3766d>]
> >> nf_conntrack_standalone_fini+0x1c/0x1e [nf_conntrack]
> >> Apr 28 00:20:30 funnyman kernel:  [<c044b56f>]
> sys_delete_module+0x177/0x1af
> >> Apr 28 00:20:30 funnyman kernel:  [<c0472c00>] ? remove_vma+0x31/0x53
> >> Apr 28 00:20:30 funnyman kernel:  [<c0473468>] ? do_munmap+0x182/0x19c
> >> Apr 28 00:20:30 funnyman kernel:  [<c0404bae>] sysenter_past_esp+0x6a/0x90
> >> Apr 28 00:20:30 funnyman kernel:  [<c0640000>] ?
> pci_scan_bridge+0x1dc/0x2eb
> >> Apr 28 00:20:30 funnyman hcid[9436]: Got disconnected from the system
> message
> >> bus
> >> Apr 28 00:20:30 funnyman kernel:  =======================
> >> Apr 28 00:20:30 funnyman rpc.statd[2994]: Caught signal 15, un-registering
> and
> >> exiting.
> >> Apr 28 00:20:30 funnyman kernel: ---[ end trace eb2ec02455daeda8 ]---
> >> Apr 28 00:20:30 funnyman portmap[11769]: connect from 127.0.0.1 to
> >> unset(status): request from unprivileged port
> >> Apr 28 00:20:30 funnyman pcscd: pcscdaemon.c:529:signal_trap() Preparing
> for
> >> suicide
> >>
> >> and mm/slub.c:2444 are as follows:
> >>
> >>  2433  * Close a cache and release the kmem_cache structure
> >>    2434  * (must be used for caches created using kmem_cache_create)
> >>    2435  */
> >>    2436 void kmem_cache_destroy(struct kmem_cache *s)
> >>    2437 {
> >>    2438         down_write(&slub_lock);
> >>    2439         s->refcount--;
> >>    2440         if (!s->refcount) {
> >>    2441                 list_del(&s->list);
> >>    2442                 up_write(&slub_lock);
> >>    2443                 if (kmem_cache_close(s))
> >>    2444                         WARN_ON(1);
> >>    2445                 sysfs_slab_remove(s);
> >>    2446         } else
> >>    2447                 up_write(&slub_lock);
> >>    2448 }
> >>    2449 EXPORT_SYMBOL(kmem_cache_destroy);
> >>
> >> How to reproduce:
> >>
> >> Not sure how, as it occur during shutdown.
> >>     
> >
> > Looks like nf_contrack is destroying a slab cache which still has
> > live objects.
> >
> > I think this came up a few days ago but I'm not sure if it was fixed?
> 
> I believe Stephen fixed a use-after-free in bridging a few days ago,
> are you referring to this? Otherwise a pointer would be appreciated.

<checks>

Sorry, I confused it with a similar-looking USB trace.  Pekka added some
additional debug at that site which might help here - it will tell us the
name of the slab cache:

void kmem_cache_destroy(struct kmem_cache *s)
{
	down_write(&slub_lock);
	s->refcount--;
	if (!s->refcount) {
		list_del(&s->list);
		up_write(&slub_lock);
		if (kmem_cache_close(s)) {
			printk(KERN_ERR "SLUB %s: %s called for cache that "
				"still has objects.\n", s->name, __func__);
			dump_stack();
		}
		sysfs_slab_remove(s);
	} else
		up_write(&slub_lock);
}
			
that was merged into mainline yesterday.

> In any case, htmldeveloper, could you provide some more information
> about your setup, i.e. firewall rules, does the unload happen during
> load, ...? Did you also notice the bug on other kernel versions than
> sched-devel.git-x86-latest.git? Thanks.
> 
Comment 4 Pekka Enberg 2008-04-29 12:41:26 UTC
Andrew Morton wrote:
> Sorry, I confused it with a similar-looking USB trace.  Pekka added some
> additional debug at that site which might help here - it will tell us the
> name of the slab cache:
> 
> void kmem_cache_destroy(struct kmem_cache *s)
> {
>       down_write(&slub_lock);
>       s->refcount--;
>       if (!s->refcount) {
>               list_del(&s->list);
>               up_write(&slub_lock);
>               if (kmem_cache_close(s)) {
>                       printk(KERN_ERR "SLUB %s: %s called for cache that "
>                               "still has objects.\n", s->name, __func__);
>                       dump_stack();
>               }
>               sysfs_slab_remove(s);
>       } else
>               up_write(&slub_lock);
> }
>                       
> that was merged into mainline yesterday.

Christoph added even nicer debugging code that dumps the objects in the 
cache:

http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commitdiff;h=33b12c38134e95e5afa73214af6f49abd7b8418e
Comment 5 Pekka Enberg 2008-04-29 13:08:20 UTC
Andrew Morton wrote:
> On Tue, 29 Apr 2008 21:14:46 +0200
> Patrick McHardy <kaber@trash.net> wrote:
> 
>> Andrew Morton wrote:
>>> (switched to email.  Please respond via emailed reply-to-all, not via the
>>> bugzilla web interface).
>>>
>>> On Tue, 29 Apr 2008 06:31:36 -0700 (PDT) bugme-daemon@bugzilla.kernel.org
>>> wrote:
>>>
>>>   
>>>> kernel version:
>>>>
>>>> cat include/config/kernel.release 
>>>> 2.6.25-sched-devel.git-x86-latest.git
>>>>
>>>> Shutting down the system generated the following errors:
>>>>
>>>> Apr 28 00:20:22 funnyman libvirtd: Shutting down on signal 15
>>>> Apr 28 00:20:25 funnyman kernel: sky2 eth0: Link is down.
>>>> Apr 28 00:20:25 funnyman xinetd[3373]: Exiting...
>>>> Apr 28 00:20:30 funnyman kernel: ------------[ cut here ]------------
>>>> Apr 28 00:20:30 funnyman kernel: WARNING: at mm/slub.c:2444
>>>> kmem_cache_destroy+0xfe/0x108()
>>>> Apr 28 00:20:30 funnyman kernel: Modules linked in: rfcomm hidp l2cap
>>>> bluetooth
>>>> button ext2 btrfs hfsplus usb_storage nls_utf8 bridge autofs4
>>>> nf_conntrack(-)
>>>> xt_tcpudp x_tables sunrpc loop dm_multipath video output sbs sbshc battery
>>>> ac
>>>> ipv6 parport_pc lp parport snd_usb_audio snd_usb_lib snd_rawmidi snd_hwdep
>>>> snd_hda_intel snd_seq_dummy snd_seq_oss snd_seq_midi_event snd_seq
>>>> snd_seq_device snd_pcm_oss sg firewire_ohci snd_mixer_oss snd_pcm
>>>> firewire_core
>>>> crc_itu_t snd_timer snd pata_jmicron soundcore serio_raw sky2
>>>> snd_page_alloc
>>>> pcspkr i2c_i801 iTCO_wdt iTCO_vendor_support i2c_core floppy dm_snapshot
>>>> dm_zero dm_mirror dm_mod ahci ata_generic ata_piix libata sd_mod scsi_mod
>>>> ext3
>>>> jbd ehci_hcd ohci_hcd uhci_hcd [last unloaded: xt_state]
>>>> Apr 28 00:20:30 funnyman kernel: Pid: 11669, comm: modprobe Not tainted
>>>> 2.6.25-sched-devel.git-x86-latest.git #1
>>>> Apr 28 00:20:30 funnyman kernel:  [<c042bad6>] warn_on_slowpath+0x46/0x56
>>>> Apr 28 00:20:30 funnyman kernel:  [<c0415a33>] ?
>>>> apic_wait_icr_idle+0x16/0x1d
>>>> Apr 28 00:20:30 funnyman kernel:  [<c0415243>] ?
>>>> __send_IPI_dest_field+0x50/0x54
>>>> Apr 28 00:20:30 funnyman kernel:  [<c04020e5>] ? send_IPI_mask+0xd/0xf
>>>> Apr 28 00:20:30 funnyman kernel:  [<c046773c>] ?
>>>> get_pageblock_flags_group+0x50/0x6e
>>>> Apr 28 00:20:30 funnyman kernel:  [<c046777e>] ?
>>>> get_pageblock_migratetype+0x24/0x27
>>>> Apr 28 00:20:30 funnyman kernel:  [<c0468472>] ? free_hot_page+0xf/0x11
>>>> Apr 28 00:20:30 funnyman kernel:  [<c0468494>] ? __free_pages+0x20/0x2b
>>>> Apr 28 00:20:30 funnyman kernel:  [<c047f471>] ? __free_slab+0xac/0xb4
>>>> Apr 28 00:20:30 funnyman kernel:  [<c0480754>]
>>>> kmem_cache_destroy+0xfe/0x108
>>>> Apr 28 00:20:30 funnyman kernel:  [<f8d337c0>]
>>>> nf_conntrack_cleanup+0x53/0x7a
>>>> [nf_conntrack]
>>>> Apr 28 00:20:30 funnyman kernel:  [<f8d3766d>]
>>>> nf_conntrack_standalone_fini+0x1c/0x1e [nf_conntrack]
>>>> Apr 28 00:20:30 funnyman kernel:  [<c044b56f>]
>>>> sys_delete_module+0x177/0x1af
>>>> Apr 28 00:20:30 funnyman kernel:  [<c0472c00>] ? remove_vma+0x31/0x53
>>>> Apr 28 00:20:30 funnyman kernel:  [<c0473468>] ? do_munmap+0x182/0x19c
>>>> Apr 28 00:20:30 funnyman kernel:  [<c0404bae>] sysenter_past_esp+0x6a/0x90
>>>> Apr 28 00:20:30 funnyman kernel:  [<c0640000>] ?
>>>> pci_scan_bridge+0x1dc/0x2eb
>>>> Apr 28 00:20:30 funnyman hcid[9436]: Got disconnected from the system
>>>> message
>>>> bus
>>>> Apr 28 00:20:30 funnyman kernel:  =======================
>>>> Apr 28 00:20:30 funnyman rpc.statd[2994]: Caught signal 15, un-registering
>>>> and
>>>> exiting.
>>>> Apr 28 00:20:30 funnyman kernel: ---[ end trace eb2ec02455daeda8 ]---
>>>> Apr 28 00:20:30 funnyman portmap[11769]: connect from 127.0.0.1 to
>>>> unset(status): request from unprivileged port
>>>> Apr 28 00:20:30 funnyman pcscd: pcscdaemon.c:529:signal_trap() Preparing
>>>> for
>>>> suicide
>>>>
>>>> and mm/slub.c:2444 are as follows:
>>>>
>>>>  2433  * Close a cache and release the kmem_cache structure
>>>>    2434  * (must be used for caches created using kmem_cache_create)
>>>>    2435  */
>>>>    2436 void kmem_cache_destroy(struct kmem_cache *s)
>>>>    2437 {
>>>>    2438         down_write(&slub_lock);
>>>>    2439         s->refcount--;
>>>>    2440         if (!s->refcount) {
>>>>    2441                 list_del(&s->list);
>>>>    2442                 up_write(&slub_lock);
>>>>    2443                 if (kmem_cache_close(s))
>>>>    2444                         WARN_ON(1);
>>>>    2445                 sysfs_slab_remove(s);
>>>>    2446         } else
>>>>    2447                 up_write(&slub_lock);
>>>>    2448 }
>>>>    2449 EXPORT_SYMBOL(kmem_cache_destroy);
>>>>
>>>> How to reproduce:
>>>>
>>>> Not sure how, as it occur during shutdown.
>>>>     
>>> Looks like nf_contrack is destroying a slab cache which still has
>>> live objects.
>>>
>>> I think this came up a few days ago but I'm not sure if it was fixed?
>> I believe Stephen fixed a use-after-free in bridging a few days ago,
>> are you referring to this? Otherwise a pointer would be appreciated.
> 
> <checks>
> 
> Sorry, I confused it with a similar-looking USB trace.  Pekka added some
> additional debug at that site which might help here - it will tell us the
> name of the slab cache:

Well, it's obviously nf_conntrack_cachep but this is the second time I 
see the SLUB WARN_ON trigger but can't find anything wrong with the 
code. Christoph, if you look at nf_conntrack_cleanup() in 
net/netfilter/nf_conntrack_core.c:

  i_see_dead_people:
         nf_conntrack_flush();
         if (atomic_read(&nf_conntrack_count) != 0) {
                 schedule();
                 goto i_see_dead_people;
         }

Yeah, yikes, but in nf_conntrack_alloc() we do

         atomic_inc(&nf_conntrack_count);

before

         ct = kmem_cache_zalloc(nf_conntrack_cachep, GFP_ATOMIC);

So I don't see how we can call kmem_cache_destroy() with unfree'd 
objects in it... Can you take a look at this?

And oh, Peter, if you can trigger this with mainline, please do post the 
oops. I should give us better information what's happening.

		Pekka
Comment 6 Christoph Lameter 2008-04-29 16:00:34 UTC
On Tue, 29 Apr 2008, Pekka Enberg wrote:

> Well, it's obviously nf_conntrack_cachep but this is the second time I see
> the
> SLUB WARN_ON trigger but can't find anything wrong with the code. Christoph,
> if you look at nf_conntrack_cleanup() in net/netfilter/nf_conntrack_core.c:
> 
>  i_see_dead_people:
>         nf_conntrack_flush();
>         if (atomic_read(&nf_conntrack_count) != 0) {
>                 schedule();
>                 goto i_see_dead_people;
>         }

And we are sure that no additional items are allocatable after this?

>         ct = kmem_cache_zalloc(nf_conntrack_cachep, GFP_ATOMIC);
> 
> So I don't see how we can call kmem_cache_destroy() with unfree'd objects in
> it... Can you take a look at this?

Nothing jumps out but then there are numerous components involved.

> And oh, Peter, if you can trigger this with mainline, please do post the
> oops.
> I should give us better information what's happening.

The new diagnostics should give us a lot more data to go on. Make sure you 
run with "slub_debug" as a kernel parameter or SLUB_DEBUG_ON configured.

The dump will include the time and place when the remaining objects were 
allocated which should resolve this issue.
Comment 7 Erik Andr 2010-01-06 12:19:41 UTC
Is this still an issue with 2.6.32?

Note You need to log in before you can comment on or make changes to this bug.