Bug 9973 - My mm-mystery-crash has now sneaked into mainline
My mm-mystery-crash has now sneaked into mainline
Status: CLOSED CODE_FIX
Product: Memory Management
Classification: Unclassified
Component: Slab Allocator
All Linux
: P1 normal
Assigned To: Andrew Morton
:
Depends on:
Blocks: 9832
  Show dependency treegraph
 
Reported: 2008-02-13 15:23 UTC by Rafael J. Wysocki
Modified: 2008-02-19 15:08 UTC (History)
4 users (show)

See Also:
Kernel Version: 2.6.25-rc1
Tree: Mainline
Regression: Yes


Attachments
OOPS from 2.6.25-rc2-mm1 (6.66 KB, text/plain)
2008-02-17 13:30 UTC, Torsten Kaiser
Details

Description Rafael J. Wysocki 2008-02-13 15:23:07 UTC
Subject         : My mm-mystery-crash has now sneaked into mainline
Submitter       : "Torsten Kaiser" <just.for.lkml@googlemail.com>
Date            : 2008-02-11 22:46
References      : http://lkml.org/lkml/2008/2/11/424
Handled-By      : Stefan Richter <stefanr@s5r6.in-berlin.de>

This entry is being used for tracking a regression from 2.6.24.  Please don't
close it until the problem is fixed in the mainline or the report is rejected.
Comment 1 Stefan Richter 2008-02-15 01:11:49 UTC
Two corrections:
  - There is no evidence that this is a drivers/ieee1394 bug.
  - There isn't anybody handling this bug.
Comment 2 Rafael J. Wysocki 2008-02-17 12:25:34 UTC
References : http://lkml.org/lkml/2008/2/16/267
Comment 3 Torsten Kaiser 2008-02-17 13:29:01 UTC
OK, 2.6.25-rc2-mm1 did also die, this one with left a better OOPS. I will attached this info, when I'm done with this comment.

The first reference is a summary what I found, I also posted a message linking to all the crashed I posted to the lkml that I think are cause by this:
http://lkml.org/lkml/2008/2/13/452

I guess something is corrupting skbs or lists of skbs and Stefans ieee1394 is just hit, because I'm using it as the main network on this system. But it also crashed with a real ethernet (tg3 driver).

What I don't really understand is, that the skbs seem to come from SLUB, but turning on slub_debug via kernel commandline only slowed the system down (ok normal) and made it more difficult to provoke the crash, but did not result in additional warnings or errors.
Comment 4 Torsten Kaiser 2008-02-17 13:30:20 UTC
Created attachment 14877 [details]
OOPS from 2.6.25-rc2-mm1
Comment 5 Torsten Kaiser 2008-02-19 11:31:19 UTC
The revert-patch from http://lkml.org/lkml/2008/2/19/231 fixes this for me and seems to be current best way to fix this.

Note You need to log in before you can comment on or make changes to this bug.