Bug 1426 - slab error in cache_free_debugcheck() ... memory outside object was overwritten [sd.c vs. MESH]
Summary: slab error in cache_free_debugcheck() ... memory outside object was overwritt...
Status: CLOSED CODE_FIX
Alias: None
Product: Platform Specific/Hardware
Classification: Unclassified
Component: PPC-32 (show other bugs)
Hardware: i386 Linux
: P2 normal
Assignee: Zwane Mwaikambo
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2003-10-26 10:27 UTC by John Mock
Modified: 2005-08-07 13:40 UTC (History)
1 user (show)

See Also:
Kernel Version: 2.6.0-test9
Subsystem:
Regression: ---
Bisected commit-id:


Attachments
Patch to driver/scsi/sd.c workaround possible MESH byte-alignment problem. (1.16 KB, patch)
2003-10-26 10:35 UTC, John Mock
Details | Diff
Full 'dmesg' output. (11.04 KB, text/plain)
2003-10-26 10:36 UTC, John Mock
Details
.config file for linux-2.6.0-test9 (21.13 KB, text/plain)
2003-10-26 10:38 UTC, John Mock
Details

Description John Mock 2003-10-26 10:27:03 UTC
Distribution: kernel.org

Hardware Environment:PowerMac 8500 [see .dmesg for details]
    00:0b.0 Host bridge: Apple Computer Inc. Bandit PowerPC host bridge (rev 03)
    00:0d.0 Ethernet controller: Linksys Network Everywhere Fast Ethernet 10/100
model NC100 (rev 11)
    00:0e.0 VGA compatible controller: Matrox Graphics, Inc. MGA 2064W
[Millennium] (rev 01)
    00:0f.0 USB Controller: OPTi Inc. 82C861 (rev 10)
    00:10.0 Class ff00: Apple Computer Inc. Grand Central I/O (rev 02)
    01:0b.0 Non-VGA unclassified device: Apple Computer Inc. Control Video
    01:0d.0 Class ff00: Apple Computer Inc. PlanB Video-In (rev 01)

Software Environment: Debian 'Testing' distribution

Problem Description:
    SCSI driver get once stack trace per disk drive during bootup.  Sample:

    SCSI device sda: 4226725 512-byte hdwr sectors (2164 MB)
    SCSI device sda: drive cache: write back
    slab error in cache_free_debugcheck(): cache `size-512(DMA)': double free,
or memory outside object was overwritten
    Call trace:
     [c000956c] dump_stack+0x18/0x28
     [c003beac] __slab_error+0x2c/0x3c
     [c003e784] kfree+0x290/0x384
     [c00dbdbc] sd_revalidate_disk+0xb8/0x148
     [c00dc020] sd_probe+0x1d4/0x2c0
     [c00b9958] bus_match+0x50/0x8c
     [c00b9ad8] driver_attach+0x88/0xc8
     [c00b9e28] bus_add_driver+0x90/0xe0
     [c00ba288] driver_register+0x30/0x40
     [c00d6478] scsi_register_driver+0x1c/0x2c
     [c02275e4] init_sd+0x5c/0x70
     [c02125dc] do_initcalls+0x54/0xe0
     [c0003958] init+0x1c/0xc0
     [c0008cf0] kernel_thread+0x44/0x60
    cde0dc88: redzone 1: 0x0, redzone 2: 0x170fc2a5.
     sda: [mac] sda1 sda2 sda3 sda4 sda5 sda6 sda7 sda8 sda9 sda10 sda11 sda12
    Attached scsi disk sda at scsi0, channel 0, id 2, lun 0

Trace problem to vicinity of sd_read_capacity() in driver/scsi/sd.c, as freeing
and regetting the buffer after that call fails, and before it, it succeeds.
'benh' says:
Comment 1 John Mock 2003-10-26 10:31:25 UTC
[oops, web interface wemt awry]

'benh' says:

    I know what's going on in sd_read_capacity(), and so far, it proved
    quite harmless on my 8500, it's the MESH DBDMA controller that doesn't
    like beeing passed a buffer that isn't aligned to a 16 bytes boundary
    I'd say (or maybe a 32 bytes boundary). It's erasing a few bytes before
    the buffer, thus trashing the pattern put there by the slab debugging
    code.

And indeed, forcing that boundary at sd_revalidate_disk() seems to fix the
problem, but that seems like bad place to put something which is platform
specific.  The following patch seems to fix it:

Comment 2 John Mock 2003-10-26 10:35:38 UTC
Created attachment 1205 [details]
Patch to driver/scsi/sd.c workaround possible MESH byte-alignment problem.

This probably isn't the best fix, but it solves the problem at least for the
upcoming 2.6.0 release.

Below is the full 'dmesg' showing the configuration and .config file
Comment 3 John Mock 2003-10-26 10:36:31 UTC
Created attachment 1206 [details]
Full 'dmesg' output.
Comment 4 John Mock 2003-10-26 10:38:05 UTC
Created attachment 1207 [details]
.config file for linux-2.6.0-test9

Note You need to log in before you can comment on or make changes to this bug.