Distribution: kernel.org Hardware Environment:PowerMac 8500 [see .dmesg for details] 00:0b.0 Host bridge: Apple Computer Inc. Bandit PowerPC host bridge (rev 03) 00:0d.0 Ethernet controller: Linksys Network Everywhere Fast Ethernet 10/100 model NC100 (rev 11) 00:0e.0 VGA compatible controller: Matrox Graphics, Inc. MGA 2064W [Millennium] (rev 01) 00:0f.0 USB Controller: OPTi Inc. 82C861 (rev 10) 00:10.0 Class ff00: Apple Computer Inc. Grand Central I/O (rev 02) 01:0b.0 Non-VGA unclassified device: Apple Computer Inc. Control Video 01:0d.0 Class ff00: Apple Computer Inc. PlanB Video-In (rev 01) Software Environment: Debian 'Testing' distribution Problem Description: SCSI driver get once stack trace per disk drive during bootup. Sample: SCSI device sda: 4226725 512-byte hdwr sectors (2164 MB) SCSI device sda: drive cache: write back slab error in cache_free_debugcheck(): cache `size-512(DMA)': double free, or memory outside object was overwritten Call trace: [c000956c] dump_stack+0x18/0x28 [c003beac] __slab_error+0x2c/0x3c [c003e784] kfree+0x290/0x384 [c00dbdbc] sd_revalidate_disk+0xb8/0x148 [c00dc020] sd_probe+0x1d4/0x2c0 [c00b9958] bus_match+0x50/0x8c [c00b9ad8] driver_attach+0x88/0xc8 [c00b9e28] bus_add_driver+0x90/0xe0 [c00ba288] driver_register+0x30/0x40 [c00d6478] scsi_register_driver+0x1c/0x2c [c02275e4] init_sd+0x5c/0x70 [c02125dc] do_initcalls+0x54/0xe0 [c0003958] init+0x1c/0xc0 [c0008cf0] kernel_thread+0x44/0x60 cde0dc88: redzone 1: 0x0, redzone 2: 0x170fc2a5. sda: [mac] sda1 sda2 sda3 sda4 sda5 sda6 sda7 sda8 sda9 sda10 sda11 sda12 Attached scsi disk sda at scsi0, channel 0, id 2, lun 0 Trace problem to vicinity of sd_read_capacity() in driver/scsi/sd.c, as freeing and regetting the buffer after that call fails, and before it, it succeeds. 'benh' says:
[oops, web interface wemt awry] 'benh' says: I know what's going on in sd_read_capacity(), and so far, it proved quite harmless on my 8500, it's the MESH DBDMA controller that doesn't like beeing passed a buffer that isn't aligned to a 16 bytes boundary I'd say (or maybe a 32 bytes boundary). It's erasing a few bytes before the buffer, thus trashing the pattern put there by the slab debugging code. And indeed, forcing that boundary at sd_revalidate_disk() seems to fix the problem, but that seems like bad place to put something which is platform specific. The following patch seems to fix it:
Created attachment 1205 [details] Patch to driver/scsi/sd.c workaround possible MESH byte-alignment problem. This probably isn't the best fix, but it solves the problem at least for the upcoming 2.6.0 release. Below is the full 'dmesg' showing the configuration and .config file
Created attachment 1206 [details] Full 'dmesg' output.
Created attachment 1207 [details] .config file for linux-2.6.0-test9