Created attachment 21439 [details] Backtrace of the OOPS with 2.6.30-rc5 Hello, I get kernel panic when data is read from IDE CDROM with 2.6.29 or later kernels (testing 2.6.30-rc5 at the moment). I have never had the issue with 2.6.28 and earlier kernels. I suspect the cause though. My CD/DVD drive is pretty old and might be a bit faulty (or this might kernel be misinterpreting my drive sometimes). <= 2.6.28 kernels used to turn off DMA due to seek errors (if I recall correctly) sometimes. However, I used to be able to turn DMA back on with hdparm and I have never really had read() failures. Now with >= 2.6.29 I have never seen DMA being turn off on my CD/DVD drive but I guess kernel just panics instead. So I suspect there is a regression in the latest kernels which triggers unrecoverable panic instead of drive reset. I got the attached backtrace (sorry, I couldn't do better than this picture) with 2.6.30-rc5 on Debian unstable amd64 box while doing a simple: $ dd if=/dev/cdrom of=/dev/null bs=1M
Created attachment 21440 [details] Some information about my system/kernel configuration uname -a lsmod lspci -vv and kernel config
On Tuesday 19 May 2009 23:44:32 bugzilla-daemon@bugzilla.kernel.org wrote: > http://bugzilla.kernel.org/show_bug.cgi?id=13345 Please test the patch from: http://patchwork.kernel.org/patch/24790/
Unfortunately, the patch applied on top of 2.6.30-rc5 does not fix the problem.
Hi, can you also send the bootlogs of a working kernel (<= 2.6.28) and of the borked one (30-rc5). For the last one, it would be very useful to see the whole of the initial OOPS which is cut off on the photo. Can you catch the output with a serial console (netconsole might do too if its not too early in the boot process and the machine doesn't die before some data can be transferred over the network)? Also, on the 30-rc5 do objdump -d drivers/ide/ide-io.o > ide-io.dsm and make drivers/ide/ide-io.s and send me the .dsm and .s files. Thanks, Boris.
On Wednesday 20 May 2009 09:30:29 bugzilla-daemon@bugzilla.kernel.org wrote: > http://bugzilla.kernel.org/show_bug.cgi?id=13345 > > > > > > --- Comment #3 from Modestas Vainius <modestas@vainius.eu> 2009-05-20 > 07:30:29 --- > Unfortunately, the patch applied on top of 2.6.30-rc5 does not fix the > problem. Just to be sure: Is the OOPS identical as previously or is it something new?
I'm really sorry, but I realized that I had not rebuilt initrd when I'd installed a new kernel with the patch applied. So I did that and I can confirm that I no longer got OOPS but those DMA turn offs were back instead as they used to happen with <= 2.6.28 kernels: [ 663.442788] hdd: cdrom_decode_status: status=0x51 { DriveReady SeekComplete Error } [ 663.445040] hdd: cdrom_decode_status: error=0x40 <3>{ LastFailedSense=0x04 } [ 663.446768] ide: failed opcode was: unknown [ 663.454549] hdd: cdrom_decode_status: status=0x51 { DriveReady SeekComplete Error } [ 663.456681] hdd: cdrom_decode_status: error=0x40 <3>{ LastFailedSense=0x04 } [ 663.458541] ide: failed opcode was: unknown [ 663.464783] hdd: cdrom_decode_status: status=0x51 { DriveReady SeekComplete Error } [ 663.466847] hdd: cdrom_decode_status: error=0x40 <3>{ LastFailedSense=0x04 } [ 663.468764] ide: failed opcode was: unknown [ 663.473414] hdd: cdrom_decode_status: status=0x51 { DriveReady SeekComplete Error } [ 663.475365] hdd: cdrom_decode_status: error=0x40 <3>{ LastFailedSense=0x04 } [ 663.477259] ide: failed opcode was: unknown [ 663.477406] hdd: DMA disabled [ 663.520043] hdd: ATAPI reset complete So yes, I can confirm that the patch fixes the OOPS with 2.6.30-rc5 and 2.6.30-rc6-git5. I'm looking forward to seeing the patch included in 2.6.30-rc7 and 2.6.29.x, it looks pretty important to me. What is more, I "played" a bit with IDE cabling and switched my CD/DVD drive to IDE master and I could not even reproduce the errors above (and hence annoying DMA turn off is gone too) any more. So it is good news on all fronts for me. I leave it for you to close the bug but it is fully RESOLVED as far as I'm concerned.
Can you send us the dmesg of the working kernel, I'd like to see what is the drive model exactly when it gets identified. Also, are you using 40 or 80 wires IDE cable? You can recognize the 80 wires cable by the blue connector on the host side (the end that goes into the motherboard). @Bart: presumably, this sounds like another one b0rked drive-side cable detection, from what I've seen so far and looking at Martin's error messages. Thanks, Boris.
Created attachment 21473 [details] dmesg of 2.6.30-rc6-git5 + patch with CDROM as hdc and hdd No errors with CDROM as hdc plugged to middle connector of the cable. DMA problems with CDROM as hdd plugged to end connector of the cable. I have not tested other combinations. My cable is 40-wire (all connectors are black, I counted the wires too).
On Thursday 21 May 2009 22:25:02 bugzilla-daemon@bugzilla.kernel.org wrote: > http://bugzilla.kernel.org/show_bug.cgi?id=13345 > > > > > > --- Comment #8 from Modestas Vainius <modestas@vainius.eu> 2009-05-21 > 20:25:02 --- > Created an attachment (id=21473) > --> (http://bugzilla.kernel.org/attachment.cgi?id=21473) > dmesg of 2.6.30-rc6-git5 + patch with CDROM as hdc and hdd > > No errors with CDROM as hdc plugged to middle connector of the cable. > DMA problems with CDROM as hdd plugged to end connector of the cable. > I have not tested other combinations. > > My cable is 40-wire (all connectors are black, I counted the wires too). Sigh... it is detected as 80-wire... This is nVidia PATA controller with broken cable detection. In ide we just use BIOS data, libata looks at both BIOS and ACPI data (though it may not help at all actually)... Anyway the following change needs to be ported to amd74xx.c one day: "pata_amd: update mode selection for NV PATAs" (commit ce54d1616302117fa98513ae916bb3333e1c02ea)
The discussion with Nvidia some time back established that for on board Nvidia devices the only method that would be reliable as the ACPI one.
In both cases? Or I'm just lucky that I have not seen problems with CD/DVD drive as hdc yet?
On Thursday 21 May 2009 23:04:19 bugzilla-daemon@bugzilla.kernel.org wrote: > http://bugzilla.kernel.org/show_bug.cgi?id=13345 > > > > > > --- Comment #11 from Modestas Vainius <modestas@vainius.eu> 2009-05-21 > 21:04:19 --- > In both cases? Or I'm just lucky that I have not seen problems with CD/DVD > drive as hdc yet? In both cases -- the drive is always tuned to UDMA/66.
BIOS setup tells me "UltraDMA Mode 2" (UDMA/33) for CD drive so Linux is obviously misdetecting here. I'll probably have to get a 80-conductor cable as hdparm -X udma2 does not seem to work :/
Hullo. I ran into this the other day, & can report the bug is still present in 2.6.30-rc8. I tried to capture the console output on panic but: FATAL: Error inserting netconsole (/lib/modules/2.6.30-rc8/kernel/drivers/net/netconsole.ko): Operation not permitted (Yes, I was root.) I tried 2 drives, only one triggered the bug. This is a Lite-On SOHD-16P9S28C made in January 2005.
Ah, I can give chipset info: $ lspci|grep 'IDE\|SATA' 00:1f.2 IDE interface: Intel Corporation 82801IB (ICH9) 2 port SATA IDE Controller (rev 02) 00:1f.5 IDE interface: Intel Corporation 82801I (ICH9 Family) 2 port SATA IDE Controller (rev 02) 03:00.0 SATA controller: JMicron Technologies, Inc. 20360/20363 Serial ATA Controller (rev 03) 03:00.1 IDE interface: JMicron Technologies, Inc. 20360/20363 Serial ATA Controller (rev 03) The cable is 40-wire. I haven't tried the patch. If I feel like opening up my computer again in a few days I may try it.
On Tuesday 09 June 2009 23:04:07 bugzilla-daemon@bugzilla.kernel.org wrote: > http://bugzilla.kernel.org/show_bug.cgi?id=13345 > > > Ethan Grammatikidis <eekee57@fastmail.fm> changed: > > What |Removed |Added > ---------------------------------------------------------------------------- > CC| |eekee57@fastmail.fm > > > > > --- Comment #14 from Ethan Grammatikidis <eekee57@fastmail.fm> 2009-06-09 > 21:04:05 --- > Hullo. I ran into this the other day, & can report the bug is still present > in > 2.6.30-rc8. Could it be bug #13399 instead of this one? [ This one should really be already fixed. ]
> Could it be bug #13399 instead of this one? > > [ This one should really be already fixed. ] It could well be, by the symptoms. I can't tell the difference. It's not fixed for that old drive of mine though. :) A different problem with the dvd-rom drives was fixed between rc7 and rc8. Anything accessing my newer drive would freeze in an uninteruptable wait for IO under rc7, but works fine with rc8.
Correction: newer drive does not work fine. With 2 days uptime processes accessing the drive freeze in an uninteruptable wait for IO. It's a buggy drive to be sure, but in 2.6.24/26 the worst it did was spam dmesg, & back in 2.6.15 it would stop working but not cause processes to freeze in an unkillable state.
2.6.30 final fixes my issue with my newer drive.