Most recent kernel where this bug did *NOT* occur: Distribution: Debian "Etch" 4.0 Hardware Environment: Mother ASRock K7S41GX 256 MB RAM HDD 40 GB Software Environment: Problem Description: I'm suspecting that my HDD is not working 100% well, but that's not the problem. It throws some errors when DMA Mode 6 is on (can be lowered with the BIOS or disabled with hdparm -v0 /dev/hda) The thing is that Linux reports dma problems and thus disables the WRONG device (it only disables hdb) here's my kernel log: Jun 2 12:50:40 localhost kernel: hda: dma_intr: status=0x51 { DriveReady SeekComplete Error } Jun 2 12:50:40 localhost kernel: hda: dma_intr: error=0x84 { DriveStatusError BadCRC } Jun 2 12:50:40 localhost kernel: ide: failed opcode was: unknown Jun 2 12:50:40 localhost kernel: hda: dma_intr: status=0x51 { DriveReady SeekComplete Error } Jun 2 12:50:40 localhost kernel: hda: dma_intr: error=0x84 { DriveStatusError BadCRC } Jun 2 12:50:40 localhost kernel: ide: failed opcode was: unknown Jun 2 12:50:40 localhost kernel: hda: dma_intr: status=0x51 { DriveReady SeekComplete Error } Jun 2 12:50:40 localhost kernel: hda: dma_intr: error=0x84 { DriveStatusError BadCRC } Jun 2 12:50:40 localhost kernel: ide: failed opcode was: unknown Jun 2 12:50:41 localhost kernel: hda: dma_intr: status=0x51 { DriveReady SeekComplete Error } Jun 2 12:50:41 localhost kernel: hda: dma_intr: error=0x84 { DriveStatusError BadCRC } Jun 2 12:50:41 localhost kernel: ide: failed opcode was: unknown Jun 2 12:50:41 localhost kernel: hda: dma_intr: status=0x51 { DriveReady SeekComplete Error } Jun 2 12:50:41 localhost kernel: hda: dma_intr: error=0x84 { DriveStatusError BadCRC } Jun 2 12:50:41 localhost kernel: ide: failed opcode was: unknown Jun 2 12:50:41 localhost kernel: hdb: DMA disabled Jun 2 12:50:41 localhost kernel: ide0: reset: success hdb is disabled and ide0 is resetted. OK, but hda is the one that should be disabled. hdb works fine and disabling it doesn't fix anything. Also note that these errors are only reported in the kernel, but so far didn't crash/hang/etc my system; but slowdowns my CD recorder. Steps to reproduce: Have a buggy HDD sorry :( I also found that Bug #2477 is _similar_ (On DMA errors on _both_ drives, only the CD drive gets disabled, but the HD still has DMA)
That's kinda funny, sorry ;) Are you able to confirm that more recent kernels have the same problem? 2.6.21? Thanks.
"Are you able to confirm that more recent kernels have the same problem? 2.6.21?" That's my latest Debian Distro's kernel. Now I'm downloading the latest from kernel.org I will kinda "hate" you because I will need to recompile many kernel libs (To eliminate any confussion, YES the bug still happens when only the necessary kernel libs are loaded) As far as the "buggy" HDD, I think it's not a Linux problem because M$ DOS has serious problems when DMA Mode 6 is enabled by the BIOS, problems which I didn't have a year ago. (Win XP & Linux work OK. As I said, Linux only reports hda errors in the log and thus disables hdb) Will contact you later when I finish testing the new kernel.
Here some more info: hdparm /dev/hda /dev/hda: multcount = 0 (off) IO_support = 0 (default 16-bit) unmaskirq = 0 (off) using_dma = 1 (on) keepsettings = 0 (off) readonly = 0 (off) readahead = 256 (on) geometry = 16383/255/63, sectors = 80418240, start = 0 hdparm /dev/hdb /dev/hdb: IO_support = 0 (default 16-bit) unmaskirq = 0 (off) using_dma = 0 (off) keepsettings = 0 (off) readonly = 0 (off) readahead = 256 (on) HDIO_GETGEO failed: Inappropriate ioctl for device I didn't yet have the time to test a newer kernel. But I'm also looking forward to kernel debugging. I've been looking in /usr/src/linux-source-2.6.18/drivers/ ide and I hope that, digging throgh the Stack, I could find the source of the problem easily. But I've never debugged the kernel, only user applications. How do I enable kernel debugging and which app should I use for? thanks! PS: I also found in Kconfig: config IDEDISK_MULTI_MODE bool "Use multi-mode by default" help If you get this error, try to say Y here: hda: set_multmode: status=0x51 { DriveReady SeekComplete Error } hda: set_multmode: error=0x04 { DriveStatusError } If in doubt, say N. which is very similar to what I get, I'll check it later. However we're digging why hdb's DMA is disabled instead of hda.
IDEDISK_MULTI_MODE is not related - the errors you are getting are BadCRCs which are usually caused by a bad cabling. Driver should recover cleanly from this situation but it seems it isn't. It would be useful to see the full dmesg output (you may need to increase CONFIG_LOG_BUF_SHIFT kernel config option - "General setup"->"Kernel log buffer size (16 >= 64KB, 17 => 128KB)" in order to capture all kernel messages). I would _strongly_ advise to not invest any time to debugging before trying 2.6.22-rc3 (that's it - 2.6.22-rc3 not 2.6.21). There were tons of IDE fixes since 2.6.18 + latest -rc contains fixes specific to SiS IDE driver. If the latest & greates kernel still doesn't work then the best places to start looking at are ide-iops.c:pre_reset() and ide-iops.c:ide_auto_reduce_xfer().
Hi! Iv'e tried 2.6.22-rc5 and the dma problem persists (also hdb is still the one that's being disabled) I don't know when I will take look at the code. I'm pretty busy right now.
Is this still an issue with a recent kernel (2.6.32)?
I'm no longer maintainer of the affected code.
I'm afraid that PC has now more serious HW problems (bad caps, replaced them but still doesn't work right) which makes it difficult to test. I can hardly keep it up running for long without a complete freeze right now. I use it as a DVD player plugged to my TV until it completely dies... The PC had serious HW problems that's for sure; the actual bug in the kernel was that hdb (a CD-RW drive) DMA was being disabled when the conflicting device was the Hard Drive at hda (an 40GB HDD). Note the linux kernel correctly identified the malfunctioning HDD drive, it just disabled the wrong device. If I am able to update the latest kernel to that machine, I'll post it here