Latest working kernel version:2.6.23 Earliest failing kernel version:2.6.24-rc1 Distribution:N/A Hardware Environment: MB : ASUS A7N8X-E deluxe DVD : PLEXTOR DVDR PX-740A (FW 1.02) Software Environment: N/A Problem Description: During kernel boot, the DVD drive initialisation fails with those errors (2.6.25-rc5): hda: status error: error=0xd0 { Busy } ide: failed opcode was: unknown hda: drive not ready for command hda: status error: error=0xd0 { Busy } ide: failed opcode was: unknown hda: drive not ready for command hda: status error: error=0xd0 { Busy } ide: failed opcode was: unknown hda: drive not ready for command Steps to reproduce: plug a plextor DVDR PX-740A as primary ide master. boot a kernel >= 2.6.24-rc1 NB: if the plextor is on primary ide slave, there's no error. (the jumpers are ok). I bisect to find out where the bug was and I found that this commit is: b140b99c413ce410197cfcd4014e757cd745226a is first bad commit commit b140b99c413ce410197cfcd4014e757cd745226a Author: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com> Date: Sat Oct 13 17:47:51 2007 +0200 ide: change master/slave IDENTIFY order Need to probe slave device first to make it release PDIAG- (this is required for correct device side cable detection). Based on libata commit f31f0cc2f0b7527072d94d02da332d9bb8d7d94c. Thanks to Craig for testing this patch. Cc: Craig Block <chblock3@yahoo.com> Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com> :040000 040000 fa17f9e66a6e551184b1ec00049cea6fb87f7a1c 89a175312afe235fbdb04a3a593a4b981bdd9188 M drivers
Created attachment 15256 [details] full 2.6.25-rc5 dmesg
Thanks, the drive survives the probe (is identified correctly) but fails when there is a first attempt to use it with ide-cd. Could you also send a dmesg from 2.6.23 (so we can compare these two)? | Steps to reproduce: | plug a plextor DVDR PX-740A as primary ide master. | boot a kernel >= 2.6.24-rc1 | | NB: if the plextor is on primary ide slave, there's no error. | (the jumpers are ok). Please confirm that I get this correctly - if the drive is plugged as master and configured by _jumpers_ as master it also fails? [ Also: is the drive connected to the "other" cable end (_not_ in the middle) when used as master? ] PS I'm cc:ed on all IDE bugs anyway so I've removed myself from cc:
| [ Also: is the drive connected to the "other" cable end (_not_ in the middle) | when used as master? ] One more thing - please also verify that the cable is plugged correctly (not in the reverse order) - the connector farer from the middle one (blue one) should go to the controller. [ Sorry for asking so many stupid questions but I would really like to exclude hardware configuration issue first. ]
> Please confirm that I get this correctly - if the drive is plugged as master > and configured by _jumpers_ as master it also fails? > [ Also: is the drive connected to the "other" cable end (_not_ in the middle) > when used as master? ] yep! doesn't work: IDE drive configured by jumper as master AND connected to the end of the cable. does work: IDE drive configured by jumper as slave AND connected to the middle of the cable. By the way, the presence of another drive on the cable doesn't seem to change anything. [I didn't try: IDE drive configured by jumper as master AND connected to the middle of the cable. IDE drive configured by jumper as slave AND connected to the end of the cable. but I can try if you want] And the calbe I use is a 80 connector.
(In reply to comment #3) > One more thing - please also verify that the cable is plugged correctly (not > in > the reverse order) - the connector farer from the middle one (blue one) > should > go to the controller. yes, it is. > [ Sorry for asking so many stupid questions but I would really like to > exclude > hardware configuration issue first. ] It's all right, everything should be double-checked.
Created attachment 15269 [details] full 2.6.23 dmesg
Thanks. I've rechecked with the known ATAPI errata but so far I don't see a reason for drive getting stuck - I wonder whether it could be controller related. Could you try using ide_pci_generic driver instead of amd74xx (you need to disable CONFIG_BLK_DEV_AMD74XX, enable CONFIG_BLK_DEV_GENERIC and boot kernel with "ide_pci_generic.all_generic_ide" option) and see if it helps?
Also booting with "hdb=noprobe" should workaround the problem (please try it).
Reply-To: dani@ngrt.de On Sun, 16 Mar 2008 07:02:35 -0700 (PDT), bugme-daemon@bugzilla.kernel.org wrote: >I've rechecked with the known ATAPI errata but so far I don't see a reason for >drive getting stuck - I wonder whether it could be controller related. > >Could you try using ide_pci_generic driver instead of amd74xx (you need to >disable CONFIG_BLK_DEV_AMD74XX, enable CONFIG_BLK_DEV_GENERIC and boot kernel >with "ide_pci_generic.all_generic_ide" option) and see if it helps? This problem is widespread! I am experiencing it - like other users of the Knoppix 5.3 live cd that came with a recent issue of the c't magazine - here on a Nvidia MCP55 and a BenQ DW1655 which is master to a Plextor PX-708A. Booting off the BenQ fails whereas booting off the Plextor (slave) works. Ciao, Dani
MCP-55 controller again, hmm... - What error messages are you seeing? - If you boot from slave drive does the master drive work?
(In reply to comment #7) > Could you try using ide_pci_generic driver instead of amd74xx (you need to > disable CONFIG_BLK_DEV_AMD74XX, enable CONFIG_BLK_DEV_GENERIC and boot kernel > with "ide_pci_generic.all_generic_ide" option) and see if it helps? when I disable CONFIG_BLK_DEV_AMD74XX and enable CONFIG_BLK_DEV_GENERIC, the kernel is booting without any error (whenever I add ide_pci_generic.all_generic_ide or hdb=noprobe).
I have the same bug : https://bugs.edge.launchpad.net/ubuntu/+source/linux/+bug/181561 and http://bugzilla.kernel.org/show_bug.cgi?id=9837 I don't think that this is a controller problem. I have 2 different PC (nvidia and intel controller). And the bug follow is where i put my Plextor PX-740A (setting as master, and alone on ide). Setting it as Slave resolve temporarly the problem.
*** Bug 9837 has been marked as a duplicate of this bug. ***
Richard, please send 'lspci -vvvxxx' outputs for: - kernel using CONFIG_BLK_DEV_AMD74XX - kernel using CONFIG_BLK_DEV_GENERIC + "ide_pci_generic.all_generic_ide" (+ dmesg output for this one) and also: - "bad" kernel using amd74xx (one with commit b140b99c413ce410197cfcd4014e757cd745226a) - "good" kernel using amd74xx (one without commit b140b99c413ce410197cfcd4014e757cd745226a) I'll see if there are any differences in the way controller is programmed visible in PCI configuration. Patrice, please send dmesg output for the system with the Intel controller (for the "bad" kernel). Thanks.
Created attachment 15296 [details] lspci -vvvxxx for 2.6.25-rc5 kernel CONFIG_BLK_DEV_AMD74XX=y
Created attachment 15297 [details] 2.6.25-rc5 dmesg with CONFIG_BLK_DEV_GENERIC=y + "ide_pci_generic.all_generic_ide"
Created attachment 15298 [details] 2.6.25-rc5 lspci with CONFIG_BLK_DEV_GENERIC=y + "ide_pci_generic.all_generic_ide"
Created attachment 15299 [details] dmesg 2.6.25-rc5 kernel CONFIG_BLK_DEV_AMD74XX=y without commit b140b99c413ce410197cfcd4014e757cd745226a
Created attachment 15300 [details] lspci 2.6.25-rc5 kernel CONFIG_BLK_DEV_AMD74XX=y without commit b140b99c413ce410197cfcd4014e757cd745226a
I think it's all for the attachments.
Reply-To: dani@ngrt.de On Sun, 16 Mar 2008 08:25:05 -0700 (PDT), bugme-daemon@bugzilla.kernel.org wrote: >http://bugzilla.kernel.org/show_bug.cgi?id=10239 > > > > > >------- Comment #10 from bzolnier@gmail.com 2008-03-16 08:25 ------- >MCP-55 controller again, hmm... > >- What error messages are you seeing? Lots of these: hda: status error: status=0x59 { DriveReady SeekComplete DataRequest Error } hda: status error: error=0x00 { } ide: failed opcode was: unknown hda: drive not ready for command hda: status error: status=0x59 { DriveReady SeekComplete DataRequest Error } hda: status error: error=0x00 { } ide: failed opcode was: unknown hda: drive not ready for command hda: status error: status=0x59 { DriveReady SeekComplete DataRequest Error } hda: status error: error=0x00 { } ide: failed opcode was: unknown hda: drive not ready for command hda: status error: status=0x59 { DriveReady SeekComplete DataRequest Error } hda: status error: error=0x00 { } ide: failed opcode was: unknown hda: drive not ready for command hda: UDMA/33 mode selected hda: set_drive_speed_status: status=0x58 { DriveReady SeekComplete DataRequest } hda: host max PIO5 wanted PIO255(auto-tune) selected PIO4 hda: set_drive_speed_status: status=0x58 { DriveReady SeekComplete DataRequest } hdb: UDMA/33 mode selected hda: status error: status=0x58 { DriveReady SeekComplete DataRequest } ide: failed opcode was: unknown hda: drive not ready for command hda: status error: status=0x58 { DriveReady SeekComplete DataRequest } ide: failed opcode was: unknown full dmesg on request. >- If you boot from slave drive does the master drive work? No. Ciao, Dani
Created attachment 15314 [details] dmesg output on Intel (cdrom is working) @Bartlomiej : I'm sorry, I'v no bug in Intel chipset Here is informations : root@ubuntu:~# uname -r 2.6.24-12-generic (Hardy system up to date)
root@ubuntu:~# lspci 00:00.0 Host bridge: Intel Corporation 82G33/G31/P35/P31 Express DRAM Controller (rev 02) 00:01.0 PCI bridge: Intel Corporation 82G33/G31/P35/P31 Express PCI Express Root Port (rev 02) 00:1a.0 USB Controller: Intel Corporation 82801I (ICH9 Family) USB UHCI Controller #4 (rev 02) 00:1a.1 USB Controller: Intel Corporation 82801I (ICH9 Family) USB UHCI Controller #5 (rev 02) 00:1a.2 USB Controller: Intel Corporation 82801I (ICH9 Family) USB UHCI Controller #6 (rev 02) 00:1a.7 USB Controller: Intel Corporation 82801I (ICH9 Family) USB2 EHCI Controller #2 (rev 02) 00:1b.0 Audio device: Intel Corporation 82801I (ICH9 Family) HD Audio Controller (rev 02) 00:1c.0 PCI bridge: Intel Corporation 82801I (ICH9 Family) PCI Express Port 1 (rev 02) 00:1c.4 PCI bridge: Intel Corporation 82801I (ICH9 Family) PCI Express Port 5 (rev 02) 00:1c.5 PCI bridge: Intel Corporation 82801I (ICH9 Family) PCI Express Port 6 (rev 02) 00:1d.0 USB Controller: Intel Corporation 82801I (ICH9 Family) USB UHCI Controller #1 (rev 02) 00:1d.1 USB Controller: Intel Corporation 82801I (ICH9 Family) USB UHCI Controller #2 (rev 02) 00:1d.2 USB Controller: Intel Corporation 82801I (ICH9 Family) USB UHCI Controller #3 (rev 02) 00:1d.7 USB Controller: Intel Corporation 82801I (ICH9 Family) USB2 EHCI Controller #1 (rev 02) 00:1e.0 PCI bridge: Intel Corporation 82801 PCI Bridge (rev 92) 00:1f.0 ISA bridge: Intel Corporation 82801IB (ICH9) LPC Interface Controller (rev 02) 00:1f.2 IDE interface: Intel Corporation 82801IB (ICH9) 2 port SATA IDE Controller (rev 02) 00:1f.3 SMBus: Intel Corporation 82801I (ICH9 Family) SMBus Controller (rev 02) 00:1f.5 IDE interface: Intel Corporation 82801I (ICH9 Family) 2 port SATA IDE Controller (rev 02) 01:00.0 VGA compatible controller: ATI Technologies Inc RV370 [Sapphire X550 Silent] 01:00.1 Display controller: ATI Technologies Inc RV370 secondary [Sapphire X550 Silent] 02:00.0 Ethernet controller: Attansic Technology Corp. L1 Gigabit Ethernet Adapter (rev b0) 03:00.0 SATA controller: JMicron Technologies, Inc. JMicron 20360/20363 AHCI Controller (rev 03) 03:00.1 IDE interface: JMicron Technologies, Inc. JMicron 20360/20363 AHCI Controller (rev 03) 05:03.0 FireWire (IEEE 1394): VIA Technologies, Inc. IEEE 1394 Host Controller (rev c0) root@ubuntu:~#
downstream bug report: https://bugs.gentoo.org/show_bug.cgi?id=213615
Additional info sought at http://www.jmicron.com/Support_FAQ.html Reason: Bug [probably] reproduced on ASUS Maximums Extreme SE motherboard latest BIOS, with dual DVDs connected to JMicron controller on motherboard, and, a separate 3ware 9620 RAID controller for HDs. Bug reproduced on Etch, Sabayon, Knoppix 5.1.1. The HW setup works fine for another OS. Bug exhibits itself as follows: A bit down the installation, the CD-ROM cannot be found, despite ok booting from the device first. Booting DVD: LiteOn DH-16D2S (IDE) Second DVD: Samsung SMSHS203 (SATA) (? May be a 202 which is IDE) Though the use of all_ide_generic may work (not yet tried), it seems there are some performance side effects: "Q6: What are the differences between Legacy mode (IDE) and AHCI mode? Ans: Legacy mode support s OS through legacy IDE driver. Most SATA functions are not supported in Legacy mode, like SATA II 3G, NCQ, HotPlug and etc. JMicron Technology Corp. delivers the worldwide first AHCI compliant eSATA controller and now most of the Operating Systems are "Native Support", which enables SATA II 3G, NCQ, and Hotplug on JMB36X SATA / eSATA controllers. " If I read correctly, I will disable SATA II speed enhancements (and then also on the nice RAID-controller?) if I use all_ide_generic or ide_pci_generic.all_generic_ide. That will be a problem. Found workarounds so far: Switch Slave/Master Use "ide_pci_generic.all_generic_ide" Remove CD and insert a USB-drive early in process hdb=noprobe Use network install Let the distribution retry the mounting multiple times, "it then resolves by itself" Not tried them yet, but thought summary might aid them wiser than me. Appologies if description not good enough. First time writer here.
[ sorry for the delay ] I investigated lspci outputs sent by Richard: --- lspci.amd74xx 2008-03-29 17:16:31.000000000 +0100 +++ 2.6.25-rc5-generic_ide.lspci 2008-03-29 17:17:06.000000000 +0100 the chunk corresponding to 00:09.0 IDE interface... @@ -400,10 +400,10 @@ 20: 01 f0 00 00 00 00 00 00 00 00 00 00 43 10 11 0c 30: 00 00 00 00 44 00 00 00 00 00 00 00 00 00 03 01 40: 43 10 11 0c 01 00 02 00 00 00 00 00 00 09 00 00 -50: 03 f0 00 00 00 00 00 00 a8 20 a8 20 22 00 20 20 +50: 03 f0 00 00 00 00 00 00 a8 20 a8 20 66 00 20 20 ^^ 0x5c: 0x22 -> 0x66 0x5c == 0x4c (AMD_ADDRESS_SETUP) + 0x10 (for nVidia) amd74xx.c::amd_set_speed(): ... pci_read_config_byte(dev, AMD_ADDRESS_SETUP + offset, &t); t = (t & ~(3 << ((3 - dn) << 1))) | ((FIT(timing->setup, 1, 4) - 1) << ((3 - dn) << 1)); pci_write_config_byte(dev, AMD_ADDRESS_SETUP + offset, t); ... 0x4c: Address Setup Time Register: 7:6 P0ADD Primary Drive 0 Address Setup Time 5:4 P1ADD Primary Drive 1 Address Setup Time 3:2 S0ADD Secondary Drive 0 Address Setup Time 1:0 S1ADD Secondary Drive 1 Address Setup Time 0x22 == 00100010b 0x66 == 01100110b For some reason the BIOS (if ide_pci_generic is used timings are not programmed and the default values are used) sets higher address setup time than amd74xx driver. The timing used by amd74xxx is correct w.r.t. drive requirements, ATA spec and AMD datasheet but it could be that for nVidia hosts for some reason we need to use the higher timing (or maybe nVidia has different programming requirements for this register). Richard, could you try the attached patch? [ There is also another change in PCI configuration space at offsets 0x8d-0x8e... ]
Created attachment 15505 [details] amd74xx-fix-address-setup-time-for-nvidia-hosts.patch
Daniel, thanks for the link - now it is clear that it is not nVidia specific, however I still wonder whether this is generic IDE problem or the commit just uncovered some problems - in any case I'm going to revert the patch for 2.6.25 (this has the cost of making cable detection less reliable...). To Ubuntu developers: please help us with letting us know about problems early. I'm a bit unhappy with the fact that the issue was initially reported in the beginning of January and I learned about it 2 weeks ago (during 2.6.25 stabilization phase)...
@Bartlomiej : i'm not ubuntu dev, but i'v reported this bug on kernel.org on january 2008-01-28 after reported it on launchpad the 2008-01-09 (the time that i suspect a kernel bug. https://bugs.edge.launchpad.net/ubuntu/+source/linux/+bug/181561 http://bugzilla.kernel.org/show_bug.cgi?id=9837 Best regards
@Patrice: Yeah, I know + thanks for reporting it. The unfortunate thing is that it was all the time under Product: Platform Specific/Hardware + Component: i386 (instead of IO/Storage + IDE). As a result it never reached linux-ide@ ML or me so we've learned about this bug entry only on 2008-03-16 when the link to the previous bug was mentioned under this bug.
@Patrice: PS could you please re-assign the bug to me.
I was quite puzzled why the same problem is not reported for corresponding libata host drivers as actually the "guilty" change was based on libata changes (+ amd74xx and pata_amd host drivers are very similar nowadays). It seems that commit f58229f8060055b08b34008ea08f31de1e2f003c ("libata-link: implement and use link/device iterators") which went into 2.6.24 by accident reverted the "guilty" change (thus making cable detection less reliable): @@ -2134,18 +2132,16 @@ int ata_bus_probe(struct ata_port *ap) /* after the reset the device state is PIO 0 and the controller state is undefined. Record the mode */ - for (i = 0; i < ATA_MAX_DEVICES; i++) - ap->link.device[i].pio_mode = XFER_PIO_0; + ata_link_for_each_dev(dev, &ap->link) + dev->pio_mode = XFER_PIO_0; /* read IDENTIFY page and configure devices. We have to do the identify specific sequence bass-ackwards so that PDIAG- is released by the slave device */ - for (i = ATA_MAX_DEVICES - 1; i >= 0; i--) { - dev = &ap->link.device[i]; - - if (tries[i]) - dev->class = classes[i]; + ata_link_for_each_dev(dev, &ap->link) { + if (tries[dev->devno]) + dev->class = classes[dev->devno]; if (!ata_dev_enabled(dev)) continue; [ the code above should use ata_link_for_each_dev_reverse() instead ] N.B. ata_eh_revalidate_and_attach() gets it right so libata may also be currently affected by the problem but will be triggered only for suspend/resume or if somebody decides to rescan/plug-in devices I guess that we want to fix libata but at the same time verify that the issue that we've hit with IDE amd74xx/piix does/doesn't happen with pata_amd/ata_piix? Tejun?
The patch amd74xx-fix-address-setup-time-for-nvidia-hosts.patch doesn't seem to correct the bug. I'm attaching dmesg and lspci.
Created attachment 15507 [details] 2.6.25-rc5 dmesg with patch amd74xx-fix-address-setup-time-for-nvidia-hosts
Created attachment 15508 [details] 2.6.25-rc5 lspci with patch amd74xx-fix-address-setup-time-for-nvidia-hosts
Created attachment 15509 [details] libata-fix-cable-detection.patch Thanks for testing, it looks more like generic problem with some drives now... Could you also check if pata_amd with the attached patch work? You need to: - disable IDE completely (CONFIG_IDE=n) - enable libata (CONFIG_ATA=y) and pata_amd (CONFIG_PATA_AMD=y) - enable SCSI disk (CONFIG_BLK_DEV_SD=y) and CD-ROM support (CONFIG_BLK_DEV_SR=y) [ as a side-effect device names will change from /dev/hd* to /dev/sd* ]
the following bug report says that adding vga=788 to the boot parameters will bring up a normal boot: http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=469189 i can confirm that, but i don't have any idea why it works that way
Bartlomiej, thanks for finding out the accidental order change but ata_bus_probe() is deprecated and currently only used by sata_sx4 and SAS. None of PATA ones uses that path anymore. Everything goes through ata_eh_revalidate_and_attach(). Richard, can you be persuaded into trying pata_amd?
with pata_amd, the dvd writer is working. (I applied libata-fix-cable-detection.patch anyway).
Created attachment 15537 [details] 2.6.25-rc5 dmesg with patch libata-fix-cable-detection.patch and pata_amd
Created attachment 15538 [details] 2.6.25-rc5 lspci with patch libata-fix-cable-detection.patch and pata_amd
So, pata_amd works fine. It doesn't look like the reversed IDENTIFY order is the actual culprit in the probing failure. pata_amd has been doing it that way for a long time now. It was converted to new EH pretty early and have been using the reverse order IDENTIFY since it was first applied to EH till now. I'll forward the fix for obsolete path upstream. Thanks.
Hi All! Sorry for my ... and for my bad English :-) Can you see this bugreport please: http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=480679? This bug was reported to Debian, but it is exact what I have on my Slackware 12.1 with kernel 2.6.24.5. And I can't find it at http://bugzilla.kernel.org. But the key words of dmesg ("hdc: status timeout: status=0xd0 { Busy } ide: failed opcode was: unknown drive not ready for command") is equialent to this topic. Is Debian bug 480679 equialent to kernel bug 10239 or should be reported separately? P.S. Additionally to Debian bugreport, I have next features on Slackware: - not only hald-addon-storage, but k3b also can freeze system; - trying to renice k3b process that eats ~99% CPU have no results.
This bug as it has been fixed months ago by reverting the change causing regression (sorry for the late update): commit f367bed005b06db7067fc378a5f2253fac54e5d9 Author: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com> Date: Sat Mar 29 19:48:21 2008 +0100 Revert "ide: change master/slave IDENTIFY order" This reverts commit b140b99c413ce410197cfcd4014e757cd745226a. ... [ also in the end it looked like a generic timing issue in IDE code triggered by my change - the underlying issue may have been fixed in newer kernels (there were a ton of fixes since then) so it would useful if somebody tries to revert-the-revert and see if kernel still breaks ] Richard: thanks for all your help on this, also by incident I found a real problem with pata_amd while analyzing the lspci outputs from you: amd74xxx: 00: de 10 65 00 05 00 b0 00 a2 8a 01 01 00 00 00 00 10: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 20: 01 f0 00 00 00 00 00 00 00 00 00 00 43 10 11 0c 30: 00 00 00 00 44 00 00 00 00 00 00 00 00 00 03 01 40: 43 10 11 0c 01 00 02 00 00 00 00 00 00 09 00 00 50: 03 f0 00 00 00 00 00 00 a8 20 a8 20 66 00 20 20 pata_amd: 00: de 10 65 00 05 00 b0 00 a2 8a 01 01 00 00 00 00 10: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 20: 01 f0 00 00 00 00 00 00 00 00 00 00 43 f0 11 0c 30: 00 00 00 00 44 00 00 00 00 00 00 00 00 00 03 01 40: 43 f0 11 0c 01 00 02 00 00 00 00 00 00 09 00 00 50: 03 f0 00 00 00 00 00 00 99 20 99 20 22 00 20 20 pata_amd incorrectly programs FIFO settings at offset 0x41 instead of 0x51 (which seems to be shadowed at 0x2d so it also results in wrong "Subsystem" of PCI device being reported) amd74xx: Subsystem: ASUSTeK Computer Inc. Unknown device 0c11 pata_amd: Subsystem: Unknown device f043:0c11 Tejun: feel free to fix (or forward to Alan) the above issue (I'm too busy with other stuff + amd74xx doesn't have the problem) sergey: please try some recent kernel (preferably 2.6.26-rc6) and if the issue still happens there open a new bug
Patrice: please close this bug