Most recent kernel where this bug did not occur: -/- Distribution: Gentoo Hardware Environment: Software Environment: System uname: 2.6.23 i686 Intel(R) Celeron(R) CPU 2.66GHz Timestamp of tree: Sun, 28 Oct 2007 21:50:01 +0000 distcc 2.18.3 i686-pc-linux-gnu (protocols 1 and 2) (default port 3632) [disabled] ccache version 2.4 [enabled] app-shells/bash: 3.2_p17 dev-lang/python: 2.4.4-r6 dev-python/pycrypto: 2.0.1-r6 dev-util/ccache: 2.4-r7 sys-apps/baselayout: 1.12.9-r2 sys-apps/sandbox: 1.2.18.1-r2 sys-devel/autoconf: 2.13, 2.61-r1 sys-devel/automake: 1.4_p6, 1.5, 1.6.3, 1.7.9-r1, 1.8.5-r3, 1.9.6-r2, 1.10 sys-devel/binutils: 2.18-r1 sys-devel/gcc-config: 1.3.16 sys-devel/libtool: 1.5.24 virtual/os-headers: 2.6.22-r2 ACCEPT_KEYWORDS="x86" CBUILD="i686-pc-linux-gnu" CFLAGS="-march=pentium4 -O2 -pipe -msse2 -mfpmath=sse -fomit-frame-pointer" CHOST="i686-pc-linux-gnu" Problem Description: First of all: Does it look like a hardware defekt on the controller or is it a bug? If you tent to say its a HW defekt, please let me know and close this bug. When I connect a HDD to a certain port (Interface 3) of the Highpoint RockedRaid 454 the kernel shows the following: ata5.00: ATA-7: Maxtor 5A250J0, RAM51VV0, max UDMA/133 ata5.00: 490234752 sectors, multi 16: LBA48 ata5.01: ATA-7: HDT722525DLAT80, V44OA96A, max UDMA/133 ata5.01: 488397168 sectors, multi 16: LBA48 ata5.00: limited to UDMA/33 due to 40-wire cable ata5.01: limited to UDMA/33 due to 40-wire cable Find mode for 12 reports A81F442 Find mode for 12 reports A81F442 Find mode for DMA 66 reports 120C8242 Find mode for DMA 66 reports 120C8242 ata5.00: configured for UDMA/33 ata5.01: failed to IDENTIFY (I/O error, err_mask=0x2) ata5.01: revalidation failed (errno=-5) ata5: failed to recover some devices, retrying in 5 secs ata5.01: failed to IDENTIFY (I/O error, err_mask=0x2) ata5.01: revalidation failed (errno=-5) ata5.01: limiting speed to UDMA/33:PIO3 ata5: failed to recover some devices, retrying in 5 secs ata5.01: failed to IDENTIFY (I/O error, err_mask=0x2) ata5.01: revalidation failed (errno=-5) ata5.01: disabled ata5: failed to recover some devices, retrying in 5 secs ata5.00: failed to IDENTIFY (I/O error, err_mask=0x40) ata5.00: revalidation failed (errno=-5) ata5: failed to recover some devices, retrying in 5 secs Find mode for 12 reports A81F442 Find mode for DMA 66 reports 120C8242 ata5.00: configured for UDMA/33 ata5: EH pending after completion, repeating EH (cnt=4) Connecting the cable to another port (interface 4) works fine: ata6: PATA max UDMA/100 cmd 0x0001a400 ctl 0x0001a002 bmdma 0x00019808 irq 19 ata6.00: ATA-7: Maxtor 5A250J0, RAM51VV0, max UDMA/133 ata6.00: 490234752 sectors, multi 16: LBA48 ata6.01: ATA-7: HDT722525DLAT80, V44OA96A, max UDMA/133 ata6.01: 488397168 sectors, multi 16: LBA48 Find mode for 12 reports A81F442 Find mode for 12 reports A81F442 Find mode for DMA 69 reports 12848242 Find mode for DMA 69 reports 12848242 ata6.00: configured for UDMA/100 ata6.01: configured for UDMA/100 All other ports (1, 2, 4) are working fine, only port 3 is buggy... Steps to reproduce: Connect UDM100 capable cable to interface 3 of the controller and boot. Reproducable: Alway (with one or with both disks) Full dmsg here: http://olausson.name/temp/HPT374_failed http://olausson.name/temp/HPT374_worked Thanks and regards Bjoern
Here's the hw setup: H/W path Device Class Description ====================================================== system P4V88 /0 bus P4V88 /0/0 memory 64KB BIOS /0/3 processor Intel(R) Celeron(R) CPU 2.66GHz /0/3/4 memory 16KB L1 cache /0/3/5 memory 256KB L2 cache /0/3/6 memory L3 cache /0/1 memory 896MB System memory /0/100 bridge PT880 Host Bridge /0/100/1 bridge VT8237 PCI Bridge /0/100/1/0 display NV43 [GeForce 6200] /0/100/a scsi2 storage HPT374 /0/100/a/0 /dev/sda disk 189GB Maxtor 6Y200P0 /0/100/a/0/1 /dev/sda1 volume 189GB Linux raid autodetect partition /0/100/a/1 /dev/sdb disk 189GB Maxtor 6Y200P0 /0/100/a/1/1 /dev/sdb1 volume 189GB Linux raid autodetect partition /0/100/a/2 /dev/sdc disk 233GB Maxtor 6L250R0 /0/100/a/2/1 /dev/sdc1 volume 232GB Linux raid autodetect partition /0/100/a/3 /dev/sdd disk 233GB Maxtor 6L250R0 /0/100/a/3/1 /dev/sdd1 volume 232GB Linux raid autodetect partition /0/100/a.1 scsi5 storage HPT374 /0/100/a.1/0.0.0 /dev/sde disk 233GB Maxtor 5A250J0 /0/100/a.1/0.0.0/1 /dev/sde1 volume 232GB Linux raid autodetect partition /0/100/a.1/0.1.0 /dev/sdf disk 232GB HDT722525DLAT80 /0/100/a.1/0.1.0/1 /dev/sdf1 volume 232GB Linux raid autodetect partition /0/100/b multimedia SB Live! EMU10k1 /0/100/b.1 input SB Live! Game Port /0/100/c eth0 network DGE-528T Gigabit Ethernet Adapter /0/100/d wifi0 network AR5212 802.11abg NIC /0/100/f storage VIA VT6420 SATA RAID Controller /0/100/f.1 scsi6 storage VT82C586A/B/VT82C686/A/B/VT823x/A/C PIPC Bus Master /0/100/f.1/0.0.0 /dev/sdg disk 152GB Maxtor 6Y160P0 /0/100/f.1/0.0.0/1 /dev/sdg1 volume 47MB Linux raid autodetect partition /0/100/f.1/0.0.0/2 /dev/sdg2 volume 972MB Linux swap / Solaris partition /0/100/f.1/0.0.0/3 /dev/sdg3 volume 151GB Linux raid autodetect partition /0/100/f.1/0 /dev/sdh disk 152GB Maxtor 6Y160P0 /0/100/f.1/0/1 /dev/sdh1 volume 47MB Linux raid autodetect partition /0/100/f.1/0/2 /dev/sdh2 volume 972MB Linux swap / Solaris partition /0/100/f.1/0/3 /dev/sdh3 volume 151GB Linux raid autodetect partition /0/100/f.1/1 disk ROM-DRIVE-52MAX /0/100/10 bus VT82xxxxx UHCI USB 1.1 Controller /0/100/10/1 usb2 bus UHCI Host Controller /0/100/10.1 bus VT82xxxxx UHCI USB 1.1 Controller /0/100/10.1/1 usb3 bus UHCI Host Controller /0/100/10.2 bus VT82xxxxx UHCI USB 1.1 Controller /0/100/10.2/1 usb4 bus UHCI Host Controller /0/100/10.3 bus VT82xxxxx UHCI USB 1.1 Controller /0/100/10.3/1 usb5 bus UHCI Host Controller /0/100/10.4 bus USB 2.0 /0/100/10.4/1 usb1 bus EHCI Host Controller /0/100/11 bridge VT8237 ISA bridge [KT600/K8T800/K8T890 South] /0/100/11.5 multimedia VT8233/A/8235/8237 AC97 Audio Controller /0/100/12 eth1 network VT6102 [Rhine-II] /0/101 bridge PT880 Host Bridge /0/102 bridge PT880 Host Bridge /0/103 bridge PT880 Host Bridge /0/104 bridge PT880 Host Bridge /0/105 bridge PT880 Host Bridge /1 dummy0 network Ethernet interface
Odd - does look like one of your ports isn't properly wired (eg a loose pin) but its always hard to be sure its not a weird software bug. Could also be the cable if you are moving drives between cables not cables between ports ?
I guess it's the port of the controller... If I keep the cable, and switch the port, everything works, If I switch the cable and keep the port --> bug So now there are two options... the port of the controller is damaged or a bug in the driver. Any ideas how I can eliminate one of those two options? One more thing I noticed... attached is a DeskStar Hitachi disk and a Maxtor disk. If I run Drive Fitness Test whith the two disks connected to the damaged port, DFT hangs and will not recover. If I remove only the Hitachi disk, DFT detects all other drives. Is there some way to test the controller port? Any ideas are welcome to eliminate or proof the hardware defect. regards Bjoern
I've been having a look over this. There are a small number of things that ports 3/4 do differently to port 1/2. I've audited those and double checked against the reference information which is a bit limited unfortunately. From that I've got a patch you can try which may help, hinder, or do nothing but is worth trying I thinh Will attach it in a moment
Created attachment 13400 [details] Proposed fixes for HPT37x problems
Created attachment 13406 [details] hpt_is_fixed I attached the two disks back to the "faulty" port and they are working again. compare the patched Kernel output with the output in the first post. Thanks for the fix! Maybe this fix will find it's way into 2.6.23.x? Currently I can't run 2.6.24 layer7 patches do not work on this one yet. ata5.00: ATA-7: Maxtor 5A250J0, RAM51VV0, max UDMA/133 ata5.00: 490234752 sectors, multi 16: LBA48 ata5.01: ATA-7: HDT722525DLAT80, V44OA96A, max UDMA/133 ata5.01: 488397168 sectors, multi 16: LBA48 Find mode for 12 reports A81F442 Find mode for 12 reports A81F442 Find mode for DMA 69 reports 12848242 Find mode for DMA 69 reports 12848242 ata5.00: configured for UDMA/100 ata5.01: configured for UDMA/100 attached a full dmsg output. Thanks Bjoern
Will push into 2.6.24, not my call if it ends up in 2.6.23.x but it may well do if it shows no other problems
I'll try to make the patch work against 2.6.23.1 to see if it makes some trouble... thanks Bjoern
Pushed upstream along with another cable fix Sergei noticed was needed
Thanks a lot!
Sry to reopen this bug, but the patch works for 2.6.23.1 but 2.6.24-rc5 does not. I tried the latest Kernel 2.6.24-rc5 but got the following misbehavior: To make it short, the Kernel embezzled two 250GB Maxtor drives. rebooting the patched 2.6.23.1 Kernel --> everything works again. See attached dmesg output for 2.6.24-rc5 and 2.6.23.1 kernels regards Bjoern
Created attachment 13999 [details] patched 2.6.23.1 dmesg patched 2.6.23.1 dmesg
Created attachment 14000 [details] stock 2.6.24-rc5 dmesg stock 2.6.24-rc5 dmesg
Any chance you can build -rc2 and let me know if that works (or ideally which -rc beaks it)
Shure I can. I'll start with -rc1 and crawl upwards. Results should be here within a few hours ;-) regards Bjoern
Created attachment 14006 [details] dmesg rc1 It fails from rc1 to rc2 and all following are failing. so rc1 is the last working one.
Created attachment 14007 [details] dmesg rc2
Created attachment 14008 [details] dmesg rc3
Created attachment 14009 [details] dmesg rc4
Can you try reversing the following patch segment and let me know if this fixes it ?
Created attachment 14077 [details] pata_hpt37x: Patch segment to reverse (patch -R)
Comment on attachment 14077 [details] pata_hpt37x: Patch segment to reverse (patch -R) Wrong diff sorry
Created attachment 14078 [details] Correct diff I hope patch -R drivers/ata/pata_hpt37x < a1
Created attachment 14082 [details] demsg 2.6.24-rc5_patched Your patch did the trick. All drives were found. Thanks 16:42:08 [/usr/src/linux-2.6.24-rc5] root@enterprise $ patch -R drivers/ata/pata_hpt37x.c < ../a1.txt patching file drivers/ata/pata_hpt37x.c Hunk #1 succeeded at 359 (offset -2 lines).
Can you try the following change *instead* This I think fixes the bug. If it does then I'll push that to Linus for 2.6.24 final, if not I'll push the backout you've tested. Thanks a lot for the debugging
Created attachment 14088 [details] Proposed fix
Will check it in a day or two. Thanks for the help
Created attachment 14108 [details] dmesg 2.6.24-rc5_proposed-fix Fix works. Thanks a lot! I didn't stress test. I just dropped to /bin/bb, dumped dmesg and rebooted 2.6.23.1 have a nice christmas time ;) regards Bjoern
Patch was applied as commit f941b168a4d7281bf49e166f2febc49470c0149f in Linus' tree.