Bug 9261 (hpt37x_UDMA-33)
Summary: | (pata hpt374) Mishandling of port 3/4 special cases ? | ||
---|---|---|---|
Product: | IO/Storage | Reporter: | Bjoern Olausson (lkmlist) |
Component: | Serial ATA | Assignee: | Alan (alan) |
Status: | CLOSED CODE_FIX | ||
Severity: | normal | CC: | bunk |
Priority: | P1 | ||
Hardware: | All | ||
OS: | Linux | ||
Kernel Version: | 2.6.24-rc5 | Subsystem: | |
Regression: | --- | Bisected commit-id: | |
Attachments: |
Proposed fixes for HPT37x problems
hpt_is_fixed patched 2.6.23.1 dmesg stock 2.6.24-rc5 dmesg dmesg rc1 dmesg rc2 dmesg rc3 dmesg rc4 pata_hpt37x: Patch segment to reverse (patch -R) Correct diff I hope patch -R drivers/ata/pata_hpt37x < a1 demsg 2.6.24-rc5_patched Proposed fix dmesg 2.6.24-rc5_proposed-fix |
Description
Bjoern Olausson
2007-10-29 11:03:40 UTC
Here's the hw setup: H/W path Device Class Description ====================================================== system P4V88 /0 bus P4V88 /0/0 memory 64KB BIOS /0/3 processor Intel(R) Celeron(R) CPU 2.66GHz /0/3/4 memory 16KB L1 cache /0/3/5 memory 256KB L2 cache /0/3/6 memory L3 cache /0/1 memory 896MB System memory /0/100 bridge PT880 Host Bridge /0/100/1 bridge VT8237 PCI Bridge /0/100/1/0 display NV43 [GeForce 6200] /0/100/a scsi2 storage HPT374 /0/100/a/0 /dev/sda disk 189GB Maxtor 6Y200P0 /0/100/a/0/1 /dev/sda1 volume 189GB Linux raid autodetect partition /0/100/a/1 /dev/sdb disk 189GB Maxtor 6Y200P0 /0/100/a/1/1 /dev/sdb1 volume 189GB Linux raid autodetect partition /0/100/a/2 /dev/sdc disk 233GB Maxtor 6L250R0 /0/100/a/2/1 /dev/sdc1 volume 232GB Linux raid autodetect partition /0/100/a/3 /dev/sdd disk 233GB Maxtor 6L250R0 /0/100/a/3/1 /dev/sdd1 volume 232GB Linux raid autodetect partition /0/100/a.1 scsi5 storage HPT374 /0/100/a.1/0.0.0 /dev/sde disk 233GB Maxtor 5A250J0 /0/100/a.1/0.0.0/1 /dev/sde1 volume 232GB Linux raid autodetect partition /0/100/a.1/0.1.0 /dev/sdf disk 232GB HDT722525DLAT80 /0/100/a.1/0.1.0/1 /dev/sdf1 volume 232GB Linux raid autodetect partition /0/100/b multimedia SB Live! EMU10k1 /0/100/b.1 input SB Live! Game Port /0/100/c eth0 network DGE-528T Gigabit Ethernet Adapter /0/100/d wifi0 network AR5212 802.11abg NIC /0/100/f storage VIA VT6420 SATA RAID Controller /0/100/f.1 scsi6 storage VT82C586A/B/VT82C686/A/B/VT823x/A/C PIPC Bus Master /0/100/f.1/0.0.0 /dev/sdg disk 152GB Maxtor 6Y160P0 /0/100/f.1/0.0.0/1 /dev/sdg1 volume 47MB Linux raid autodetect partition /0/100/f.1/0.0.0/2 /dev/sdg2 volume 972MB Linux swap / Solaris partition /0/100/f.1/0.0.0/3 /dev/sdg3 volume 151GB Linux raid autodetect partition /0/100/f.1/0 /dev/sdh disk 152GB Maxtor 6Y160P0 /0/100/f.1/0/1 /dev/sdh1 volume 47MB Linux raid autodetect partition /0/100/f.1/0/2 /dev/sdh2 volume 972MB Linux swap / Solaris partition /0/100/f.1/0/3 /dev/sdh3 volume 151GB Linux raid autodetect partition /0/100/f.1/1 disk ROM-DRIVE-52MAX /0/100/10 bus VT82xxxxx UHCI USB 1.1 Controller /0/100/10/1 usb2 bus UHCI Host Controller /0/100/10.1 bus VT82xxxxx UHCI USB 1.1 Controller /0/100/10.1/1 usb3 bus UHCI Host Controller /0/100/10.2 bus VT82xxxxx UHCI USB 1.1 Controller /0/100/10.2/1 usb4 bus UHCI Host Controller /0/100/10.3 bus VT82xxxxx UHCI USB 1.1 Controller /0/100/10.3/1 usb5 bus UHCI Host Controller /0/100/10.4 bus USB 2.0 /0/100/10.4/1 usb1 bus EHCI Host Controller /0/100/11 bridge VT8237 ISA bridge [KT600/K8T800/K8T890 South] /0/100/11.5 multimedia VT8233/A/8235/8237 AC97 Audio Controller /0/100/12 eth1 network VT6102 [Rhine-II] /0/101 bridge PT880 Host Bridge /0/102 bridge PT880 Host Bridge /0/103 bridge PT880 Host Bridge /0/104 bridge PT880 Host Bridge /0/105 bridge PT880 Host Bridge /1 dummy0 network Ethernet interface Odd - does look like one of your ports isn't properly wired (eg a loose pin) but its always hard to be sure its not a weird software bug. Could also be the cable if you are moving drives between cables not cables between ports ? I guess it's the port of the controller... If I keep the cable, and switch the port, everything works, If I switch the cable and keep the port --> bug So now there are two options... the port of the controller is damaged or a bug in the driver. Any ideas how I can eliminate one of those two options? One more thing I noticed... attached is a DeskStar Hitachi disk and a Maxtor disk. If I run Drive Fitness Test whith the two disks connected to the damaged port, DFT hangs and will not recover. If I remove only the Hitachi disk, DFT detects all other drives. Is there some way to test the controller port? Any ideas are welcome to eliminate or proof the hardware defect. regards Bjoern I've been having a look over this. There are a small number of things that ports 3/4 do differently to port 1/2. I've audited those and double checked against the reference information which is a bit limited unfortunately. From that I've got a patch you can try which may help, hinder, or do nothing but is worth trying I thinh Will attach it in a moment Created attachment 13400 [details]
Proposed fixes for HPT37x problems
Created attachment 13406 [details]
hpt_is_fixed
I attached the two disks back to the "faulty" port and they are working again.
compare the patched Kernel output with the output in the first post.
Thanks for the fix!
Maybe this fix will find it's way into 2.6.23.x?
Currently I can't run 2.6.24 layer7 patches do not work on this one yet.
ata5.00: ATA-7: Maxtor 5A250J0, RAM51VV0, max UDMA/133
ata5.00: 490234752 sectors, multi 16: LBA48
ata5.01: ATA-7: HDT722525DLAT80, V44OA96A, max UDMA/133
ata5.01: 488397168 sectors, multi 16: LBA48
Find mode for 12 reports A81F442
Find mode for 12 reports A81F442
Find mode for DMA 69 reports 12848242
Find mode for DMA 69 reports 12848242
ata5.00: configured for UDMA/100
ata5.01: configured for UDMA/100
attached a full dmsg output.
Thanks
Bjoern
Will push into 2.6.24, not my call if it ends up in 2.6.23.x but it may well do if it shows no other problems I'll try to make the patch work against 2.6.23.1 to see if it makes some trouble... thanks Bjoern Pushed upstream along with another cable fix Sergei noticed was needed Thanks a lot! Sry to reopen this bug, but the patch works for 2.6.23.1 but 2.6.24-rc5 does not. I tried the latest Kernel 2.6.24-rc5 but got the following misbehavior: To make it short, the Kernel embezzled two 250GB Maxtor drives. rebooting the patched 2.6.23.1 Kernel --> everything works again. See attached dmesg output for 2.6.24-rc5 and 2.6.23.1 kernels regards Bjoern Created attachment 13999 [details]
patched 2.6.23.1 dmesg
patched 2.6.23.1 dmesg
Created attachment 14000 [details]
stock 2.6.24-rc5 dmesg
stock 2.6.24-rc5 dmesg
Any chance you can build -rc2 and let me know if that works (or ideally which -rc beaks it) Shure I can. I'll start with -rc1 and crawl upwards. Results should be here within a few hours ;-) regards Bjoern Created attachment 14006 [details]
dmesg rc1
It fails from rc1 to rc2 and all following are failing.
so rc1 is the last working one.
Created attachment 14007 [details]
dmesg rc2
Created attachment 14008 [details]
dmesg rc3
Created attachment 14009 [details]
dmesg rc4
Can you try reversing the following patch segment and let me know if this fixes it ? Created attachment 14077 [details]
pata_hpt37x: Patch segment to reverse (patch -R)
Comment on attachment 14077 [details]
pata_hpt37x: Patch segment to reverse (patch -R)
Wrong diff sorry
Created attachment 14078 [details]
Correct diff I hope
patch -R drivers/ata/pata_hpt37x < a1
Created attachment 14082 [details]
demsg 2.6.24-rc5_patched
Your patch did the trick.
All drives were found.
Thanks
16:42:08 [/usr/src/linux-2.6.24-rc5]
root@enterprise $ patch -R drivers/ata/pata_hpt37x.c < ../a1.txt
patching file drivers/ata/pata_hpt37x.c
Hunk #1 succeeded at 359 (offset -2 lines).
Can you try the following change *instead* This I think fixes the bug. If it does then I'll push that to Linus for 2.6.24 final, if not I'll push the backout you've tested. Thanks a lot for the debugging Created attachment 14088 [details]
Proposed fix
Will check it in a day or two. Thanks for the help Created attachment 14108 [details]
dmesg 2.6.24-rc5_proposed-fix
Fix works. Thanks a lot!
I didn't stress test. I just dropped to /bin/bb, dumped dmesg and rebooted 2.6.23.1
have a nice christmas time ;)
regards
Bjoern
Patch was applied as commit f941b168a4d7281bf49e166f2febc49470c0149f in Linus' tree. |