Bug 9309
Summary: | Drive seagate ST380011AS needs to be blacklisted | ||
---|---|---|---|
Product: | IO/Storage | Reporter: | Hans-Joachim Baader (Hans-Joachim.Baader) |
Component: | Serial ATA | Assignee: | Tejun Heo (htejun) |
Status: | REJECTED UNREPRODUCIBLE | ||
Severity: | normal | CC: | akpm, protasnb |
Priority: | P1 | ||
Hardware: | All | ||
OS: | Linux | ||
Kernel Version: | 2.6.22.11 | Subsystem: | |
Regression: | --- | Bisected commit-id: | |
Attachments: |
Output of lcpci -v
dmesg before patch dmesg after patch |
Description
Hans-Joachim Baader
2007-11-05 10:31:19 UTC
Reply-To: akpm@linux-foundation.org On Mon, 5 Nov 2007 10:31:20 -0800 (PST) bugme-daemon@bugzilla.kernel.org wrote: > (haven't checked yet if it works) When you have done so, please send a tested patch to Jeff Garzik <jeff@garzik.org> Andrew Morton <akpm@linux-foundation.org> linux-ide@vger.kernel.org thanks. A timeout tells us nothing, sorta like this bug report :) There is definitely -zero- information implying that this should be added to an NCQ blacklist. Furthermore, on hardware like Intel ICH, a timeout is the only way we have to know that a DMA error occurred (it only sends an interrupt on success; assumes OS will notice failure via timeout). It always recovers from the timeout, so I don't see any problem here. Of course, a full dmesg and lspci might shed additional light. The driver is complaining about missing CLO support, so the controller seems to be a non-intel variant of ahci, or if an intel one, a pretty early one. Anyways, mostly likely cause is hardware issues. * Does 'smartctl -a /dev/sdX' indicate any problem? * Perform common hardware debugging - swap / reseat cables, connect harddrive to separate power supply, etc. We have several of these machines, only the two with new ATA drivers make problems. The others run a 2.6.13.5 kernel. smartctl doesn't indicate problems. Suggested patch: --- drivers/ata/libata-core.c.orig 2007-10-22 11:34:23.000000000 +0200 +++ drivers/ata/libata-core.c 2007-11-09 19:05:31.000000000 +0100 @@ -3789,6 +3789,7 @@ /* NCQ is broken */ { "Maxtor 6L250S0", "BANC1G10", ATA_HORKAGE_NONCQ }, { "Maxtor 6B200M0", "BANC1B10", ATA_HORKAGE_NONCQ }, + { "ST380011AS", "3.00", ATA_HORKAGE_NONCQ }, /* NCQ hard hangs device under heavier load, needs hard power cycle */ { "Maxtor 6B250S0", "BANC1B70", ATA_HORKAGE_NONCQ }, /* Blacklist entries taken from Silicon Image 3124/3132 I attach lspci, dmesg output (before and after patch) Created attachment 13482 [details]
Output of lcpci -v
Created attachment 13483 [details]
dmesg before patch
Created attachment 13484 [details]
dmesg after patch
I see, it's a ICH6. It could be that the cause is the ICH6 not the drive. Can you please connect another NCQ capable drive to the controller and see if the same problem occurs? Or even better, connect ST380011AS to another NCQ capable controller and see whether it works. If ICH6 AHCI turns out to be the culprit, we'll need to turn off NCQ support for the controller not the drive. Thanks. Hans, Any update on this, Have you been able to try recommended in #8? I'm sorry, I didn't have the right hardware combination for further tests. I'll try to find some next week. No activity for months, closing. Please re-open if you get time |