Most recent kernel where this bug did not occur: 2.6.16.23 Distribution: Gentoo (Tested with git kernels, see below) Hardware Environment: P4, Intel ICH6 (chipset rev. 3), 1 SATA disk Problem Description: Kernel fails to detect SATA drive during boot. [output snippet] ata1: SRST failed (status 0xFF) ata1: SRST failed (err_mask=0x100) ata1: softreset failed, retrying in 5 secs ata1: SRST failed (status 0xFF) ata1: SRST failed (err_mask=0x100) ata1: softreset failed, retrying in 5 secs ata1: reset failed, giving up [end] git bisect says the following commit is responsible: d133ecab8ff1233c2eb3ecb94f7956aa10002300 is first bad commit commit d133ecab8ff1233c2eb3ecb94f7956aa10002300 Author: Tejun Heo <htejun@gmail.com> Date: Wed Mar 1 01:25:39 2006 +0900 [PATCH] ata_piix: reimplement piix_sata_probe() Steps to reproduce: Reboot kernel including above mentioned commit.
Created attachment 9054 [details] Output lspci -vvv (on working kernel)
Hello, We've been seeing a number of detection failure reports on various piix chips and there have been several different attempts to fix it - but none succeeded to fix all of them yet. The commit that broke your machine is one of those and it actually fixed problems on other machines. Can you please test 2.6.18 and report boot dmesg? Thanks a lot for reporting the bug and taking the time to bisect it.
Hi. I've tested the release 2.6.18, and still get the SRST error above. Unfortunately I've not (yet) had time to set up another box on the network to catch a netfeed of the bootlog, so I can't provide the boot output. I'll try to set up the other box tomorrow. Kind regards, and thanhs for the reply! /Joakim
Can you test the patch in the following mail? You can use either git or download the full kernel tarball to get the modified kernel. http://article.gmane.org/gmane.linux.ide/13284
Hi. The patch doesn't help. The reported error is the same as originally reported. Used config: http://crafack.dk/kernel/config
Thanks for testing. Hmmm... the offending commit is related to PCS and the patch makes ata_piix not use PCS for device detection anymore. However, the SRST failure messages can be caused by another bug which is fixed by another patch, so you might see the error message for unoccupied ports. So... * Can you post dmesg of successful boot on earlier kernel? * Does the SRST failure occur on the port w/ device attached? Thanks.
Hi. Dmesg for working kernel (2.6.16.something): http://crafack.dk/kernel/dmesg As far as I can see, the error occurs on ATA1, where the disk is normally found.
Can you please test 2.6.19-rc3-mm1? Sorry about all the trouble but this PCS problem is really puzzling and different generations of piix controllers show various behaviors. We're trying to solve the PCS problem but aren't there yet. Thanks.
Hi. Just tried with 2.6.19-rc3-git8, but that was a no go. Will give rc3-mm1 a go. I have some spare time in the following days/weeks, Is there anything I could try (e.g. specially instrumented driver) that could narrow the bug hunt ?
Hi. Just tested on 2.6.19-rc3-mm1, with negative result. I also tried disabling "old" ATA (CONFIG_IDE), which caused renumbering of devices, but didn't work either. Please let me know how I can be of assistance (unfortunately I have no experience doing bughunts in the kernel).
All ata_piix detection bugs have been ironed out as of 2.6.20. We're defaulting to polling IDENTIFY and skipping 0xff wait. Closing as fixed. Please reopen if it's still broken for you.