Bug 10254 - SiS Pata: failed to IDENTIFY [...] retrying in 5 secs
Summary: SiS Pata: failed to IDENTIFY [...] retrying in 5 secs
Status: RESOLVED CODE_FIX
Alias: None
Product: IO/Storage
Classification: Unclassified
Component: Serial ATA (show other bugs)
Hardware: All Linux
: P1 normal
Assignee: Tejun Heo
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2008-03-15 15:48 UTC by Jan Bücken
Modified: 2008-03-29 06:46 UTC (History)
3 users (show)

See Also:
Kernel Version: 2.6.24.3
Subsystem:
Regression: Yes
Bisected commit-id:


Attachments
probe-debug.patch (995 bytes, patch)
2008-03-18 19:59 UTC, Tejun Heo
Details | Diff
relevant dmesg cutout (2.19 KB, text/plain)
2008-03-19 07:43 UTC, Jan Bücken
Details
phantom-workaround.patch (1.60 KB, patch)
2008-03-20 19:44 UTC, Tejun Heo
Details | Diff
dmesg with phantom-workaround patch (13.54 KB, text/plain)
2008-03-22 12:48 UTC, Jan Bücken
Details

Description Jan Bücken 2008-03-15 15:48:03 UTC
Latest working kernel version: 2.6.22 (gentoo-r8)
Earliest failing kernel version: 2.6.24 (getnoo and vanilla 2.6.24.3 tested)
Distribution: gentoo
Hardware Environment: Don't sure what to mention here: PC-32, Laptop: Medion MD5400
Software Environment: Don't sure what to mention here.
Problem Description:

Hi,
I'm using the gentoo-sources-2.6.24-r3 with the Sis PATA support enabled in the
kernel: I know, Mainline Kernels only, hence I tested this bug with the vanilla kernel 2.6.24.3 of the gentoo sources:

Now I get the following message on boot:
ata2.01: failed to IDENTIFY (I/O error, err_mask=0x1)
ata2: failed to recover some devices, retrying in 5 secs

This is ok, because there is no device which can be detectet! But this is
annoying, because this message repeats three times and the result is that I
need to wait 15 secs on boot. 
With the gentoo-sources-2.6.22-r8 this is no problem. The message "failed to
IDENTIFY" does exists, but the boot doesn't wait before retrying.

My question: Is it possible to set the secs to zero or to deactivate scanning
of some (ataX.YZ) ports (e.g with a boot parameter?) 
(I don't mean sda = noprobe. I tested it, and it seems to me that in the moment he scans for the ide/pata-hardware the devices are not restored as sda / sdb)

I believe the reason for this problem is, that the CDROM in my notebook is
flashed to cable-select (BUT I'M DON'T SURE IF it is cable-select... going to find this out...). 

Greetings
Jan Bücken

In the following a cutout of dmesg which could be interesting:

pata_sis 0000:00:02.5: version 0.5.2
scsi0 : pata_sis
scsi1 : pata_sis
ata1: PATA max UDMA/133 cmd 0x1f0 ctl 0x3f6 bmdma 0x1000 irq 14
ata2: PATA max UDMA/133 cmd 0x170 ctl 0x376 bmdma 0x1008 irq 15
Marking TSC unstable due to: possible TSC halt in C2.
Time: acpi_pm clocksource has been installed.
Switched to high resolution mode on CPU 0
ata1.00: ATA-5: HITACHI_DK23DA-40, 00J0A0A1, max UDMA/100
ata1.00: 78140160 sectors, multi 16: LBA 
ata1.00: limited to UDMA/33 due to 40-wire cable
ata1.00: configured for UDMA/33
ata2.01: failed to IDENTIFY (I/O error, err_mask=0x1)
ata2: failed to recover some devices, retrying in 5 secs
ata2.01: failed to IDENTIFY (I/O error, err_mask=0x1)
ata2: failed to recover some devices, retrying in 5 secs
ata2.01: failed to IDENTIFY (I/O error, err_mask=0x1)
ata2: failed to recover some devices, retrying in 5 secs
ata2.00: ATAPI: QSI CD-RW/DVD-ROM SBW-161, SX13, max UDMA/33
ata2.00: configured for UDMA/33

Steps to reproduce:
unknown, use a MD5400 or a SIS chipset???

------------------------
Computer programmers don't byte, they nibble a bit.
Comment 1 Tejun Heo 2008-03-17 23:23:09 UTC
Hmm... interesting bug.  What happens if you make the cdrom a slave?
Comment 2 Jan Bücken 2008-03-18 14:25:04 UTC
Hi,
I cannot test it in an easy way, because it is a cdrom in a notebook and is not jumperd. I didn't managed it to flash it yet (no floppy, and I have to install dos to a usb stick to use the firmware / flashtools, I'm working on it)
But in the moment I can tell the following:
I'm believe I was wrong: I had a look at what is printed on the cdrom and there is mentioned what how it was flashed: with SX13-M It seems to me that M stands for Master and hence it was not cable select.
BUT: I removed the cdrom completely: No problems on boot.
AND: It is a notebook, it is impossible to get a second device (e.g a slave) on this cable, because there is only one access. Do you know what I mean?

greetings
Jan
Comment 3 Tejun Heo 2008-03-18 19:49:30 UTC
OIC, it's a laptop.  Hmmm... I'll prep a debug patch.
Comment 4 Tejun Heo 2008-03-18 19:59:02 UTC
Created attachment 15338 [details]
probe-debug.patch

Can you please apply the attached patch and post the resulting boot log?
Comment 5 Jan Bücken 2008-03-19 07:43:12 UTC
Created attachment 15345 [details]
relevant dmesg cutout 

Hi,
a cutout of the boot log (I took dmesg, right? I took the relevant part only) is attached.
Hope you have fun with it ;-)
Comment 6 Tejun Heo 2008-03-20 19:44:43 UTC
Created attachment 15359 [details]
phantom-workaround.patch

Please apply this patch and report the result.
Comment 7 Jan Bücken 2008-03-22 12:22:17 UTC
Hi,
I applied the patch at the vanilla sources 2.6.24.3 in the "clean" version, I mean that I didn't applied the debug-patch before. BUT: I get during "make":

drivers/ata/libata-core.c: In Funktion »ata_dev_read_id«:
drivers/ata/libata-core.c:1941: Fehler: expected »;« before »return«
make[2]: *** [drivers/ata/libata-core.o] Fehler 1
make[1]: *** [drivers/ata] Fehler 2
make: *** [drivers] Fehler 2

Should I apply this phantom-patch after the debug-patch above? I thought I have to apply patches at a "clean" version (also in future)?
Comment 8 Jan Bücken 2008-03-22 12:31:08 UTC
Edit drivers/ata/libata-core.o on my own and inserted the missing ";". Now it seems to compile. I'll post the result till its ready...
Comment 9 Jan Bücken 2008-03-22 12:48:00 UTC
Created attachment 15402 [details]
dmesg with phantom-workaround patch

Don't know if the patch affects more than the lines with "ata", hence I post the hole dmesg output. 

Happy Easter!

Jan
Comment 10 Tejun Heo 2008-03-22 23:05:24 UTC
Aieee, that was my mistake.  I added that printk at last moment and apparently didn't compile test it after that.  Sorry about that.

So, the patch works fine.  I'll forward it upstream.  Resolving as CODE_FIX.  Thanks.
Comment 11 Jan Bücken 2008-03-23 03:43:21 UTC
Great! Thanks for your work.

Hint: I don't know if it is still relevant, but I posted this bug first in the gentoo bugzilla because I tested this with the gentoo sources. There Gerard van Vuuren confirms the problem and post this:

"I am having the same problem with Gentoo-sources 2.6.24-r3 and with
vanilla-sources 2.6.25-rc6.
What is worse in my case is that the HD-Led remains lit!
My board is an Asus P5W DH Deluxe.
If you need more info I'll will supply that.
Gerard"

Don't know if he is right with his reason about HD-Led...

Link:
http://bugs.gentoo.org/show_bug.cgi?id=211369
Comment 12 Tejun Heo 2008-03-23 04:47:12 UTC
Ah... P5W DH Deluxe.  My personal favorite one with endless stream of problems.  :-)  I thought I worked around most of the problems and it should work well if the controller is in ahci mode.  Can you please ask Gerard to file a bug report here and cc me?
Comment 13 Gerard van Vuuren 2008-03-27 06:45:03 UTC
Hi Tejun,
I changed the bios from "normal IDE" to AHCI.Grub errored out after that.
Found out that the drive letters had changed.
Before I had sda and sdb.sdc being the bogus drive.
Now I have sda sdb (bogus) and sdc (was sdb).
I had to change my grub.conf file:

default 0
timeout 30
splashimage=(hd1,0)/grub/splash.xpm.gz

		
title=Gentoo-Linux-32
	kernel (hd1,0)/kernel-2.6.24-3 root=/dev/sdc2 vga=0x0F07


Note the (hd1,0).When I run rescuecd and do --device-map I get
hd0 /dev/sda
hd1 /dev/sdb
hd2 /dev/sdc
With this setup I can boot as always and the long wait is gone.
Also the HD-Led extinguishes when there is no disk activity.
Thanks for all the work you're doing!
Gerard.
Comment 14 Tejun Heo 2008-03-27 19:16:27 UTC
Hmmm... The bios is moving disks around depending on configuration?  That's interesting.  Maybe there's a way to workaround the ata_piix mode too but I haven't found it yet.  Anyways, good to know that at least ahci mode works.
Comment 15 Gerard van Vuuren 2008-03-28 15:55:59 UTC
Hi Tejun,
I want to add the following:
The system I experienced the problem with is 32 bit.
I want to try 64 bit and fired up sysrescuecd in 64 bit mode.
When I look in /dev after booting the two disks are in sda and sdb
again.IOW when I run in 32 bit mode the bogus disk is sdb and
in 64 bit mode the bogus disk shows up as sdc!
Gerard.
Comment 16 Tejun Heo 2008-03-28 20:07:34 UTC
Can you post boot logs from both environments?  The difference is probably because difference in PCI scanning order.  It's nearly impossible to keep static device names (/dev/[hs]dX) permanent these days.  The whole system is much more dynamic now (as opposed to the days of IO ports and IRQs statically allocated to primary, secondary and tertiary IDE channels).  It's probably best to use other permanent names - ie. UUID / volID, etc...
Comment 17 Gerard van Vuuren 2008-03-29 06:46:23 UTC
Well I am sorry,but booted sysrescuecd again in both 32 and 64 bit mode but now
/dev shows the same.In both cases I have sda sdc as the true disks and sdb as the 
bogus one.Something wrong with my board?
I think I'll do as you suggested and use UUID or similar.Have to consult the docs
for that.
Thanks for your time.
Gerard.

Note You need to log in before you can comment on or make changes to this bug.