Latest working kernel version: 2.6.22 (gentoo-r8) Earliest failing kernel version: 2.6.24 (getnoo and vanilla 2.6.24.3 tested) Distribution: gentoo Hardware Environment: Don't sure what to mention here: PC-32, Laptop: Medion MD5400 Software Environment: Don't sure what to mention here. Problem Description: Hi, I'm using the gentoo-sources-2.6.24-r3 with the Sis PATA support enabled in the kernel: I know, Mainline Kernels only, hence I tested this bug with the vanilla kernel 2.6.24.3 of the gentoo sources: Now I get the following message on boot: ata2.01: failed to IDENTIFY (I/O error, err_mask=0x1) ata2: failed to recover some devices, retrying in 5 secs This is ok, because there is no device which can be detectet! But this is annoying, because this message repeats three times and the result is that I need to wait 15 secs on boot. With the gentoo-sources-2.6.22-r8 this is no problem. The message "failed to IDENTIFY" does exists, but the boot doesn't wait before retrying. My question: Is it possible to set the secs to zero or to deactivate scanning of some (ataX.YZ) ports (e.g with a boot parameter?) (I don't mean sda = noprobe. I tested it, and it seems to me that in the moment he scans for the ide/pata-hardware the devices are not restored as sda / sdb) I believe the reason for this problem is, that the CDROM in my notebook is flashed to cable-select (BUT I'M DON'T SURE IF it is cable-select... going to find this out...). Greetings Jan Bücken In the following a cutout of dmesg which could be interesting: pata_sis 0000:00:02.5: version 0.5.2 scsi0 : pata_sis scsi1 : pata_sis ata1: PATA max UDMA/133 cmd 0x1f0 ctl 0x3f6 bmdma 0x1000 irq 14 ata2: PATA max UDMA/133 cmd 0x170 ctl 0x376 bmdma 0x1008 irq 15 Marking TSC unstable due to: possible TSC halt in C2. Time: acpi_pm clocksource has been installed. Switched to high resolution mode on CPU 0 ata1.00: ATA-5: HITACHI_DK23DA-40, 00J0A0A1, max UDMA/100 ata1.00: 78140160 sectors, multi 16: LBA ata1.00: limited to UDMA/33 due to 40-wire cable ata1.00: configured for UDMA/33 ata2.01: failed to IDENTIFY (I/O error, err_mask=0x1) ata2: failed to recover some devices, retrying in 5 secs ata2.01: failed to IDENTIFY (I/O error, err_mask=0x1) ata2: failed to recover some devices, retrying in 5 secs ata2.01: failed to IDENTIFY (I/O error, err_mask=0x1) ata2: failed to recover some devices, retrying in 5 secs ata2.00: ATAPI: QSI CD-RW/DVD-ROM SBW-161, SX13, max UDMA/33 ata2.00: configured for UDMA/33 Steps to reproduce: unknown, use a MD5400 or a SIS chipset??? ------------------------ Computer programmers don't byte, they nibble a bit.
Hmm... interesting bug. What happens if you make the cdrom a slave?
Hi, I cannot test it in an easy way, because it is a cdrom in a notebook and is not jumperd. I didn't managed it to flash it yet (no floppy, and I have to install dos to a usb stick to use the firmware / flashtools, I'm working on it) But in the moment I can tell the following: I'm believe I was wrong: I had a look at what is printed on the cdrom and there is mentioned what how it was flashed: with SX13-M It seems to me that M stands for Master and hence it was not cable select. BUT: I removed the cdrom completely: No problems on boot. AND: It is a notebook, it is impossible to get a second device (e.g a slave) on this cable, because there is only one access. Do you know what I mean? greetings Jan
OIC, it's a laptop. Hmmm... I'll prep a debug patch.
Created attachment 15338 [details] probe-debug.patch Can you please apply the attached patch and post the resulting boot log?
Created attachment 15345 [details] relevant dmesg cutout Hi, a cutout of the boot log (I took dmesg, right? I took the relevant part only) is attached. Hope you have fun with it ;-)
Created attachment 15359 [details] phantom-workaround.patch Please apply this patch and report the result.
Hi, I applied the patch at the vanilla sources 2.6.24.3 in the "clean" version, I mean that I didn't applied the debug-patch before. BUT: I get during "make": drivers/ata/libata-core.c: In Funktion »ata_dev_read_id«: drivers/ata/libata-core.c:1941: Fehler: expected »;« before »return« make[2]: *** [drivers/ata/libata-core.o] Fehler 1 make[1]: *** [drivers/ata] Fehler 2 make: *** [drivers] Fehler 2 Should I apply this phantom-patch after the debug-patch above? I thought I have to apply patches at a "clean" version (also in future)?
Edit drivers/ata/libata-core.o on my own and inserted the missing ";". Now it seems to compile. I'll post the result till its ready...
Created attachment 15402 [details] dmesg with phantom-workaround patch Don't know if the patch affects more than the lines with "ata", hence I post the hole dmesg output. Happy Easter! Jan
Aieee, that was my mistake. I added that printk at last moment and apparently didn't compile test it after that. Sorry about that. So, the patch works fine. I'll forward it upstream. Resolving as CODE_FIX. Thanks.
Great! Thanks for your work. Hint: I don't know if it is still relevant, but I posted this bug first in the gentoo bugzilla because I tested this with the gentoo sources. There Gerard van Vuuren confirms the problem and post this: "I am having the same problem with Gentoo-sources 2.6.24-r3 and with vanilla-sources 2.6.25-rc6. What is worse in my case is that the HD-Led remains lit! My board is an Asus P5W DH Deluxe. If you need more info I'll will supply that. Gerard" Don't know if he is right with his reason about HD-Led... Link: http://bugs.gentoo.org/show_bug.cgi?id=211369
Ah... P5W DH Deluxe. My personal favorite one with endless stream of problems. :-) I thought I worked around most of the problems and it should work well if the controller is in ahci mode. Can you please ask Gerard to file a bug report here and cc me?
Hi Tejun, I changed the bios from "normal IDE" to AHCI.Grub errored out after that. Found out that the drive letters had changed. Before I had sda and sdb.sdc being the bogus drive. Now I have sda sdb (bogus) and sdc (was sdb). I had to change my grub.conf file: default 0 timeout 30 splashimage=(hd1,0)/grub/splash.xpm.gz title=Gentoo-Linux-32 kernel (hd1,0)/kernel-2.6.24-3 root=/dev/sdc2 vga=0x0F07 Note the (hd1,0).When I run rescuecd and do --device-map I get hd0 /dev/sda hd1 /dev/sdb hd2 /dev/sdc With this setup I can boot as always and the long wait is gone. Also the HD-Led extinguishes when there is no disk activity. Thanks for all the work you're doing! Gerard.
Hmmm... The bios is moving disks around depending on configuration? That's interesting. Maybe there's a way to workaround the ata_piix mode too but I haven't found it yet. Anyways, good to know that at least ahci mode works.
Hi Tejun, I want to add the following: The system I experienced the problem with is 32 bit. I want to try 64 bit and fired up sysrescuecd in 64 bit mode. When I look in /dev after booting the two disks are in sda and sdb again.IOW when I run in 32 bit mode the bogus disk is sdb and in 64 bit mode the bogus disk shows up as sdc! Gerard.
Can you post boot logs from both environments? The difference is probably because difference in PCI scanning order. It's nearly impossible to keep static device names (/dev/[hs]dX) permanent these days. The whole system is much more dynamic now (as opposed to the days of IO ports and IRQs statically allocated to primary, secondary and tertiary IDE channels). It's probably best to use other permanent names - ie. UUID / volID, etc...
Well I am sorry,but booted sysrescuecd again in both 32 and 64 bit mode but now /dev shows the same.In both cases I have sda sdc as the true disks and sdb as the bogus one.Something wrong with my board? I think I'll do as you suggested and use UUID or similar.Have to consult the docs for that. Thanks for your time. Gerard.