Latest working kernel version: 2.6.24 Earliest failing kernel version:2.6.26 Distribution:Ubuntu Hardware Environment:Bug Filing FAQ is 404 not found I don't know what I have to type Software Environment:Bug Filing FAQ is 404 not found I don't know what I have to type Problem Description:At bootup I end up in busybox and I see the following message on the top of the screen "Gave up waiting for root device" Actually I think that my hard drive "falls asleep" just after leaving grub. When I'm in the busybox I need to unplug my hard drive (serial ata) and to plug it again so that I can hear it restarting. After doing that I type exit in the busybox and the boot process continues normally. Dmesg shows me that: [ 9.672007] ata3: link is slow to respond, please be patient (ready=0) [ 14.320007] ata3: COMRESET failed (errno=-16) [ 19.680006] ata3: link is slow to respond, please be patient (ready=0) [ 24.328007] ata3: COMRESET failed (errno=-16) [ 29.688007] ata3: link is slow to respond, please be patient (ready=0) [ 59.092004] ata3: COMRESET failed (errno=-16) [ 59.092004] ata3: limiting SATA link speed to 1.5 Gbps [ 59.688009] ata3: SATA link down (SStatus 0 SControl 310) [ 60.164017] ata4: SATA link up 1.5 Gbps (SStatus 113 SControl 300) [ 60.196367] ata4.00: HPA detected: current 160834367, native 160836480 [ 60.196371] ata4.00: ATA-6: HDS722580VLSA80, V32OA6MA, max UDMA/100 [ 60.196373] ata4.00: 160834367 sectors, multi 16: LBA48 The COMRESET thing continues as far as I don't unplug and plug again my hard drive. I tried recently other distributions with the same kernel and I get the same error (Debian and pmagic liveCD) So I think this bug concerns the kernel. I also have to tell you that it's a SATA II hard drive (3gbps) on a (nforce 3) SATA I controller (1.5gbps). And it appears that the controller does not fully support the hard drive (or the SATA I retro-compatibility of the hard drive is malfunctioning I don't know) But with older kernel it did always work without any problem. I'm running Ubuntu intrepid ibex alpha up-to-date, kernel 2.6.27-1.2 (I recently updated from 2.6.26 to 2.6.27 but the problem is the same) Thanks I don't know how to attach files here so if you want the dmesg.log etc. files I reported the bug on launchpad where you'll find these files. https://bugs.launchpad.net/ubuntu/+source/linux/+bug/256637
Marked as a regression Reassigned to ATA It's sata_nv. dmesg is here: http://launchpadlibrarian.net/16703605/dmesg.log
Don't think anything's changed at the sata_nv level that would cause this. Tejun, there were some reset changes in libata recently, weren't there?
Yes, libata is now defaulting to hardreset and early nv's seem to have problem with it. I sent the test patches a few times but haven't got enough response to commit it. So, let's do one more testing.
Created attachment 17525 [details] nv-nohrst.patch Can you test whether the attached patch fixes the problem? Thanks.
(In reply to comment #4) > Created an attachment (id=17525) [details] > nv-nohrst.patch > > Can you test whether the attached patch fixes the problem? Thanks. > Could you explain the procedure to follow in order to apply the patch?
First, build your own 2.6.26.3 and boot the system with it and check everything is as expected. Then, cd to the source tree and apply the patch by executing "patch -p1 < nv-nohrst.patch" and build the kernel again (just running make again will do) and test the new kernel.
The patch fixes the problem :D Thank you Do you need some more info to ensure the patch worked as expected? Will this patch be included in the latest kernel? For info I upgraded to the 2.6.27 and the problem was the same. This is the one I patched.
Yes, already pending for 2.6.27 and I'll send it to -stable once it gets accepted upstream. Thanks.
This bug happens again due to my recent update from 2.6.27-4 to 2.6.27-7. I don't know if it's due to a change upstream or a change related to ubuntu. 2.6.27-4 worked fine but 2.6.27-6 and 2.6.27-7 are suffering of the same bug. Has the patch been removed? Were there changes in sata_nv?
The problem is still not completely resolved. For generic and ck804, it works but something is still wrong with nf2/3 and I'm still working on it. Please wait a bit. Thanks.
I just wanted to point out that I have this exact problem on ck804 and kernels 2.6.26 and 2.6.27.1. So it's not exactly working there either. The patch above works for me too, except then I got a whole bunch of messages to syslog just saying ata1: EH complete and ata2: EH complete. Those are empty, my drives are on ata3 and ata4.
I just bought a nf2/3 board and am waiting for it to arrive. Please give me a few more days. Thanks.
Hi, has anything new happened concerning this bug ? If I can be of any help, I have a system on which this bug occurs. I sent my dmesg and lscpi to the launchpad.net bug report already mentioned by François ( https://bugs.launchpad.net/debian/+source/linux/+bug/256637 ).
Which attachments in that bug report are yours? It seems to be discussing likely multiple different unrelated problems. This bug is dealing with sata_nv and nForce2/3 chipsets.
Indeed, I forgot to specify that ;-). My username is "sym_zo" and my comment the following : https://bugs.launchpad.net/ubuntu/+source/linux/+bug/256637/comments/36 uploaded files : http://launchpadlibrarian.net/20681738/lspci_and_dmidecode.zip
So your machine is nForce4? That seems odd, I have that chipset and I've never run into that problem with any Fedora 2.6.27 kernel or with vanilla 2.6.28. Can you test with vanilla 2.6.27.10 or 2.6.28?
Ok, I'll try it out tomorrow (it's 02:30 AM in my timezone^^). I suppose there isn't an easier way than downloading from kernel.org and compiling ?
Sorry for the delay. I tested with a vanilla 2.6.28 : I still get the error (and a busybox). The latest kernel installed on my machine with which it accepts to boot is still 2.6.24.
Can you please attach the failing log here?
(In reply to comment #16) > So your machine is nForce4? That seems odd, I have that chipset and I've > never > run into that problem with any Fedora 2.6.27 kernel or with vanilla 2.6.28. > Can > you test with vanilla 2.6.27.10 or 2.6.28? I also have nforce4 and using sata_nv, but for me this problem doesn't happen in 2.6.27.8 or 2.6.28 anymore. As I recall, 2.6.26 was where the problem first showed up and also some earlier 2.6.27 kernels, earlier than 2.6.27.8 that is.
Tejun Heo >> Sorry for the delay. I'll do that ASAP. When you say "failing log", I suppose it is /var/log/dmesg you want ?
Yeap, dmesg output after the failure. Preferably w/ printk timestamp turned on.
Created attachment 24521 [details] hdparm -I and lscpi -nn for a kernel w/o SATA problem As you requested TJ, here's the info on the last kernel I've been able to successfully install. All later kernels are not able to recognize my sata drives.
Created attachment 24553 [details] boot log w/ libata.force=nohrst Logs John sent me via email. Attaching here for later reference.
Created attachment 24554 [details] w/ kernel param "sata_nv.swncq=0"
Created attachment 24555 [details] w/ kernel param "sata_nv.swncq=0 libata.force=nohrst"
Created attachment 24556 [details] and without any param
The biggest related change since 2.6.24 would be restructuring of reset operations which happened between 2.6.24 and 25. Our current sequence should be basically the same as before. I currently have no idea what could be the difference. What is the mainboard? Can you please post the output of dmidecode? Thanks.
TJ, this happens on both my Windows Vista 64 Bit Ultimate box and the Linux box. Here's the dmidecode from my Linux Box john@johnsubuntu:~$ sudo dmidecode [sudo] password for john: # dmidecode 2.9 SMBIOS 2.4 present. 38 structures occupying 1145 bytes. Table at 0x000F0000. Handle 0x0000, DMI type 0, 24 bytes BIOS Information Vendor: Phoenix Technologies, LTD Version: 6.00 PG Release Date: 07/11/2007 Address: 0xE0000 Runtime Size: 128 kB ROM Size: 512 kB Characteristics: ISA is supported PCI is supported PNP is supported APM is supported BIOS is upgradeable BIOS shadowing is allowed Boot from CD is supported Selectable boot is supported BIOS ROM is socketed EDD is supported 5.25"/360 KB floppy services are supported (int 13h) 5.25"/1.2 MB floppy services are supported (int 13h) 3.5"/720 KB floppy services are supported (int 13h) 3.5"/2.88 MB floppy services are supported (int 13h) Print screen service is supported (int 5h) 8042 keyboard services are supported (int 9h) Serial services are supported (int 14h) Printer services are supported (int 17h) CGA/mono video services are supported (int 10h) ACPI is supported USB legacy is supported LS-120 boot is supported ATAPI Zip drive boot is supported BIOS boot specification is supported Targeted content distribution is supported Handle 0x0001, DMI type 1, 27 bytes System Information Manufacturer: NVIDIA Product Name: NFORCE 680i LT SLI Version: 2 Serial Number: 1 UUID: 6A97600D-034B-0400-0000-000000000000 Wake-up Type: Power Switch SKU Number: Family: Handle 0x0002, DMI type 2, 8 bytes Base Board Information Manufacturer: NVIDIA Product Name: NFORCE 680i LT SLI Version: 2 Serial Number: 1 Handle 0x0003, DMI type 3, 17 bytes Chassis Information Manufacturer: NVIDIA Type: Desktop Lock: Not Present Version: NFORCE 680i LT SLI Serial Number: Asset Tag: Boot-up State: Unknown Power Supply State: Unknown Thermal State: Unknown Security Status: Unknown OEM Information: 0x00000000 Handle 0x0004, DMI type 4, 35 bytes Processor Information Socket Designation: Socket 775 Type: Central Processor Family: Other Manufacturer: Intel ID: FB 06 00 00 FF FB EB BF Version: Intel(R) Core(TM)2 Quad CPU Voltage: 1.7 V External Clock: 336 MHz Max Speed: 200 MHz Current Speed: 3024 MHz Status: Populated, Enabled Upgrade: ZIF Socket L1 Cache Handle: 0x000A L2 Cache Handle: 0x000B L3 Cache Handle: Not Provided Serial Number: Asset Tag: Part Number: Handle 0x0005, DMI type 5, 24 bytes Memory Controller Information Error Detecting Method: None Error Correcting Capabilities: None Supported Interleave: One-way Interleave Current Interleave: One-way Interleave Maximum Memory Module Size: 32 MB Maximum Total Memory Size: 128 MB Supported Speeds: 70 ns 60 ns Supported Memory Types: Standard EDO Memory Module Voltage: 5.0 V Associated Memory Slots: 4 0x0006 0x0007 0x0008 0x0009 Enabled Error Correcting Capabilities: None Handle 0x0006, DMI type 6, 12 bytes Memory Module Information Socket Designation: A0 Bank Connections: 0 1 Current Speed: 10 ns Type: Other Installed Size: 1024 MB (Double-bank Connection) Enabled Size: 1024 MB (Double-bank Connection) Error Status: OK Handle 0x0007, DMI type 6, 12 bytes Memory Module Information Socket Designation: A1 Bank Connections: 2 3 Current Speed: 10 ns Type: Other Installed Size: 1024 MB (Double-bank Connection) Enabled Size: 1024 MB (Double-bank Connection) Error Status: OK Handle 0x0008, DMI type 6, 12 bytes Memory Module Information Socket Designation: A2 Bank Connections: 4 5 Current Speed: 10 ns Type: Other Installed Size: 1024 MB (Double-bank Connection) Enabled Size: 1024 MB (Double-bank Connection) Error Status: OK Handle 0x0009, DMI type 6, 12 bytes Memory Module Information Socket Designation: A3 Bank Connections: 6 7 Current Speed: 10 ns Type: Other Installed Size: 1024 MB (Double-bank Connection) Enabled Size: 1024 MB (Double-bank Connection) Error Status: OK Handle 0x000A, DMI type 7, 19 bytes Cache Information Socket Designation: Internal Cache Configuration: Enabled, Not Socketed, Level 1 Operational Mode: Write Back Location: Internal Installed Size: 32 KB Maximum Size: 32 KB Supported SRAM Types: Synchronous Installed SRAM Type: Synchronous Speed: Unknown Error Correction Type: None System Type: Instruction Associativity: 8-way Set-associative Handle 0x000B, DMI type 7, 19 bytes Cache Information Socket Designation: External Cache Configuration: Enabled, Not Socketed, Level 2 Operational Mode: Write Back Location: External Installed Size: 4096 KB Maximum Size: 4096 KB Supported SRAM Types: Synchronous Installed SRAM Type: Synchronous Speed: Unknown Error Correction Type: None System Type: Unified Associativity: 8-way Set-associative Handle 0x000C, DMI type 8, 9 bytes Port Connector Information Internal Reference Designator: PRIMARY IDE Internal Connector Type: On Board IDE External Reference Designator: Not Specified External Connector Type: None Port Type: Other Handle 0x000D, DMI type 8, 9 bytes Port Connector Information Internal Reference Designator: FDD Internal Connector Type: On Board Floppy External Reference Designator: Not Specified External Connector Type: None Port Type: 8251 FIFO Compatible Handle 0x000E, DMI type 8, 9 bytes Port Connector Information Internal Reference Designator: COM1 Internal Connector Type: 9 Pin Dual Inline (pin 10 cut) External Reference Designator: External Connector Type: DB-9 male Port Type: Serial Port 16450 Compatible Handle 0x000F, DMI type 8, 9 bytes Port Connector Information Internal Reference Designator: Keyboard Internal Connector Type: PS/2 External Reference Designator: External Connector Type: PS/2 Port Type: Keyboard Port Handle 0x0010, DMI type 8, 9 bytes Port Connector Information Internal Reference Designator: PS/2 Mouse Internal Connector Type: PS/2 External Reference Designator: External Connector Type: PS/2 Port Type: Mouse Port Handle 0x0011, DMI type 8, 9 bytes Port Connector Information Internal Reference Designator: Not Specified Internal Connector Type: None External Reference Designator: USB0 External Connector Type: Other Port Type: USB Handle 0x0012, DMI type 8, 9 bytes Port Connector Information Internal Reference Designator: Not Specified Internal Connector Type: None External Reference Designator: USB1 External Connector Type: Other Port Type: USB Handle 0x0013, DMI type 8, 9 bytes Port Connector Information Internal Reference Designator: Not Specified Internal Connector Type: None External Reference Designator: USB2 External Connector Type: Other Port Type: USB Handle 0x0014, DMI type 8, 9 bytes Port Connector Information Internal Reference Designator: Not Specified Internal Connector Type: None External Reference Designator: USB3 External Connector Type: Other Port Type: USB Handle 0x0015, DMI type 8, 9 bytes Port Connector Information Internal Reference Designator: Not Specified Internal Connector Type: None External Reference Designator: USB4 External Connector Type: Other Port Type: USB Handle 0x0016, DMI type 8, 9 bytes Port Connector Information Internal Reference Designator: Not Specified Internal Connector Type: None External Reference Designator: USB5 External Connector Type: Other Port Type: USB Handle 0x0017, DMI type 9, 13 bytes System Slot Information Designation: PCI0 Type: 32-bit PCI Current Usage: Available Length: Long ID: 1 Characteristics: 5.0 V is provided PME signal is supported Handle 0x0018, DMI type 9, 13 bytes System Slot Information Designation: PCI1 Type: 32-bit PCI Current Usage: Available Length: Long ID: 2 Characteristics: 5.0 V is provided PME signal is supported Handle 0x0019, DMI type 13, 22 bytes BIOS Language Information Installable Languages: 3 n|US|iso8859-1 n|US|iso8859-1 r|CA|iso8859-1 Currently Installed Language: n|US|iso8859-1 Handle 0x001A, DMI type 16, 15 bytes Physical Memory Array Location: System Board Or Motherboard Use: System Memory Error Correction Type: None Maximum Capacity: 2 GB Error Information Handle: Not Provided Number Of Devices: 4 Handle 0x001B, DMI type 17, 27 bytes Memory Device Array Handle: 0x001A Error Information Handle: Not Provided Total Width: 128 bits Data Width: 128 bits Size: 1024 MB Form Factor: DIMM Set: None Locator: A0 Bank Locator: Bank0/1 Type: DRAM Type Detail: None Speed: 798 MHz (1.3 ns) Manufacturer: None Serial Number: None Asset Tag: None Part Number: None Handle 0x001C, DMI type 17, 27 bytes Memory Device Array Handle: 0x001A Error Information Handle: Not Provided Total Width: 128 bits Data Width: 128 bits Size: 1024 MB Form Factor: DIMM Set: None Locator: A1 Bank Locator: Bank2/3 Type: DRAM Type Detail: None Speed: 798 MHz (1.3 ns) Manufacturer: None Serial Number: None Asset Tag: None Part Number: None Handle 0x001D, DMI type 17, 27 bytes Memory Device Array Handle: 0x001A Error Information Handle: Not Provided Total Width: 128 bits Data Width: 128 bits Size: 1024 MB Form Factor: DIMM Set: None Locator: A2 Bank Locator: Bank4/5 Type: DRAM Type Detail: None Speed: 798 MHz (1.3 ns) Manufacturer: None Serial Number: None Asset Tag: None Part Number: None Handle 0x001E, DMI type 17, 27 bytes Memory Device Array Handle: 0x001A Error Information Handle: Not Provided Total Width: 128 bits Data Width: 128 bits Size: 1024 MB Form Factor: DIMM Set: None Locator: A3 Bank Locator: Bank6/7 Type: DRAM Type Detail: None Speed: 798 MHz (1.3 ns) Manufacturer: None Serial Number: None Asset Tag: None Part Number: None Handle 0x001F, DMI type 19, 15 bytes Memory Array Mapped Address Starting Address: 0x00000000000 Ending Address: 0x000FFFFFFFF Range Size: 4 GB Physical Array Handle: 0x001A Partition Width: 0 Handle 0x0020, DMI type 20, 19 bytes Memory Device Mapped Address Starting Address: 0x00000000000 Ending Address: 0x0003FFFFFFF Range Size: 1 GB Physical Device Handle: 0x001B Memory Array Mapped Address Handle: 0x001F Partition Row Position: 1 Handle 0x0021, DMI type 20, 19 bytes Memory Device Mapped Address Starting Address: 0x00040000000 Ending Address: 0x0007FFFFFFF Range Size: 1 GB Physical Device Handle: 0x001C Memory Array Mapped Address Handle: 0x001F Partition Row Position: 1 Handle 0x0022, DMI type 20, 19 bytes Memory Device Mapped Address Starting Address: 0x00080000000 Ending Address: 0x000BFFFFFFF Range Size: 1 GB Physical Device Handle: 0x001D Memory Array Mapped Address Handle: 0x001F Partition Row Position: 1 Handle 0x0023, DMI type 20, 19 bytes Memory Device Mapped Address Starting Address: 0x000C0000000 Ending Address: 0x000FFFFFFFF Range Size: 1 GB Physical Device Handle: 0x001E Memory Array Mapped Address Handle: 0x001F Partition Row Position: 1 Handle 0x0024, DMI type 32, 11 bytes System Boot Information Status: No errors detected Handle 0x0025, DMI type 127, 4 bytes End Of Table john@johnsubuntu:~$
My Windows box is a EVGA X58 3x SLI and I can't get any kernel after 2.6.24-24 to recognize SATA drives on that system either. Before I forget, thanks so much for being willing to put in the time and effort to fix this!!!
I forgot to tell you, the board for the dmidecode above is an XFX 680i LT SLI
Created attachment 24672 [details] sata_nv-oh-my-god.patch Hmmmm.... I found a pretty similar board here (ASUSTek L1N64-SLI WS) which shows the same PCI ID for the SATA controller and I also have WD5000YS drive around. Unfortunately, it works fine here. I wonder what the difference could be. Can you please apply the attached patch and see how it works? Please attach the kernel log with the patch applied. Thanks.
OK, I'm happy to do that but I think I'd need pretty detailed instructions to do it. I'm really a newbie. I'm working with install CD's, do I simply add the patch as a kernel parameter during install? Or do I need to get kernel sources, apply the patch, compile, etc? Is there a way for you to remotely access my machine?
How about this? Can you set up your test kernel, then ftp it to me and I can try it out?
Ah.... alright. Let me prep something called kISO. It's a bare minimum installation media which can contain a new kernel and should be used in combination with actual full installation media.
Can you please try the following kISO? http://htj.dyndns.org/export/testing/sl112-x86_64-bko11445_dbg0/SL112-x86_64-bko11445_dbg0.iso For instructions on using kISO. http://htj.dyndns.org/export/testing/SL103-kISO-doc.txt Please try to acquire kernel boot log as you did with the installation media. Thanks.
OK.....is the kISO disk all I need or do I also need the SUSE 11.2 CD? The instructions aren't clear.
Created attachment 24735 [details] SL112-x86_64-bko11445_dbg.iso boot.msg
Same result, TJ. Don't give up!
Any news on this, TJ?
Sorry about the lack of response. I'm running out of ideas. The last kISO skips all hardreset related things including simple link resume, even then the drive fails to respond with DRDY to SRST. I think I'll have to compare 2.6.24 init path and try to find out what the difference is. The problem is that the code has changed a lot since then. Is it possible for you to set up an environment where you can test a patched kernel? Thanks.
The kISO worked fine if you can do that again or, I'm pretty good at following directions. Want to send me (ftp or whatever) a patched kernel and then phone me to talk me through it?
TJ, as an update, I just tried the Kubuntu 9.10 installer and had the exact same problem. Everything works fine until you get to the partitioning. The hardrives are not recognized. It's as though I unplugged them. Any luck comparing the 2.6.24 init path?
Sorry caught up doing other stuff. Will do it in a few days. Thanks.
Don't give up, TJ!
Alright, can you please test this one and report the kernel boot log? http://htj.dyndns.org/export/testing/sl112-x86_64-bko11445_dbg1/SL112-x86_64-bko11445_dbg1.iso
Created attachment 25094 [details] kISO (2) boot.msg results Here's the boot.msg, TJ
Created attachment 25095 [details] Install Screen Thought you'd like to see this, too.
Hmmm... the workaround didn't kick in. Strange. This is the machine you posted the dmidecode for, rigth?
Oops, strike that. I was looking at the boot log from the first kiso. The workaround kicked in but the detection failed the same. With the workaround applied, the behavior is very close to 2.6.24. I'll look again. :-(
What's the news, TJ??
Ummm.... are you interested in sending the board to me? I can buy it if it isn't too expensive. If it is, I can pay for the round-trip cost. Thanks.
Sure, I could send you the board. But I really do want it back. I'll be out of town for the next week, so I won't miss it too much. What address?
TJ, are we at a dead end on this?
Hmmm... looks that way for the moment. I'll see if there's anything else I can do remotely. Thanks.
How about changing some of the BIOS settings?
Maybe you could work through a Local Ubuntu Team? https://wiki.ubuntu.com/LoCoTeamList I don't know the guy but there's a "Nerdy Nick" here in Denver. Cheers!
Sorry about the long delay. Can you please test this kiso? http://htj.dyndns.org/export/testing/sl112-x86_64-bko11445_dbg2/SL112-x86_64-bko11445_dbg2.iso Thanks.
OK, tell me if I'm doing this right. I download the kiso and burn it as an iso. Then I boot the kiso on my Linux box and choose "Install" from the first menu. Right? Next, it wants me to remove the kiso disk and put in the Suse install disk, right?
OK, tell me if I'm doing this righ. I download the kiso and burn it as an iso. Then I boot the kiso on my Linux box and choose "Install" from the first menu. Right? Next, it wants me to remove the kiso disk and put in the Suse install disk, right? On 4/1/2010 9:35 PM, bugzilla-daemon@bugzilla.kernel.org wrote: > https://bugzilla.kernel.org/show_bug.cgi?id=11445 > > > > > > --- Comment #56 from John Scott<gr8-scott@comcast.net> 2010-03-12 14:55:45 > --- > How about changing some of the BIOS settings? > > --- Comment #57 from John Scott<gr8-scott@comcast.net> 2010-03-12 15:33:22 > --- > Maybe you could work through a Local Ubuntu Team? > > https://wiki.ubuntu.com/LoCoTeamList > > I don't know the guy but there's a "Nerdy Nick" here in Denver. > > Cheers! > > --- Comment #58 from Tejun Heo<tj@kernel.org> 2010-04-02 03:35:02 --- > Sorry about the long delay. Can you please test this kiso? > > > http://htj.dyndns.org/export/testing/sl112-x86_64-bko11445_dbg2/SL112-x86_64-bko11445_dbg2.iso > > Thanks. > >
You don't need openSUSE installation media. Just boot into rescue mode and fetch boot.msg from there.
Created attachment 25847 [details] Boot.msg file from 04/03/2010 On screen Error message - failed to detect SATA HD's
Okay, another miss. I'm afraid I don't have much left to try remotely at this point. :-(
I'm just an newbie, but would it do any good to compare the boot.msg from a kernel that boots OK with one that doesn't? Would that help to isolate the problem? Next ???, what's a good Socket 775 SATA board that does work? Can you compare the boot.msg from that board to mine?
boot.msg wouldn't show any new information at this point. I'm quite lost as to where the difference is. :-( Short of testing things locally (and maybe try to hook it up w/ a bus tracer), I'm not sure what to do. Most 775 boards work fine. You're currently the only one reporting boot probing problems on sata_nv. Thanks.
The reason I asked about a board that you're sure works is that I've got two different 775 boards and both have the same problem. Do you know of a reasonably priced 775 board that works? I'd buy it just to get this behind me. Also, I beg to differ about my problem being unusual. I searched the net and the problem I'm having isn't unique. It's been going on for over a year. Every kernel since 2.6.24-24 has failed on my systems and on many others. Check it out. Google "errno=-16" http://search.yahoo.com/search?p=errno%3D-16&ei=UTF-8&fr=moz35 Any way, thanks for your efforts
Oh... sure, reset failures sure have been reported a lot. The thing is that the failures are caused by a lot of different reasons (IRQ delivery problems is the most common reason now) on a lot of different configurations and you're currently the only one who is reporting probe failure on sata_nv which hasn't been root caused yet. If you can point me to a reasonably priced 775 board which doesn't work, I'll be happy to get one and fix it. What is the other board you're having problem with? Thanks.
The other board is an EVGA X58 3x SLI running a Core i7 920 processor. Right now, I've got Windows 7 on that machine. Same issue... cannot find SATA drives. SRST fails errno=-16
That's an intel ich10. You're seeing probe failures on that board? Can you please post boot.msg from that machine? Thanks.
This happens for me too twice with the 2.6.35.2 kernel after ~12h and ~50h uptime, but only with my new sdd with Sandforce controller. Never before with my old slow disc drive. The controller is an Intel Corporation 82801HBM/HEM (ICH8M/ICH8M-E) in sata mode [42004.832055] ata3.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x6 frozen [42004.832061] ata3.00: failed command: FLUSH CACHE [42004.832070] ata3.00: cmd e7/00:00:00:00:00/00:00:00:00:00/a0 tag 0 [42004.832071] res 40/00:00:00:4f:c2/00:01:00:00:00/00 Emask 0x4 (timeout) [42004.832076] ata3.00: status: { DRDY } [42004.832081] ata3: hard resetting link [42010.192060] ata3: link is slow to respond, please be patient (ready=0) [42014.849045] ata3: COMRESET failed (errno=-16) [42014.849058] ata3: hard resetting link [42020.208053] ata3: link is slow to respond, please be patient (ready=0) [42024.857047] ata3: COMRESET failed (errno=-16) [42024.857060] ata3: hard resetting link [42030.224024] ata3: link is slow to respond, please be patient (ready=0) [42059.912068] ata3: COMRESET failed (errno=-16) [42059.912082] ata3: limiting SATA link speed to 1.5 Gbps [42059.912086] ata3: hard resetting link [42064.940048] ata3: COMRESET failed (errno=-16) [42064.940059] ata3: reset failed, giving up [42064.940063] ata3.00: disabled [42064.940069] ata3.00: device reported invalid CHS sector 0 [42064.940087] ata3: EH complete [42064.940087] ata3: EH complete [42064.940110] end_request: I/O error, dev sdb, sector 0 [42064.940172] Aborting journal on device dm-1-8. [42064.940191] end_request: I/O error, dev sdb, sector 0 [42064.940281] Aborting journal on device dm-4-8. [42064.940318] sd 2:0:0:0: [sdb] Unhandled error code [42064.940321] sd 2:0:0:0: [sdb] Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK [42064.940325] sd 2:0:0:0: [sdb] CDB: Read(10): 28 00 00 f1 84 e0 00 00 20 00 [42064.940334] end_request: I/O error, dev sdb, sector 15828192 [42064.940346] sd 2:0:0:0: [sdb] Unhandled error code [42064.940348] sd 2:0:0:0: [sdb] Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK [42064.940351] sd 2:0:0:0: [sdb] CDB: Read(10): 28 00 01 e4 4b 08 00 00 20 00 [42064.940359] end_request: I/O error, dev sdb, sector 31738632 [42064.940369] sd 2:0:0:0: [sdb] Unhandled error code [42064.940371] sd 2:0:0:0: [sdb] Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK ...
Thomas, can you please attach full kernel boot log (/var/log/boot.msg or the output of dmesg after boot)? But it looks like the device has shut down. More likely to be a device problem than anything else. Thanks.
Created attachment 28601 [details] Boot log + error messages It's the dmegs output of the day, when I happens. I could save it on another drive, after the ssd did not answer anymore. I have send a request to the manufacturer of the ssd too, as it was my first assumption too. But I am running my system currently with the 2.6.32 kernel for two days, and there is still no connection timeout. But it's not a proof, as the error is not deterministic.
FLUSH_CACHE is a non-tagged nodata command, which means that no other command is in progress and all the host controller does is issuing a single command packet to the device for the command. There isn't much the host can screw up for this type of commands. For drives w/ rotating media, FLUSH_CACHE often spikes power consumption and inadequate power supply often shows up as FLUSH_CACHE timeouts. For SSDs, problems like this have usually been remedied by firmware updates on the drive side. On some rare cases, disabling NCQ seems to help too for whatever reason. Thanks.
Thanks a lot for the help. I will try to disabling NCQ, as long as there is no firmware update available.