Bug 32682

Summary: libata freeze randomly hangs disk I/O
Product: IO/Storage Reporter: Robin Rainton (robin)
Component: Serial ATAAssignee: Jeff Garzik (jgarzik)
Status: CLOSED UNREPRODUCIBLE    
Severity: high CC: alan, bjoernv, robin, rtguille, tj
Priority: P1    
Hardware: All   
OS: Linux   
Kernel Version: 2.6.38.2 Subsystem:
Regression: No Bisected commit-id:
Attachments: Full dmesg output
lspci -vvv
fdisk -lu
dmesg output with libata.force=1.5Gbps

Description Robin Rainton 2011-04-05 06:19:18 UTC
There seem to be numerous issues on this topic but none mention DMA as far as I can tell.

I have seen this problem with the following kernels:

2.6.28.10
2.6.35.9
2.6.38.2

This is on a 64bit CentOS 5.5 distro. Some parts of the distro are not compatible with the newer kernels but I was experimenting to see if the problem was corrected in a newer kernel. Sadly it seems not. Hardware in the system is GeForce 8200 motherboard (6 x sata) with SiL3114 PCI card (4 x SATA).

The SiL has 2 x 200GB Seagate drives in RAID 1 (boot + root)

The GeForce has 6 x 500GB Samsung in RAID 5

I have seen the following types of message in relation to every drive in this box. Ie. on both controllers. While some people suspect these errors are caused by faulty hardware there is no way that every drive and every port can be bad. The system has a 450W PSU but estimations by looking at the UPS it's connected to indicate power draw of well under 200W (the CPU in this system is a AMD 4850e).

Moreover, hardware problems can be excluded as the fault seems to be fixed by adding the following during boot as a kernel argument: "libata.dma=0"

The timing of the fault is seemingly random. System load does not seem to play a part.

This problem manifests itself as a lockup/freeze of processes that are performing/waiting for disk I/O (those not accessing disks continue just fine). Inspecting logs after such an event one sees the below.

Although the logging covers only a few seconds the I/O hang appears to be a lot longer (30 seconds or more).

Note that in the system above with 10 SATA ports and 8 drives the 'ata7' could be replaced by any other number.

Apr  4 03:09:11 plex kernel: ata7.00: exception Emask 0x0 SAct 0xf SErr 0x0 action 0x6 frozen
Apr  4 03:09:11 plex kernel: ata7.00: failed command: READ FPDMA QUEUED
Apr  4 03:09:11 plex kernel: ata7.00: cmd 60/50:00:4d:14:bb/00:00:0a:00:00/40 tag 0 ncq 40960 in
Apr  4 03:09:11 plex kernel:          res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)
Apr  4 03:09:11 plex kernel: ata7.00: status: { DRDY }
Apr  4 03:09:11 plex kernel: ata7.00: failed command: READ FPDMA QUEUED
Apr  4 03:09:11 plex kernel: ata7.00: cmd 60/80:08:cd:12:bb/00:00:0a:00:00/40 tag 1 ncq 65536 in
Apr  4 03:09:11 plex kernel:          res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)
Apr  4 03:09:11 plex kernel: ata7.00: status: { DRDY }
Apr  4 03:09:11 plex kernel: ata7.00: failed command: READ FPDMA QUEUED
Apr  4 03:09:11 plex kernel: ata7.00: cmd 60/80:10:4d:13:bb/00:00:0a:00:00/40 tag 2 ncq 65536 in
Apr  4 03:09:11 plex kernel:          res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)
Apr  4 03:09:11 plex kernel: ata7.00: status: { DRDY }
Apr  4 03:09:11 plex kernel: ata7.00: failed command: READ FPDMA QUEUED
Apr  4 03:09:11 plex kernel: ata7.00: cmd 60/08:18:8d:00:dd/00:00:0a:00:00/40 tag 3 ncq 4096 in
Apr  4 03:09:11 plex kernel:          res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)
Apr  4 03:09:11 plex kernel: ata7.00: status: { DRDY }
Apr  4 03:09:11 plex kernel: ata7: hard resetting link
Apr  4 03:09:11 plex kernel: ata7: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
Apr  4 03:09:11 plex kernel: ata7.00: configured for UDMA/133
Apr  4 03:09:11 plex kernel: ata7.00: device reported invalid CHS sector 0
Apr  4 03:09:11 plex last message repeated 3 times
Apr  4 03:09:11 plex kernel: ata7: EH complete

Earlier Kernels had slightly different output:

Apr  3 04:54:05 plex kernel: ata6.00: NCQ disabled due to excessive errors
Apr  3 04:54:05 plex kernel: ata6.00: exception Emask 0x0 SAct 0x7ff SErr 0x0 action 0x6 frozen
Apr  3 04:54:05 plex kernel: ata6.00: cmd 60/68:00:65:c0:ac/00:00:01:00:00/40 tag 0 ncq 53248 in
Apr  3 04:54:05 plex kernel:          res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)
Apr  3 04:54:05 plex kernel: ata6.00: status: { DRDY }
Apr  3 04:54:05 plex kernel: ata6.00: cmd 60/a0:08:2d:bd:ac/00:00:01:00:00/40 tag 1 ncq 81920 in
Apr  3 04:54:05 plex kernel:          res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)
Apr  3 04:54:05 plex kernel: ata6.00: status: { DRDY }
Apr  3 04:54:05 plex kernel: ata6.00: cmd 60/00:10:cd:bd:ac/01:00:01:00:00/40 tag 2 ncq 131072 in
Apr  3 04:54:05 plex kernel:          res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)
Apr  3 04:54:05 plex kernel: ata6.00: status: { DRDY }
Apr  3 04:54:05 plex kernel: ata6.00: cmd 60/00:18:cd:c0:ac/01:00:01:00:00/40 tag 3 ncq 131072 in
Apr  3 04:54:05 plex kernel:          res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)
Apr  3 04:54:05 plex kernel: ata6.00: status: { DRDY }
Apr  3 04:54:05 plex kernel: ata6.00: cmd 60/00:20:cd:be:ac/01:00:01:00:00/40 tag 4 ncq 131072 in
Apr  3 04:54:05 plex kernel:          res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)
Apr  3 04:54:05 plex kernel: ata6.00: status: { DRDY }
Apr  3 04:54:05 plex kernel: ata6.00: cmd 60/98:28:cd:bf:ac/00:00:01:00:00/40 tag 5 ncq 77824 in
Apr  3 04:54:05 plex kernel:          res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)
Apr  3 04:54:05 plex kernel: ata6.00: status: { DRDY }
Apr  3 04:54:05 plex kernel: ata6.00: cmd 60/f0:30:cd:c1:ac/00:00:01:00:00/40 tag 6 ncq 122880 in
Apr  3 04:54:05 plex kernel:          res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)
Apr  3 04:54:05 plex kernel: ata6.00: status: { DRDY }
Apr  3 04:54:05 plex kernel: ata6.00: cmd 60/10:38:bd:c2:ac/00:00:01:00:00/40 tag 7 ncq 8192 in
Apr  3 04:54:05 plex kernel:          res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)
Apr  3 04:54:05 plex kernel: ata6.00: status: { DRDY }
Apr  3 04:54:05 plex kernel: ata6.00: cmd 60/00:40:cd:c2:ac/01:00:01:00:00/40 tag 8 ncq 131072 in
Apr  3 04:54:05 plex kernel:          res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)
Apr  3 04:54:05 plex kernel: ata6.00: status: { DRDY }
Apr  3 04:54:05 plex kernel: ata6.00: cmd 60/28:48:cd:c3:ac/00:00:01:00:00/40 tag 9 ncq 20480 in
Apr  3 04:54:05 plex kernel:          res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)
Apr  3 04:54:05 plex kernel: ata6.00: status: { DRDY }
Apr  3 04:54:05 plex kernel: ata6.00: cmd 60/60:50:cd:bc:ac/00:00:01:00:00/40 tag 10 ncq 49152 in
Apr  3 04:54:05 plex kernel:          res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)
Apr  3 04:54:05 plex kernel: ata6.00: status: { DRDY }
Apr  3 04:54:05 plex kernel: ata6: hard resetting link
Apr  3 04:54:06 plex kernel: ata6: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
Apr  3 04:54:06 plex kernel: ata6.00: configured for UDMA/133
Apr  3 04:54:06 plex kernel: ata6: EH complete
Apr  3 04:54:06 plex kernel: sd 5:0:0:0: [sdf] 976773168 512-byte hardware sectors: (500 GB/465 GiB)
Apr  3 04:54:06 plex kernel: sd 5:0:0:0: [sdf] Write Protect is off
Apr  3 04:54:06 plex kernel: sd 5:0:0:0: [sdf] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
Comment 1 Robin Rainton 2011-04-05 08:41:21 UTC
Correction, the "libata.dma=0" argument does not correct this problem with Kernel 2.6.28.10.

I will double check the others.
Comment 2 Robin Rainton 2011-04-05 10:05:49 UTC
Further correction, the "libata.dma=0" argument does not correct this problem with Kernel 2.6.35.9 either.
Comment 3 Robin Rainton 2011-04-05 21:40:25 UTC
Even further correction, having previously run for quite some time the 2.6.38.2 kernel hung a few times overnight, so the "libata.dma=0" argument does not correct this problem with that Kernel either.

Apologise for the initially incorrect title then.

This problem is therefore more serious than thought as I cannot find any work around for these hangs.

I still stand by what I said about the hardware being reliable though - the 'frozen' messages are spread over all SATA ports and drives which simply cannot all be faulty.
Comment 4 Tejun Heo 2011-04-06 10:22:27 UTC
First of all, please always attach full dmesg output including the boot and error messages when reporting kernel bugs, preferably with CONFIG_PRINTK enabled.  It might look boring and repetitive but it actually contains a lot of valuable information.

I can't tell much from the single selected failure but it's most likely a power related issue.  Get an extra PSU - just a cheap one, power it up separately and move half of the drives to the extra PSU and see whether the pattern of errors changes.

Thanks.
Comment 5 Robin Rainton 2011-04-06 11:40:33 UTC
Created attachment 53642 [details]
Full dmesg output

There are 3 separate occurrences of the problem here on 3 different ports.

This happened overnight when there was very little load on the system.
Comment 6 Robin Rainton 2011-04-06 11:47:17 UTC
As requested have attached dmesg output.

I am convinced this is not hardware related as discussed.

I have also experimented with disabling NCQ on the drives that support it (the 6 in RAID5 on the GeForce controller) and interestingly the problem has not resurfaced since doing that.

Will report back in a few days to confirm if the system stays error free with this.
Comment 7 Tejun Heo 2011-04-06 11:59:56 UTC
AFAICS, nothing precludes hardware issue at this point.  NCQ might affect the issue as would libata.force=1.5Gbps.  When you have underlying hardware problem, especially power related one, random tweaks often change the result and aren't too useful for cause elimination.  It's far easier to keep the software visible configuration identical and change underlying hardware configuration and see how error pattern changes.  Oh, it will also be helpful if there are failures from the silicon image controller.

Thanks.
Comment 8 Robin Rainton 2011-04-06 12:38:47 UTC
I have based the statement that both controllers are affected on the fact that when one greps /var/log/messages for the fault it occurs on all ports. Eg (duplicates removed for brevity - kernel boots included to show versions):

Mar 28 17:44:53 plex kernel: Linux version 2.6.28.10
Mar 29 10:59:30 plex kernel: ata6.00: exception Emask 0x0 SAct 0x1ff SErr 0x0 action 0x6 frozen
Apr  3 04:23:08 plex kernel: ata2.00: exception Emask 0x0 SAct 0x7ff SErr 0x0 action 0x6 frozen
Apr  3 04:27:27 plex kernel: ata4.00: exception Emask 0x0 SAct 0x3fff SErr 0x0 action 0x6 frozen
Apr  3 04:31:46 plex kernel: ata3.00: exception Emask 0x0 SAct 0xffff SErr 0x0 action 0x6 frozen
Apr  3 04:32:57 plex kernel: ata1.00: exception Emask 0x0 SAct 0x7ff SErr 0x0 action 0x6 frozen
Apr  3 04:39:55 plex kernel: ata5.00: exception Emask 0x0 SAct 0x1fff SErr 0x0 action 0x6 frozen
Apr  3 07:21:23 plex kernel: Linux version 2.6.35.9
Apr  3 08:27:05 plex kernel: ata7.00: exception Emask 0x0 SAct 0xfff SErr 0x0 action 0x6 frozen
Apr  3 09:27:01 plex kernel: ata9.00: exception Emask 0x0 SAct 0x3ff SErr 0x0 action 0x6 frozen
Apr  3 15:46:16 plex kernel: ata10.00: exception Emask 0x0 SAct 0x1fff SErr 0x0 action 0x6 frozen
Apr  3 16:41:54 plex kernel: Linux version 2.6.38.2
Apr  5 23:13:23 plex kernel: ata8.00: exception Emask 0x0 SAct 0x1ff SErr 0x0 action 0x6 frozen

However, given that all 10 ports are mentioned yet only 8 drives are present in the box is suspicious!

I then went back and checked the recovery message that follows each failure (SATA link up) and notice that on each occasion the speed was 3.0Gbps. This indicates the Samsung drives (and therfore GeForce ports) only are the only ones affected (the SiL/Seagate combination runs at 1.5Gbps).

It would appear that during testing of the different Kernels, when the system boots sometimes different versions configure the 'ata' numbers on different controllers. This may be to do with the order the modules are loaded in the initrd image.

Checking above on the boot logs I have discovered that when the 2.6.28.10 kernel boots the GeForce controller is initialised first. Note that the fault occured on ports 1-6 with that kernel.

Then while testing other kernels I discovered that the SiL controller is initialised first. Note that the fault occurs only on numbers 7,8,9,10 (so far) on later kernels. Ports above 4 here would again indicate GeForce controller.

Again I apologise for the confusion. My statement that all ports on both controllers now seems incorrect.

I will stick to the current kernel and boot image (2.6.38.2) and advise if any more faults occur with NCQ disabled.
Comment 9 Reartes Guillermo 2011-04-06 18:23:24 UTC
Try booting with kernel parameter 'pcie_aspm=off'

before rebooting:
# lspci -vvv > lspci-vvv_2.6.XX_aspm-default.out
then reboot with pcie_aspm=off
# lspci -vvv > lspci-vvv_2.6.XX_aspm-off.out

Try to reproduce the erros. I Hope it helps.

----

In my case (on F13,F14 and only once with F15) the active state power management on the sata controller:
* nVidia Corporation MCP78S [GeForce 8200] AHCI Controller [10de:0ad4] (rev a2)

Generated these type of errors among others, even i/o error.
Disabling pcie_aspm seems to fix it for F13/F14, F15 seems to not to suffer the issue, but i saw ONCE a sata error, still i use F15 (F15 seems to selectivelly disable it for that controller)

I tested slackware, with kernels compiled by me and found that the same kernel version (for example 2.6.33) downloaded an compiled in slack64 13.1 on the same hw worked fine, and F13 (2.6.33.XX) on the same hw needed pcie_aspm disabled to workaround a bunch of sata errors. And the stragest thing of all is that i compared:
*1 The output of lspci -vvv in slackware64 (2.6.33-aspm) with and without aspm
*2 The output of lscpi -vvv in F13 2.6.33.??) with and without aspm

in *1, diff showed that aspm is disabled or enabled, ok
in *2, diff showed that in addition to *1, there were other differences between
an aspm enabled and an aspm disabled kernel:
 > pcie link speed differencies.... ???why???
 > + or - differencies in other places.

In the end pcie_aspm is an efective workaroud, but i don't know why...
Slackware64 worked flawless with pcie_aspm enabled.


------------
The kernel seems to printk some stuff in parallel, it was a hellish experience to red these logs with so many disks. is there a way to serialize at least  the sata disk detection messages?
Comment 10 Reartes Guillermo 2011-04-06 21:24:08 UTC
Is this your current config? (if wrong, please correct me):

Kernel: 2.6.38.2.el5-local
Command line: ro root=/dev/r1/plex.root libata.dma=0

Controller #0:
sata_sil 0000:01:09.0: version 2.4 [ LOCATION: 0000:01:09.0 ] 
 * ATA1, ATA2, ATA3, ATA4
 * DISKS: ATA1, ATA2
 * HPA disks: ATA2

Controller #1:
libata version 3.00 loaded. [ LOCATION: 0000:00:09.0 ]
 * ATA5, ATA6, ATA7, ATA8, ATA9, ATA10
 * HPA disks: ATA5, ATA7, ATA9

Controller #0:
scsi 0:0:0:0 > sda > ATA1 > "ATA ST3200822AS 3.01 PQ: 0 ANSI: 5" > 390721968 512-byte logical blocks
scsi 1:0:0:0 > sdb > ATA2 > "ATA ST3200822AS 3.01 PQ: 0 ANSI: 5" > 390719855 512-byte logical blocks
scsi 2:0:0:0 EMPTY
scsi 3:0:0:0 EMPTY

Controller #1:
scsi 4:0:0:0 > sdc > ATA5  > "ATA SAMSUNG HD501LJ CR10 PQ: 0 ANSI: 5" > 976771055 512-byte logical blocks
scsi 5:0:0:0 > sdd > ATA6  > "ATA SAMSUNG HD502HJ 1AJ1 PQ: 0 ANSI: 5" > 976773168 512-byte logical blocks
scsi 6:0:0:0 > sde > ATA7  > "ATA SAMSUNG HD501LJ CR10 PQ: 0 ANSI: 5" > 976771055 512-byte logical blocks
scsi 7:0:0:0 > sdf > ATA8  > "ATA SAMSUNG HD502HJ 1AJ1 PQ: 0 ANSI: 5" > 976773168 512-byte logical blocks
scsi 8:0:0:0 > sdg > ATA9  > "ATA SAMSUNG HD501LJ CR10 PQ: 0 ANSI: 5" > 976771055 512-byte logical blocks
scsi 9:0:0:0 > sdh > ATA10 > "ATA SAMSUNG HD501LJ CR10 PQ: 0 ANSI: 5" > 976773168 512-byte logical blocks

HPA:
 * ata2.00: HPA detected: current 390719855, native 390721968
 * ata5.00: HPA detected: current 976771055, native 976773168
 * ata7.00: HPA detected: current 976771055, native 976773168
 * ata9.00: HPA detected: current 976771055, native 976773168

------------------------

Can you post the output of:

(list all partitions on all disks)
# fdisk -lu

# lspci -vvv

and the content of /var/log/mesages and a current dmesg (with default
boot parameters)

Not all of your 500GB disks are of the same USABLE size.
Not all of your 200GB disks are of the same USABLE size.
So i don't know if it is intentional (it was configured this way) or not.
Comment 11 Robin Rainton 2011-04-06 21:35:53 UTC
Created attachment 53702 [details]
lspci -vvv
Comment 12 Robin Rainton 2011-04-06 21:40:03 UTC
Created attachment 53712 [details]
fdisk -lu

The system boots from sda1 or sdb1.

Don't worry about the boot flags on the other first partitions - I had a mirror across all 6 x 500GB drives in the same way when that array was on it's own in another system and used for boot.

To clarify:

RAID1 (boot): sda1, sdb1
RAID1 (root inside LVM + others): sda2, sdb2
RAID1 (unused boot): sdc1 - sdh1
RAID5 (LVM with many volumes): sdc2 - sdh2
Comment 13 Robin Rainton 2011-04-06 21:44:38 UTC
Attached lspci & fdisk as requested. It is impractical to attach /var/log/messages as this is full of DHCPD and SNMP requests and is massive. I can try and grep out this noise if dmesg & other attachements aren't enough.

dmesg is unchanged since the attachment yesterday.

After disabling NCQ the freeze has not happened since:

Apr  5 23:15:24 plex kernel: ata5.00: exception Emask 0x0 SAct 0x1ff SErr 0x0 action 0x6 frozen

I will leave the system running for several more days before trying other kernel arguments suggested here to try and prove (or not) it is stable right now.
Comment 14 Reartes Guillermo 2011-04-06 22:52:07 UTC
HPA Sectors: 2113 (1081856 Bytes / ~1 Megabyte)

sda > 390721968 sectors > UNPARTITIONED Sectors 5104 > HPA Sectors NONE
sdb > 390719855 sectors > UNPARTITIONED Sectors 2991 > HPA Sectors 2113 


sdc > 976771055 Sectors > UNPARTITIONED Sectors 2991 > HPA Sectors 2113 
sdd > 976773168 Sectors > UNPARTITIONED Sectors 5104 > HPA Sectors NONE
sde > 976771055 Sectors > UNPARTITIONED Sectors 2991 > HPA Sectors 2113 
sdf > 976773168 Sectors > UNPARTITIONED Sectors 5104 > HPA Sectors NONE
sdg > 976771055 Sectors > UNPARTITIONED Sectors 2991 > HPA Sectors 2113
sdh > 976773168 Sectors > UNPARTITIONED Sectors 5104 > HPA Sectors NONE

I have googled that some mainboard reserve those 2113 sectors in the HPA
when 'hardware raid/fakeraid' has been used on those drives. HW raid
controllers also might use the HPA.

I don't know if this is the reason of the errors, probaly not, because all
partitions fit inside the partitionalble space. If there are no true HPA's 
in each drive (sdc,sde,sdg and sdb), there is another bug/issue probaly not
related. Check that those drive either have a valid hpa (and someone forgot
to erase it before partitioning or there is no hpa and it is a 
different/another bug.

-----

YOU:

00:09.0 SATA controller: nVidia Corporation MCP78S [GeForce 8200] AHCI Controller (rev a2) (prog-if 01 [AHCI 1.0])
	Subsystem: ABIT Computer Corp. Unknown device 1c44
	Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B-
	Status: Cap+ 66MHz+ UDF- FastB2B+ ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR-
	Latency: 0 (750ns min, 250ns max)
	Interrupt: pin A routed to IRQ 40
	Region 0: I/O ports at 09f0 [size=8]
	Region 1: I/O ports at 0bf0 [size=4]
	Region 2: I/O ports at 0970 [size=8]
	Region 3: I/O ports at 0b70 [size=4]
	Region 4: I/O ports at dc00 [size=16]
	Region 5: Memory at fe026000 (32-bit, non-prefetchable) [size=8K]
	Capabilities: [44] Power Management version 2
		Flags: PMEClk- DSI- D1- D2- AuxCurrent=0mA PME(D0-,D1-,D2-,D3hot-,D3cold-)
		Status: D0 PME-Enable- DSel=0 DScale=0 PME-
	Capabilities: [8c] #12 [0010]
	Capabilities: [b0] Message Signalled Interrupts: 64bit+ Queue=0/3 Enable+
		Address: 00000000fee0200c  Data: 4169
	Capabilities: [ec] HyperTransport: MSI Mapping

ME:

00:09.0 SATA controller: nVidia Corporation MCP78S [GeForce 8200] AHCI Controller (rev a2) (prog-if 01 [AHCI 1.0])
        Subsystem: ASUSTeK Computer Inc. M3N72-D
        Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR+ FastB2B- DisINTx+
        Status: Cap+ 66MHz+ UDF- FastB2B+ ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
        Latency: 0 (750ns min, 250ns max)
        Interrupt: pin A routed to IRQ 40
        Region 0: I/O ports at c080 [size=8]
        Region 1: I/O ports at c000 [size=4]
        Region 2: I/O ports at bc00 [size=8]
        Region 3: I/O ports at b880 [size=4]
        Region 4: I/O ports at b800 [size=16]
        Region 5: Memory at fbd7a000 (32-bit, non-prefetchable) [size=8K]
        Capabilities: [44] Power Management version 2
                Flags: PMEClk- DSI- D1- D2- AuxCurrent=0mA PME(D0-,D1-,D2-,D3hot-,D3cold-)
                Status: D0 NoSoftRst- PME-Enable- DSel=0 DScale=0 PME-
        Capabilities: [8c] SATA HBA v1.0 InCfgSpace
        Capabilities: [b0] MSI: Enable+ Count=1/8 Maskable- 64bit+
                Address: 00000000fee0100c  Data: 4159
        Capabilities: [ec] HyperTransport: MSI Mapping Enable+ Fixed+
        Kernel driver in use: ahci

If it is the same contoller (but different moboshop), then the workaround
is booting with pcie_aspm=off It works with F13,F14. 
I never installed CentOS or RHEL on this machine.
I am currently running F15 Alpha w/k 2.6.38.2-9.fc15.x86_64 (without workaround)

Also, for me it was easy to reproduce the problem, for example with my
two disks:

# dd if=/dev/sda of=/dev/null &
# dd if=/dev/sdb of=/dev/null &
# find /&>/dev/null &

Saturating (a read-only test is HIGHLY recommended) single disk (not using the md device, but sdX directly). Be carefull with the raid5.
Comment 15 Reartes Guillermo 2011-04-06 22:55:26 UTC
Just in case:

Bug 611350 - ATA frozen , random pauses, every disk (workaround: pcie_aspm=off)
Comment 16 Reartes Guillermo 2011-04-06 22:56:32 UTC
Sorry improved the post:

From: https://bugzilla.redhat.com/

Bug 611350 - ATA frozen , random pauses, every disk (workaround: pcie_aspm=off)
Comment 17 Robin Rainton 2011-04-16 04:57:21 UTC
After this week I am convinced this is to do with NCQ.

After the last hang I disabled NCQ on all drives that supported it (the 6 drive RAID 5 array in this system) by:

echo 1 > /sys/block/sdX/device/queue_depth

Since then the system has been running all week without the problem.

I rebooted this morning and left queue_depth untouched (the value was 31). The freeze again occurred after around 6 hours of uptime.

I have now disabled NCQ once again as this appears to be a viable workaround.

This was all on kernel 2.6.38.2
Comment 18 Tejun Heo 2011-04-22 22:32:04 UTC
Robin, as I wrote before, correlation doesn't indicate causation.  It could be a viable workaround for your case but that's as far as it goes.  If you're interested, remove the NCQ workaround and boot with libata.force=1.5Gbps and see whether the problem goes away.

Thanks.
Comment 19 Robin Rainton 2011-04-22 22:45:52 UTC
I tried booting with libata.force=1.5Gbps but dmesg reports the drives are connected at 3.0Gbps.

Is that what you expect? Attached is a dmesg output for this.
Comment 20 Robin Rainton 2011-04-22 22:46:40 UTC
Created attachment 55082 [details]
dmesg output with libata.force=1.5Gbps
Comment 21 Reartes Guillermo 2011-04-22 23:01:44 UTC
Use either
 libata.force=1.5G
or
 libata.force=1.5
Comment 22 Tejun Heo 2011-04-22 23:03:04 UTC
That's initrd failing to pass on the libata parameter to the respective module.  I don't know why some distro initrds still don't implement it. :-(

Anyways, you'll need to edit some module config and regenerate initrd so that initrd specifies the parameter to libata directly.  Which distro are you on?
Comment 23 Tejun Heo 2011-04-22 23:04:13 UTC
Reartes, it is "libata.force=1.5Gbps".
Comment 24 Robin Rainton 2011-04-22 23:10:16 UTC
This is with Centos 5.6.

I don't see why one has to edit the init script to pass an argument to a module - surely that is a bug in it's own right?

Anyhow, that aside, what does one need to do to the init script? I assume I have to put something on the end of insmod? For ahci? Please confirm.
Comment 25 Tejun Heo 2011-04-22 23:23:06 UTC
Yeah, that's a bug of the initrd.  Please report to the distro.  I don't know about centos but in most cases it comes down to adding a file under /etc/modprobe.d and regenerating the initrd.  mkinitrd will pick up the parameter from there and include it inside initrd.
Comment 26 Reartes Guillermo 2011-04-22 23:28:23 UTC
I know, but these are more easy to type.

http://www.mjmwired.net/kernel/Documentation/kernel-parameters.txt

The VAL specifies the configuration to force.
As long as there's no ambiguity shortcut notation is allowed.
For example, both 1.5 and 1.5G would work for 1.5Gbps.
The following configurations can be forced.

@Tejun

> Robin, as I wrote before, correlation doesn't indicate causation.  It could
> be
> a viable workaround for your case but that's as far as it goes.  If you're
> interested, remove the NCQ workaround and boot with libata.force=1.5Gbps and
> see whether the problem goes away.

I do fully agree with that statement. 

In my case Nvidia [10de:0ad4], pcie_aspm=off works with fedora's kernel.
I never experienced the problem with slackware 13.1 with either the slack
kernel nor kernel downladed by me and manually enabled pcie_aspm.

I noticed that forcing no ncq the messages diminished, then i ended 
disabling everything!!! PIO MODE!!! and some messages still apeared.
until the pcie_aspm=off workaround.

I do not think (but i have no prof) that there is a problem with kernel.org
kernel. It seems fedora / rhel6beta1 the only rhel tested, at leat in my
specific case.

The test i performed with slack userspace and my downloaded, configured and
compiled kernels showed no issues at all. nothing.

The most biz stuff is that i noticed that fedora does strange things with the
pci/pcie bus when pcie_aspm=off is set. i am no pci/pcie guru, but it is
strange and i really think (with no proof) that pcie_aspm fixes the problem by
accident. I still don't know the true cause, but at least with F15 i only
experienced 1 message.

@ Robin
is your nvdida sata controller the same as mine ?

mine is [10de:0ad4].

check with
# lspci -nn | grep SATA

> Yeah, that's a bug of the initrd.  Please report to the distro.  I don't know
> about centos but in most cases it comes down to adding a file under
> /etc/modprobe.d and regenerating the initrd.  mkinitrd will pick up the
> parameter from there and include it inside initrd.

It seems probably.
Try the sorter forms anyways, they might work.
Comment 27 Robin Rainton 2011-04-23 00:39:11 UTC
# lspci -nn | grep SATA
00:09.0 SATA controller [0106]: nVidia Corporation MCP78S [GeForce 8200] AHCI Controller [10de:0ad4] (rev a2)
01:09.0 RAID bus controller [0104]: Silicon Image, Inc. SiI 3114 [SATALink/SATARaid] Serial ATA Controller [1095:3114] (rev 02)
Comment 28 Robin Rainton 2011-04-23 01:24:56 UTC
I'm not sure the shorter forms of the argument will work, looking here:

http://lxr.linux.no/linux+v2.6.38/drivers/ata/libata-core.c#L6293

Later I will manually add "force=1.5Gbps" to the libata insmod command in the RD.

BTW, this is a stock kernel compiled by myself, not from any distro.

I don't see how there is an initrd bug here either. When one specifies arguments on the Grub command line the kernel should take note of them - that's nothing to do with the RD is it?
Comment 29 Robin Rainton 2011-05-06 06:26:58 UTC
Further update on this:

After manually adding "force=1.5Gbps" to the libata insmod command in the
RD the system has been stable for almost 2 weeks now.

So there now appear to be 2 solutions to this problem:

1. Disable NCQ
2. force=1.5Gbps
Comment 30 Tejun Heo 2011-05-06 06:49:00 UTC
To me, it definitely looks like link layer problem which gets masked if transfer rate and/or power draw isn't high enough.  Most likely power related issue but could be cabling.  Please prepare a separate PSU and connect all the hard drives to it and see whether the problem goes away.  The two PSUs don't have to share common ground, SATA links don't make direct electrical connections and don't require common ground between host and device.  There really isn't much which can be debugged from software at this point.

Thanks.