Kernel Bug Tracker – Bug 15561
SCSI Generic READ_10 to SATA fails when starting multiple processes
Last modified: 2013-12-10 18:14:21 UTC
Created attachment 25572 [details]
aborted sg_io_hdr and kernel logs for various kernels
Issuing a lot of concurrent READ_10 commands via sg driver to SATA
drives causes the the commands to be aborted for no good reason. I
can reproducibly cause the problem within a few seconds on multiple
known good machines and drives over a wide range of kernels.
I queue 16 concurrent 64k reads to each of eight sata drives with
eight separate process which start at roughly the same time. At least
one and typically several log kernel errors (reset the associated SATA
bus) and return task aborted.
Perhaps it is a clue to what is going on: even if just using one
drive, driver_duration shows the reads take far longer than normal
(greater than 10ms) when first starting to queue io even with only one
drive, after which the performance behaves more like one would expect
from a sata disk drive. This slow start is exhibited on both arm and
x86_64 architectures although with only one drive I've never seen an
Older x86_64 kernels are less verbose in kernel log and report with
fixed sense instead of sense descriptors, but the same ATA event is
occuring. See attachment for typical sg_io_hdr and kernel logs.