Bug 15562 - SCSI Generic block io queueing can lock up
SCSI Generic block io queueing can lock up
Product: IO/Storage
Classification: Unclassified
Component: SCSI
All Linux
: P1 normal
Assigned To: linux-scsi@vger.kernel.org
Depends on:
  Show dependency treegraph
Reported: 2010-03-17 22:14 UTC by Mike Hayward
Modified: 2015-02-19 15:41 UTC (History)
2 users (show)

See Also:
Kernel Version: 2.6.22-2.6.32
Tree: Mainline
Regression: No


Description Mike Hayward 2010-03-17 22:14:03 UTC
When queueing, write() can occassionally return ENOMEM or EBUSY.  The
SCSI GENERIC HOWTO indicates ENOMEM can be returned for indirect io
and that it is extremely rare, however I can typically cause it within
an hour even for direct io which shouldn't need to mem copy.  The
EBUSY return is not even a documented error in these circumstances.

Regardless of which error is received, retrying will never succeed and
the fd is wedged at this point.  With EBUSY I've noticed several
concurrent processes running against different sg block devices to
fail simultaneously and never allow a write() to queue a command

This happens when there is plenty of swap, only 20% of ram "used ",
the rest occupied by buffer cache.

There are no errors logged by the driver. Here is an example of the
offending sg_io_hdr, note all values are in hex:

interface_id    S
dxfer_direction fffffffd  (SG_DXFER_FROM_DEV)
cmd_len         a         (it's a READ 10)
mx_sb_len       fc
iovec_count     0
dxfer_len       200000
dxferp          1c1f400
cmdp            896518
sbp             896528
timeout         20000
flags           1         (SG_FLAG_DIRECT_IO)
pack_id         0
usr_ptr         8964e0
Comment 1 xerofoify 2014-06-24 17:07:47 UTC
This bug is against obsolete kernel. Please test newer 
kernel to see if fixed.
Cheers Nick
Comment 2 Alan 2015-02-19 15:41:36 UTC
This bug relates to a very old kernel. Closing as obsolete.

Note You need to log in before you can comment on or make changes to this bug.