Bug 15562

Summary: SCSI Generic block io queueing can lock up
Product: IO/Storage Reporter: Mike Hayward (mh-linux-kernel)
Component: SCSIAssignee: linux-scsi (linux-scsi)
Status: RESOLVED OBSOLETE    
Severity: normal CC: alan, xerofoify
Priority: P1    
Hardware: All   
OS: Linux   
Kernel Version: 2.6.22-2.6.32 Subsystem:
Regression: No Bisected commit-id:

Description Mike Hayward 2010-03-17 22:14:03 UTC
When queueing, write() can occassionally return ENOMEM or EBUSY.  The
SCSI GENERIC HOWTO indicates ENOMEM can be returned for indirect io
and that it is extremely rare, however I can typically cause it within
an hour even for direct io which shouldn't need to mem copy.  The
EBUSY return is not even a documented error in these circumstances.

Regardless of which error is received, retrying will never succeed and
the fd is wedged at this point.  With EBUSY I've noticed several
concurrent processes running against different sg block devices to
fail simultaneously and never allow a write() to queue a command
again.

This happens when there is plenty of swap, only 20% of ram "used ",
the rest occupied by buffer cache.

There are no errors logged by the driver. Here is an example of the
offending sg_io_hdr, note all values are in hex:

interface_id    S
dxfer_direction fffffffd  (SG_DXFER_FROM_DEV)
cmd_len         a         (it's a READ 10)
mx_sb_len       fc
iovec_count     0
dxfer_len       200000
dxferp          1c1f400
cmdp            896518
sbp             896528
timeout         20000
flags           1         (SG_FLAG_DIRECT_IO)
pack_id         0
usr_ptr         8964e0
Comment 1 xerofoify 2014-06-24 17:07:47 UTC
This bug is against obsolete kernel. Please test newer 
kernel to see if fixed.
Cheers Nick
Comment 2 Alan 2015-02-19 15:41:36 UTC
This bug relates to a very old kernel. Closing as obsolete.