Bug 14325 - sbp2_scsi_abort causes stall in disk I/O
Summary: sbp2_scsi_abort causes stall in disk I/O
Status: RESOLVED OBSOLETE
Alias: None
Product: Drivers
Classification: Unclassified
Component: IEEE1394 (show other bugs)
Hardware: All Linux
: P1 normal
Assignee: drivers_ieee1394
URL: https://bugs.launchpad.net/ubuntu/+so...
Keywords:
Depends on:
Blocks:
 
Reported: 2009-10-05 03:56 UTC by Ethan
Modified: 2013-12-10 16:50 UTC (History)
3 users (show)

See Also:
Kernel Version: 2.6.31 Ubuntu
Subsystem:
Regression: No
Bisected commit-id:


Attachments

Description Ethan 2009-10-05 03:56:49 UTC
Recently I had trouble with seemingly random stalls in video playback.  Tracked this back to I/O stall on the external firewire drive due to hdparm being queried (hdparm -i /dev/sdb) to see if an acpi script (/usr/lib/pm-utils/power.d/95hdparm-apm) can reset the APM policy for the drive.  (why they need to do this on a periodic basis instead of event-driven I don't know, I have already reported the bug to the script writers in pm-utils package: https://bugs.launchpad.net/ubuntu/+source/pm-utils/+bug/440338 )

But it occurs to me that the hdparm call maybe shouldn't stall access to the drive.  Perhaps there is something you can do about this... or maybe not...

When using the old sbp2 kernel module for access, I would get messages like this in /var/log/messages:
Oct 1 23:13:42 ginger kernel: [260307.000180] ieee1394: sbp2: aborting sbp2 command
Oct 1 23:13:42 ginger kernel: [260307.000213] sd 2:0:0:0: [sdb] CDB: ATA command pass through(16): 85 08 2e 00 00 00 00 00 00 00 00 00 00 40 a1 00
Oct 1 23:13:42 ginger kernel: [260362.000084] ieee1394: sbp2: aborting sbp2 command
Oct 1 23:13:42 ginger kernel: [260362.000103] sd 2:0:0:0: [sdb] CDB: ATA command pass through(16): 85 08 2e 00 00 00 00 00 00 00 00 00 00 40 ec 00
Oct 1 23:13:42 ginger kernel: [260370.000194] ieee1394: sbp2: aborting sbp2 command
Oct 1 23:13:42 ginger kernel: [260370.000221] sd 2:0:0:0: [sdb] CDB: ATA command pass through(16): 85 08 2e 00 00 00 00 00 00 00 00 00 00 40 a1 00

When using firewire-sbp2, the stall is significantly shorter, but still an issue during video playback.  Messages like the following are given:
Oct  4 21:30:58 ginger kernel: [ 6129.000158] firewire_sbp2: fw1.0: sbp2_scsi_abort
Oct  4 21:30:58 ginger kernel: [ 6183.000119] firewire_sbp2: fw1.0: sbp2_scsi_abort
Oct  4 21:30:58 ginger kernel: [ 6191.000126] firewire_sbp2: fw1.0: sbp2_scsi_abort
Oct  4 21:30:58 ginger kernel: [ 6246.000133] firewire_sbp2: fw1.0: sbp2_scsi_abort
Oct  4 21:30:58 ginger kernel: [ 6254.000131] firewire_sbp2: fw1.0: sbp2_scsi_abort
Oct  4 21:30:58 ginger kernel: [ 6296.552545] eth0: increased tx threshold, txcfg 0xd0f0100e.
Oct  4 21:30:58 ginger kernel: [ 6309.000150] firewire_sbp2: fw1.0: sbp2_scsi_abort
Oct  4 21:30:58 ginger kernel: [ 6317.000105] firewire_sbp2: fw1.0: sbp2_scsi_abort
Oct  4 21:30:58 ginger kernel: [ 6624.000128] firewire_sbp2: fw1.0: sbp2_scsi_abort

Off topic: Ubuntu 9.10 is still set to default to the older drivers btw, took some digging to realize you have new drivers coming out and to give them a shot.  (Ubuntu does install both binaries)  Might suggest that they switch the default over?  Otherwise, let me know if there's any data integrity issues if I stick with the new driver for my drive!  Your site sounds like the new drivers are mature enough for usage, or at least to the point where it's no more buggy than the old drivers... thanks!
Comment 1 Stefan Richter 2009-10-06 11:26:25 UTC
changing kernel version datum from
"2.6.31-11-generic #38-Ubuntu SMP Fri Oct 2 11:55:55 UTC 2009 i686 GNU/Linux"
to something shorter for narrower table columns in bug lists
Comment 2 Anonymous Emailer 2009-10-08 17:22:55 UTC
Reply-To: stefanr@s5r6.in-berlin.de

> Recently I had trouble with seemingly random stalls in video playback. 
> Tracked
> this back to I/O stall on the external firewire drive due to hdparm being
> queried (hdparm -i /dev/sdb)
[...]
> Oct 1 23:13:42 ginger kernel: [260307.000180] ieee1394: sbp2: aborting sbp2
> command
> Oct 1 23:13:42 ginger kernel: [260307.000213] sd 2:0:0:0: [sdb] CDB: ATA
> command pass through(16): 85 08 2e 00 00 00 00 00 00 00 00 00 00 40 a1 00

The bug title mistakes cause and effect.  The PC sends a command to the
device, the device doesn't finish it before a time-out decided by the
higher SCSI levels, and the SCSI core instructs sbp2 to abort the command.

The causes are:

(1.) hdparm generates a command which the device doesn't support.  I
don't know whether hdparm checks whether commands are supported, and if
so, if the device mistakenly indicates support for it.  What is logged
in dmesg when the device is plugged in and discovered by sbp2 and the
SCSI stack (i.e. the inquiry response)?

(2.) When a device receives a command which it doesn't implement, it
should immediately return respective error status.  Your device however
simply does nothing, hence the client has to wait idly until timeout.

The latter bug of the device might perhaps be mitigated if the sbp2 or
firewire-sbp2 driver allowed to issue several commands in parallel.  The
sbp2 driver actually implements this but in a buggy way.  You can try it
by means of the module parameter serialize_io=0.  This may do more harm
than good though because sbp2 is not stable in this mode (bug 1872) and
because the device could be confused even more if the unsupported
command overlaps transactions of supported commands.

(The RFE to add an equivalent of serialize_io=0 of sbp2 to
firewire-sbp2, but without the bugs of sbp2, is logged at
http://ieee1394.wiki.kernel.org/index.php/To_Do#firewire-sbp2.)

> Ubuntu 9.10 is still set to default to the older drivers btw, took
> some digging to realize you have new drivers coming out and to give them a
> shot.  (Ubuntu does install both binaries)

Isn't 9.10 the first Ubuntu release which provides the new drivers?

> Might suggest that they switch the default over?

I recommend to do so (after perusal of
http://ieee1394.wiki.kernel.org/index.php/Juju_Migration).  As far as I
can tell, the new drivers are ready for the mainstream, and they should
be especially attractive to Ubuntu because they eliminate the user
permissions problem of /dev/raw1394.

> Otherwise, let me know if there's any data integrity issues if I
> stick with the new driver for my drive!

There are none; SBP-2 support belongs to the most mature features of the
new stack.  To my knowledge, it is actually more robust than SBP-2
support in the old stack.
Comment 3 Stefan Richter 2009-12-30 14:16:18 UTC
Note to self:  This report is about a Macally G-S350SUA which, according to some web shops, is based on Initio INIC1530 (IEEE 1394a/ USB 2.0 to IDE bridge) combined with SunplusIT SPIF223 (IDE to SATA bridge).

Note You need to log in before you can comment on or make changes to this bug.