Most recent kernel where this bug did not occur: Distribution: Fedora Core 5 (with fedora and vanilla kernels) Hardware Environment: P3-S 1400MHz, 512MB, intel chipset 02:09.0 FireWire (IEEE 1394): Texas Instruments TSB12LV23 IEEE-1394 Controller (prog-if 10 [OHCI]) Subsystem: Texas Instruments Unknown device 8010 Flags: bus master, medium devsel, latency 32, IRQ 21 Memory at ec800000 (32-bit, non-prefetchable) [size=2K] Memory at ec000000 (32-bit, non-prefetchable) [size=16K] Capabilities: [44] Power Management version 1 Problem Description: The Initio 2430 is a FW800 sbp2/3 controller, of course backward compatible to FW100/200/400. Actually connected in FW400 mode, with a FW400<->FW800 cable (unit date 2005). Plugging the device, the following output is produced: scsi3 : SBP-2 IEEE-1394 ieee1394: sbp2: Workarounds for node 0-00:1023: 0x2 (firmware_revision 0x000242, vendor_id 0x001010, model_id 0x000000) ieee1394: sbp2: Logged into SBP-2 device ieee1394: Node 0-00:1023: Max speed [S400] - Max payload [2048] Vendor: Initio Model: SP2014N Rev: 2.42 Type: Direct-Access ANSI SCSI revision: 00 SCSI device sda: 390721968 512-byte hdwr sectors (200050 MB) sda: Write Protect is off sda: Mode Sense: 86 0b 00 02 sda: missing header in MODE_SENSE response SCSI device sda: drive cache: write back SCSI device sda: 390721968 512-byte hdwr sectors (200050 MB) sda: Write Protect is off sda: Mode Sense: 86 0b 00 02 sda: missing header in MODE_SENSE response SCSI device sda: drive cache: write back sda: unknown partition table sd 3:0:0:0: Attached scsi disk sda Note the "missing header in MODE_SENSE response", which does not sound too good. Anyway, the problem occurs when trasfering data to the device, sometimes the copy locks with: ieee1394: sbp2: sbp2util_node_write_no_wait failed. ieee1394: sbp2: aborting sbp2 command sd 3:0:0:0: command: cdb[0]=0x2a: 2a 00 11 35 18 d8 00 00 f8 00 ieee1394: sbp2: sbp2util_node_write_no_wait failed. it restarts and, it seems, there is no data corruption. I read around, this seems to be "common", sometimes blamed to the firmware of the unit. serializing or not makes little difference, only the errors are in sequence or overlapped. Disabling workarounds does not bring anything. Initio does not seem to have firmware updates for this unit. Together with the other, I'm planning to test this thing under windows and see what it says... Steps to reproduce: cp some_files /mnt/initio2430, dmesg...
This is a known problem in sbp2. Whenever "sbp2util_node_write_no_wait failed" appears in the log, it means most certainly thet sbp2 was unable to acquire a free transaction label. I plan to rework sbp2's routine which emits that message to be able to sleep until a transaction label becomes available. This will take some time but I'm on it. Until then you could try if a different filesystem makes this condition less likely. http://sourceforge.net/mailarchive/forum.php?thread_id=25299507&forum_id=5389 The "missing header in MODE_SENSE response" business is just a sign of firmware flaws of the Initio bridge. I have a INIC-2430 based disk too which has the same flaw. Linux SCSI core was hardened to work around that, so it's no problem anymore. And it's unrelated to the "sbp2util_node_write_no_wait failed" bug.
Side note. As reported in http://bugzilla.kernel.org/show_bug.cgi?id=6947#c13 an INIC-1530 is affected too, and comment #1 refers to an OXFW922 bridge. My plan as stated in the mailarchive may take more time, therefore I will try to come up with a simpler temporary solution that is quicker to implement and can be merged sooner.
Created attachment 8766 [details] [RFT PATCH 2.6.17.x] ieee1394: sbp2: handle "sbp2util_node_write_no_wait failed"
Created attachment 8787 [details] [PATCH 2.6.18-rc4-mm1 2/8] ieee1394: sbp2: handle "sbp2util_node_write_no_wait failed" This improved update has been posted to lkml. I will try to get it together with patches which it directly depends on into Linux 2.6.18(-rcX).
fix went into Linux 2.6.18-git16