Distribution: gentoo Hardware Environment: athlon-tbird 1.3ghz Software Environment: kernel 2.6.1 Problem Description: dmesg output: sbp2: $Rev$ Ben Collins <bcollins@debian.org> scsi1 : SCSI emulation for IEEE-1394 SBP-2 Devices ieee1394: sbp2: Logged into SBP-2 device ieee1394: sbp2: Node 0-00:1023: Max speed [S400] - Max payload [2048] Vendor: Apple Model: iPod Rev: 1.21 Type: Direct-Access ANSI SCSI revision: 02 SCSI device sda: 19531260 512-byte hdwr sectors (10000 MB) sda: test WP failed, assume Write Enabled sda: asking for cache data failed sda: assuming drive cache: write through /dev/scsi/host1/bus0/target0/lun0: p1 p2 Attached scsi removable disk sda at scsi1, channel 0, id 0, lun 0 Attached scsi generic sg1 at scsi1, channel 0, id 0, lun 0, type 0 ieee1394: sbp2: aborting sbp2 command 0x2a 00 00 02 0b f2 00 00 80 00 ieee1394: sbp2: aborting sbp2 command 0x2a 00 00 02 0c 72 00 00 80 00 ieee1394: sbp2: aborting sbp2 command 0x2a 00 00 02 0c f2 00 00 80 00 ieee1394: sbp2: aborting sbp2 command 0x2a 00 00 02 0d 72 00 00 80 00 ieee1394: sbp2: aborting sbp2 command 0x2a 00 00 02 0d f2 00 00 80 00 ieee1394: sbp2: aborting sbp2 command 0x2a 00 00 02 0e 72 00 00 80 00 ieee1394: sbp2: aborting sbp2 command 0x2a 00 00 02 0e f2 00 00 80 00 Steps to reproduce: modprobe sbp2 mount /dev/sda2 /mnt/ipod copy files to /mnt/ipod the above worked flawlessly on a 2.4.18 kernel.
Created attachment 2305 [details] dmesg output log I had this problem with 2.6.1 too. I can connect to iPod with 2.6.3 now, but the connection brakes after a short time and the ipod is reconnected at sdc while it was on sdb before and so on, what leafs my gtkpod app waiting forever, since I can't manage to umount the lost sdb2 connection to remount sdc2 to /mnt/ipod. I have compiled ieee1394, ohci1394, sbp2 and sd_mod directly into kernel. If you need the kernel config or something else, just ask. By the way all this was working in 2.4.20+ but getting the device connected was far more work, so thanks for that part of your work!
Created attachment 4968 [details] From syslog: Linux 2.6.11.6, Audigy, Oxford 911 ATA adapter, nastiness Messages from the current session of 2.6.11.6, playing with Firewire drive and Audigy.
Comment on attachment 4968 [details] From syslog: Linux 2.6.11.6, Audigy, Oxford 911 ATA adapter, nastiness I'm finding very similar problems with my Alpha and Audigy and Oxford 911 ATA adapter, and Linux 2.6.11.6. Except, I get another slightly different set of errors. In this case, too, Linux 2.4 had no discernable problem. Also, the same drive has no problem with my brother's Athlon and a more normal Firewire chip and Linux 2.6.10.
Thomas and "R", please update to Linux 2.6.12 and/or try the suggestions at http://www.linux1394.org/faq.php#sbp2abort The problem reported by Elias (bug 2278) is different because it's cause is a fast succession of bus resets.
FYI, sbp2 was switched to a safer default mode in Linux 2.6.14-rc3. But the underlying cause of the problem class "works in 2.4 but not 2.6" has not been identified yet AFAIK.
*** Bug 6093 has been marked as a duplicate of this bug. ***
Some relevant patches have been recently submitted to Andrew Morton's -mm patchset. I made them also available as patches for released kernels at http://me.in-berlin.de/~s5r6/linux1394/updates/. Some of these fixes may perhaps be released with Linux 2.6.18. Code inspection has shown that there are more changes needed. Also, one of my SBP-2 devices (a TI StorageLynx based HDD) still shows command abortions if sbp2 is loaded with serialize_io=0. (serialize_io=1 is the default since Linux 2.6.14.) All other HDDs and CD/DVD-RWs I have access to work OK. I hope to get the missing changes implemented during the next weeks.
PS: The patches I referred to reside in "v146_experimental" or later versions at http://me.in-berlin.de/~s5r6/linux1394/updates/.
I believe the following needs to be done to resolve this (assumed that the devices and their firmwares are bugfree): - If the device has the "ordered" flag set in its Logical_Unit_Number ROM entry, completion of a task means completion of all previous tasks. - If src==1 in a status block, the relating ORB DMA must not be unmapped and reused until status for a subsequent ORB is received. - I suspect all of the protocol handling should be moved out of atomic context into a kernel thread (using the kthread API or workqueue API), so that all usages of sbp2util_node_write_no_wait() can be replaced by true transactions. This would be a continuation of the solution to bug 6948.
setting "Severity=low" to reflect the diminishing practical impact of this bug since Linux 2.6.14
Thomas and others, Is the problem still present with current kernel (hopefully containing patches that Stefan mentioned in #7-8)? Thanks.
Natalie, thanks for the heads up. I don't have an iPod but there weren't any bug reports of this kind anymore for a year or so, because sbp2 was changed to safer defaults in Linux 2.6.14. For a non-default mode of operation though (sbp2 module loaded with parameter serialize_io=0), the following remarks from comment #9 still need to be addressed: - If the device has the "ordered" flag set in its Logical_Unit_Number ROM entry, completion of a task means completion of all previous tasks. - If src==1 in a status block, the relating ORB DMA must not be unmapped and reused until status for a subsequent ORB is received. See also the "TODO" comment at the top of drivers/ieee1394/sbp2.c: http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=blob;f=drivers/ieee1394/sbp2.c;h=a81ba8fca0db168314a2ce21aee0cecb0fd4f567#l38
PS: Patches mentioned in earlier comments have been merged in mainline Linux soon after the comments.
The problem does not occur if sbp2 is used with default parameters. Also, the alternative firewire-sbp2 driver does not feature this bug. There are currently no resources to fix sbp2 when used with the non-default parameter serialize_io=0.