|Summary:||sbp2 serialize_io=0 buggy (was: data corruption on ipod using sbp2 module)|
|Product:||Drivers||Reporter:||Thomas Margraf (ug57txm)|
|Component:||IEEE1394||Assignee:||Stefan Richter (stefanr)|
|Severity:||low||CC:||kenny, kevkim55, protasnb|
|Bug Depends on:|
dmesg output log
From syslog: Linux 184.108.40.206, Audigy, Oxford 911 ATA adapter, nastiness
Description Thomas Margraf 2004-01-15 05:27:10 UTC
Distribution: gentoo Hardware Environment: athlon-tbird 1.3ghz Software Environment: kernel 2.6.1 Problem Description: dmesg output: sbp2: $Rev$ Ben Collins <firstname.lastname@example.org> scsi1 : SCSI emulation for IEEE-1394 SBP-2 Devices ieee1394: sbp2: Logged into SBP-2 device ieee1394: sbp2: Node 0-00:1023: Max speed [S400] - Max payload  Vendor: Apple Model: iPod Rev: 1.21 Type: Direct-Access ANSI SCSI revision: 02 SCSI device sda: 19531260 512-byte hdwr sectors (10000 MB) sda: test WP failed, assume Write Enabled sda: asking for cache data failed sda: assuming drive cache: write through /dev/scsi/host1/bus0/target0/lun0: p1 p2 Attached scsi removable disk sda at scsi1, channel 0, id 0, lun 0 Attached scsi generic sg1 at scsi1, channel 0, id 0, lun 0, type 0 ieee1394: sbp2: aborting sbp2 command 0x2a 00 00 02 0b f2 00 00 80 00 ieee1394: sbp2: aborting sbp2 command 0x2a 00 00 02 0c 72 00 00 80 00 ieee1394: sbp2: aborting sbp2 command 0x2a 00 00 02 0c f2 00 00 80 00 ieee1394: sbp2: aborting sbp2 command 0x2a 00 00 02 0d 72 00 00 80 00 ieee1394: sbp2: aborting sbp2 command 0x2a 00 00 02 0d f2 00 00 80 00 ieee1394: sbp2: aborting sbp2 command 0x2a 00 00 02 0e 72 00 00 80 00 ieee1394: sbp2: aborting sbp2 command 0x2a 00 00 02 0e f2 00 00 80 00 Steps to reproduce: modprobe sbp2 mount /dev/sda2 /mnt/ipod copy files to /mnt/ipod the above worked flawlessly on a 2.4.18 kernel.
Comment 1 elias 2004-03-09 14:01:33 UTC
Created attachment 2305 [details] dmesg output log I had this problem with 2.6.1 too. I can connect to iPod with 2.6.3 now, but the connection brakes after a short time and the ipod is reconnected at sdc while it was on sdb before and so on, what leafs my gtkpod app waiting forever, since I can't manage to umount the lost sdb2 connection to remount sdc2 to /mnt/ipod. I have compiled ieee1394, ohci1394, sbp2 and sd_mod directly into kernel. If you need the kernel config or something else, just ask. By the way all this was working in 2.4.20+ but getting the device connected was far more work, so thanks for that part of your work!
Comment 2 R 2005-04-21 11:31:54 UTC
Created attachment 4968 [details] From syslog: Linux 220.127.116.11, Audigy, Oxford 911 ATA adapter, nastiness Messages from the current session of 18.104.22.168, playing with Firewire drive and Audigy.
Comment 3 R 2005-04-21 11:41:13 UTC
Comment on attachment 4968 [details] From syslog: Linux 22.214.171.124, Audigy, Oxford 911 ATA adapter, nastiness I'm finding very similar problems with my Alpha and Audigy and Oxford 911 ATA adapter, and Linux 126.96.36.199. Except, I get another slightly different set of errors. In this case, too, Linux 2.4 had no discernable problem. Also, the same drive has no problem with my brother's Athlon and a more normal Firewire chip and Linux 2.6.10.
Comment 4 Stefan Richter 2005-07-24 03:17:37 UTC
Thomas and "R", please update to Linux 2.6.12 and/or try the suggestions at http://www.linux1394.org/faq.php#sbp2abort The problem reported by Elias (bug 2278) is different because it's cause is a fast succession of bus resets.
Comment 5 Stefan Richter 2005-10-01 01:55:35 UTC
FYI, sbp2 was switched to a safer default mode in Linux 2.6.14-rc3. But the underlying cause of the problem class "works in 2.4 but not 2.6" has not been identified yet AFAIK.
Comment 6 Stefan Richter 2006-04-23 10:46:23 UTC
*** Bug 6093 has been marked as a duplicate of this bug. ***
Comment 7 Stefan Richter 2006-08-16 00:48:04 UTC
Some relevant patches have been recently submitted to Andrew Morton's -mm patchset. I made them also available as patches for released kernels at http://me.in-berlin.de/~s5r6/linux1394/updates/. Some of these fixes may perhaps be released with Linux 2.6.18. Code inspection has shown that there are more changes needed. Also, one of my SBP-2 devices (a TI StorageLynx based HDD) still shows command abortions if sbp2 is loaded with serialize_io=0. (serialize_io=1 is the default since Linux 2.6.14.) All other HDDs and CD/DVD-RWs I have access to work OK. I hope to get the missing changes implemented during the next weeks.
Comment 8 Stefan Richter 2006-08-16 00:54:06 UTC
PS: The patches I referred to reside in "v146_experimental" or later versions at http://me.in-berlin.de/~s5r6/linux1394/updates/.
Comment 9 Stefan Richter 2006-11-01 14:41:27 UTC
I believe the following needs to be done to resolve this (assumed that the devices and their firmwares are bugfree): - If the device has the "ordered" flag set in its Logical_Unit_Number ROM entry, completion of a task means completion of all previous tasks. - If src==1 in a status block, the relating ORB DMA must not be unmapped and reused until status for a subsequent ORB is received. - I suspect all of the protocol handling should be moved out of atomic context into a kernel thread (using the kthread API or workqueue API), so that all usages of sbp2util_node_write_no_wait() can be replaced by true transactions. This would be a continuation of the solution to bug 6948.
Comment 10 Stefan Richter 2006-12-05 12:06:08 UTC
setting "Severity=low" to reflect the diminishing practical impact of this bug since Linux 2.6.14
Comment 11 Natalie Protasevich 2007-09-04 00:51:34 UTC
Thomas and others, Is the problem still present with current kernel (hopefully containing patches that Stefan mentioned in #7-8)? Thanks.
Comment 12 Stefan Richter 2007-09-04 01:42:41 UTC
Natalie, thanks for the heads up. I don't have an iPod but there weren't any bug reports of this kind anymore for a year or so, because sbp2 was changed to safer defaults in Linux 2.6.14. For a non-default mode of operation though (sbp2 module loaded with parameter serialize_io=0), the following remarks from comment #9 still need to be addressed: - If the device has the "ordered" flag set in its Logical_Unit_Number ROM entry, completion of a task means completion of all previous tasks. - If src==1 in a status block, the relating ORB DMA must not be unmapped and reused until status for a subsequent ORB is received. See also the "TODO" comment at the top of drivers/ieee1394/sbp2.c: http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=blob;f=drivers/ieee1394/sbp2.c;h=a81ba8fca0db168314a2ce21aee0cecb0fd4f567#l38
Comment 13 Stefan Richter 2007-09-04 01:45:39 UTC
PS: Patches mentioned in earlier comments have been merged in mainline Linux soon after the comments.
Comment 14 Stefan Richter 2008-02-19 12:00:44 UTC
The problem does not occur if sbp2 is used with default parameters. Also, the alternative firewire-sbp2 driver does not feature this bug. There are currently no resources to fix sbp2 when used with the non-default parameter serialize_io=0.