Bug 1872
Summary: | sbp2 serialize_io=0 buggy (was: data corruption on ipod using sbp2 module) | ||
---|---|---|---|
Product: | Drivers | Reporter: | Thomas Margraf (ug57txm) |
Component: | IEEE1394 | Assignee: | Stefan Richter (stefanr) |
Status: | REJECTED WILL_NOT_FIX | ||
Severity: | low | CC: | kenny, kevkim55, protasnb |
Priority: | P2 | ||
Hardware: | i386 | ||
OS: | Linux | ||
Kernel Version: | all | Subsystem: | |
Regression: | --- | Bisected commit-id: | |
Bug Depends on: | |||
Bug Blocks: | 10046 | ||
Attachments: |
dmesg output log
From syslog: Linux 2.6.11.6, Audigy, Oxford 911 ATA adapter, nastiness |
Description
Thomas Margraf
2004-01-15 05:27:10 UTC
Created attachment 2305 [details]
dmesg output log
I had this problem with 2.6.1 too. I can connect to iPod with 2.6.3 now, but
the connection brakes after a short time and the ipod is reconnected at sdc
while it was on sdb before and so on, what leafs my gtkpod app waiting forever,
since I can't manage to umount the lost sdb2 connection to remount sdc2 to
/mnt/ipod.
I have compiled ieee1394, ohci1394, sbp2 and sd_mod directly into kernel. If
you need the kernel config or something else, just ask.
By the way all this was working in 2.4.20+ but getting the device connected was
far more work, so thanks for that part of your work!
Created attachment 4968 [details]
From syslog: Linux 2.6.11.6, Audigy, Oxford 911 ATA adapter, nastiness
Messages from the current session of 2.6.11.6, playing with Firewire drive and
Audigy.
Comment on attachment 4968 [details]
From syslog: Linux 2.6.11.6, Audigy, Oxford 911 ATA adapter, nastiness
I'm finding very similar problems with my Alpha and Audigy and Oxford 911 ATA
adapter, and Linux 2.6.11.6. Except, I get another slightly different set of
errors.
In this case, too, Linux 2.4 had no discernable problem. Also, the same drive
has no problem with my brother's Athlon and a more normal Firewire chip and
Linux 2.6.10.
Thomas and "R", please update to Linux 2.6.12 and/or try the suggestions at http://www.linux1394.org/faq.php#sbp2abort The problem reported by Elias (bug 2278) is different because it's cause is a fast succession of bus resets. FYI, sbp2 was switched to a safer default mode in Linux 2.6.14-rc3. But the underlying cause of the problem class "works in 2.4 but not 2.6" has not been identified yet AFAIK. *** Bug 6093 has been marked as a duplicate of this bug. *** Some relevant patches have been recently submitted to Andrew Morton's -mm patchset. I made them also available as patches for released kernels at http://me.in-berlin.de/~s5r6/linux1394/updates/. Some of these fixes may perhaps be released with Linux 2.6.18. Code inspection has shown that there are more changes needed. Also, one of my SBP-2 devices (a TI StorageLynx based HDD) still shows command abortions if sbp2 is loaded with serialize_io=0. (serialize_io=1 is the default since Linux 2.6.14.) All other HDDs and CD/DVD-RWs I have access to work OK. I hope to get the missing changes implemented during the next weeks. PS: The patches I referred to reside in "v146_experimental" or later versions at http://me.in-berlin.de/~s5r6/linux1394/updates/. I believe the following needs to be done to resolve this (assumed that the devices and their firmwares are bugfree): - If the device has the "ordered" flag set in its Logical_Unit_Number ROM entry, completion of a task means completion of all previous tasks. - If src==1 in a status block, the relating ORB DMA must not be unmapped and reused until status for a subsequent ORB is received. - I suspect all of the protocol handling should be moved out of atomic context into a kernel thread (using the kthread API or workqueue API), so that all usages of sbp2util_node_write_no_wait() can be replaced by true transactions. This would be a continuation of the solution to bug 6948. setting "Severity=low" to reflect the diminishing practical impact of this bug since Linux 2.6.14 Thomas and others, Is the problem still present with current kernel (hopefully containing patches that Stefan mentioned in #7-8)? Thanks. Natalie, thanks for the heads up. I don't have an iPod but there weren't any bug reports of this kind anymore for a year or so, because sbp2 was changed to safer defaults in Linux 2.6.14. For a non-default mode of operation though (sbp2 module loaded with parameter serialize_io=0), the following remarks from comment #9 still need to be addressed: - If the device has the "ordered" flag set in its Logical_Unit_Number ROM entry, completion of a task means completion of all previous tasks. - If src==1 in a status block, the relating ORB DMA must not be unmapped and reused until status for a subsequent ORB is received. See also the "TODO" comment at the top of drivers/ieee1394/sbp2.c: http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=blob;f=drivers/ieee1394/sbp2.c;h=a81ba8fca0db168314a2ce21aee0cecb0fd4f567#l38 PS: Patches mentioned in earlier comments have been merged in mainline Linux soon after the comments. The problem does not occur if sbp2 is used with default parameters. Also, the alternative firewire-sbp2 driver does not feature this bug. There are currently no resources to fix sbp2 when used with the non-default parameter serialize_io=0. |