Bug 7771

Summary: ieee1394, nodemgr: config ROM scanning fails too early
Product: Drivers Reporter: Stefan Richter (stefanr)
Component: IEEE1394Assignee: Stefan Richter (stefanr)
Status: REJECTED WILL_NOT_FIX    
Severity: low CC: protasnb, rdunlap
Priority: P2    
Hardware: i386   
OS: Linux   
Kernel Version: all Subsystem:
Regression: --- Bisected commit-id:
Bug Depends on:    
Bug Blocks: 10046    

Description Stefan Richter 2007-01-05 01:59:36 UTC
SBP-2 clause 7.1 says: If a target has insufficient information available after
power reset, it responds to configuration ROM requests at CSR offset 0x400 with
a data value of 0 or acknowledges these requests with ack_tardy. This condition
may last up to 5 seconds. If the target becomes responsive within those 5
seconds, it shall _not_ announce availability of the ROM by a bus reset.

Problem: ieee1394's nodemgr gives up on ROM reading after it got ack_tardy 3
times within 0.92 seconds after bus reset or if it got a data value of 0. It
therefore fails to detect the ROM (i.e. identity and capabilities) of a node
unless a bus reset follows (forced by the target or manually by the user).

This bug has low priority because all targets known to me force a bus reset,
seemingly regardless of the 5 seconds criterion.
Comment 1 Stefan Richter 2007-01-05 02:08:54 UTC
Requirements for a solution:

1. Get rid of the extra layer of indirection that the IEEE 1394 agnostic design
of the IEEE 1212 library (csr1212.[ch]) forces onto the IEEE 1394 stack.

2. Modify the nodemgr_host_thread context to account for 0-value returns and
ack_tardy acknowledges for at least 5 seconds, while still being responsive to a
bus reset while being stuck in such a retry loop. Among else, this might require
to get rid of the global nodemgr_serialize mutex.
Comment 2 Natalie Protasevich 2007-07-04 16:55:29 UTC
Stephan,
Sounds like this is your to-do item. Would be nice to have "new feature suggestion" or "rework" category. Maybe it should be marked "Will fix later"? - depending how soon you plan to get to it probably. And set approximate target kernel level, so it can be searcheable etc.
Regards, -- Natalie.
Comment 3 Stefan Richter 2007-07-05 11:46:35 UTC
I did a few first steps for item 1 from comment #1 and this has been merged, and even a little bit for item 2 will be merged in the next window.  But since the new firewire stack went mainline recently, my plan is now to wait for initial testing experience with the new drivers, then switch this bug to WILL_NOT_FIX (except in the unlikely case that the new drivers turn out to be a catastrophic failure).  In the meantime, the WILL_FIX_LATER flag is indeed quite right.

Regarding further categorization of this bug:  For my personal needs as subsystem maintainer, the current data fields are entirely sufficient.  But I guess some tags like you mentioned would be good for those who want a qualified overall picture of all open bugs.

Regarding estimated dates/ target releases for fixes:  I generally can't supply them.  I try to fix high-profile bugs and regressions ASAP; the rest has to wait until I've got the time and mood or somebody helps.  I.e. WILL_FIX_LATER is somewhat applicable to many of my bugs.  (We have got years old known bugs in the IEEE 1394 subsystem.  But the intent is that the new firewire stack fixes most or all of them and also lowers maintenance burden in the long run, so that we don't get into the same position anytime soon again.)
Comment 4 Stefan Richter 2008-02-19 12:11:01 UTC
There are currently no resources to fix this in drivers/ieee1394/.
drivers/firewire/ does not feature this problem.