Subject : 2.6.25-rc2: ohci1394 problem Submitter : Thomas Meyer <thomas@m3y3r.de> Date : 2008-02-20 08:47 References : http://lkml.org/lkml/2008/2/20/58 Handled-By : Stefan Richter <stefanr@s5r6.in-berlin.de> This entry is being used for tracking a regression from 2.6.24. Please don't close it until the problem is fixed in the mainline or the report is rejected.
Summary of latest feedback (http://lkml.org/lkml/2008/2/23/244 + http://lkml.org/lkml/2008/2/23/259): - The presence of both ohci1394 and firewire-ohci does not seem to be the problem. - Several or all of ohci1394's MMIO reads return ~0 (all bits set to one). Therefore it is unlikely that the problem is caused by the IEEE1394 subsystem. - Is this really a regression relative to 2.6.24?
Thomas, can you verify if 2.6.24 works correctly, please?
Thomas response in http://marc.info/?l=linux-kernel&m=120396387509950 : - Was apparently a mixup when building the kernel; went away after "make clean". - In any case, 2.6.24 was not affected.
New status: Bug reappeared with a make distclean; make on v2.6.25-rc6-14-gbde4f8f.
Created attachment 15396 [details] config used
Created attachment 15397 [details] dmesg
$ git describe v2.6.25-rc6-243-g028011e
Handled-By : Nobody
I'm confused about this. I looked at the original threads, and what really stands out to me is that the original reporter had two drivers loaded for the same hardware (firewire-ohci and ohci1394.) *In the best case* there is a fundamental race condition there, meaning unpredictable behaviour would be the norm.
Can someone publish the "lspci -vv" for the affected system?
Reply-To: stefanr@s5r6.in-berlin.de H. Peter Anvin wrote at http://bugzilla.kernel.org/show_bug.cgi?id=10080#c9 : > I'm confused about this. I looked at the original threads, and what really > stands out to me is that the original reporter had two drivers loaded for the > same hardware (firewire-ohci and ohci1394.) *In the best case* there is a > fundamental race condition there, meaning unpredictable behaviour would be > the > norm. Hmm, right -- I didn't see this until now. Today's dmesg: http://bugzilla.kernel.org/attachment.cgi?id=15397&action=view [ 1.236587] firewire_ohci: Failed to remap registers [ 243.640549] ohci1394: fw-host0: Get PHY Reg timeout (etc.) However, the two drivers for the same device don't seem to be the problem. Looks like firewire-ohci was attempted to be bound to the controller much earlier than ohci1394. The error message means that firewire-ohci's pci_request_region() succeeded but pci_iomap() failed, hence the pci_driver.probe failed, hence firewire-ohci wasn't bound to the device, hence subsequent loading of ohci1394 (manually, I presume) was a valid action. IOW firewire-ohci was indeed already loaded, but not bound to the device because of the .probe failure; and ohci1394 was loaded much later. Same thing in the report in February: http://lkml.org/lkml/2008/2/23/244 [ 1.326958] firewire_ohci: Failed to remap registers [ 856.943807] ohci1394: fw-host0: Get PHY Reg timeout (here: ohci1394 manually loaded by insmod) (Let's see if bugme-daemon captures this...)
Proposed patch: http://lkml.org/lkml/2008/3/22/175
Created attachment 15417 [details] lspci -vv
Linus' patch as per comment #12 has been committed: commit b9e76a00749521f2b080fa8a4fb15f66538ab756 (I suppose it still needs to be tested by Thomas)
Additional proposed patch by Ingo: http://lkml.org/lkml/2008/3/25/32
Created attachment 15433 [details] dmesg 2.6.24
Created attachment 15434 [details] lspci -vv 2.6.24
Created attachment 15435 [details] lsmod 2.6.24
Fixed by: http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=12c22d6ef299ccf0955e5756eb57d90d7577ac68