Bug 36622

Summary: fw-ohci: Pinnacle MovieBoard unsupported (was: panic in at_context_transmit after "Register access failure")
Product: Drivers Reporter: Stefan Richter (stefanr)
Component: IEEE1394Assignee: drivers_ieee1394
Status: CLOSED WILL_NOT_FIX    
Severity: normal CC: alan, andrea.vai, florian
Priority: P1    
Hardware: All   
OS: Linux   
Kernel Version: 2.6.32, 3.0 Subsystem:
Regression: No Bisected commit-id:
Attachments: panic screenshot
session with device plugged in shortly after initialization
session without anything plugged in

Description Stefan Richter 2011-06-03 22:48:41 UTC
Reported by Joel Bourrigaud on 31 May 2011,
http://thread.gmane.org/gmane.linux.kernel.firewire.devel/14982

>>>
i have encoutered the attached kernel panic error. It is written to 
notify you then i do as a newbie. I have connected my DV video camera to 
capture the video from my second boot W7. it was working good and then i 
restart my pc on my first boot LINUX mint. as shown in attached picture 
a blue screen of death (from linux) also call kernel panic appears. I 
found the work around:
shutdown the PC
disconnected the 1394 cable
power on the PC.
<<<

From the screenshot:

Kernel panic - not syncing: Fatal exception in interrupt
Pid: 0, comm: swapper Tainted: G   D  2.6.32-5-686 #1
Call Trace:
 [...] ? panic+0x38/0xe6
 [...] ? oops_end+0x91/0x9d
 [...] ? no_context+0x105/0x10e
 [...] ? __bad_area_nosemaphore+0x115/0x11d
 [...] ? up+0x9/0x2a
 [...] ? release_console_sem+0x174/0x1a2
 [...] ? do_page_fault+0x0/0x307
 [...] ? bad_area_nosemaphore+0xa/0xc
 [...] ? error_code+0x73/0x78
 [...] ? at_context_transmit+0xf1/0x480 [firewire_ohci]
 [...] ? tasklet_action+0x67/0xad
 [...] ? __do_softirq+0xaa/0x156
 [...] ? do_softirq+0x31/0x3c
 [...] ? irq_exit+0x26/0x58
 [...] ? do_IRQ+0x78/0x89
 [...] ? common_interrupt+0x30/0x38
 [...] ? mwait_idle+0x62/0x6c
 [...] ? cpu_idle+0x89/0xa5
firewire_ohci_ Register access failure - please notify linux1394-devel@lists.sf.net
 [...] ? start_kernel+0x318/0x31d
Comment 1 Stefan Richter 2011-06-03 22:56:09 UTC
Created attachment 60702 [details]
panic screenshot
Comment 2 Stefan Richter 2011-07-09 22:12:45 UTC
Report from Bjørn Forbord:
http://marc.info/?l=linux1394-devel&m=130626533011925

Reproduced by me:
http://marc.info/?l=linux1394-devel&m=131024849331607
Comment 3 Stefan Richter 2011-07-09 22:24:46 UTC
Created attachment 65122 [details]
session with device plugged in shortly after initialization
Comment 4 Stefan Richter 2011-07-09 22:25:25 UTC
Created attachment 65132 [details]
session without anything plugged in
Comment 5 Stefan Richter 2011-07-09 22:34:45 UTC
Provisional patch to disable firewire-ohci probe on Pinnacle cards:
http://marc.info/?l=linux1394-devel&m=131025022232574
Comment 6 Florian Mickler 2011-07-12 09:08:07 UTC
A patch referencing this bug report has been merged in Linux v3.0-rc7:

commit 7f7e37115a8b6724f26d0637a04e1d35e3c59717
Author: Stefan Richter <stefanr@s5r6.in-berlin.de>
Date:   Sun Jul 10 00:23:03 2011 +0200

    firewire: ohci: do not bind to Pinnacle cards, avert panic
Comment 7 Stefan Richter 2011-07-15 07:18:20 UTC
According to its PHY vendor/model ID 00000e:086613, the OHCI-1394 part of Pinnacle MovieBoard has been derived from Fujitsu MB86613,
http://www.fujitsu.com/downloads/MICRO/fma/pdf/mb86613l_um.pdf

The Pinnacle device additionally contains a proprietary(?) analogue video capture part though:

0c:07.0 Multimedia controller [0480]: Pinnacle Systems Inc. AV/DV Studio Capture Card [11bd:bede]
        Subsystem: Pinnacle Systems Inc. Device [11bd:0023]
        Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV+ VGASnoop- ParErr- Stepping- SERR+ FastB2B- DisINTx-
        Status: Cap+ 66MHz- UDF- FastB2B+ ParErr- DEVSEL=medium >TAbort- <TAbort+ <MAbort+ >SERR- <PERR- INTx-
        Latency: 64 (2000ns min, 4000ns max), Cache Line Size: 64 bytes
        Interrupt: pin A routed to IRQ 10
        Region 0: Memory at fb7ff000 (32-bit, non-prefetchable) [size=4K]
        Capabilities: [40] Power Management version 2
                Flags: PMEClk- DSI+ D1- D2- AuxCurrent=0mA PME(D0-,D1-,D2-,D3hot-,D3cold-)
                Status: D0 NoSoftRst- PME-Enable- DSel=0 DScale=0 PME-

0c:07.1 FireWire (IEEE 1394) [0c00]: Pinnacle Systems Inc. Device [11bd:0015] (prog-if 10 [OHCI])
        Subsystem: Pinnacle Systems Inc. Device [11bd:0023]
        Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR+ FastB2B- DisINTx-
        Status: Cap+ 66MHz- UDF- FastB2B+ ParErr- DEVSEL=medium >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
        Latency: 64 (8000ns min, 20000ns max), Cache Line Size: 64 bytes
        Interrupt: pin A routed to IRQ 10
        Region 0: Memory at fb7fe800 (32-bit, non-prefetchable) [size=2K]
        Capabilities: [44] Power Management version 2
                Flags: PMEClk- DSI- D1+ D2+ AuxCurrent=0mA PME(D0-,D1+,D2+,D3hot+,D3cold-)
                Status: D0 NoSoftRst- PME-Enable- DSel=0 DScale=0 PME-
        Kernel modules: firewire-ohci

Clemens Ladisch wrote:
> That controller's datasheet says that ack_tardy is an extension that
> indicates that "a packet was received while the sytem power was off".
>
> Section 7.4.4 says that the driver initialization must clear the PME
> enable bit and acknowledge any PME that happened before.  I'd guess this
> interrupt storm happens because the PME status is never cleared.

Though according to lspci, PME is cleared.

> So much for that theory ...
> 
> But the datasheet says the MB86613 has PM version 1 and doesn't have D1,
> so it looks as if Pinnacle has implemented its own PM stuff.  

As for the way forward, I think I should try something with the reg-access-fail check moved out of the IRQ handler into all call sites that access SClk domain registers.
Comment 8 Stefan Richter 2012-04-02 05:39:16 UTC
Patch "firewire: ohci: handle register access failure in SClk domain":
http://marc.info/?l=linux1394-devel&m=133332120525432

This does not fix the card yet:
http://marc.info/?l=linux1394-devel&m=133332196725635
Comment 9 Florian Mickler 2012-04-04 14:57:47 UTC
A patch referencing a commit referencing this bug report has been merged in Linux v3.4-rc1:

commit 98466cc4502b3171f1bdc146db0d2106fcbc3f4f
Author: Stefan Richter <stefanr@s5r6.in-berlin.de>
Date:   Sun Mar 4 14:24:31 2012 +0100

    firewire: tone down some diagnostic log messages
Comment 10 Stefan Richter 2012-06-13 16:51:13 UTC
Clemens pointed out that the card sets several bogus interrupt event bits when it starts its IRQ storm.  The presence of the regAccessFail bit is likely purely random.  Therefore, implementation of regAccessFail handling won't be sufficient or even relevant to getting this card to work.

I currently don't have further ideas what to try with this card, hence set this bug to WILL_NOT_FIX.  Could be revisited if anybody comes up with a great suggestion.