Bug 194913

Summary: Unstable issues with FW643 and firewire_ohci
Product: Drivers Reporter: Julian (dev)
Component: IEEE1394Assignee: drivers_ieee1394
Status: RESOLVED UNREPRODUCIBLE    
Severity: high    
Priority: P1    
Hardware: All   
OS: Linux   
Kernel Version: 4.9.13-rt12 Subsystem:
Regression: No Bisected commit-id:

Description Julian 2017-03-18 02:41:55 UTC
Hi,
Switching on or pluging in this firewire device (saphire pro 40 audio interface)
I randomly get the following errors. Note I unbinded the Texas firewire controller however I do not get this error with the Texas controller.

I'm already loading kernel with options firewire_ohci quirks=0x10 as MSI errors fill the kernel log.

Unable to unload firewire_ohci modules. Unable to clean shut down forced hard reset required (SysReq u/b/o).

If you require anything else let me know.
Regards,
Julian.

=============================================================================
device:
=============================================================================
83:00.0 FireWire (IEEE 1394): LSI Corporation FW643 [TrueFire] PCIe 1394b Controller (rev 08) (prog-if 10 [OHCI])
        Subsystem: Device 5901:1101
        Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx-
        Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
        Latency: 0, Cache Line Size: 256 bytes
        Interrupt: pin A routed to IRQ 16
        NUMA node: 1
        Region 0: Memory at cdefe000 (64-bit, non-prefetchable) [size=4K]
        Capabilities: [44] Power Management version 3
                Flags: PMEClk- DSI- D1+ D2+ AuxCurrent=375mA PME(D0+,D1+,D2+,D3hot+,D3cold+)
                Status: D0 NoSoftRst- PME-Enable- DSel=0 DScale=0 PME+
        Capabilities: [4c] MSI: Enable- Count=1/1 Maskable- 64bit+
                Address: 0000000000000000  Data: 0000
        Capabilities: [60] Express (v1) Endpoint, MSI 00
                DevCap: MaxPayload 256 bytes, PhantFunc 0, Latency L0s <4us, L1 <64us
                        ExtTag- AttnBtn- AttnInd- PwrInd- RBE+ FLReset- SlotPowerLimit 0.000W
                DevCtl: Report errors: Correctable- Non-Fatal- Fatal- Unsupported-
                        RlxdOrd+ ExtTag- PhantFunc- AuxPwr- NoSnoop-
                        MaxPayload 256 bytes, MaxReadReq 2048 bytes
                DevSta: CorrErr- UncorrErr- FatalErr- UnsuppReq- AuxPwr+ TransPend-
                LnkCap: Port #0, Speed 2.5GT/s, Width x1, ASPM L0s L1, Exit Latency L0s <512ns, L1 <64us
                        ClockPM+ Surprise- LLActRep- BwNot- ASPMOptComp-
                LnkCtl: ASPM Disabled; RCB 64 bytes Disabled- CommClk+
                        ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt-
                LnkSta: Speed 2.5GT/s, Width x1, TrErr- Train- SlotClk+ DLActive- BWMgmt- ABWMgmt-
        Capabilities: [100 v1] Advanced Error Reporting
                UESta:  DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
                UEMsk:  DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
                UESvrt: DLP+ SDES+ TLP- FCP+ CmpltTO- CmpltAbrt- UnxCmplt- RxOF+ MalfTLP+ ECRC- UnsupReq- ACSViol-
                CESta:  RxErr- BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr-
                CEMsk:  RxErr- BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr+
                AERCap: First Error Pointer: 00, GenCap+ CGenEn- ChkCap+ ChkEn-
        Capabilities: [140 v1] Virtual Channel
                Caps:   LPEVC=0 RefClk=100ns PATEntryBits=1
                Arb:    Fixed- WRR32- WRR64- WRR128-
                Ctrl:   ArbSelect=Fixed
                Status: InProgress-
                VC0:    Caps:   PATOffset=00 MaxTimeSlots=1 RejSnoopTrans-
                        Arb:    Fixed- WRR32- WRR64- WRR128- TWRR128- WRR256-
                        Ctrl:   Enable+ ID=0 ArbSelect=Fixed TC/VC=ff
                        Status: NegoPending- InProgress-
                VC1:    Caps:   PATOffset=00 MaxTimeSlots=1 RejSnoopTrans-
                        Arb:    Fixed- WRR32- WRR64- WRR128- TWRR128- WRR256-
                        Ctrl:   Enable- ID=0 ArbSelect=Fixed TC/VC=00
                        Status: NegoPending- InProgress-
        Capabilities: [170 v1] Device Serial Number 01-08-00-00-00-01-8e-26
        Kernel modules: firewire_ohci

=============================================================================
Kernel Traces:
=============================================================================
Mar 18 13:18:55 venus kernel: firewire_ohci 0000:06:05.0: removed fw-ohci device
Mar 18 13:20:51 venus kernel: firewire_core 0000:83:00.0: phy config: new root=ffc1, gap_count=5
Mar 18 13:20:51 venus kernel: firewire_ohci 0000:83:00.0: failed to read phy reg 5
Mar 18 13:20:51 venus kernel: CPU: 2 PID: 9009 Comm: kworker/2:3 Tainted: P          IO    4.9.13-rt12 #1
Mar 18 13:20:51 venus kernel: Hardware name: Supermicro X8DAH/X8DAH, BIOS 2.1        12/30/2011
Mar 18 13:20:51 venus kernel: Workqueue: events bm_work [firewire_core]
Mar 18 13:20:51 venus kernel:  ffffc90017827d08 ffffffff813602f2 0000000000000067 ffff88144c4f4000
Mar 18 13:20:51 venus kernel:  ffffc90017827d30 ffffffffa01015cd 0000000000000000 0000000000000005
Mar 18 13:20:51 venus kernel:  0000000000000040 ffffc90017827d60 ffffffffa0101650 ffff88144c4f4648
Mar 18 13:20:51 venus kernel: Call Trace:
Mar 18 13:20:51 venus kernel:  [<ffffffff813602f2>] dump_stack+0x4d/0x6b
Mar 18 13:20:51 venus kernel:  [<ffffffffa01015cd>] read_phy_reg+0x8d/0xa0 [firewire_ohci]
Mar 18 13:20:51 venus kernel:  [<ffffffffa0101650>] update_phy_reg+0x20/0x50 [firewire_ohci]
Mar 18 13:20:51 venus kernel:  [<ffffffffa01016be>] ohci_update_phy_reg+0x3e/0x60 [firewire_ohci]
Mar 18 13:20:51 venus kernel:  [<ffffffffa00add7c>] bm_work+0x1dc/0x470 [firewire_core]
Mar 18 13:20:51 venus kernel:  [<ffffffffa00adb4e>] ? fw_schedule_bm_work+0x3e/0x90 [firewire_core]
Mar 18 13:20:51 venus kernel:  [<ffffffffa00b138b>] ? fw_device_update+0x2b/0x30 [firewire_core]
Mar 18 13:20:51 venus kernel:  [<ffffffff810740a5>] process_one_work+0x1d5/0x4d0
Mar 18 13:20:51 venus kernel:  [<ffffffff810743eb>] worker_thread+0x4b/0x510
Mar 18 13:20:51 venus kernel:  [<ffffffff810743a0>] ? process_one_work+0x4d0/0x4d0
Mar 18 13:20:51 venus kernel:  [<ffffffff81079dca>] kthread+0xca/0xe0
Mar 18 13:20:51 venus kernel:  [<ffffffff81079d00>] ? kthread_park+0x60/0x60
Mar 18 13:20:51 venus kernel:  [<ffffffff81725622>] ret_from_fork+0x22/0x30
Mar 18 13:20:55 venus kernel: firewire_ohci 0000:83:00.0: isochronous cycle inconsistent
Mar 18 13:20:55 venus kernel: firewire_ohci 0000:83:00.0: DMA context still active (0x00000400)

=============================================================================
Attempts to force module unloading. (modprobe -r)
=============================================================================
Mar 18 13:27:33 venus kernel: firewire_ohci 0000:83:00.0: failed to read phy reg 4
Mar 18 13:27:33 venus kernel: CPU: 16 PID: 9955 Comm: rmmod Tainted: P          IO    4.9.13-rt12 #1
Mar 18 13:27:33 venus kernel: Hardware name: Supermicro X8DAH/X8DAH, BIOS 2.1        12/30/2011
Mar 18 13:27:33 venus kernel:  ffffc90017043c68 ffffffff813602f2 0000000000000067 ffff88144c4f4000
Mar 18 13:27:33 venus kernel:  ffffc90017043c90 ffffffffa01015cd 00000000000000c0 0000000000000004
Mar 18 13:27:33 venus kernel:  0000000000000000 ffffc90017043cc0 ffffffffa0101650 ffff88144c4f4648
Mar 18 13:27:33 venus kernel: Call Trace:
Mar 18 13:27:33 venus kernel:  [<ffffffff813602f2>] dump_stack+0x4d/0x6b
Mar 18 13:27:33 venus kernel:  [<ffffffffa01015cd>] read_phy_reg+0x8d/0xa0 [firewire_ohci]
Mar 18 13:27:33 venus kernel:  [<ffffffffa0101650>] update_phy_reg+0x20/0x50 [firewire_ohci]
Mar 18 13:27:33 venus kernel:  [<ffffffffa01016be>] ohci_update_phy_reg+0x3e/0x60 [firewire_ohci]
Mar 18 13:27:33 venus kernel:  [<ffffffffa00ad490>] fw_core_remove_card+0xa0/0x170 [firewire_core]
Mar 18 13:27:33 venus kernel:  [<ffffffffa00ad000>] ? 0xffffffffa00ad000
Mar 18 13:27:33 venus kernel:  [<ffffffffa00ad010>] ? dummy_read_phy_reg+0x10/0x10 [firewire_core]
Mar 18 13:27:33 venus kernel:  [<ffffffffa00ad040>] ? dummy_send_response+0x20/0x20 [firewire_core]
Mar 18 13:27:33 venus kernel:  [<ffffffffa00ad020>] ? dummy_update_phy_reg+0x10/0x10 [firewire_core]
Mar 18 13:27:33 venus kernel:  [<ffffffffa00ad050>] ? dummy_send_request+0x10/0x10 [firewire_core]
Mar 18 13:27:33 venus kernel:  [<ffffffffa00ad060>] ? dummy_cancel_packet+0x10/0x10 [firewire_core]
Mar 18 13:27:33 venus kernel:  [<ffffffffa00ad070>] ? dummy_enable_phys_dma+0x10/0x10 [firewire_core]
Mar 18 13:27:33 venus kernel:  [<ffffffffa00ad090>] ? dummy_allocate_iso_context+0x20/0x20 [firewire_core]
Mar 18 13:27:33 venus kernel:  [<ffffffffa00ad0a0>] ? dummy_start_iso+0x10/0x10 [firewire_core]
Mar 18 13:27:33 venus kernel:  [<ffffffffa00ad0b0>] ? dummy_set_iso_channels+0x10/0x10 [firewire_core]
Mar 18 13:27:33 venus kernel:  [<ffffffffa00ad0c0>] ? dummy_queue_iso+0x10/0x10 [firewire_core]
Mar 18 13:27:33 venus kernel:  [<ffffffffa00ad0d0>] ? dummy_flush_queue_iso+0x10/0x10 [firewire_core]
Mar 18 13:27:33 venus kernel:  [<ffffffffa01047a0>] pci_remove+0x60/0x2c0 [firewire_ohci]
Mar 18 13:27:33 venus kernel:  [<ffffffff817250f7>] ? rt_spin_unlock+0x27/0x40
Mar 18 13:27:33 venus kernel:  [<ffffffff813adb39>] pci_device_remove+0x39/0xc0
Mar 18 13:27:33 venus kernel:  [<ffffffff81489de1>] __device_release_driver+0xa1/0x160
Mar 18 13:27:33 venus kernel:  [<ffffffff8148a9c4>] driver_detach+0xb4/0xc0
Mar 18 13:27:33 venus kernel:  [<ffffffff81489818>] bus_remove_driver+0x58/0xd0
Mar 18 13:27:33 venus kernel:  [<ffffffff8148b22c>] driver_unregister+0x2c/0x50
Mar 18 13:27:33 venus kernel:  [<ffffffff813ac8b2>] pci_unregister_driver+0x32/0xa0
Mar 18 13:27:33 venus kernel:  [<ffffffffa0105a83>] fw_ohci_cleanup+0x10/0x58d [firewire_ohci]
Mar 18 13:27:33 venus kernel:  [<ffffffff810d5ea2>] SyS_delete_module+0x192/0x260
Mar 18 13:27:33 venus kernel:  [<ffffffff810014d8>] ? exit_to_usermode_loop+0xf8/0x100
Mar 18 13:27:33 venus kernel:  [<ffffffff817253e0>] entry_SYSCALL_64_fastpath+0x13/0x94
=============================================================================
Mar 18 13:27:33 venus kernel: firewire_ohci 0000:83:00.0: failed to read phy reg 1
Mar 18 13:27:33 venus kernel: CPU: 16 PID: 9609 Comm: kworker/16:1 Tainted: P          IO    4.9.13-rt12 #1
Mar 18 13:27:33 venus kernel: Hardware name: Supermicro X8DAH/X8DAH, BIOS 2.1        12/30/2011
Mar 18 13:27:33 venus kernel: Workqueue: firewire br_work [firewire_core]
Mar 18 13:27:33 venus kernel:  ffffc9001762fd80 ffffffff813602f2 0000000000000067 ffff88144c4f4000
Mar 18 13:27:33 venus kernel:  ffffc9001762fda8 ffffffffa01015cd ffff88144c4f4648 ffff88144c4f4000
Mar 18 13:27:33 venus kernel:  0000000000000001 ffffc9001762fdd0 ffffffffa010160e 0000000000000000
Mar 18 13:27:33 venus kernel: Call Trace:
Mar 18 13:27:33 venus kernel:  [<ffffffff813602f2>] dump_stack+0x4d/0x6b
Mar 18 13:27:33 venus kernel:  [<ffffffffa01015cd>] read_phy_reg+0x8d/0xa0 [firewire_ohci]
Mar 18 13:27:33 venus kernel:  [<ffffffffa010160e>] ohci_read_phy_reg+0x2e/0x50 [firewire_ohci]
Mar 18 13:27:33 venus kernel:  [<ffffffffa00b499b>] fw_send_phy_config+0xab/0xc0 [firewire_core]
Mar 18 13:27:33 venus kernel:  [<ffffffffa00ad2f9>] br_work+0x49/0xb0 [firewire_core]
Mar 18 13:27:33 venus kernel:  [<ffffffff810740a5>] process_one_work+0x1d5/0x4d0
Mar 18 13:27:33 venus kernel:  [<ffffffff810743eb>] worker_thread+0x4b/0x510
Mar 18 13:27:33 venus kernel:  [<ffffffff810743a0>] ? process_one_work+0x4d0/0x4d0
Mar 18 13:27:33 venus kernel:  [<ffffffff81079dca>] kthread+0xca/0xe0
Mar 18 13:27:33 venus kernel:  [<ffffffff81079d00>] ? kthread_park+0x60/0x60
Mar 18 13:27:33 venus kernel:  [<ffffffff81725622>] ret_from_fork+0x22/0x30
Comment 1 Stefan Richter 2017-03-19 13:52:39 UTC
Does it happen on a kernel without -rt patch too?

(I never saw this with my FW643 rev07 and rev08 cards myself, tested in an AM2+/AM780G and an LGA1150/C226 PC, always using mainline preempt kernels.)
Comment 2 Julian 2017-03-19 20:25:15 UTC
This error hasn't shown up since I set quirks=0 (default) and the PCI latency Timer from 128 to 32.
The I've had with this driver existed on vanilla 4.9.10.

I'll attempt to duplicate a consistent error and will use the vanilla kernel (4.9.13).
Comment 3 Julian 2017-03-20 08:44:16 UTC
I'm going to close this as resolved. with quirks=0x10 or quirks=0. Setting pci latency in bios I am unable to replicate this error with the kernels I'm currently using. For now i'm considering an external hardware fault (device or controller).
Can't for the life of me replicate it.