I'm getting a sporadic lockup in io_watchdog_func. It's probably related to keyboard input, at least happens when using the keyboard. System load doesn't have to be high. I was finally able to kdump 8GB vmcore. Using gdb it reveals [Switching to thread 6 (LWP 21136)] (gdb) bt #0 io_watchdog_func (_ohci=18446612135731167280) at drivers/usb/host/ohci-hcd.c:761 #1 0xffffffff810cf668 in call_timer_fn (fn=0xffffffff81394b10 <io_watchdog_func>, data=3416953312, timer=<optimized out>) at kernel/time/timer.c:1141 #2 0xffffffff810cf910 in __run_timers (base=<optimized out>) at kernel/time/timer.c:1213 #3 run_timer_softirq (h=0xffff8800cbaa8030) at kernel/time/timer.c:1404 #4 0xffffffff8108446b in __do_softirq () at kernel/softirq.c:269 #5 0xffffffff81084776 in invoke_softirq () at kernel/softirq.c:346 #6 irq_exit () at kernel/softirq.c:387 #7 0xffffffff8102f335 in exiting_irq () at ./arch/x86/include/asm/apic.h:660 #8 smp_apic_timer_interrupt (regs=0xffff8800cbaa8030) at arch/x86/kernel/apic/apic.c:945 #9 <signal handler called> #10 0xffffffffffffff10 in ?? () #11 0x0000000000000000 in ?? () (gdb) p ed $1 = (struct ed *) 0xdead0000001000c0 That matches the panic printed crash backtrace, where $rdx and $r12 are 0xdead... rdx 0xdead000000100100 -2401263026317557504 r12 0xdead0000001000c0 -2401263026317557568 Unfortunately I've only got them as picture The crash backtrace also hinted at cpuidle_enter_state [Switching to thread 5 (process 1)] #0 0xffffffffa0001ecf in ?? () (gdb) bt #0 0xffffffffa0001ecf in ?? () #1 0x0000000000000018 in irq_stack_union () #2 0xffff8800cacc3400 in ?? () #3 0xffffffffa0005090 in ?? () #4 0x000027be538d61f9 in ?? () #5 0xffffffff81803ec8 in init_thread_union () #6 0xffffffff813bf687 in cpuidle_enter_state (dev=0xffff88022370c8a8, drv=0xffff88022370c800, index=2) at drivers/cpuidle/cpuidle.c:125 Backtrace stopped: frame did not save the PC ### hardware used ### Board is: GA-78LMT-USB3 CPU is: AMD FX(tm)-6300 Six-Core Processor # lspci 00:00.0 Host bridge: Advanced Micro Devices, Inc. [AMD] RS780 Host Bridge 00:02.0 PCI bridge: Advanced Micro Devices, Inc. [AMD] RS780 PCI to PCI bridge (ext gfx port 0) 00:04.0 PCI bridge: Advanced Micro Devices, Inc. [AMD] RS780/RS880 PCI to PCI bridge (PCIE port 0) 00:06.0 PCI bridge: Advanced Micro Devices, Inc. [AMD] RS780 PCI to PCI bridge (PCIE port 2) 00:11.0 SATA controller: Advanced Micro Devices, Inc. [AMD/ATI] SB7x0/SB8x0/SB9x0 SATA Controller [AHCI mode] 00:12.0 USB controller: Advanced Micro Devices, Inc. [AMD/ATI] SB7x0/SB8x0/SB9x0 USB OHCI0 Controller 00:12.1 USB controller: Advanced Micro Devices, Inc. [AMD/ATI] SB7x0 USB OHCI1 Controller 00:12.2 USB controller: Advanced Micro Devices, Inc. [AMD/ATI] SB7x0/SB8x0/SB9x0 USB EHCI Controller 00:13.0 USB controller: Advanced Micro Devices, Inc. [AMD/ATI] SB7x0/SB8x0/SB9x0 USB OHCI0 Controller 00:13.1 USB controller: Advanced Micro Devices, Inc. [AMD/ATI] SB7x0 USB OHCI1 Controller 00:13.2 USB controller: Advanced Micro Devices, Inc. [AMD/ATI] SB7x0/SB8x0/SB9x0 USB EHCI Controller 00:14.0 SMBus: Advanced Micro Devices, Inc. [AMD/ATI] SBx00 SMBus Controller (rev 3c) 00:14.1 IDE interface: Advanced Micro Devices, Inc. [AMD/ATI] SB7x0/SB8x0/SB9x0 IDE Controller 00:14.2 Audio device: Advanced Micro Devices, Inc. [AMD/ATI] SBx00 Azalia (Intel HDA) 00:14.3 ISA bridge: Advanced Micro Devices, Inc. [AMD/ATI] SB7x0/SB8x0/SB9x0 LPC host controller 00:14.4 PCI bridge: Advanced Micro Devices, Inc. [AMD/ATI] SBx00 PCI to PCI Bridge 00:14.5 USB controller: Advanced Micro Devices, Inc. [AMD/ATI] SB7x0/SB8x0/SB9x0 USB OHCI2 Controller 00:18.0 Host bridge: Advanced Micro Devices, Inc. [AMD] Family 15h Processor Function 0 00:18.1 Host bridge: Advanced Micro Devices, Inc. [AMD] Family 15h Processor Function 1 00:18.2 Host bridge: Advanced Micro Devices, Inc. [AMD] Family 15h Processor Function 2 00:18.3 Host bridge: Advanced Micro Devices, Inc. [AMD] Family 15h Processor Function 3 00:18.4 Host bridge: Advanced Micro Devices, Inc. [AMD] Family 15h Processor Function 4 00:18.5 Host bridge: Advanced Micro Devices, Inc. [AMD] Family 15h Processor Function 5 01:00.0 VGA compatible controller: Advanced Micro Devices, Inc. [AMD/ATI] Barts PRO [Radeon HD 6850] 01:00.1 Audio device: Advanced Micro Devices, Inc. [AMD/ATI] Barts HDMI Audio [Radeon HD 6800 Series] 02:00.0 USB controller: VIA Technologies, Inc. Device 3483 (rev 01) 03:00.0 Ethernet controller: Realtek Semiconductor Co., Ltd. RTL8111/8168/8411 PCI Express Gigabit Ethernet Controller (rev 06) 04:06.0 Network controller: Techsan Electronics Co Ltd B2C2 FlexCopII DVB chip / Technisat SkyStar2 DVB card (rev 02) # lsusb Bus 009 Device 001: ID 1d6b:0001 Linux Foundation 1.1 root hub Bus 004 Device 001: ID 1d6b:0002 Linux Foundation 2.0 root hub Bus 008 Device 001: ID 1d6b:0001 Linux Foundation 1.1 root hub Bus 007 Device 001: ID 1d6b:0001 Linux Foundation 1.1 root hub Bus 003 Device 001: ID 1d6b:0002 Linux Foundation 2.0 root hub Bus 006 Device 002: ID 1e7d:2c2e ROCCAT Bus 006 Device 001: ID 1d6b:0001 Linux Foundation 1.1 root hub Bus 005 Device 002: ID 046a:0023 Cherry GmbH CyMotion Master Linux Keyboard G230 Bus 005 Device 001: ID 1d6b:0001 Linux Foundation 1.1 root hub Bus 002 Device 001: ID 1d6b:0003 Linux Foundation 3.0 root hub Bus 001 Device 002: ID 2109:3431 Bus 001 Device 001: ID 1d6b:0002 Linux Foundation 2.0 root hub (2109:3431 is a bit wierd, should be another root hub)
Created attachment 163681 [details] lspci verbose
Created attachment 163691 [details] lsusb_verbose For 2109:3431 there's "bcdUSB 2.10" and kernel prints "[ 1195.581234] hub 1-1:1.0: hub_port_status failed (err = -110)" when querying.
Created attachment 163721 [details] crash log extracted with crash tool's log With help of the crash tool I was able to acquire the crash log. Looks like list corruption.
I'm by no means a kernel dev, but could it be a locking/irq issue in ohci_irq()? ohci_irq() calls ohci_work(), which calls finish_urb(), which states: " * PRECONDITION: ohci lock held, irqs blocked" But ohci_irq() seems to only spin_[un]lock(), not spin_[un]lock_irq[save|restore](). All other functions that call ohci_work() do at least spin_[un]lock_irq. So irqs could still be enabled and possibly the event triggered twice, thus the double list add?
On Sun, Jan 18, 2015 at 12:56:41PM +0000, bugzilla-daemon@bugzilla.kernel.org wrote: > https://bugzilla.kernel.org/show_bug.cgi?id=91511 > > Bug ID: 91511 > Summary: Sporadic crash/lockup in ohci-hcd io_watchdog_func() Please send to the linux-usb@vger.kernel.org mailing list.
Fixed by 7d8021c9 "USB: OHCI: Fix race between ED unlink and URB submission"