Bug 91511

Summary: Sporadic crash/lockup in ohci-hcd io_watchdog_func()
Product: Drivers Reporter: Heiko (lil_tux)
Component: USBAssignee: Greg Kroah-Hartman (greg)
Status: RESOLVED CODE_FIX    
Severity: normal    
Priority: P1    
Hardware: x86-64   
OS: Linux   
Kernel Version: 3.18.3-4.1.3 Subsystem:
Regression: No Bisected commit-id:
Attachments: lspci verbose
lsusb_verbose
crash log extracted with crash tool's log

Description Heiko 2015-01-18 12:56:41 UTC
I'm getting a sporadic lockup in io_watchdog_func. It's probably related to keyboard input, at least happens when using the keyboard. System load doesn't have to be high.

I was finally able to kdump 8GB vmcore. Using gdb it reveals

[Switching to thread 6 (LWP 21136)]
(gdb) bt
#0  io_watchdog_func (_ohci=18446612135731167280) at drivers/usb/host/ohci-hcd.c:761
#1  0xffffffff810cf668 in call_timer_fn (fn=0xffffffff81394b10 <io_watchdog_func>, data=3416953312, timer=<optimized out>) at kernel/time/timer.c:1141
#2  0xffffffff810cf910 in __run_timers (base=<optimized out>) at kernel/time/timer.c:1213
#3  run_timer_softirq (h=0xffff8800cbaa8030) at kernel/time/timer.c:1404
#4  0xffffffff8108446b in __do_softirq () at kernel/softirq.c:269
#5  0xffffffff81084776 in invoke_softirq () at kernel/softirq.c:346
#6  irq_exit () at kernel/softirq.c:387
#7  0xffffffff8102f335 in exiting_irq () at ./arch/x86/include/asm/apic.h:660
#8  smp_apic_timer_interrupt (regs=0xffff8800cbaa8030) at arch/x86/kernel/apic/apic.c:945
#9  <signal handler called>
#10 0xffffffffffffff10 in ?? ()
#11 0x0000000000000000 in ?? ()

(gdb) p ed
$1 = (struct ed *) 0xdead0000001000c0

That matches the panic printed crash backtrace, where $rdx and $r12 are 0xdead...

rdx            0xdead000000100100       -2401263026317557504
r12            0xdead0000001000c0       -2401263026317557568

Unfortunately I've only got them as picture

The crash backtrace also hinted at cpuidle_enter_state

[Switching to thread 5 (process 1)]
#0  0xffffffffa0001ecf in ?? ()
(gdb) bt
#0  0xffffffffa0001ecf in ?? ()
#1  0x0000000000000018 in irq_stack_union ()
#2  0xffff8800cacc3400 in ?? ()
#3  0xffffffffa0005090 in ?? ()
#4  0x000027be538d61f9 in ?? ()
#5  0xffffffff81803ec8 in init_thread_union ()
#6  0xffffffff813bf687 in cpuidle_enter_state (dev=0xffff88022370c8a8, drv=0xffff88022370c800, index=2) at drivers/cpuidle/cpuidle.c:125
Backtrace stopped: frame did not save the PC



### hardware used ###

Board is: GA-78LMT-USB3
CPU is: AMD FX(tm)-6300 Six-Core Processor

# lspci
00:00.0 Host bridge: Advanced Micro Devices, Inc. [AMD] RS780 Host Bridge
00:02.0 PCI bridge: Advanced Micro Devices, Inc. [AMD] RS780 PCI to PCI bridge (ext gfx port 0)
00:04.0 PCI bridge: Advanced Micro Devices, Inc. [AMD] RS780/RS880 PCI to PCI bridge (PCIE port 0)
00:06.0 PCI bridge: Advanced Micro Devices, Inc. [AMD] RS780 PCI to PCI bridge (PCIE port 2)
00:11.0 SATA controller: Advanced Micro Devices, Inc. [AMD/ATI] SB7x0/SB8x0/SB9x0 SATA Controller [AHCI mode]
00:12.0 USB controller: Advanced Micro Devices, Inc. [AMD/ATI] SB7x0/SB8x0/SB9x0 USB OHCI0 Controller
00:12.1 USB controller: Advanced Micro Devices, Inc. [AMD/ATI] SB7x0 USB OHCI1 Controller
00:12.2 USB controller: Advanced Micro Devices, Inc. [AMD/ATI] SB7x0/SB8x0/SB9x0 USB EHCI Controller
00:13.0 USB controller: Advanced Micro Devices, Inc. [AMD/ATI] SB7x0/SB8x0/SB9x0 USB OHCI0 Controller
00:13.1 USB controller: Advanced Micro Devices, Inc. [AMD/ATI] SB7x0 USB OHCI1 Controller
00:13.2 USB controller: Advanced Micro Devices, Inc. [AMD/ATI] SB7x0/SB8x0/SB9x0 USB EHCI Controller
00:14.0 SMBus: Advanced Micro Devices, Inc. [AMD/ATI] SBx00 SMBus Controller (rev 3c)
00:14.1 IDE interface: Advanced Micro Devices, Inc. [AMD/ATI] SB7x0/SB8x0/SB9x0 IDE Controller
00:14.2 Audio device: Advanced Micro Devices, Inc. [AMD/ATI] SBx00 Azalia (Intel HDA)
00:14.3 ISA bridge: Advanced Micro Devices, Inc. [AMD/ATI] SB7x0/SB8x0/SB9x0 LPC host controller
00:14.4 PCI bridge: Advanced Micro Devices, Inc. [AMD/ATI] SBx00 PCI to PCI Bridge
00:14.5 USB controller: Advanced Micro Devices, Inc. [AMD/ATI] SB7x0/SB8x0/SB9x0 USB OHCI2 Controller
00:18.0 Host bridge: Advanced Micro Devices, Inc. [AMD] Family 15h Processor Function 0
00:18.1 Host bridge: Advanced Micro Devices, Inc. [AMD] Family 15h Processor Function 1
00:18.2 Host bridge: Advanced Micro Devices, Inc. [AMD] Family 15h Processor Function 2
00:18.3 Host bridge: Advanced Micro Devices, Inc. [AMD] Family 15h Processor Function 3
00:18.4 Host bridge: Advanced Micro Devices, Inc. [AMD] Family 15h Processor Function 4
00:18.5 Host bridge: Advanced Micro Devices, Inc. [AMD] Family 15h Processor Function 5
01:00.0 VGA compatible controller: Advanced Micro Devices, Inc. [AMD/ATI] Barts PRO [Radeon HD 6850]
01:00.1 Audio device: Advanced Micro Devices, Inc. [AMD/ATI] Barts HDMI Audio [Radeon HD 6800 Series]
02:00.0 USB controller: VIA Technologies, Inc. Device 3483 (rev 01)
03:00.0 Ethernet controller: Realtek Semiconductor Co., Ltd. RTL8111/8168/8411 PCI Express Gigabit Ethernet Controller (rev 06)
04:06.0 Network controller: Techsan Electronics Co Ltd B2C2 FlexCopII DVB chip / Technisat SkyStar2 DVB card (rev 02)


# lsusb 
Bus 009 Device 001: ID 1d6b:0001 Linux Foundation 1.1 root hub
Bus 004 Device 001: ID 1d6b:0002 Linux Foundation 2.0 root hub
Bus 008 Device 001: ID 1d6b:0001 Linux Foundation 1.1 root hub
Bus 007 Device 001: ID 1d6b:0001 Linux Foundation 1.1 root hub
Bus 003 Device 001: ID 1d6b:0002 Linux Foundation 2.0 root hub
Bus 006 Device 002: ID 1e7d:2c2e ROCCAT 
Bus 006 Device 001: ID 1d6b:0001 Linux Foundation 1.1 root hub
Bus 005 Device 002: ID 046a:0023 Cherry GmbH CyMotion Master Linux Keyboard G230
Bus 005 Device 001: ID 1d6b:0001 Linux Foundation 1.1 root hub
Bus 002 Device 001: ID 1d6b:0003 Linux Foundation 3.0 root hub
Bus 001 Device 002: ID 2109:3431  
Bus 001 Device 001: ID 1d6b:0002 Linux Foundation 2.0 root hub

(2109:3431 is a bit wierd, should be another root hub)
Comment 1 Heiko 2015-01-18 12:58:13 UTC
Created attachment 163681 [details]
lspci verbose
Comment 2 Heiko 2015-01-18 13:03:26 UTC
Created attachment 163691 [details]
lsusb_verbose

For 2109:3431 there's "bcdUSB               2.10" and kernel prints
"[ 1195.581234] hub 1-1:1.0: hub_port_status failed (err = -110)"
when querying.
Comment 3 Heiko 2015-01-18 15:08:58 UTC
Created attachment 163721 [details]
crash log extracted with crash tool's log

With help of the crash tool I was able to acquire the crash log. Looks like list corruption.
Comment 4 Heiko 2015-01-18 17:53:54 UTC
I'm by no means a kernel dev, but could it be a locking/irq issue in ohci_irq()? ohci_irq() calls ohci_work(), which calls finish_urb(), which states:
" * PRECONDITION:  ohci lock held, irqs blocked"

But ohci_irq() seems to only spin_[un]lock(), not spin_[un]lock_irq[save|restore](). All other functions that call ohci_work() do at least spin_[un]lock_irq. So irqs could still be enabled and possibly the event triggered twice, thus the double list add?
Comment 5 Greg Kroah-Hartman 2015-01-19 02:01:53 UTC
On Sun, Jan 18, 2015 at 12:56:41PM +0000, bugzilla-daemon@bugzilla.kernel.org wrote:
> https://bugzilla.kernel.org/show_bug.cgi?id=91511
> 
>             Bug ID: 91511
>            Summary: Sporadic crash/lockup in ohci-hcd io_watchdog_func()

Please send to the linux-usb@vger.kernel.org mailing list.
Comment 6 Heiko 2015-07-27 06:39:25 UTC
Fixed by 7d8021c9 "USB: OHCI: Fix race between ED unlink and URB submission"