Bug 76991

Summary: xHCI regression for VIA USB 3.0 controller in handle_cmd_completion
Product: Drivers Reporter: aew9thae
Component: USBAssignee: Greg Kroah-Hartman (greg)
Status: NEW ---    
Severity: normal CC: szg00000
Priority: P1    
Hardware: All   
OS: Linux   
Kernel Version: 3.15.0-rc7 (cd79bde29f00a346eec3fe17c1c5073c37ed95e7) Subsystem:
Regression: No Bisected commit-id:

Description aew9thae 2014-05-28 12:41:57 UTC
(I originally sent this to linux-usb mailing list, but for some reason it didn't seem to have been received there. It's copied here.)

My VIA USB 3.0 controller has stopped working in recent kernels. During boot,
dmesg shows a WARNING stack trace at drivers/usb/host/xhci-ring.c:1615 
containing handle_cmd_completion+0xdc7/0x1000. 

USB ports become unusable - mouse, keyboard and fdisking mass storage 
devices all spew error messages and stack traces followed by logs of 
device resets.

Git-bisection from 3.10 to 3.14 for drivers/usb/host points to 20e7acb13ff48,
which commit message appears to indicate that the VIA controller in question
may not be xhci spec rev1.0-compliant.

I'm not sure what Linux's policy on regression for non-compliant hardware
is, but it'll save me the trouble of patching/building a kernel each time 
if you can revert, or otherwise fix this. 

Thanks

--- Bisection details --- 

# git bisect log
git bisect start '--' 'drivers/usb/host/'
# good: [8bb495e3f02401ee6f76d1b1d77f3ac9f079e376] 
Linux 3.10
git bisect good 8bb495e3f02401ee6f76d1b1d77f3ac9f079e376
# bad: [455c6fdbd219161bd09b1165f11699d6d73de11c] 
Linux 3.14
git bisect bad 455c6fdbd219161bd09b1165f11699d6d73de11c
# good: [40b3dc6da05c4ac0e317723a22eaa807c4b98648] 
usb: pci-quirks: amd_chipset_sb_type_init() can be static
git bisect good 40b3dc6da05c4ac0e317723a22eaa807c4b98648
# bad: [9b547a882e9ffec67bb41a4e66b4bcc0e91a2737] 
usb: r8a66597-hcd: Convert to clk_prepare/unprepare
git bisect bad 9b547a882e9ffec67bb41a4e66b4bcc0e91a2737
# good: [a393a807d0c805e7c723315ff0e88a857055e9c6] 
USB: EHCI: start new isochronous streams ASAP
git bisect good a393a807d0c805e7c723315ff0e88a857055e9c6
# bad: [a2cdc3432c361bb885476d1c625e22b518e0bc07] 
usb: xhci: remove the unused ->address field
git bisect bad a2cdc3432c361bb885476d1c625e22b518e0bc07
# bad: [20e7acb13ff48fbc884d5918c3697c27de63922a] 
xhci: use completion event's slot id rather than dig it out of command
git bisect bad 20e7acb13ff48fbc884d5918c3697c27de63922a
# good: [d194c031994d3fc1038fa09e9e92d9be24a21921] 
xhci: correct the usage of USB_CTRL_SET_TIMEOUT
git bisect good d194c031994d3fc1038fa09e9e92d9be24a21921
# good: [b244b431f89e152dd4bf35d71786f1c0eb8cba7e] 
xhci: refactor TRB_ENABLE_SLOT case into function
git bisect good b244b431f89e152dd4bf35d71786f1c0eb8cba7e
# good: [9b3103ac9d19525781c297c4fb1e544e077c8901] 
xhci: refactor TRB_ADDR_DEV case into function
git bisect good 9b3103ac9d19525781c297c4fb1e544e077c8901
# first bad commit: [20e7acb13ff48fbc884d5918c3697c27de63922a] 
xhci: use completion event's slot id rather than dig it out of command

# git show 20e7acb13ff48fbc884d5918c3697c27de63922a
commit 20e7acb13ff48fbc884d5918c3697c27de63922a
Author: Xenia Ragiadakou <burzalodowa@gmail.com>
Date:   Mon Sep 9 13:29:50 2013 +0300

 xhci: use completion event's slot id rather than dig it out of command
  
 Since the slot id retrieved from the Reset Device TRB matches the slot id in
 the command completion event, which is available, there is no need to determine
 it again.
 This patch removes the uneccessary reassignment to slot id and adds a WARN_ON
 in case the two Slot ID fields differ (although according xhci spec rev1.0
 they should not differ).
    
 Signed-off-by: Xenia Ragiadakou <burzalodowa@gmail.com>
 Signed-off-by: Sarah Sharp <sarah.a.sharp@linux.intel.com>

diff --git a/drivers/usb/host/xhci-ring.c b/drivers/usb/host/xhci-ring.c
index e3b61b8..88939b7 100644
--- a/drivers/usb/host/xhci-ring.c
+++ b/drivers/usb/host/xhci-ring.c
@@ -1547,9 +1547,9 @@ bandwidth_change:
            xhci_handle_cmd_reset_ep(xhci, event, xhci->cmd_ring->dequeue);
            break;
    case TRB_TYPE(TRB_RESET_DEV):
+           WARN_ON(slot_id != TRB_TO_SLOT_ID(
+                   le32_to_cpu(xhci->cmd_ring->dequeue->generic.field[3])))
            xhci_dbg(xhci, "Completed reset device command.\n");
-           slot_id = TRB_TO_SLOT_ID(
-                   le32_to_cpu(xhci->cmd_ring->dequeue->generic.field[3]));
            virt_dev = xhci->devs[slot_id];
            if (virt_dev)
                    handle_cmd_in_cmd_wait_list(xhci, virt_dev, event);


--- Problem Details ---

# uname -a
Linux godel 3.15.0-rc7-Saran-00040-gcd79bde #15 SMP PREEMPT \
Tue May 27 23:18:08 EDT 2014 x86_64 GNU/Linux

# git rev-parse HEAD 
cd79bde29f00a346eec3fe17c1c5073c37ed95e7

# lspci | grep VIA
02:00.0 USB controller: VIA Technologies, Inc. Device 3483 (rev 01)

# lsusb -v
(...Gets stuck trying to list the following: )
Bus 001 Device 002: ID 2109:3431  
Device Descriptor:
  bLength                18
  bDescriptorType         1
  bcdUSB               2.10
  bDeviceClass            9 Hub
  bDeviceSubClass         0 Unused
  bDeviceProtocol         1 Single TT
  bMaxPacketSize0        64
  idVendor           0x2109 
  idProduct          0x3431 
  bcdDevice            4.20
  iManufacturer           0 
  iProduct                1 USB2.0 Hub
  iSerial                 0 
  bNumConfigurations      1
...
Binary Object Store Descriptor:
  bLength                 5
  bDescriptorType        15
  wTotalLength           42
  bNumDeviceCaps          3
  USB 2.0 Extension Device Capability:
    bLength                 7
    bDescriptorType        16
    bDevCapabilityType      2
    bmAttributes   0x00000002
      Link Power Management (LPM) Supported
  SuperSpeed USB Device Capability:
    bLength                10
    bDescriptorType        16
    bDevCapabilityType      3
    bmAttributes         0x00
    wSpeedsSupported   0x000e
      Device can operate at Full Speed (12Mbps)
      Device can operate at High Speed (480Mbps)
      Device can operate at SuperSpeed (5Gbps)
    bFunctionalitySupport   1
      Lowest fully-functional device speed is Full Speed (12Mbps)
    bU1DevExitLat           4 micro seconds
    bU2DevExitLat         231 micro seconds
  Container ID Device Capability:
    bLength                20
    bDescriptorType        16
    bDevCapabilityType      4
    bReserved               0
    ContainerID             {5cf3ee30-d507-4925-b001-802d79434c30}
[ Stuck ]

# tailf /var/log/everything.log
...
xhci_hcd 0000:02:00.0: Reset device command completion for disabled slot 0
hub 1-1:1.0: hub_port_status failed (err = -110)
xhci_hcd 0000:02:00.0: Timeout while waiting for reset device command
usb 2-2: reset SuperSpeed USB device number 12 using xhci_hcd
xhci_hcd 0000:02:00.0: xHCI xhci_drop_endpoint called with disabled ep ffff8804f553b240
xhci_hcd 0000:02:00.0: xHCI xhci_drop_endpoint called with disabled ep ffff8804f553b288
xhci_hcd 0000:02:00.0: Trying to add endpoint 0x81 without dropping it.
usb 2-2: Busted HC?  Not enough HCD resources for old configuration.
...

# dmesg 
(relevant stuff)

WARNING: CPU: 6 PID: 0 at drivers/usb/host/xhci-ring.c:1615 \
              handle_cmd_completion+0xdc7/0x1000 [xhci_hcd]()
Modules linked in: ata_generic usb_storage pata_acpi btrfs \
              xor ohci_pci ohci_hcd ehci_pci xhci_hcd ehci_hcd \
              pata_atiixp crc32c_intel usbcore usb_common floppy \
              raid6_pq sd_mod crc_t10dif crct10dif_common ahci \
              libahci libata scsi_mod
CPU: 6 PID: 0 Comm: swapper/6 Not tainted 3.15.0-rc7-Saran-00040-gcd79bde #15
Hardware name: Gigabyte Technology Co., Ltd. GA-78LMT-USB3/GA-78LMT-USB3\
              , BIOS FA 04/23/2013
 0000000000000009 ffff88052ed83d48 ffffffff814b65a9 0000000000000000
 ffff88052ed83d80 ffffffff81065dad 0000000000000000 0000000000000003
 ffff88051033f6e0 ffff88051033f080 ffff8805103c8000 ffff88052ed83d90
Call Trace:
 <IRQ>  [<ffffffff814b65a9>] dump_stack+0x4d/0x6f
 [<ffffffff81065dad>] warn_slowpath_common+0x7d/0xa0
 [<ffffffff81065e8a>] warn_slowpath_null+0x1a/0x20
 [<ffffffffa03d6077>] handle_cmd_completion+0xdc7/0x1000 [xhci_hcd]
 [<ffffffff810a34ad>] ? enqueue_task_fair+0x10d/0x5b0
 [<ffffffff8109b285>] ? sched_clock_cpu+0xb5/0xe0
 [<ffffffffa03d6beb>] xhci_irq+0x5db/0x1ec0 [xhci_hcd]
 [<ffffffff81095549>] ? ttwu_do_wakeup+0x19/0xf0
 [<ffffffff81097e2f>] ? try_to_wake_up+0x1ff/0x2e0
 [<ffffffffa03d84e1>] xhci_msi_irq+0x11/0x20 [xhci_hcd]
 [<ffffffff810c126e>] handle_irq_event_percpu+0x3e/0x1f0
 [<ffffffff810c145d>] handle_irq_event+0x3d/0x60
 [<ffffffff810c3ff6>] handle_edge_irq+0x66/0x130
 [<ffffffff81016a5e>] handle_irq+0x1e/0x40
 [<ffffffff814c59cd>] do_IRQ+0x4d/0xe0
 [<ffffffff814bbc2d>] common_interrupt+0x6d/0x6d
 <EOI>  [<ffffffff81050196>] ? native_safe_halt+0x6/0x10
 [<ffffffff8101deef>] default_idle+0x1f/0x100
 [<ffffffff8101e86f>] arch_cpu_idle+0xf/0x20
 [<ffffffff810aba18>] cpu_startup_entry+0x258/0x490
 [<ffffffff81043224>] start_secondary+0x1f4/0x280
---[ end trace d0b3dfbd98479c47 ]---
Comment 1 Greg Kroah-Hartman 2014-05-28 15:43:13 UTC
On Wed, May 28, 2014 at 12:41:57PM +0000, bugzilla-daemon@bugzilla.kernel.org wrote:
> (I originally sent this to linux-usb mailing list, but for some reason it
> didn't seem to have been received there. It's copied here.)

Please disable html and resend it, we don't do USB bugs in bugzilla,
sorry.