Bug 115741 - XHCI is slow during boot (bios/efi) and leaves many dmesg messages
Summary: XHCI is slow during boot (bios/efi) and leaves many dmesg messages
Status: NEW
Alias: None
Product: Drivers
Classification: Unclassified
Component: USB (show other bugs)
Hardware: x86-64 Linux
: P1 normal
Assignee: Greg Kroah-Hartman
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2016-04-01 11:29 UTC by Oliver
Modified: 2021-11-05 12:38 UTC (History)
13 users (show)

See Also:
Kernel Version: 3.16 - 4.5
Subsystem:
Regression: No
Bisected commit-id:


Attachments
Dmesg output with broken usb (70.50 KB, text/x-log)
2016-04-01 11:29 UTC, Oliver
Details
lots of xhci spewing with usb debug output enabled (1.35 KB, text/x-log)
2016-04-01 11:29 UTC, Oliver
Details
dmesg on 4.4.0-rc2 with broken usb output (69.84 KB, text/x-log)
2016-04-01 11:31 UTC, Oliver
Details

Description Oliver 2016-04-01 11:29:19 UTC
Created attachment 211371 [details]
Dmesg output with broken usb

I have a Apple Inc. MacBookPro11,1 (with the most recent 'bios': BIOS MBP111.88Z.0138.B16.1509081438 09/08/2015).

At the beginning, USB worked normally. After a while (and after newer kernel versions released by debian?) things started to act strangely. For one, the bios/efi boot takes a very long time (probably due to the same reason I describe later) just to get to the bootloader/grub. Likley resetting and probing for USB ports/mass storage. When grub finally pops up, I can use the (internal USB based keyboard) normally to select a grub entry etc.

Booting the kernel then works reasonably fine, until it loads the xhci module.
It spews some messages in dmesg (taking some 15 seconds) and only then, the keyboard starts to work again.

The log is filled with messages like:
[    7.248479] xhci_hcd 0000:00:14.0: Command completion event does not match command
[    7.248495] xhci_hcd 0000:00:14.0: Timeout while waiting for setup device command
[   12.256347] xhci_hcd 0000:00:14.0: Timeout while waiting for setup device command
[   12.256363] usb 1-2: hub failed to enable device, error -62
[   17.264166] xhci_hcd 0000:00:14.0: Timeout while waiting for setup device command
(followed by USB hub/device enumeration)

I've tried several combinations and quirks, updating to the latest rc kernels since 3.16 (am on 4.5.0 right now) and it only seems to get worse.

Last year, on the 3.x series of kernels occasionally after a reboot the 'bios' would go through quickly and fine and also no problems loading the module and logging in. But now it always fails.

Additionally it (may or may not) seems to cause the internal usb card reader to not even show up almost all of the time, though under OSX it works fine. There is/was a known issue with this cardreader where it would disappear after a suspend.

Adding various seemingly related intel usb3 quirks I had no change, as I think all of them are already applied to this chipset.

I'm guessing that somehow the usb chipset has some configuration option miss-set (which persists over reboots/power down) and the driver doesn't quite understand it.

Unfortunately it seems that this chipset does not work in pure USB2.0 (ehci) mode and needs the xhci module to work at all, so even falling to USB2 is no option. Also disconnecting all USB perhipials is nearly impossible as the touchpad, bluetooth cardreader and keyboard are internally all wired to USB.

I'm attaching 3 dmesg logs with various kernels and levels of debugging information. I tried to google for errors from these logs, but to no avail.
Comment 1 Oliver 2016-04-01 11:29:52 UTC
Created attachment 211381 [details]
lots of xhci spewing with usb debug output enabled
Comment 2 Oliver 2016-04-01 11:31:29 UTC
Created attachment 211391 [details]
dmesg on 4.4.0-rc2 with broken usb output
Comment 3 Greg Kroah-Hartman 2016-04-01 14:52:53 UTC
On Fri, Apr 01, 2016 at 11:29:19AM +0000, bugzilla-daemon@bugzilla.kernel.org wrote:
> https://bugzilla.kernel.org/show_bug.cgi?id=115741
> 
>             Bug ID: 115741
>            Summary: XHCI is slow during boot (bios/efi) and leaves many
>                     dmesg messages

Please send to the linux-usb@vger.kernel.org mailing list.
Comment 4 Felipe Ortiz 2016-05-26 05:17:46 UTC
Hi I have a very similar issue I use Debian with kernel 4.5 in a macbook pro 12.1
boot take a long time before show grub, but when it appear the keyboard is almost useless... you have a "windows time" where the keyboard respond this "windows" are random, they come and go, the best opportunity is press enter key before grub pops up (with this boot with default entry) this erratic behavior is present in all "special boot options" of the macbook pro (see https://support.apple.com/en-us/HT201255) the touch pad have the same problems. One of this special boot options (pressing "D") is a hardware diagnostics in this test my card reader show an error VDC001. On OSX the card reader work fine, but Debian cannot see it.

Olivier can you do a hardware test? turn on your Mac with the D key pressed.
Comment 5 Oliver 2016-05-26 08:46:15 UTC
I have run the diagnostic mode a few times before, but never found an issue there. I'll run it again tonight and report if there is a problem. E.g. no news means diagnostic thinks everything is okay :)

The problem still persists btw, and the linux-sub mailinglist has gotten little attention ...

http://thread.gmane.org/gmane.linux.usb.general/139697
Comment 6 Oliver 2016-06-06 15:31:39 UTC
I ran the diagnostic mode a few times since, and as expected, no problems whatsoever.

Because of some rework I ran from an USB driver (ubuntu 16.04) and it also had the same problems. I'm running 4.6.1 right now (from git) and still the same issue.

On the ML nothing is happening in this area ...
Comment 7 robin.marlow 2016-07-22 02:40:19 UTC
I have the same problems with long wait from chime to bootmanager (refind) and then long waits between keypresses.  Occasionally after a reboot it works normally,  but never cold.  Did any of you find a solution to this?

I'm using Gentoo with kernel 4.4.1 (have tried 4.6 but was no better)
Comment 8 Oliver 2016-07-26 08:11:03 UTC
Problem persists. Meanwhile I now also have a MacBookPro12,1 and so far I don't have the issue.

I would not be supprised (but this is a pure guess) that one commit of the kernel changed some setting in the hardware/chipset which causes this strange delay and may need to be 'fixed back'. Reason I recon this, is even an older live disk, thus with an older kernel, still shows this problem.

Or the hardware in that gen macbook is simply bad.
Comment 9 Thomas Lefebvre 2016-08-22 22:57:28 UTC
I am experiencing the same exact behavior as comments 4 and 7.
It's a MacbookPro12,1 and it does this everytime on cold boot and on reboots from Mac OS X. It does not show this behavior when rebooting from Linux.
It seems that neither my SD card reader is working from Linux nor from Mac OS X anymore.
I am really willing to help finding the root cause of this. Should you need something to help troubleshoot, please let me know.
I am starting to wonder if this is a software issue or an hardware issue.
Comment 10 Maël Lavault 2017-01-17 09:28:14 UTC
I also experience the same issue on a MacBookPro12,1 with Fedora 25 workstation (kernel 4.9.3-200).
Comment 11 auxsvr 2017-03-24 07:51:06 UTC
Similar issue (too slow to boot, error messages with XHCI) on a Macbook pro 12,1 with openSUSE Tumbleweed (linux 4.10.1). Apple's diagnostic test indicates no hardware issue.
Comment 12 Oliver 2017-03-24 10:09:25 UTC
A small update, the new MacBookPro12,1 has even become worse then before. (I still have the older one, which also still has the issue).

Initially it was fine, but as time went on, it became worse and worse. Booting now takes a couple of minutes.

First the time from chime to systemd-boot takes forever. Changing the menu is almost impossible as it takes forever for the (usb) keyboard to go through the menu. It probably is constantly restarting/timing out.

Once it is passed that, it is reasonably okay.

After susppend it's pretty much the same, where it sometimes works reasonably fast, other times can take a minute for USB to start working again.

(log after a reasonable fast working USB):
[156101.125279] xhci_hcd 0000:00:14.0: Timeout while waiting for setup device command
[156101.333046] usb 2-3: device not accepting address 31, error -62
[156106.760787] xhci_hcd 0000:00:14.0: Timeout while waiting for setup device command
[156112.132656] xhci_hcd 0000:00:14.0: Timeout while waiting for setup device command
[156112.340618] usb 2-3: device not accepting address 32, error -62
[156112.488616] usb 1-1: new high-speed USB device number 29 using xhci_hcd
[156112.628974] usb 1-1: New USB device found, idVendor=0451, idProduct=8142
[156112.628978] usb 1-1: New USB device strings: Mfr=0, Product=0, SerialNumber=1
[156112.628979] usb 1-1: SerialNumber: 54012851532D
[156112.629338] hub 1-1:1.0: USB hub found
[156112.629394] hub 1-1:1.0: 4 ports detected
[156118.020473] xhci_hcd 0000:00:14.0: Timeout while waiting for setup device command
[156123.396195] xhci_hcd 0000:00:14.0: Timeout while waiting for setup device command
[156123.604194] usb 2-3: device not accepting address 33, error -62
[156123.632194] usb usb2-port3: unable to enumerate USB device
[156123.632270] xhci_hcd 0000:00:14.0: Stop endpoint command completion for disabled slot 15


I do not have OSX installed on the NVMe disk anymore, but on an external USB3 drive, using the built in bootloader (via option/alt) is as tedious as systemd-boot (or grub2).

The bootloader is still the stock one, I have the one partition, that is fat formatted, where I have systemd-boot installed on.

Booting OSX from the external drive takes quite a while to get past the first bit, but is fine after the OS loads its usb driver. I dont' do this often, but haven't noticed any slowdowns or strange USB behaviour there. I'll double check and will respond if it is slow there too.

As a consequence, the built in card-reader does not work in linux (works fine in OSX), which isn't fully surprising as it is a USB based one.

So somehow, it seems linux is breaking the XHCI controller which is persistent through power off.
Comment 13 Oliver 2017-03-24 10:10:45 UTC
and since i can't edit, I'm running debian's 4.9.0-1 kernel at the moment, nothing special, no special options.

Also I boot OSX occasionally to check for firmware updates etc.
Comment 14 Harald Rudell 2017-08-14 23:38:09 UTC
this xhci_hcd still bad
it keeps turning off usb and with that the console keyboard

Now after a suspend-immediate resume 4.10.0-32-generic:
xhci_hcd 0000:00:14.0: xHCI host not responding to stop endpoint command.
xhci_hcd 0000:00:14.0: Assuming host is dying, halting host.
xhci_hcd 0000:00:14.0: HC died; cleaning up

After that only the power button works until Apple Rescue Mode, macOS, Apple Hardware Test, back to Linux

History tells us this is not going to be fixed any time soon.

Can we REMOVE the code path leading to "Assuming host is dying, halting host." Halting the host is ALWAYS a bad idea.

On most laptops halting means no network, no keyboard, no mouse.
On all desktops I have halting means no keyboard no mouse

And xhci_hcd keeps shutting devices off at random.
Comment 15 Oliver 2017-08-15 07:50:08 UTC
Just to confirm, I also still have issues with usb3/xhci on the macbook. Most of the time i don't notice it much. Suspend-resume fails once a week maybe due to xhci.

My logs are generally full of xhci errors and warnings. But keyboard/mouse tend to keep working generally.

The only thing that's really amazing, is if I connect an android phone and try to adb to it, the entire xhci driver crashes and never returns. Keyboard and mouse are then dead too (obviously).
Comment 16 Daniel Wolff 2018-04-25 03:45:35 UTC
I also have a MacbookPro 11,1, and even with the latest firmware, the problem is still here, unfortunately.

BIOS MBP111.88Z.0145.B00.1802091341 02/09/2018

Bit by bit, I have exactly the same problem as Oliver, from comment #1.
Comment 17 steven3k+kernelorg 2019-10-29 15:07:54 UTC
I experiencing the same problem as comment 4,7,9, and 12.
The MacOS hardware test reports the SD card reader as not working.

I am using Debian stable with kernel 4.19.67.

I don't have Mac OS installed but booting into either the online or the offline recovery doesn't fix the problem (contrarily to some other reports I found on other forums). 

I can also confirm that, somehow, using adb with an Android phone often leads to unresponsive USB, including keyboard and touchpad, as reported by comment 15.
Comment 18 Benjamin Asbach 2020-09-07 19:17:39 UTC
Maybe you could try some more recent kernel. I'm currently running 5.7.19 without having the timeouts.
Comment 19 Tobias Bachmann 2021-11-05 10:54:24 UTC
Starting with kernel 5.14.14 I can't boot anymore. XHCI dies early on and won't allow me to decrypt my LUKS encrypted root partition.
Comment 20 gregkh 2021-11-05 12:38:58 UTC
On Fri, Nov 05, 2021 at 10:54:24AM +0000, bugzilla-daemon@bugzilla.kernel.org wrote:
> https://bugzilla.kernel.org/show_bug.cgi?id=115741
> 
> --- Comment #19 from Tobias Bachmann (tobachmann@gmx.de) ---
> Starting with kernel 5.14.14 I can't boot anymore. XHCI dies early on and
> won't
> allow me to decrypt my LUKS encrypted root partition.

Different issue, should be fixed in the next 5.14.y release:
	https://lore.kernel.org/r/42bcbea6-5eb8-16c7-336a-2cb72e71bc36@redhat.com

Note You need to log in before you can comment on or make changes to this bug.