This is the first time it's happened to me. My USB mouse just died. The system was more or less idle, then I got these messages from the kernel: [109773.985092] xhci_hcd 0000:c3:00.3: xHCI host not responding to stop endpoint command [109773.998577] xhci_hcd 0000:c3:00.3: xHCI host controller not responding, assume dead [109773.998622] xhci_hcd 0000:c3:00.3: HC died; cleaning up [109773.998668] xhci_hcd 0000:c3:00.3: Timeout while waiting for stop endpoint command [109773.998740] usb 1-2: USB disconnect, device number 2 [109774.032612] usb 1-3: USB disconnect, device number 3 [109774.033087] usb 1-4: USB disconnect, device number 4 This has never happened before with any of previous kernels, 6.9, 6.10, 6.11, 6.12. Now on 6.13.4 this happened a few minutes after the system resumed. That looks like a major regression. The kernel didn't try anything. Unbinding and binding the USB endpoint in /sys using this script has fixed the mouse but I never had to do that before: https://unix.stackexchange.com/a/704342 My lspci: c3:00.1 Audio device: Advanced Micro Devices, Inc. [AMD/ATI] Rembrandt Radeon High Definition Audio Controller c3:00.6 Audio device: Advanced Micro Devices, Inc. [AMD] Family 17h/19h/1ah HD Audio Controller c3:00.2 Encryption controller: Advanced Micro Devices, Inc. [AMD] Phoenix CCP/PSP 3.0 Device 00:18.0 Host bridge: Advanced Micro Devices, Inc. [AMD] Phoenix Data Fabric; Function 0 00:18.1 Host bridge: Advanced Micro Devices, Inc. [AMD] Phoenix Data Fabric; Function 1 00:18.2 Host bridge: Advanced Micro Devices, Inc. [AMD] Phoenix Data Fabric; Function 2 00:18.3 Host bridge: Advanced Micro Devices, Inc. [AMD] Phoenix Data Fabric; Function 3 00:18.4 Host bridge: Advanced Micro Devices, Inc. [AMD] Phoenix Data Fabric; Function 4 00:18.5 Host bridge: Advanced Micro Devices, Inc. [AMD] Phoenix Data Fabric; Function 5 00:18.6 Host bridge: Advanced Micro Devices, Inc. [AMD] Phoenix Data Fabric; Function 6 00:18.7 Host bridge: Advanced Micro Devices, Inc. [AMD] Phoenix Data Fabric; Function 7 00:01.0 Host bridge: Advanced Micro Devices, Inc. [AMD] Phoenix Dummy Host Bridge 00:02.0 Host bridge: Advanced Micro Devices, Inc. [AMD] Phoenix Dummy Host Bridge 00:03.0 Host bridge: Advanced Micro Devices, Inc. [AMD] Phoenix Dummy Host Bridge 00:04.0 Host bridge: Advanced Micro Devices, Inc. [AMD] Phoenix Dummy Host Bridge 00:08.0 Host bridge: Advanced Micro Devices, Inc. [AMD] Phoenix Dummy Host Bridge 00:00.0 Host bridge: Advanced Micro Devices, Inc. [AMD] Phoenix Root Complex 00:00.2 IOMMU: Advanced Micro Devices, Inc. [AMD] Phoenix IOMMU 00:14.3 ISA bridge: Advanced Micro Devices, Inc. [AMD] FCH LPC Bridge (rev 51) c3:00.5 Multimedia controller: Advanced Micro Devices, Inc. [AMD] ACP/ACP3X/ACP6x Audio Coprocessor (rev 63) 01:00.0 Network controller: MEDIATEK Corp. MT7922 802.11ax PCI Express Wireless Network Adapter c4:00.0 Non-Essential Instrumentation [1300]: Advanced Micro Devices, Inc. [AMD] Phoenix Dummy Function c5:00.0 Non-Essential Instrumentation [1300]: Advanced Micro Devices, Inc. [AMD] Phoenix Dummy Function 02:00.0 Non-Volatile memory controller: Micron Technology Inc 3400 NVMe SSD [Hendrix] 00:03.1 PCI bridge: Advanced Micro Devices, Inc. [AMD] Family 19h USB4/Thunderbolt PCIe tunnel 00:04.1 PCI bridge: Advanced Micro Devices, Inc. [AMD] Family 19h USB4/Thunderbolt PCIe tunnel 00:02.2 PCI bridge: Advanced Micro Devices, Inc. [AMD] Phoenix GPP Bridge 00:02.4 PCI bridge: Advanced Micro Devices, Inc. [AMD] Phoenix GPP Bridge 00:08.1 PCI bridge: Advanced Micro Devices, Inc. [AMD] Phoenix Internal GPP Bridge to Bus [C:A] 00:08.2 PCI bridge: Advanced Micro Devices, Inc. [AMD] Phoenix Internal GPP Bridge to Bus [C:A] 00:08.3 PCI bridge: Advanced Micro Devices, Inc. [AMD] Phoenix Internal GPP Bridge to Bus [C:A] c4:00.1 Signal processing controller: Advanced Micro Devices, Inc. [AMD] AMD IPU Device c3:00.7 Signal processing controller: Advanced Micro Devices, Inc. [AMD] Sensor Fusion Hub 00:14.0 SMBus: Advanced Micro Devices, Inc. [AMD] FCH SMBus Controller (rev 71) c3:00.3 USB controller: Advanced Micro Devices, Inc. [AMD] Device 15b9 c3:00.4 USB controller: Advanced Micro Devices, Inc. [AMD] Device 15ba c5:00.3 USB controller: Advanced Micro Devices, Inc. [AMD] Device 15c0 c5:00.4 USB controller: Advanced Micro Devices, Inc. [AMD] Device 15c1 c5:00.5 USB controller: Advanced Micro Devices, Inc. [AMD] Pink Sardine USB4/Thunderbolt NHI controller #1 c5:00.6 USB controller: Advanced Micro Devices, Inc. [AMD] Pink Sardine USB4/Thunderbolt NHI controller #2 c3:00.0 VGA compatible controller: Advanced Micro Devices, Inc. [AMD/ATI] Phoenix1 (rev d4) My lsusb: Bus 001 Device 001: ID 1d6b:0002 Linux Foundation 2.0 root hub Bus 001 Device 002: ID 0489:e0f2 Foxconn / Hon Hai Wireless_Device Bus 001 Device 003: ID 06cb:00f0 Synaptics, Inc. Bus 002 Device 001: ID 1d6b:0003 Linux Foundation 3.0 root hub Bus 003 Device 001: ID 1d6b:0002 Linux Foundation 2.0 root hub Bus 003 Device 002: ID 0408:545f Quanta Computer, Inc. HP 5MP Camera Bus 004 Device 001: ID 1d6b:0003 Linux Foundation 3.0 root hub Bus 005 Device 001: ID 1d6b:0002 Linux Foundation 2.0 root hub Bus 006 Device 001: ID 1d6b:0003 Linux Foundation 3.0 root hub Bus 007 Device 001: ID 1d6b:0002 Linux Foundation 2.0 root hub Bus 008 Device 001: ID 1d6b:0003 Linux Foundation 3.0 root hub I'm not bisecting this issue because so far it's happened just once and I've no idea how to trigger it. Yet it has never happened before with previous kernels.
I'm utterly confused as to why the kernel decided to "xHCI host not responding to stop endpoint command". I didn't do anything at the time. Wasn't even using the mouse. Something funky is going on with 6.13.
This was reported in the SUSE bug tracker earlier: https://bugzilla.suse.com/show_bug.cgi?id=1236992 I don't see it being reported here, so the issue is not new. Yet I see no patches queued for 6.13.5.
The SUSE issue is seemingly unrelated, please dismiss.
This just happened again: [161470.836493] PM: resume devices took 0.547 seconds [161470.836720] OOM killer enabled. [161470.836721] Restarting tasks ... done. [161470.839715] random: crng reseeded on system resumption [161470.845090] PM: suspend exit [161471.322491] cs35l41-hda i2c-CSC3551:00-cs35l41-hda.0: DSP1: Firmware: 400a4 vendor: 0x2 v0.43.1, 2 algorithms [161471.324469] cs35l41-hda i2c-CSC3551:00-cs35l41-hda.0: DSP1: cirrus/cs35l41-dsp1-spk-prot-103c8b72.bin: v0.43.1 [161471.324480] cs35l41-hda i2c-CSC3551:00-cs35l41-hda.0: DSP1: spk-prot: D:\Amp Tuning\HP\840\0930\103C8B45_220930.bin [161471.403951] cs35l41-hda i2c-CSC3551:00-cs35l41-hda.1: Calibration applied: R0=10446 [161471.407392] cs35l41-hda i2c-CSC3551:00-cs35l41-hda.0: Calibration applied: R0=10526 [161471.432157] cs35l41-hda i2c-CSC3551:00-cs35l41-hda.1: Firmware Loaded - Type: spk-prot, Gain: 17 [161471.433916] cs35l41-hda i2c-CSC3551:00-cs35l41-hda.0: Firmware Loaded - Type: spk-prot, Gain: 17 [161471.523827] hp_wmi: Unknown event_id - 131073 - 0x0 [162644.637587] xhci_hcd 0000:c3:00.3: xHCI host not responding to stop endpoint command [162644.651068] xhci_hcd 0000:c3:00.3: xHCI host controller not responding, assume dead [162644.651076] xhci_hcd 0000:c3:00.3: HC died; cleaning up [162644.651099] xhci_hcd 0000:c3:00.3: Timeout while waiting for stop endpoint command [162644.651102] usb 1-2: USB disconnect, device number 4 [162644.678374] usb 1-3: USB disconnect, device number 2 [162644.678748] usb 1-4: USB disconnect, device number 3 Shortly after resume all the USB ports are disabled. I'm reverting back to Linux 6.11. I cannot use my device like this.
6.13 has a lot of changes related to endpoint stopping: e21ebe51af68 xhci: Turn NEC specific quirk for handling Stop Endpoint errors generic 474538b8dd1c usb: xhci: Avoid queuing redundant Stop Endpoint commands 484c3bab2d5d usb: xhci: Fix TD invalidation under pending Set TR Dequeue 42b758137601 usb: xhci: Limit Stop Endpoint retries Endpoints are stopped in order to cancel transfers, before suspend, and to soft reset an endpoint after clearing a halt. I understand that bisecting an issue like this that triggers rarely isn't an option, but can I ask you to try running 6.13 with xhci dynamic debug enabled. mount -t debugfs none /sys/kernel/debug echo 'module xhci_hcd =p' >/sys/kernel/debug/dynamic_debug/control echo 'module usbcore =p' >/sys/kernel/debug/dynamic_debug/control and send dmesg after issue is triggered. It could reveal a bit more what's going on
(In reply to Mathias Nyman from comment #5) > I understand that bisecting an issue like this that triggers rarely isn't an > option, but can I ask you to try running 6.13 with xhci dynamic debug > enabled. Will do as soon as possible. Thanks a lot!
Which exact versions were you running successfully and for how long? These patches listed by Mathias are instant first suspects, but they were all backported to v6.12.7 in December. Most of them also to v6.11.11 in early December and later in January to some LTS series. Any chance that hibernation is indeed a (delayed) trigger and you weren't doing it as often in the past? Did you come across similar reports from stable kernel branches in this year?
> Which exact versions were you running successfully and for how long? Kernel 6.12.14 that I was running earlier didn't have this issue. Used software suspend/resume multiple times successfully. > Any chance that hibernation is indeed a (delayed) trigger and you weren't > doing it as often in the past? Not using hibernation, just software suspend. I've not changed anything software-wise except installing a new kernel on this laptop. > Did you come across similar reports from stable kernel branches in this year? I've Googled a couple of times already for this exact error message and nothing turned up.