Bug 207065
Summary: | C-media USB audio device stops working from 5.2.0-rc3 onwards | ||
---|---|---|---|
Product: | Drivers | Reporter: | Florian Meyer (florianmey) |
Component: | USB | Assignee: | Default virtual assignee for Drivers/USB (drivers_usb) |
Status: | NEW --- | ||
Severity: | normal | CC: | hawk.it, jamielennox, leho, mail, mathias.nyman, miren_radia, o.freyermuth |
Priority: | P1 | ||
Hardware: | All | ||
OS: | Linux | ||
Kernel Version: | >=5.2.0 | Subsystem: | |
Regression: | Yes | Bisected commit-id: | |
Attachments: |
Output of lsusb -v
dmesg output with usb debug enabled Content of /sys/kernel/debug/tracing/trace testpatch that doesn't clear TT buffer after protocol STALL Fix typo in attachment 288293 Dmesg output with both patches applied Content of /sys/kernel/debug/tracing/trace with both patches applied Full dmesg output with both patches applied DMESG log from todays lockup. Older dmesg log with debug information added Content of /sys/kernel/debug/tracing/trace Content of /sys/kernel/debug/tracing/trace |
Description
Florian Meyer
2020-04-02 15:57:49 UTC
Created attachment 288161 [details]
dmesg output with usb debug enabled
As per Mathias Nyman's comment over at the other bug report I'm also attaching the requested debug logs from this:
mount -t debugfs none /sys/kernel/debug
echo 'module xhci_hcd =p' >/sys/kernel/debug/dynamic_debug/control
echo 'module usbcore =p' >/sys/kernel/debug/dynamic_debug/control
echo 81920 > /sys/kernel/debug/tracing/buffer_size_kb
echo 1 > /sys/kernel/debug/tracing/events/xhci-hcd/enable
< connect the logitec device >
Send output of dmesg
Send content of /sys/kernel/debug/tracing/trace
Created attachment 288163 [details]
Content of /sys/kernel/debug/tracing/trace
There is also another bug which was bisected to the same commit under https://bugzilla.kernel.org/show_bug.cgi?id=206897 But that one is not related to USB audio devices. Also there has been another bug related to this commit from last year, which was discussed and fixed here: https://www.spinics.net/lists/linux-usb/msg182970.html Thanks, traces show its related to Clearing TT buffer after a STALL on endpoint 0. The first stall looks like a protocol stall, not a function stall, meaning that endpoint isn't really halted, just that the device does not support the request in the control transfer. Anyway, xhci starts clearing what it assumes is a halted endpoint, including clearing the hub TT buffer. Specs are a bit unclear if TT should be cleared in this case, or at least I couldn't find it. I'll create a tespatch that doesn't clear TT buffer on protocol stalls, and add excessive tracing and debugging to it. Created attachment 288293 [details]
testpatch that doesn't clear TT buffer after protocol STALL
Does this help? Could you share similar logs and traces with this applied?
Thanks
On Thu, 9 Apr 2020 bugzilla-daemon@bugzilla.kernel.org wrote: > --- Comment #4 from Mathias Nyman (mathias.nyman@linux.intel.com) --- > Thanks, traces show its related to Clearing TT buffer after a STALL on > endpoint 0. > > The first stall looks like a protocol stall, not a function stall, meaning > that > endpoint isn't really halted, just that the device does not support the > request in the control transfer. > > Anyway, xhci starts clearing what it assumes is a halted endpoint, > including clearing the hub TT buffer. > > Specs are a bit unclear if TT should be cleared in this case, > or at least I couldn't find it. TTs should be cleared when an error occurs in the protocol. STALL isn't an error; a real error would be a timeout or CRC mismatch or URB cancellation, things like that -- things that might leave the TT in a busy state (because it hasn't sent its final status back to the host) when it ought to be idle. Sending a STALL isn't a protocol error and it does clear the TT status. This is discussed (not as explicitly as one might want) in sections 11.17.3 and 11.17.5 of the USB-2.0 specification. Alan Stern Created attachment 288331 [details] Fix typo in attachment 288293 [details] bug 206897 is resolved with the patch in attachment 288293 [details], there is a typo in the patch. This patch just fixes the typo. Apply in order 0001-xhci-testpatch-Don-t-clear-TT-buffer-on-ep0-protocol.patch 0002-Fix-a-typo-in-prevoius-patch.patch > TTs should be cleared when an error occurs in the protocol. STALL
> isn't an error; a real error would be a timeout or CRC mismatch or URB
> cancellation, things like that -- things that might leave the TT in a
> busy state (because it hasn't sent its final status back to the host)
> when it ought to be idle. Sending a STALL isn't a protocol error and
> it does clear the TT status.
>
> This is discussed (not as explicitly as one might want) in sections
> 11.17.3 and 11.17.5 of the USB-2.0 specification.
>
Thanks for the clarification, makes sense.
With both patches applied the sound system is working. Thanks a lot! I will attach the requested logs with the patch applied as well. Created attachment 288447 [details]
Dmesg output with both patches applied
Created attachment 288449 [details]
Content of /sys/kernel/debug/tracing/trace with both patches applied
Gzipped because the file was too big to attach here.
Created attachment 288523 [details]
Full dmesg output with both patches applied
I just realized that the last dmesg log was truncated for some reason. This one is complete.
So I'm not entirely sure if this is related to my previous problem but I'm now encountering a different kind of USB problems. USB equipment is still the same as described in the first message. The new problem leads to all USB devices getting disconnected, USB audio as well as mouse. The USB keyboard continues working because I guess it is working via PS2 internally? The problem has been occurring pretty much since the previous issue was fixed but only sporadically. Today it occurred for the first time in ~4 weeks and sometimes it happened several times in a day. The last messages in the dmesg log are: [ 7075.685892] xhci_hcd 0000:2a:00.3: xHCI host not responding to stop endpoint command. [ 7075.685895] xhci_hcd 0000:2a:00.3: USBSTS: [ 7075.701768] xhci_hcd 0000:2a:00.3: xHCI host controller not responding, assume dead [ 7075.701775] xhci_hcd 0000:2a:00.3: HC died; cleaning up [ 7075.701808] usb 3-1: USB disconnect, device number 2 [ 7075.766108] usb 3-2: USB disconnect, device number 3 [ 7075.766110] usb 3-2.4: USB disconnect, device number 4 Created attachment 292705 [details]
DMESG log from todays lockup.
Created attachment 292707 [details]
Older dmesg log with debug information added
Log created as per the previous instructions (message 2). I just enabled the settings on system start in the hopes of capturing the hang so the files are quite big.
Created attachment 292709 [details]
Content of /sys/kernel/debug/tracing/trace
Created as per previous comment. Uploaded at:
Created attachment 292711 [details]
Content of /sys/kernel/debug/tracing/trace
Unpacked file is ~200mb. Ran for the entirety of the systems uptime to be able to catch the bug.
|