Bug 218544 - not enough bandwidth, synaptics hi-res audio duplex audio
Summary: not enough bandwidth, synaptics hi-res audio duplex audio
Status: RESOLVED ANSWERED
Alias: None
Product: Drivers
Classification: Unclassified
Component: USB (show other bugs)
Hardware: i386 Linux
: P3 normal
Assignee: Default virtual assignee for Drivers/USB
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2024-03-01 15:15 UTC by Ian Malone
Modified: 2024-03-15 12:50 UTC (History)
1 user (show)

See Also:
Kernel Version:
Subsystem:
Regression: No
Bisected commit-id:


Attachments
lsusb_-v_output (14.28 KB, text/plain)
2024-03-01 16:20 UTC, Ian Malone
Details
sys_kernel_debug_usb_devices_contents (7.03 KB, text/plain)
2024-03-01 16:21 UTC, Ian Malone
Details
/sys/kernel/debug/usb/devices other devices disabled (6.44 KB, text/plain)
2024-03-01 17:14 UTC, Ian Malone
Details
/sys/kernel/debug/usb/devices other devices and hid disabled (6.44 KB, text/plain)
2024-03-05 17:14 UTC, Ian Malone
Details
wireplumber rule for 16bit input on conexant/synaptics hi res audio (218 bytes, text/plain)
2024-03-15 12:50 UTC, Ian Malone
Details

Description Ian Malone 2024-03-01 15:15:41 UTC
I have a USB to 3.5mm adapter which seems not to work in
duplex mode on USB2.0 systems, possibly due to a bandwidth calculation
bug. The same adapter on the same machine works correctly when booted into windows 7.

Alsa info output:
http://alsa-project.org/db/?f=1b9970a5f7264bd8af263d4ba6e4559be06f6be4

The device is an Anker USB-C to 3.5mm audio dongle (lsusb:
Conexant Systems (Rockwell), Inc. Hi-Res Audio) which I've used for
some time on my phone (Android with USB-3.2). On trying to use it with
an older T420 laptop recently with only USB-2.0 ports I discovered it
will not work in duplex mode. Input-only and output-only profiles work
(tested recording and playback with audacity), but with duplex no
sound is recorded (Fedora 39, pipewire). This is easily reproduced by
looking at the pavucontrol volume monitor for Output Devices, if I
switch the device to Analog or Digital Input in configuration then the
Input Devices level monitor for it shows activity if I speak into or
tap the microphone. With duplex selected there is no activity, the
level monitor bar may or may not display. I can switch between the two
behaviours by changing the profile. Various applications such as
Audacity and Zoom appear to hang when accessing this microphone with
the duplex profile set. I've used pipewire configuration to force the
format to 16LE only (playback and recording), but this has not helped.

In dmesg this error appears when this happens (microphone opened, for
example by pavucontrol):
[  294.825544] usb 1-1.1: cannot submit urb 0, error -28: not enough bandwidth
(T420, Fedora 39, kernel 6.7.5)

The T420 has USB 2 Type A ports, so a Type C to Type A adapter is
needed, but so far as I can tell it's a passive device. I've also been able to use it on a newer laptop with USB-3.2 on type A and type C ports I also
get duplex (F37, kernel 6.5.12), although dmesg here still shows errors:
[ 9173.386998] usb 3-1: 1:1: usb_set_interface failed (-28)
[ 9173.387110] usb 3-1: Not enough bandwidth for new device state.
[ 9173.387113] usb 3-1: Not enough bandwidth for altsetting 1
(some of the type A ports can be used without these errors, seems to be the case if the controller isn't shared with any other devices).


I've tried building fresh kernel and modifying various of
the defines in sound/usb/card.h (currently MAX_PACKS 4 and
MAX_PACKS_HS (MAX_PACKS * 4), compared to 6 and *8) but not hit on a
winning formula yet. I'm currently successfully using a different USB adaptor on the same system, it might be relevant that the capture interface for that device offers a largest packet size of 96 (44.1 & 48 kHz, S16_LE mono), versus 288 for the one that doesn't work (44.1 & 48kHz, S24_3LE stereo). I can see in pcm.c:find_format() that the format with the largest packet size gets selected, I'm not sure if there's any fallback process if this fails.

With Analog Stereo Input profile:
$ cat /proc/asound/card1/stream0 
Synaptics Hi-Res Audio at usb-0000:00:1a.0-1.1, full speed : USB Audio

Playback:
  Status: Stop
  Interface 2
    Altset 1
    Format: S16_LE
    Channels: 2
    Endpoint: 0x01 (1 OUT) (ADAPTIVE)
    Rates: 8000, 16000, 32000, 44100, 48000, 96000
    Bits: 16
    Channel map: FL FR
  Interface 2
    Altset 2
    Format: S24_3LE
    Channels: 2
    Endpoint: 0x01 (1 OUT) (ADAPTIVE)
    Rates: 44100, 48000, 96000
    Bits: 24
    Channel map: FL FR

Capture:
  Status: Running
    Interface = 1
    Altset = 2
    Packet Size = 288
    Momentary freq = 48000 Hz (0x30.0000)
  Interface 1
    Altset 1
    Format: S16_LE
    Channels: 2
    Endpoint: 0x81 (1 IN) (ASYNC)
    Rates: 44100, 48000
    Bits: 16
    Channel map: FL FR
  Interface 1
    Altset 2
    Format: S24_3LE
    Channels: 2
    Endpoint: 0x81 (1 IN) (ASYNC)
    Rates: 44100, 48000
    Bits: 24
    Channel map: FL FR
[root@prometheus ~]# 


With Analog Stereo Duplex profile capture stream shows as stopped:
$ cat /proc/asound/card1/stream0 
Synaptics Hi-Res Audio at usb-0000:00:1a.0-1.1, full speed : USB Audio

Playback:
  Status: Running
    Interface = 2
    Altset = 2
    Packet Size = 432
    Momentary freq = 48000 Hz (0x30.0000)
  Interface 2
    Altset 1
    Format: S16_LE
    Channels: 2
    Endpoint: 0x01 (1 OUT) (ADAPTIVE)
    Rates: 8000, 16000, 32000, 44100, 48000, 96000
    Bits: 16
    Channel map: FL FR
  Interface 2
    Altset 2
    Format: S24_3LE
    Channels: 2
    Endpoint: 0x01 (1 OUT) (ADAPTIVE)
    Rates: 44100, 48000, 96000
    Bits: 24
    Channel map: FL FR

Capture:
  Status: Stop
  Interface 1
    Altset 1
    Format: S16_LE
    Channels: 2
    Endpoint: 0x81 (1 IN) (ASYNC)
    Rates: 44100, 48000
    Bits: 16
    Channel map: FL FR
  Interface 1
    Altset 2
    Format: S24_3LE
    Channels: 2
    Endpoint: 0x81 (1 IN) (ASYNC)
    Rates: 44100, 48000
    Bits: 24
    Channel map: FL FR
Comment 1 Takashi Iwai 2024-03-01 15:19:02 UTC
It's rather a core USB problem, likely an issue about the bandwidth management in the controller driver.  Reassigned.
Comment 2 Alan Stern 2024-03-01 15:58:24 UTC
Can you attach the output from "lsusb -v" for this device?  And also the contents of /sys/kernel/debug/usb/devices?
Comment 3 Ian Malone 2024-03-01 16:20:14 UTC
Created attachment 305936 [details]
lsusb_-v_output

# lsusb -v -s 001:005
Comment 4 Ian Malone 2024-03-01 16:21:06 UTC
Created attachment 305937 [details]
sys_kernel_debug_usb_devices_contents

/sys/kernel/debug/usb/devices
Comment 5 Alan Stern 2024-03-01 16:48:04 UTC
The devices file shows that the Synaptics audio device is sharing the same bus with a Broadcom Bluetooth device.  Maybe if you disabled that device the audio would work better.  Try doing:

  echo 0 >/sys/bus/usb/devices/1-1.4/bConfigurationValue

and then trying the audio.

There's also a video camera on that bus, but since it runs at high speed rather than full speed, it probably isn't interfering significantly.  If you want, you can try disabling it also by issuing the same command with 1-1.6 in place of 1-1.4.
Comment 6 Ian Malone 2024-03-01 17:14:37 UTC
Created attachment 305938 [details]
/sys/kernel/debug/usb/devices other devices disabled

Disabling the onboard camera, bluetooth and (other bus, but for good measure) qmi_wwan:
# echo 0 > /sys/bus/usb/devices/1-1.4/bConfigurationValue
# echo 0 > /sys/bus/usb/devices/1-1.6/bConfigurationValue
# echo 0 > /sys/bus/usb/devices/2-1.4/bConfigurationValue

(/sys/kernel/debug/usb/devices attached) sadly the same problem. I've also tried blacklisting the modules for the other devices (uvcvideo, btusb, qcserial, qmi_wwan) with no change. The rear port appears to be the other bus, but using that also has the same result:
# lsusb -t
/:  Bus 001.Port 001: Dev 001, Class=root_hub, Driver=ehci-pci/3p, 480M
    |__ Port 001: Dev 002, If 0, Class=Hub, Driver=hub/6p, 480M
        |__ Port 004: Dev 003, 12M
        |__ Port 006: Dev 004, 480M
/:  Bus 002.Port 001: Dev 001, Class=root_hub, Driver=ehci-pci/3p, 480M
    |__ Port 001: Dev 002, If 0, Class=Hub, Driver=hub/8p, 480M
        |__ Port 001: Dev 004, If 0, Class=Audio, Driver=snd-usb-audio, 12M
        |__ Port 001: Dev 004, If 1, Class=Audio, Driver=snd-usb-audio, 12M
        |__ Port 001: Dev 004, If 2, Class=Audio, Driver=snd-usb-audio, 12M
        |__ Port 001: Dev 004, If 3, Class=Human Interface Device, Driver=usbhid, 12M
        |__ Port 004: Dev 003, 480M

(All three devices are onboard, so difficult to remove)
Comment 7 Alan Stern 2024-03-01 19:09:35 UTC
There's also the usbhid interface on the audio device, probably used for a volume control or something like that.  Maybe unbinding it too will help.  You can try it, anyway, just to see what happens:

  echo 1-1.1:1.3 >/sys/bus/usb/drivers/usbhid/unbind

(or with the device plugged into the rear port, use 2-1.1:1.3).

Without going into any deeper testing, I can summarize the problem for you.  Basically, the ehci-hcd driver in Linux has trouble utilizing the entire available bandwidth when low- or full-speed (1.5 or 12 Mb/s) devices are connected via a USB-2 hub.  That's your situation; the hub is the 1-1 (or 2-1) device, number 002 on both buses.

At one time Intel's chipsets would attach a single onboard hub directly to the EHCI controller and then connect all the downstream USB ports through that hub.  
This is what your laptop has.  Even earlier Intel chipsets worked differently; they connected each downstream USB port through a switch which would send high-speed signals to the EHCI controller and low/full-speed signals to a companion UHCI controller.  Motherboards using that scheme didn't suffer from this bandwidth problem unless the user connected a full/low-speed device via an external USB-2 hub.

The reason for the problem is that the design of USB 2.0 and the EHCI controller hardware make it quite complicated to handle the packet scheduling when translating between two different speeds on the same bus.  The driver uses an incomplete and imperfect algorithm which can handle the simplest cases okay but is not adequate for situations requiring a higher percentage of the total bandwidth, especially when different transfer types (bulk, interrupt, and isochronous) are mixed.

Improving the driver to make it more capable would require a tremendous amount of work, and for very little return since nowadays computers use xHCI USB controllers rather than EHCI.  Only legacy systems dating from the time of your T420 laptop or earlier would derive any benefit, and then only in situations involving multiple devices with high bandwidth requirements.

I hope this explanation makes sense to you.
Comment 8 Ian Malone 2024-03-02 13:02:06 UTC
Thanks, I don't have access to the machine this weekend but will test that next week.
I don't really need to get this working, but it does slightly bug me that what should be the simplest situation (single full speed device given we've disabled all others) apparently never worked properly. (Presumably this device, although FS, is just a little more demanding than most were.) I stumbled across a scheduler patch that Monty wrote around 2006 which looks like it was never adopted, I might see if it can still be applied. http://web.mit.edu/xiphmont/Public/kernel/
Comment 9 Alan Stern 2024-03-02 15:34:42 UTC
xiphmont's web page was written 17 years ago, so it is incredibly out of date.  However, if you want to use it as a base for improving the current driver, I'll be happy to review your code.
Comment 10 Ian Malone 2024-03-05 17:14:46 UTC
Created attachment 305965 [details]
/sys/kernel/debug/usb/devices other devices and hid disabled

You are of course right, the patch can't be easily adapted to apply against the current driver and there are too many incompatibilities with memory management and the rest of the USB system for it to be trivial to drop in the whole 2.6.18 host controller. I'm not sure it was ever really submitted, which is a pity as it looks like it implemented FSTN handling that never otherwise got added. I might fiddle with it a bit more to see if it can be built just to see if it would have helped.

Meanwhile, I've tried disabling the HID as well, /sys/kernel/debug/usb/devices attached. This still doesn't work (same "cannot submit urb 0, error -28: not enough bandwidth"). It does puzzle me a bit, we're now down to a single FS device on the hub, while I can understand the scheduling for LS/FS onto HS is complicated I'd have thought this issue would have popped up frequently enough when these laptops were common that it would have been addressed back then. Is there any other information I can extract to find out what's going on with the scheduler? The following are the FS/LS portion of /sys/kernel/debug/usb/ehci/0000:00:1a.0/bandwidth for a good device in out, in and duplex and the problematic device:

good device out
TT 2-1 port 0  FS/LS bandwidth allocation (us per frame)
    482  482  482  482  482  482  482  482
FS/LS budget (us per microframe)
 0:   24   0 125 125 125  83   0   0
 8:   24   0 125 125 125  83   0   0
16:   24   0 125 125 125  83   0   0
24:   24   0 125 125 125  83   0   0
32:   24   0 125 125 125  83   0   0
40:   24   0 125 125 125  83   0   0
48:   24   0 125 125 125  83   0   0
56:   24   0 125 125 125  83   0   0
2-1.1 ep 82:    24 @  0.0+1 mask 1c01
2-1.1 ep 01:   458 @  0.2+1 mask 003c

good device in good
TT 2-1 port 0  FS/LS bandwidth allocation (us per frame)
    109  109  109  109  109  109  109  109
FS/LS budget (us per microframe)
 0:   24   0   0  85   0   0   0   0
 8:   24   0   0  85   0   0   0   0
16:   24   0   0  85   0   0   0   0
24:   24   0   0  85   0   0   0   0
32:   24   0   0  85   0   0   0   0
40:   24   0   0  85   0   0   0   0
48:   24   0   0  85   0   0   0   0
56:   24   0   0  85   0   0   0   0
2-1.1 ep 82:    24 @  0.0+1 mask 1c01
2-1.1 ep 81:    85 @  0.3+1 mask e008

good device duplex
TT 2-1 port 0  FS/LS bandwidth allocation (us per frame)
    567  567  567  567  567  567  567  567
FS/LS budget (us per microframe)
 0:   24  85 125 125 125  83   0   0
 8:   24  85 125 125 125  83   0   0
16:   24  85 125 125 125  83   0   0
24:   24  85 125 125 125  83   0   0
32:   24  85 125 125 125  83   0   0
40:   24  85 125 125 125  83   0   0
48:   24  85 125 125 125  83   0   0
56:   24  85 125 125 125  83   0   0
2-1.1 ep 82:    24 @  0.0+1 mask 1c01
2-1.1 ep 01:   458 @  0.2+1 mask 003c
2-1.1 ep 81:    85 @  0.1+1 mask 3802

bad device in
TT 2-1 port 0  FS/LS bandwidth allocation (us per frame)
    273  273  273  273  273  273  273  273
FS/LS budget (us per microframe)
 0:   39   0 125 109   0   0   0   0
 8:   39   0 125 109   0   0   0   0
16:   39   0 125 109   0   0   0   0
24:   39   0 125 109   0   0   0   0
32:   39   0 125 109   0   0   0   0
40:   39   0 125 109   0   0   0   0
48:   39   0 125 109   0   0   0   0
56:   39   0 125 109   0   0   0   0
2-1.1 ep 84:    39 @  0.0+1 mask 1c01
2-1.1 ep 81:   234 @  0.2+1 mask f004

bad device out
TT 2-1 port 0  FS/LS bandwidth allocation (us per frame)
    497  497  497  497  497  497  497  497
FS/LS budget (us per microframe)
 0:   39   0 125 125 125  83   0   0
 8:   39   0 125 125 125  83   0   0
16:   39   0 125 125 125  83   0   0
24:   39   0 125 125 125  83   0   0
32:   39   0 125 125 125  83   0   0
40:   39   0 125 125 125  83   0   0
48:   39   0 125 125 125  83   0   0
56:   39   0 125 125 125  83   0   0
2-1.1 ep 84:    39 @  0.0+1 mask 1c01
2-1.1 ep 01:   458 @  0.2+1 mask 003c

bad device duplex
TT 2-1 port 0  FS/LS bandwidth allocation (us per frame)
    497  497  497  497  497  497  497  497
FS/LS budget (us per microframe)
 0:   39   0 125 125 125  83   0   0
 8:   39   0 125 125 125  83   0   0
16:   39   0 125 125 125  83   0   0
24:   39   0 125 125 125  83   0   0
32:   39   0 125 125 125  83   0   0
40:   39   0 125 125 125  83   0   0
48:   39   0 125 125 125  83   0   0
56:   39   0 125 125 125  83   0   0
2-1.1 ep 84:    39 @  0.0+1 mask 1c01
2-1.1 ep 01:   458 @  0.2+1 mask 003c

It looks like there's an extra 234us to accommodate for input to work, I'm guessing there are restrictions on where that can go. Is it plausible that if a lower bandwidth mode is requested from the device it would work? That's essentially what I was wondering about with respect to the snd-usb-audio module before this was moved over to usb.
Comment 11 Alan Stern 2024-03-06 20:37:31 UTC
The most obvious difference is that the "good" device requires only 85 us/frame for its audio-in channel whereas the "bad" device requires 234 us/frame.  This is the difference between 16-bit and 24-bit captures that you mentioned originally.  (The 39 vs. 24 for the interrupt endpoint wouldn't have a significant effect.)  There could be other factors in play, but that difference is likely enough to tip the balance.

The times in the bandwidth file indicate that audio subsystem is using the device's higher bandwidth setting.  Using the lower bandwidth setting instead could well make a difference.  I don't know how to tell the sound interface to do this; maybe Takashi can say.
Comment 12 Ian Malone 2024-03-08 14:04:36 UTC
Okay, I think we may have reached a dead end. Using wireplumber (creating rules in ~/.config/wireplumber/main.lua.d) it's possible to manipulate the audio formats that pipewire will use for a device, so I can independently request the 16 bit mode for the input and output streams. The bandwidth profiles for those are as follows (wMaxPacketSize is for the in/out interface descriptor with the corresponding bBitResolution as reported by lsusb -v):

in (48kHz)
16bit, expected wMaxpacket size 192bytes
bandwidth: 1-1.1 ep 81:   159 @  0.2+1 mask f004
uframes 125  34

24bit, expected wMaxPacketSize 288bytes
bandwidth: 1-1.1 ep 81:   234 @  0.2+1 mask f004
uframes 125 109

out (48kHz)
16bit, expected wMaxPacketSize 768bytes (?!)
bandwidth: 1-1.1 ep 01:   608 @  0.1+1 mask 003e
uframes 125 125 125 125 108

24bit, expected wMaxPacketSize 458bytes
bandwidth: 1-1.1 ep 01:   458 @  0.2+1 mask 003c
uframes 125 125 125  83

There's also the HID endpoint (unbinding doesn't seem to remove the bandwidth usage) expected wMaxPacketSize 35bytes:
1-1.1 ep 84:    39 @  0.0+1 mask 1c01

The bandwidth to wMaxPacketSize ratio is approximately the same for all streams (1.2-1.3, 1.11 for the HID I guess slightly different overheads).

Following the rules that ehci-sched.c sets out, this can't be met:
max_tt_usecs[] = { 125, 125, 125, 125, 125, 125, 30, 0 };
and:
/* special case for isoc transfers larger than 125us:
 * the first and each subsequent fully used uframe
 * must be empty, so as to not illegally delay
 * already scheduled transactions
 */

The minimum bandwidth configuration is:
out(24b) 125 125 125  83
in (16b) 125  34
hid      39

And there is no way to block them such that 30 in microframe 7 isn't exceeded.
125 125 125  83 125  34 39 0 etc.

Unless it's legal to schedule the hid into microframe 6 after the audio input as its final microframe is not fully used?


A final point of interest is 16bit output, wMaxPacketSize 768bytes. 24 bit output has allowed frequencies 44.1kHz, 48kHz, 96kHz, 2 channels. 16 bit has 8kHz, 16kHz, 32kHz, 441.kHz, 48kHz, 96kHz. Input 24 and 16 bit have only 48kHz and 44.1kHz, 2 channels.
wMaxPacketSize / (Max sampling frequency * sample bytes * channels )
in16b  192 / (48kHz * 2 * 2) = 1ms
in24b  288 / (48kHz * 3 * 2) = 1ms
out16b 768 / (96kHz * 2 * 2) = 2ms
out24b 576 / (96kHz * 3 * 2) = 1ms

Out 16 bit mode claims to accept 2ms packets (but still interval 1). I'm wondering if this is just an error in the device reported capability (or maybe it can buffer?). Do isochronous outputs have to use the full max packet size?
Comment 13 Alan Stern 2024-03-08 15:22:31 UTC
This raises an obvious question: What is the point of supporting 96 kHz operation on the output channel but only going up to 48 kHz on the input channel?  That's weird.

That maxpacket value for the output channel does look very strange.  To support 96 kHz operation, 16-bits should use 384 bytes and 24-bits should use 576.  (You wrote 458 but that was obviously a typo, copying the number in the line below.)  The 768 value just seems wrong.

I don't believe the device does 2-ms buffering.  And no, isochronous packets do not have to use the full maxpacket size, but there's no reason to set the maxpacket size larger than necessary.  I bet that 768 really ought to be 384, and it's a mistake in the device's firmware.

I don't understand why unbinding fails to remove the HID's endpoint bandwidth usage.  That might be a bug.

While I haven't looked at the details of microframe scheduling in a long time, I don't think it would be valid to schedule the interrupt endpoint to start in microframe 6.  Certainly if it were valid, it would require the use of FSTN nodes, which the driver does not support.

Given that the device will require the entire 96 kHz output bandwidth even when it's running at only 48 kHz, scheduling is bound to be difficult or impossible.  Things would be easier if there were separate alternate settings for 48 kHz and 96 kHz operation.

Part of the scheduling problem arises because it's generally better to put the isochronous packets in the earlier microframes and the interrupt packets after them.  However, the driver schedules each endpoint for the earliest feasible position, and apparently the interrupt endpoint gets started first.  That's why it ends up in its non-optimal position at the beginning of the frame.

In theory it's possible to change the 16-bit output maxpacket value in the kernel, setting it to 384.  I don't know that this would be a good idea in general, but you could try it for your own use.  It's not clear that you would want to spend the time and effort to do this, however.
Comment 14 Ian Malone 2024-03-08 16:47:01 UTC
I wrote the following before seeing your most recent reply, but it's a bit long to rewrite...

Not suggesting this as a patch, but it turns out that if you flip the order that microframes get assigned then it will all get packed in:

--- /tmp/drivers/usb/host/ehci-sched.c  2024-03-04 18:25:48.000000000 +0000
+++ linux-6.7.5-200.prom.fc39.x86_64/drivers/usb/host/ehci-sched.c      2024-03-08 14:09:08.085984284 +0000
@@ -868,11 +868,13 @@
 
                for (i = qh->ps.bw_period; i > 0; --i) {
                        frame = ++ehci->random_frame & (qh->ps.bw_period - 1);
-                       for (uframe = 0; uframe < 8; uframe++) {
+                       for (uframe = 7; ; uframe--) {
                                status = check_intr_schedule(ehci,
                                                frame, uframe, qh, &c_mask, tt);
                                if (status == 0)
                                        goto got_it;
+                               if(uframe == 0)
+                                       break;
                        }
                }

With the 24bit interfaces (default behaviour, no forcing formats in pipewire, the bluetooth device still enabled):
FS/LS budget (us per microframe)
 0:  125 109 125 125 125 125  21   0
 8:  125 109 125 125 125 125  21   0
16:  125 109 125 125 125 125  21   0
24:  125 109 125 125 125 125  21   0
32:  125 109 125 125 125 125  21   0
40:  125 109 125 125 125 125  21   0
48:  125 109 125 125 125 125  21   0
56:  125 109 125 125 125 125  21   0
1-1.1 ep 84:    39 @  0.5+1 mask 8020
1-1.4 ep 81:    24 @  0.5+1 mask 8020
1-1.1 ep 01:   458 @  0.2+1 mask 003c
1-1.1 ep 81:   234 @  0.0+1 mask 3c01

I guess this is because of an asymmetry (after your reply: probably related to what you mention about interrupt packets in later microframes): microframe-spanning transfers start with full microframes but will usually finish on partially filled ones. On top of this the seventh uframe is allowed only 30us, so filling from the back prevents the first microframe being partially occupied by transfers and forcing any microframe-spanning transfers forward a frame. This causes the space single microframe transfers can fit into to become a little more fragmented than it would otherwise be. In the case where:
39   0   0   0   0   0   0   0
is in place then adding a couple of larger transfers:

39  125 109 0   0   0    0   0

39  125 109 125 125 125  83x 0
                          ^ no longer fits (<=30)
It's still not optimum which terminating partial microframe is best to put where will depend on the smaller transfers to be fitted in, but there's one less gap.


The 16bit output (608us per frame) still wont work in duplex, but this is unsurprising as the limit is 6*125us+30us=780us, while the 608us output plus the smallest input (159us) and the 39us HID comes to 806us, so there's no way to fit it. I do wonder if snd-usb-audio would be able to help there by using a smaller packet size in the output streams. 

The bluetooth controller is 12Mbps too and wants to open extra endpoints if anything connects, so I'm actually better plugging into the other bus, but at least this is due to absolute bandwidth limits. It's actually possible to get away with duplex bluetooth audio (sMBC) and usb audio from this device on the same bus if I use the 16 bit input format:
TT 1-1 port 0  FS/LS bandwidth allocation (us per frame)
    713  713  713  713  713  713  713  713
FS/LS budget (us per microframe)
 0:  125  67 125 125 125 125  21   0
 8:  125  67 125 125 125 125  21   0
16:  125  67 125 125 125 125  21   0
24:  125  67 125 125 125 125  21   0
32:  125  67 125 125 125 125  21   0
40:  125  67 125 125 125 125  21   0
48:  125  67 125 125 125 125  21   0
56:  125  67 125 125 125 125  21   0
1-1.4 ep 81:    24 @  0.5+1 mask 8020
1-1.1 ep 84:    39 @  0.5+1 mask 8020
1-1.1 ep 01:   458 @  0.2+1 mask 003c
1-1.1 ep 81:   159 @  0.0+1 mask 3c01
1-1.4 ep 83:    17 @  0.1+1 mask 3802
1-1.4 ep 03:    16 @  0.1+1 mask 0002
(Let's take a moment to admire what a compressed codec can do.)

Not sure if this is a good idea or actually legal by the USB spec of course...
(My knowledge of which is limited to a recent skim though the EHCI specification, although I think from fig 4-17 start splits are issued the microframe before the transfer starts, a transfer starting in microframe 6 has its start-split in microframe 5, although I'm not clear if it then spans the frame boundary, it it fits in microframe 6 on its own does it use a complete-split instead? And if starting closer to the front the it should be fine.)

(Also after your reply: I suppose the different rate support is intended for playback only modes? Although 96kHz makes more sense recording than playing anyway...)
Comment 15 Alan Stern 2024-03-08 17:28:36 UTC
Some years ago I did try allocating interrupt transfers from the end of frame backwards, but decided against it in the end -- I don't remember why.  It certainly helps in your case, so maybe that decision should be reconsidered.

Maybe the reason was that the absence of FSTN nodes makes interrupt transfers near the end of the frame less reliable.  If any unexpected delays should push the transfer back a few hundred microseconds, there wouldn't be enough complete-splits to guarantee it could finish correctly.  In the examples you give above, 1-1.4 ep 81 and 1-1.1 ep 84 each have only one complete-split (only one bit set in the high-order byte of the mask), whereas the spec says there should be enough complete-splits for the entire LS/FS packet plus two extra.

snd-usb-audio using a smaller packet size for the output streams wouldn't help the scheduling; the scheduler has to assume that each endpoint will use the maximum packet size allowed (i.e., the maxpacket value).

The reason for scheduling isochronous transfers earlier than interrupt transfers has to do with the way transaction translators in hubs behave.  I forget the details (it's described in the USB-2 spec), but there's some scenario in which they will lose data if an isochronous packet comes after an interrupt packet in the same microframe.

Scheduling interrupt transfers late in the frame _is_ legal according to the spec, so long as it is done properly.  And in theory the driver could rebalance the schedule, changing which microframes are assigned to each endpoint, as new endpoints are added.  But that would add another whole new level of complexity to the driver and I never implemented it.  Besides, without FSTN nodes you still wouldn't be able to get the full benefit.
Comment 16 Ian Malone 2024-03-08 23:35:05 UTC
Thanks for looking into it. I think we can close this then, if I manage to test the packet size fix I'll find out where the correct place to submit it as a device quirk is, by itself that might be enough with the current scheduler. Unlikely I'll get into porting over that old scheduler reimplementation, I don't really need it, it seems few people have, and it almost certainly requires a deeper understanding of the protocol than I've picked up so far.
Comment 17 Ian Malone 2024-03-15 12:50:10 UTC
Created attachment 305995 [details]
wireplumber rule for 16bit input on conexant/synaptics hi res audio

Hi, I'll close this (wasn't sure what resolution to put, but doesn't matter much). Some final observations though in case they help anyone else.

It turned out to be possible to use the device in 16 bit input and 24 bit output without kernel modification. I'd thought this didn't work, but it turns out to be an interaction between wireplumber and pavucontrol that breaks it: changing device profiles (duplex/in/out) in pavucontrol changes the device interface back to 24bit requiring a restart of wireplumber to reapply the 16 bit rule for the input, this messed up some of my testing. I couldn't see this happening normally as the connection failing means the format can't be seen in pw-top. However if the 16bit rule for the input is present then restarting wireplumber after changing the profile does work ("systemctl --user restart wireplumber").

The bandwidth profile then looks like this (bluetooth device 1-1.4 on bus too):
TT 1-1 port 0  FS/LS bandwidth allocation (us per frame)
    680  680  680  680  680  680  680  680
FS/LS budget (us per microframe)
 0:   63 125 125 125 125 117   0   0
 8:   63 125 125 125 125 117   0   0
16:   63 125 125 125 125 117   0   0
24:   63 125 125 125 125 117   0   0
32:   63 125 125 125 125 117   0   0
40:   63 125 125 125 125 117   0   0
48:   63 125 125 125 125 117   0   0
56:   63 125 125 125 125 117   0   0
1-1.4 ep 81:    24 @  0.0+1 mask 1c01
1-1.1 ep 84:    39 @  0.0+1 mask 1c01
1-1.1 ep 01:   458 @  0.2+1 mask 003c
1-1.1 ep 81:   159 @  0.1+1 mask 7802


The pipewire lua rule for this is attached.

(I must have misunderstood the scheduler comment about >125us transfers needing to start on a fresh microframe, since the unmodified scheduler seems to be combining the 458 and 159us transfers (going by the budget, not sure how to interpret the mask information.)

Although the two channel input takes more bandwidth than single channel it does appear to fit. The two other devices I've got have identical chipsets, so hard to draw wide conclusions, but they only have 16bit single channel input, I suspect that's more common, but hard to find reliable information on this type of device. Fully duplex 24 bit mode doesn't work with the current scheduler, we already knew that. Packed as above it would run over to 67us in microframe 7 and only 30us are allowed there (and possibly an issue with where split-completes sit?).

Hacking the sound/usb driver module to force the 16bit output mode to maximum packet size 384 works fine (if forcing wireplumber to 16bit everything then duplex now works and sound quality fine). I'll report that as a sound/usb bug; it's possibly addressable as a device quirk, although I couldn't get that working on my own and had to resort to a brute force hack:
--- sound/usb.orig/stream.c     2024-03-08 10:19:27.430507385 +0000
+++ sound/usb/stream.c  2024-03-12 16:13:43.212737555 +0000
@@ -690,6 +690,10 @@
        fp->ep_attr = get_endpoint(alts, 0)->bmAttributes;
        fp->datainterval = snd_usb_parse_datainterval(chip, alts);
        fp->protocol = protocol;
+       if(le16_to_cpu(get_endpoint(alts, 0)->wMaxPacketSize)==768){
+         get_endpoint(alts, 0)->wMaxPacketSize = cpu_to_le16(384);
+         usb_audio_err_ratelimited(chip,"overwrote in stream");
+       }
        fp->maxpacksize = le16_to_cpu(get_endpoint(alts, 0)->wMaxPacketSize);
        fp->channels = num_channels;
        if (snd_usb_get_speed(chip->dev) == USB_SPEED_HIGH)

(Overwriting the usb drivers MaxPacketSize rather than sound/usb's structure, the later doesn't seem to work.)

Note You need to log in before you can comment on or make changes to this bug.