Bug 199327
Description
Sylvain
2018-04-08 22:46:25 UTC
Created attachment 275173 [details]
Include_Linux_log2.patch
Created attachment 275175 [details]
makefile_pie.patch
Created attachment 275177 [details]
tools_include_linux_log2.patch
Created attachment 275179 [details]
emerge-info.txt
Created attachment 275181 [details]
alsa-info.txt
Created attachment 275183 [details]
ver_linux.txt
Created attachment 275185 [details]
cpuinfo.txt
Created attachment 275187 [details]
ioports.txt
Created attachment 275189 [details]
iomem.txt
Created attachment 275191 [details]
interrupts.txt
Created attachment 275193 [details]
lspci.txt
Created attachment 275195 [details]
scsi.txt
Created attachment 275197 [details]
devices.txt
The problem occurs because the endpoint 0x81 (acting as implicit feedback ep) was already started for the output stream as sync_endpoint (use_count = 1). If the driver then tries to set the format for the input stream for endpoint 0x81, this will fail because the endpoint is already running (use_count = 1 via output stream), however, the SUBSTREAM_FLAG_DATA_EP_STARTED flag is not set for this stream. The endpoint 0x81 can no longer be stopped and reports -EBUSY. What is the correct way for the driver to handle such circumstances? Does this possibly affect other devices with implicit feedback quirk? A quick-fix would be to force the stop of endpoints via stop_endpoints() in pcm.c in line 268: ``` if (test_and_clear_bit(SUBSTREAM_FLAG_DATA_EP_STARTED, &subs->flags)) snd_usb_endpoint_stop(subs->data_endpoint); ``` change to: ``` if (test_and_clear_bit(SUBSTREAM_FLAG_DATA_EP_STARTED, &subs->flags) || (subs->data_endpoint && subs->data_endpoint->use_count != 0)) snd_usb_endpoint_stop(subs->data_endpoint); ``` I previously compiled the gentoo kernel surces 4.20.16 and the problem does not seem to be present. I did not check the vanilla sources 4.20.16 but previously Gentoo sources (post bisect till somewhere before 4.20.16) had the issue same as the vanilla sources. (In reply to Sylvain from comment #15) I use Debian with kernel 4.19.0. After your comment I tested Kernel 5.3.0 which has the same error for me: $ pulseaudio -v I: [pulseaudio] (alsa-lib)pcm_hw.c: SNDRV_PCM_IOCTL_PREPARE failed (-16) I: [pulseaudio] (alsa-lib)pcm_hw.c: SNDRV_PCM_IOCTL_PREPARE failed (-16) I: [pulseaudio] (alsa-lib)pcm_hw.c: SNDRV_PCM_IOCTL_PREPARE failed (-16) I: [pulseaudio] (alsa-lib)pcm_hw.c: SNDRV_PCM_IOCTL_PREPARE failed (-16) I: [pulseaudio] (alsa-lib)pcm_hw.c: SNDRV_PCM_IOCTL_PREPARE failed (-16) I: [pulseaudio] alsa-util.c: snd_pcm_hw_params failed: Device or resource busy ... $ sudo dmesg usb 1-1: Unable to change format on ep #81: already in use usb 1-1: Unable to change format on ep #81: already in use usb 1-1: Unable to change format on ep #81: already in use ... However, I use the Behringer UFX1604, but I think it is almost identical to the UFX1204. Could it be that you have temporarily updated alsa-lib or pulseaudio and this solved your problem? As far as I understand the code, nothing has changed significantly in this regard on the snd-usb-audio driver. alsa-lib 1.1.8 $ pulseaudio --version pulseaudio 12.2 Is your UFX1604 1397:0001? There is a patch that makes UFX1604 behavior compatible with UFX1204, which will land into upstream 5.3-rc7. In short, it's: --- a/sound/usb/pcm.c +++ b/sound/usb/pcm.c @@ -339,6 +339,7 @@ static int set_sync_ep_implicit_fb_quirk(struct snd_usb_substream *subs, ep = 0x81; ifnum = 2; goto add_sync_ep_from_ifnum; + case USB_ID(0x1397, 0x0001): /* Behringer UFX1604 */ case USB_ID(0x1397, 0x0002): /* Behringer UFX1204 */ ep = 0x81; ifnum = 1; About the full duplex behavior. Basically you can't change the capture stream status (format, rate, etc) during the implicit fb usage. A quick workaround would be just to skip instead of -EBUSY, something like: --- a/sound/usb/endpoint.c +++ b/sound/usb/endpoint.c @@ -858,10 +858,10 @@ int snd_usb_endpoint_set_params(struct snd_usb_endpoint *ep, int err; if (ep->use_count != 0) { - usb_audio_warn(ep->chip, + usb_audio_dbg(ep->chip, "Unable to change format on ep #%x: already in use\n", ep->ep_num); - return -EBUSY; + return 0; /* FIXME: assuming the unchanged condition */ } /* release old buffers, if any */ But this may result in inconsistent state, so we'd need a restriction via hw_constraint as a proper fix, I guess. (In reply to Takashi Iwai from comment #17) Thank you for your message. > Is your UFX1604 1397:0001? There is a patch that makes UFX1604 behavior > compatible with UFX1204, which will land into upstream 5.3-rc7. > In short, it's: > ... Yes, that is the USB device id. I know about the patch because I had created the corresponding bug report and have this patch in my code. Exactly this implicit_fb_quirk leads to the problem of this bug report in the way pulseaudio acts with the devices. > About the full duplex behavior. Basically you can't change the capture > stream status (format, rate, etc) during the implicit fb usage. > A quick workaround would be just to skip instead of -EBUSY, something like: > ... > But this may result in inconsistent state, so we'd need a restriction via > hw_constraint as a proper fix, I guess. So the problem seems to be the use of shared endpoints between interfaces and potentially can affect all implicit_fb_quirk devices ?! UFX1204/1604 uses: ``` ifnum 1 (out): data EP: 0x02 sync EP: 0x81 ifnum 2 (in): data EP: 0x81 ``` This special case should be recognizable in order to handle set_format differently. I came to two possible ways to deal with it: 1. Stop endpoints if use_count 1 to set the format (potentially interrupting already created stream?). 2. Do not set the format if endpoint use_count 1 (assuming the unchanged condition), so your solution. In this case, is it anyway hard request that the shared endpoint must have the same format? I do not know enough about the USB spec and what the more correct way is atm. It's possible to add some hw_constraint to restrict the parameters dynamically depending on the counterpart stream state. I'm going to cook that once after confirming that the culprit is really that. That is, if the simple fix I suggested in comment 17 really helps, I'll start looking at it. (In reply to senderberlin from comment #16) My mistake, Please disregard my comment #15, the issue is still present Yes, the fix from comment #17 seems to tackle the problem. Quick tests Behringer UFX1604: without fix: Pulseaudio: has only playback or capture device Reaper via Pulseaudio: only in or out works, not both at the same time Reaper via ALSA: in and out at the same time (full duplex) ok Renoise via ALSA: only in or out are acivatable with fix: PulseAudio: device is listed as Full Duplex Reaper via Pulseaudio: in and out at the same worked glitchy, most time reaper freezes (maybe settings in pa) Reaper via ALSA: in and out at the same time ok Renoise via ALSA: in and out at least activatable at the same time with fix (but freezes when trying to stream audio when both active) Renoise log: Renoise LOG> ALSA: Opening ALSA Playback Device 'hw:1,0 (USB Audio)'... Renoise LOG> ALSA: Open ALSA Output Device OK Renoise LOG> ALSA: Max channels is 4 for Playback... ... Renoise LOG> ALSA: Opening ALSA Capture Device 'hw:1,0 (USB Audio)'... Renoise LOG> ALSA: Open ALSA Capture Device OK Renoise LOG> ALSA: Max channels is 16 for Capture... ... Renoise LOG> ALSA: Setup Buffers... Renoise LOG> ALSA: Configuring the Player... Renoise LOG> ALSA: Creating the poll thread... Renoise LOG> ALSA: Successfully created an ALSA RT thread Renoise LOG> ALSA: Up and running... Renoise LOG> Threads: Timeout while waiting for Thread2 to finish its commands!! Info: Focusrite Scarlet 6i6 can be used in all cases mentioned above as full duplex on the same machine, same usb port Somehow I had unstable behavior in the tests with the fix with no reproducibility or clear error logs. I am a bit busy atm because of work but I will try to further investigate the problem. OK, could you give lsusb -v output of the corresponding device? I'll see what can be done in the driver, but would like to check the actual endpoints beforehand. Created attachment 284805 [details]
lsusb -v of Behringer UFX1604
lsusb -v of Behringer UFX1604
The driver currently doesn't properly support SYNC endpoints. So I believe implicit feedback quirk was enabled for several devices as a workaround. That way playback stream is synchronized to capture stream which in turn is synchronized to SOF by the device. %) (In reply to Alexander Tsoy from comment #24) > The driver currently doesn't properly support SYNC endpoints. So I believe > implicit feedback quirk was enabled for several devices as a workaround. > That way playback stream is synchronized to capture stream which in turn is > synchronized to SOF by the device. %) Sorry, I was wrong. It seems isochronous transfers are always synchronized to SOF. Created attachment 288337 [details]
max_packs_per_urb.patch
Please try attached patch. Decreasing the number of packs per URB fixes pops and clicks for my MOTU Microbook IIc which also has synchronous enpoints. This is especially noticeable on high sample rates like 96 kHz.
(In reply to Alexander Tsoy from comment #26) > Created attachment 288337 [details] > max_packs_per_urb.patch > > Please try attached patch. Decreasing the number of packs per URB fixes pops > and clicks for my MOTU Microbook IIc which also has synchronous enpoints. > This is especially noticeable on high sample rates like 96 kHz. With the use of the implicit_fb_quirk I did not notice pops and clicks with the Behringer UFX. However, the use of full duplex creates problems. If I'm not mistaken it's about the same issue as in bug #103751 and bug #207023?! (In reply to senderberlin from comment #27) Yes, I also can workaround pops and clicks issues by using implicit fb quirk. But it shouldn't be needed for devices that works is synchronous mode. What I noticed is that these glitches happens only when driver packs a lot of iso frames in URB, e.g. when pulseaudio with tsched or jack with very large buffer is used. So maybe this is a bug in scheduling code in hcd drivers or something like that. And of course it is possible that Behringer UFX* devices and Microbook II/IIc actually have different problems. We can say for sure if somebody with Behringer device would check if decreasing MAX_PACKS resolves pops and clicks issue. I reported this bug concerning the full duplex issue on a UFX1204. I am currently using the ful duplex work around proposed in comment #17 which brings back full duplex capabilities. I can also confirm that I see no issue with pops and clicks using this workaround. I am not using a microbook computer but rather an intel DZ77GA-70K motherboard. (In reply to Alexander Tsoy from comment #29) > And of course it is possible that Behringer UFX* devices and Microbook > II/IIc actually have different problems. We can say for sure if somebody > with Behringer device would check if decreasing MAX_PACKS resolves pops and > clicks issue. The max_packs_per_urb.patch from comment #26 did not resolves pops and clicks issue for me. I tested values 4, 2 and 1 for MAX_PACKS, all had the issue. So far, only the implicit_feedback_quirk has helped against pops and clicks. Created attachment 288509 [details]
packs_per_urb_sync.patch
That unfortunate. :(
OK, this is my last attempt. For me it gives even better result than with MAX_PACKS = 4. Could you give it a try?
(In reply to Alexander Tsoy from comment #32) > Created attachment 288509 [details] > packs_per_urb_sync.patch > > That unfortunate. :( > > OK, this is my last attempt. For me it gives even better result than with > MAX_PACKS = 4. Could you give it a try? I really appreciate your effort! Unfortunately, this patch does not improve either: constant pops and clicks every few 100 ms or so (not loud, sounds like some sort of scheduling / buffer size mismatch stuff). If implicit_feedback_quirk is used, the sound is completely clear. Created attachment 296313 [details]
Patch to set clock selector to default clock upon rate change
Could someone else test the attached patch and see if it resolves the pops and clicks on full-duplex mode for the UFX1604?
I own one of those and it took me quite some time to track down the problem. This patch resolves the problem for me.
I also turned off implicit feedback for my UFX1604 and I consider it unneeded.
Also, I'm interested in seeing an lsusb -v of the UFX1204.
Created attachment 296319 [details]
More complete patch disabling unneeded implicit feedback and setting clock selector to default clock on rate change for UFX1604
After re-reading the thread it is clear to me that implicit feedback for the UFX1604/UFX1204 needs to be disabled.
This is a more complete patch that disables that and for the UFX1604 only sets the clock selector to its pin 1 default clock synced to the USB SOF upon rate change. This is needed because apparently the endpoints are hardwired to the clock selector and after we change the rate on the main USB SOF synced clock the clock selector is left in a halfway state in regards to the sampling rate.
That's why the pops and clicks aren't evident at stock 48000Hz, become slightly audible at 44100Hz and detestable at 96000Hz. Seems the clock selector needs a nudge or it will screw up the sync.
Unfortunately I don't have access to the lsusb -v of the UFX1204 so please someone could share it here? This patch needs some more love from the community.
As the OP of this bug I would like to point out that it is regarding the inability to use duplex mode on an UFC1204 as a result of a regression. For me Pops and clicks were never an issue prior or after the regression. I still use comment #17 as a temporary fix. Please open a different bug for pops and clicks issues. Sylvain, I don't think you understand, the only reason implicit feedback was engaged for both the UFX1204 and the UFX1604 was because there were numerous complaints and Takashi Iwai thought it could solve the pops and clicks... see https://patchwork.kernel.org/project/alsa-devel/patch/20190820070110.4361-1-tiwai@suse.de/ Not only it doesn't solve pops and clicks it creates additional problems. I believe I tracked the crux of the problem with my patch and then we can solve your problem without the need for quick hacks. Could you provide the lsusb -v of your UFX1204? It would mean much to me! Created attachment 296321 [details]
lsusb -v of UFS1204
My point exactly, if others are complaining about other stuff let them open bugs relating to their own issues. This bug might be a consequence of changes made in attempt to fix other bugs but this regression bug is about loosing Duplex capabilities. I add the lsusb of the UFX1204 as it can help you as well as this bug.
Thanks for the feedback, Sylvain. Your lsusb -v confirms that the clocks of the UFX1204 are similar to the UFX1604. I submitted my preliminary patch to alsa-devel and Cc'ed Takashi Iwai. He chastised me a bit for hard-coding the clock number into the kernel and provided a more generic approach that should work for the UFX1204 as well as for the UFX1604. This all means that there's a good chance your regression can be soon fixed in mainline. The root cause of your regression is that the first bad commit you tracked down (thanks for that!) engages implicit feedback for a synchronous endpoint. The bad commit was done to try to fix complaints of noise / pops / clicks when these devices first for the UFX1204 then later for the UFX1604. The bad commit is unneeded now that a proper fix is available. Created attachment 296323 [details]
Patch to set the UFX1204 clock selector to default clock upon rate change. Removes hacky implicit feedback entries
Sylvain, could you please try the attached patch on top of a vanilla kernel and see if your PulseAudio issue is resolved without any nasty side effects?
I'm going to submit the clock selector setup patch and merge it for 5.13 kernel. The removal of implicit feedback quirks is done in a separate patch, once after confirmed that it doesn't do anything harmful. In principle, the implicit fb mode itself shouldn't do much bad. There have been a few bugs at the beginning of 5.11 kernel that broke PA on some devices, and this could be the case, too. Those should have been already fixed in the latest code, so please check it. Created attachment 296369 [details]
Patch to fix clock selector setup
All should have been merged in 5.13. Let's close the bug. It's working again, Thanks! |