Bug 216836

Summary: Arch linux kernel 6.1 Displayport HDMI no sound
Product: Drivers Reporter: arthur (arthur.widetschek)
Component: Sound(ALSA)Assignee: Jaroslav Kysela (perex)
Status: RESOLVED CODE_FIX    
Severity: normal CC: bevan, laser.eyess.trackers, perex, peter.bo, tiwai
Priority: P1    
Hardware: All   
OS: Linux   
Kernel Version: 6.1.1 Subsystem:
Regression: No Bisected commit-id:
Attachments: alsa-info 6.0.12
alsa-info 6.1.1
Debug patch #1
kernel log with patch applied
kernel log 6.1 with patch applied
alsa-info.sh (Thinkpad T14s, HDMI device missing)
alsa-info.sh (Thinkpad T14s, ef6f reverted, HDMI device available)
pulseaudio output (Thinkpad T14s, HDMI device missing)
pulseaudio output (Thinkpad T14s, ef6f reverted, HDMI device available)
pulseaudio debug output (Thinkpad T14s, HDMI device missing)
Test fix patch
alsa-info.sh (Thinkpad T14s, comment#24 applied, HDMI device still missing)
Pulseaudio debug log (Thinkpad T14s, comment#24 applied, HDMI device still missing)
Test fix patch 2

Description arthur 2022-12-23 11:04:05 UTC
After Upgrade from kernel 6.0.12 to linux 6.1.1.arch1-1 i had no audio output neither via hdmi nor with Displayport.
Downgrade to kernel 6.0.12 helped. I only downgraded the kernel, so its definitively an kernel issue.
My System:
Arch Linux
ThinkPad P14s Gen 2a
AMD Ryzen 7 PRO 5850U with Radeon Graphics
Comment 1 arthur 2022-12-23 11:14:06 UTC
The HDMI / Displayport ist not shown on System Preferences - Audio - Output either with kernel 6.1.1.arch1-1. With 6.0.12 it's shown.
Comment 2 arthur 2022-12-23 12:07:13 UTC
There is also an strange behaviour with the external display conntected to the laptop. Unplug an plug gin again, nothing is displayed. Also after upgrade to 6.1 and gone after downgrade
Comment 3 arthur 2022-12-26 16:36:35 UTC
I tested this bug on an Intel (12th gen) machine too. Worked. It seems to be only an AMD Issue.
Comment 4 laser.eyess.trackers 2022-12-26 17:05:45 UTC
This is a regression and someone has bisected it to ef6f5494faf6a37c74990689a3bb3cee76d2544c[0]. There is additional discussion on Arch linux's bug tracker[1] and forums[2] about the issue. In particular, multiple confirmed fixes when reverting ef6f5494faf6a37c74990689a3bb3cee76d2544c.

I can confirm that the commit reverted fixes my issue with my Ryzen 5 PRO 6650U CPU and radeon 680M GPU. 

[0] https://github.com/torvalds/linux/commit/ef6f5494faf6a37c74990689a3bb3cee76d2544c
[1] https://bugs.archlinux.org/task/76917
[2] https://bbs.archlinux.org/viewtopic.php?pid=2075553#p2075553
Comment 5 Jaroslav Kysela 2022-12-26 18:55:51 UTC
*** Bug 216853 has been marked as a duplicate of this bug. ***
Comment 6 Jaroslav Kysela 2022-12-26 18:57:02 UTC
Created attachment 303475 [details]
alsa-info 6.0.12
Comment 7 Jaroslav Kysela 2022-12-26 18:57:52 UTC
Created attachment 303476 [details]
alsa-info 6.1.1
Comment 8 Jaroslav Kysela 2022-12-26 19:14:30 UTC
Created attachment 303477 [details]
Debug patch #1

Debug patch #1. Please, apply this patch on top of the 6.1 kernel and report back atihdmi lines from dmesg (ksyslog).
Comment 9 Jaroslav Kysela 2022-12-26 19:25:18 UTC
Link to the alsa-devel ML discussion:

[3] https://lore.kernel.org/alsa-devel/87edsnxqmo.wl-tiwai@suse.de/
Comment 10 laser.eyess.trackers 2022-12-26 22:19:12 UTC
Created attachment 303478 [details]
kernel log with patch applied

Attached, though this is off of 6.2-rc1 since that was easier for me, the issue still exists. I will try directly on top of 6.1 next.
Comment 11 laser.eyess.trackers 2022-12-26 23:20:08 UTC
Created attachment 303479 [details]
kernel log 6.1 with patch applied

Same log, with 6.1 instead of 6.2-rc
Comment 12 Takashi Iwai 2022-12-27 08:52:56 UTC
Through a quick glance, I couldn't find anything obvious.  I guess it's something about how pipewire recognizes the device.

When you run like below, does it sound from HDMI?
  % aplay -Dhdmi:0 something.wav

where something.wav is a WAV file with a stereo 16/32bit 44.1/48k Hz.
Comment 13 Michael Laß 2022-12-27 13:20:31 UTC
Created attachment 303480 [details]
alsa-info.sh (Thinkpad T14s, HDMI device missing)
Comment 14 Michael Laß 2022-12-27 13:21:04 UTC
Created attachment 303481 [details]
alsa-info.sh (Thinkpad T14s, ef6f reverted, HDMI device available)
Comment 15 Michael Laß 2022-12-27 13:22:05 UTC
Created attachment 303482 [details]
pulseaudio output (Thinkpad T14s, HDMI device missing)
Comment 16 Michael Laß 2022-12-27 13:22:33 UTC
Created attachment 303483 [details]
pulseaudio output (Thinkpad T14s, ef6f reverted, HDMI device available)
Comment 17 Michael Laß 2022-12-27 13:23:03 UTC
Created attachment 303484 [details]
pulseaudio debug output (Thinkpad T14s, HDMI device missing)
Comment 18 Michael Laß 2022-12-27 13:31:41 UTC
I also attached my alsa-info.sh outputs here, both for a relatively stock 6.1 kernel and the one where commit ef6f has been reverted.

I tested the "aplay -Dhdmi:0 something.wav" command when the device is missing on the stock kernel, and indeed this _does_ output audio. So the playback works - the device is just invisible.

I am using pulseaudio on my system and had a look into the system journal. Interestingly, pulseaudio outputs some errors on the stock kernel, which it does not output with ef6f reverted. I attached both outputs here as well.

To find out more about these errors, I started pulseaudio with a couple of -v flags on the affected kernel version. The verbose output is attached to this report as well. A couple of lines before the errors one can see output like the following:

I: [pulseaudio] alsa-util.c: Error opening PCM device _ucm0001.hw:Generic,7: Device or resource busy
I: [pulseaudio] alsa-util.c: Error opening PCM device _ucm0006.hw:Generic,7: Device or resource busy
I: [pulseaudio] alsa-util.c: Error opening PCM device _ucm0007.hw:Generic,7: Device or resource busy
I: [pulseaudio] alsa-util.c: Error opening PCM device _ucm0008.hw:Generic,7: Device or resource busy
I: [pulseaudio] alsa-util.c: Error opening PCM device _ucm0009.hw:Generic,7: Device or resource busy
I: [pulseaudio] alsa-util.c: Error opening PCM device _ucm000A.hw:Generic,7: Device or resource busy
Comment 19 pietinger 2022-12-27 13:46:13 UTC
I dont think it is an AMD problem. I have an i7-6700 on a Gigabyte Z270-K5 and I updated from 5.15.80 to 6.1.1
(when I did my "make oldconfig" I have accepted all defaults)
I have no pulseaudio or pipewire - just plain ALSA with HDMI output to monitor (via DP).
I am using no systemd; just OpenRC as init system. My alsa RC-script (from Gentoo) throwed many errors about missing devices. (yes, devtmpfs is enabled in my .config; it is a Gentoo default)
Comment 20 Jaroslav Kysela 2022-12-27 16:49:51 UTC
(In reply to pietinger from comment #19)
> I dont think it is an AMD problem. I have an i7-6700 on a Gigabyte Z270-K5
> and I updated from 5.15.80 to 6.1.1 (when I did my "make oldconfig" I have
> accepted all defaults)
> I have no pulseaudio or pipewire - just plain ALSA with HDMI output to
> monitor (via DP).
> I am using no systemd; just OpenRC as init system. My alsa RC-script (from
> Gentoo) throwed many errors about missing devices. (yes, devtmpfs is enabled
> in my .config; it is a Gentoo default)

The HDMI device list changed with my patch. Basically, it does not make sense to create a lot of devices while few can be used simultaneously (limited by the audio converters - usually 3 or 4 for the Intel hardware). The pipewire / pulseaudio gets -EBUSY errors when those devices are probed and it can end up with the state where no HDMI devices are detected. Create another bug, if you find any issues with the current code for the Intel chips, but it seems that the script code is just bad.

The first connected monitor should be available on the first HDMI device (audio converter) now.
Comment 21 Jaroslav Kysela 2022-12-27 16:57:18 UTC
Trying to analyze the last logs:

patch reverted:

  card 0: Generic [HD-Audio Generic], device 3: HDMI 0 [PL2792Q]
  card 0: Generic [HD-Audio Generic], device 7: HDMI 1 [HDMI 1]
  card 0: Generic [HD-Audio Generic], device 8: HDMI 2 [HDMI 2]

with the patch:

  card 0: Generic [HD-Audio Generic], device 3: HDMI 0 [PL2792Q]
  card 0: Generic [HD-Audio Generic], device 7: HDMI 1 [HDMI 1]
  card 0: Generic [HD-Audio Generic], device 8: HDMI 2 [HDMI 2]
  card 0: Generic [HD-Audio Generic], device 9: HDMI 3 [HDMI 3]
  card 0: Generic [HD-Audio Generic], device 10: HDMI 4 [HDMI 4]
  card 0: Generic [HD-Audio Generic], device 11: HDMI 5 [HDMI 5]

which corresponds to the Audio Output HDA widgets:

  Codec: ATI R6xx HDMI
  Node 0x02 [Audio Output] wcaps 0x221: Stereo Digital Stripe
  Node 0x04 [Audio Output] wcaps 0x221: Stereo Digital Stripe
  Node 0x06 [Audio Output] wcaps 0x221: Stereo Digital Stripe
  Node 0x08 [Audio Output] wcaps 0x221: Stereo Digital Stripe
  Node 0x0a [Audio Output] wcaps 0x221: Stereo Digital Stripe
  Node 0x0c [Audio Output] wcaps 0x221: Stereo Digital Stripe

So my code changed the device allocation scheme for this type of hw. Let me check.
Comment 22 Takashi Iwai 2022-12-27 17:04:27 UTC
Yeah, the commit had two possible side effects:
- It may expose more devices than before
- The device order may be changed

But both points shouldn't matter as long as UCM is used and that's the case for PA and pipewire.  So there must be some other things.

My gut feeling is that it's around the open for no-pin PCM.  This wasn't a problem for AMD with the static PCM mapping, but now it might conflict.
(On Intel, it's no problem because each pin can choose any converters.)

Can anyone check whether the below works better?
Comment 23 Takashi Iwai 2022-12-27 17:05:27 UTC
Created attachment 303485 [details]
Test fix patch
Comment 24 Jaroslav Kysela 2022-12-27 17:25:41 UTC
I think that this this patch may work better (I'd like to keep less devices with the dynamic mapping):

  diff --git a/sound/pci/hda/patch_hdmi.c b/sound/pci/hda/patch_hdmi.c
  index 8015e4471267..18dc86c50f72 100644
  --- a/sound/pci/hda/patch_hdmi.c
  +++ b/sound/pci/hda/patch_hdmi.c
  @@ -2282,7 +2282,7 @@ static int generic_hdmi_build_pcms(struct hda_codec *codec)
          int idx, pcm_num;
 
          /* limit the PCM devices to the codec converters */
  -       pcm_num = spec->num_cvts;
  +       pcm_num = min(spec->num_nids, spec->num_cvts);
          codec_dbg(codec, "hdmi: pcm_num set to %d\n", pcm_num);
 
          for (idx = 0; idx < pcm_num; idx++) {


Looking to the old code, this was changes:

  -       /*
  -        * for non-mst mode, pcm number is the same as before
  -        * for DP MST mode without extra PCM, pcm number is same
  -        * for DP MST mode with extra PCMs, pcm number is
  -        *  (nid number + dev_num - 1)
  -        * dev_num is the device entry number in a pin
  -        */
  -
  -       if (spec->dyn_pcm_no_legacy && codec->mst_no_extra_pcms)
  -               pcm_num = spec->num_cvts;
  -       else if (codec->mst_no_extra_pcms)
  -               pcm_num = spec->num_nids;
  -       else
  -               pcm_num = spec->num_nids + spec->dev_num - 1;
  -
  +       /* limit the PCM devices to the codec converters */
  +       pcm_num = spec->num_cvts;
          codec_dbg(codec, "hdmi: pcm_num set to %d\n", pcm_num);

So it appears that pcm_num == num_nids for the old code.
Comment 25 Jaroslav Kysela 2022-12-27 17:53:44 UTC
Further comments:

  Node 0x03 [Pin Complex] wcaps 0x400381: Stereo Digital
    Pin Default 0x185600f0: [Jack] Digital Out at Int HDMI
  Node 0x05 [Pin Complex] wcaps 0x400381: Stereo Digital
    Pin Default 0x185600f0: [Jack] Digital Out at Int HDMI
  Node 0x07 [Pin Complex] wcaps 0x400381: Stereo Digital
    Pin Default 0x185600f0: [Jack] Digital Out at Int HDMI
  Node 0x09 [Pin Complex] wcaps 0x400381: Stereo Digital
    Pin Default 0x585600f0: [N/A] Digital Out at Int HDMI
  Node 0x0b [Pin Complex] wcaps 0x400381: Stereo Digital
    Pin Default 0x585600f0: [N/A] Digital Out at Int HDMI
  Node 0x0d [Pin Complex] wcaps 0x400381: Stereo Digital
    Pin Default 0x585600f0: [N/A] Digital Out at Int HDMI

So PINs 0x09,0x0b,0x0d are N/A by default (and skipped in hdmi_add_pin()):

        /*
         * For DP MST audio, Configuration Default is the same for
         * all device entries on the same pin
         */
        config = snd_hda_codec_get_pincfg(codec, pin_nid);
        if (get_defcfg_connect(config) == AC_JACK_PORT_NONE &&
            !spec->force_connect)
                return 0;


So spec->num_nids should be 3 for this hw and the simple one line patch in comment#24 should work.
Comment 26 Michael Laß 2022-12-27 18:09:31 UTC
The "Test fix patch" does not seem to help. I'm testing the one-line patch from comment#24 next. Compiling always takes a little while...
Comment 27 Michael Laß 2022-12-27 19:20:39 UTC
Unfortunately, the proposed change from comment#24 also does not help here. I see that the number of devices is reduced in that the alsa-info.sh output does not show those additional devices anymore. It's now closer to the output with ef6f reverted but not identical:

* For nodes 0x02, 0x04 and 0x06 "Digital" changed from "Enabled" to [blank]
* For nodes 0x03, 0x05 and 0x07 the three Controls vanished
* Changes down in the mixer section but that may just be different settings...

I'll attach the new output here as well.
Comment 28 Michael Laß 2022-12-27 19:21:45 UTC
Created attachment 303487 [details]
alsa-info.sh (Thinkpad T14s, comment#24 applied, HDMI device still missing)
Comment 29 Jaroslav Kysela 2022-12-27 19:26:35 UTC
Could you also attach the debug output from pulseaudio ?
Comment 30 Michael Laß 2022-12-27 19:36:31 UTC
Created attachment 303488 [details]
Pulseaudio debug log (Thinkpad T14s, comment#24 applied, HDMI device still missing)
Comment 31 Jaroslav Kysela 2022-12-27 19:48:11 UTC
It seems that the Takashi's patch from comment#23 should be applied, too:

D: [pulseaudio] alsa-util.c: Managed to open _ucm0001.hw:Generic,8
D: [pulseaudio] alsa-util.c: Managed to open _ucm0001.hw:Generic,7
D: [pulseaudio] alsa-util.c: Trying _ucm0001.hw:Generic,3 with SND_PCM_NO_AUTO_FORMAT ...
I: [pulseaudio] alsa-util.c: Error opening PCM device _ucm0001.hw:Generic,3: Device or resource busy

Could you try the kernel with both patches?
Comment 32 Jaroslav Kysela 2022-12-27 19:50:59 UTC
Looking to the patch from comment#23 - it won't work (spec->num_pins == spec->num_cvts condition - 3 != 6). Give me a little time to further analyze this problem.
Comment 33 Jaroslav Kysela 2022-12-27 20:17:18 UTC
Created attachment 303490 [details]
Test fix patch 2

Please, try this patch.
Comment 34 Michael Laß 2022-12-27 21:46:27 UTC
Yes, the patch from comment#33 (attachment 303490 [details]) works for me.

I also tested if the patches of comments #23 and #24 combined would work, but they don't. So whatever you added in you latest patch did it.
Comment 35 laser.eyess.trackers 2022-12-28 00:03:18 UTC
The patch from comment#33 also works for me. I the first patch on its own and it didn't work, I did not test the first patch + the one liner together.
Comment 36 Takashi Iwai 2022-12-28 08:43:20 UTC
(In reply to Jaroslav Kysela from comment #32)
> Looking to the patch from comment#23 - it won't work (spec->num_pins ==
> spec->num_cvts condition - 3 != 6).

Ah right, the check should be

       if (!codec->dp_mst && spec->num_pins <= spec->num_cvts)
               spec->static_pcm_mapping = true;

instead.  We can set the flag like your 2nd patch, too; it seems that AMD is the only chip that suffers from this problem (others are with "simple" types).

AFAIU, the problem is that hdmi_pcm_open() calls hdmi_pcm_open_no_pin() if no pin has been assigned yet to the PCM stream to be opened.  And the latter function takes *any* free converter.  If this takes a converter that is tied with a pin that is opened at next simultaneously, the next open results in -EBUSY because the converter has been occupied.

The problem doesn't happen on Intel chips because each pin can choose any converters on Intel, while AMD chip is a 1:1 mapping between pin and converter.

And, hdmi_pcm_open_no_pin() is a workaround for the case where an unassigned stream is opened (PA tries to open each PCM for probing at first).  We may some other better way there, too, instead of applying the static pcm mapping.
Comment 37 Takashi Iwai 2022-12-28 09:42:29 UTC
(In reply to Takashi Iwai from comment #36)
> (In reply to Jaroslav Kysela from comment #32)
> instead.  We can set the flag like your 2nd patch, too; it seems that AMD is
> the only chip that suffers from this problem (others are with "simple"
> types).

That said, I'm fine with this 2nd patch, and if Jaroslav agrees, I'm going to submit a formal patch.
Comment 38 Jaroslav Kysela 2022-12-28 11:42:21 UTC
The nice feature of the dynamic pin/pcm assignment is that the active PCMs are assigned in the order (first device, second device ...). But we need a quick fix for now. I'm fine with the 2nd patch. Add my Signed-off-by or Reviewed-by line as you wish.
Comment 39 Takashi Iwai 2022-12-28 13:00:38 UTC
OK, submitted:
  https://lore.kernel.org/r/20221228125714.16329-1-tiwai@suse.de

Will merge and send a PR for 6.2-rc2 later.
Comment 40 Takashi Iwai 2022-12-28 13:04:21 UTC
Let's close.
Comment 41 arthur 2022-12-28 13:48:48 UTC
Could you please give us an info or link in which kernel version(s) this bug will be fixed? Perhaps also in 6.1.x?
Comment 42 pietinger 2022-12-29 13:49:42 UTC
(In reply to Jaroslav Kysela from comment #20)
> 
> The HDMI device list changed with my patch. [...] Create
> another bug, if you find any issues with the current code for the Intel
> chips, but it seems that the script code is just bad.
> 
> The first connected monitor should be available on the first HDMI device
> (audio converter) now.

Thank you very much for your answer ! You are right, it is not an Intel problem. I report back because other users could have the same problem I had:

I must use an /etc/asound.conf because my left and right channel are changed wrong. So I had before:

defaults.pcm.!card 0
defaults.pcm.!device 7
pcm.swapped {
    type         route
    slave.pcm    "cards.pcm.default"
    ttable.0.1   1
    ttable.1.0   1
}
pcm.default      pcm.swapped

I had to edit it and now I use:

...
defaults.pcm.!device 3
...


Thanks again for your friendly and fast answer !