Bug 200945 - No HDMI audio output with CONFIG_VGA_SWITCHEROO=y listed in config file
Summary: No HDMI audio output with CONFIG_VGA_SWITCHEROO=y listed in config file
Status: NEW
Alias: None
Product: Drivers
Classification: Unclassified
Component: Sound(ALSA) (show other bugs)
Hardware: x86-64 Linux
: P1 high
Assignee: Jaroslav Kysela
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2018-08-27 05:27 UTC by jian-hong
Modified: 2018-09-14 14:19 UTC (History)
5 users (show)

See Also:
Kernel Version: 4.18.5
Subsystem:
Regression: No
Bisected commit-id:


Attachments
kernel 4.18.5 build CONFIG_VGA_SWITCHEROO=y in config (130.31 KB, text/x-mpsub)
2018-08-27 05:27 UTC, jian-hong
Details
dmesg of kernel 4.18.5 build CONFIG_VGA_SWITCHEROO=y in config (1.43 MB, text/plain)
2018-08-27 05:28 UTC, jian-hong
Details
kernel 4.18.5 build without CONFIG_VGA_SWITCHEROO=y in config (130.32 KB, text/x-mpsub)
2018-08-27 05:30 UTC, jian-hong
Details
dmesg of kernel 4.18.5 build without CONFIG_VGA_SWITCHEROO=y in config (1.40 MB, text/plain)
2018-08-27 05:31 UTC, jian-hong
Details
kernel 4.16.18 build CONFIG_VGA_SWITCHEROO=y in config (131.17 KB, text/x-mpsub)
2018-08-27 05:32 UTC, jian-hong
Details
dmesg of kernel 4.16.18 build CONFIG_VGA_SWITCHEROO=y in config (1.45 MB, text/plain)
2018-08-27 05:33 UTC, jian-hong
Details
kernel 4.19rc1 build CONFIG_VGA_SWITCHEROO=y in config (131.45 KB, text/plain)
2018-08-27 06:44 UTC, jian-hong
Details
dmesg of kernel 4.19rc1 build CONFIG_VGA_SWITCHEROO=y in config (63.63 KB, text/plain)
2018-08-27 06:45 UTC, jian-hong
Details
kernel 4.19rc1 build without CONFIG_VGA_SWITCHEROO=y in config (131.46 KB, text/plain)
2018-08-27 06:46 UTC, jian-hong
Details
dmesg of kernel 4.19rc1 build without CONFIG_VGA_SWITCHEROO=y in config (63.61 KB, text/plain)
2018-08-27 06:47 UTC, jian-hong
Details
snd_hda_codec_hdmi: amdgpu: Radeon Vega HDMI audio experiment (9.74 KB, application/mbox)
2018-09-05 05:42 UTC, jian-hong
Details
dmesg of "snd_hda_codec_hdmi: amdgpu: Radeon Vega HDMI audio experiment" patch (338.66 KB, text/plain)
2018-09-05 05:46 UTC, jian-hong
Details
Possible fix patch (untested) (2.21 KB, patch)
2018-09-06 05:47 UTC, Takashi Iwai
Details | Diff
dmesg of the attachment 278349 based on Linux kernel 4.19-rc2 (69.83 KB, text/plain)
2018-09-10 07:16 UTC, jian-hong
Details
Test fix (revised) (3.75 KB, patch)
2018-09-10 14:28 UTC, Takashi Iwai
Details | Diff
dmesg of resuming back based on Linux kernel 4.19-rc3 with patch attachment 278421. (70.24 KB, text/plain)
2018-09-12 06:45 UTC, jian-hong
Details
dmesg2 of resuming back based on Linux kernel 4.19-rc3 with patch attachment 278421. (71.22 KB, text/plain)
2018-09-12 08:59 UTC, jian-hong
Details
Test fix patch (v2) (9.05 KB, patch)
2018-09-12 18:26 UTC, Takashi Iwai
Details | Diff
alsa-info of ASUS X505ZA (23.89 KB, text/plain)
2018-09-13 02:42 UTC, jian-hong
Details

Description jian-hong 2018-08-27 05:27:13 UTC
Created attachment 278103 [details]
kernel 4.18.5 build CONFIG_VGA_SWITCHEROO=y in config

We have an ASUS X505ZA laptop equipped with AMD Ryzen 5 2500U with Radeon Vega Mobile Gfx.

When use the kernel build with "CONFIG_VGA_SWITCHEROO=y", the external HDMI's audio output always not works and cannot be choosed.
It status is not connected:
dev@endless:~$ cat /proc/asound/card0/eld#0.0 
monitor_present		0
eld_valid		0

Even I already can changed the display mode into mirror or join displays.

However, if use kernel build without "CONFIG_VGA_SWITCHEROO=y", the external HDMI's audio output always works correctly.

By the way, this issue did not happen with kernel 4.16.18 which always works correctly.
Comment 1 jian-hong 2018-08-27 05:28:19 UTC
Created attachment 278105 [details]
dmesg of kernel 4.18.5 build CONFIG_VGA_SWITCHEROO=y in config
Comment 2 jian-hong 2018-08-27 05:30:36 UTC
Created attachment 278107 [details]
kernel 4.18.5 build without CONFIG_VGA_SWITCHEROO=y in config
Comment 3 jian-hong 2018-08-27 05:31:26 UTC
Created attachment 278109 [details]
dmesg of kernel 4.18.5 build without CONFIG_VGA_SWITCHEROO=y in config
Comment 4 jian-hong 2018-08-27 05:32:30 UTC
Created attachment 278111 [details]
kernel 4.16.18 build CONFIG_VGA_SWITCHEROO=y in config
Comment 5 jian-hong 2018-08-27 05:33:52 UTC
Created attachment 278113 [details]
dmesg of kernel 4.16.18 build CONFIG_VGA_SWITCHEROO=y in config
Comment 6 Takashi Iwai 2018-08-27 05:37:44 UTC
Could you try the freshly released 4.19-rc1?  There is a vga_switcheroo fix about such dual AMD GPUs.  If it fixes the issue, we can push it to stable tree.
Comment 7 jian-hong 2018-08-27 05:40:19 UTC
We have another one, ASUS X570ZD laptop equipped with AMD Ryzen 7 2700U with Radeon Vega Mobile Gfx and NVIDIA GeForce GTX 1050 Mobile which has the same issue.
Comment 8 jian-hong 2018-08-27 06:43:53 UTC
(In reply to Takashi Iwai from comment #6)
> Could you try the freshly released 4.19-rc1?  There is a vga_switcheroo fix
> about such dual AMD GPUs.  If it fixes the issue, we can push it to stable
> tree.

Tested with 4.19-rc1, but the same issue still happens.
Comment 9 jian-hong 2018-08-27 06:44:59 UTC
Created attachment 278115 [details]
kernel 4.19rc1 build CONFIG_VGA_SWITCHEROO=y in config
Comment 10 jian-hong 2018-08-27 06:45:38 UTC
Created attachment 278117 [details]
dmesg of kernel 4.19rc1 build CONFIG_VGA_SWITCHEROO=y in config
Comment 11 jian-hong 2018-08-27 06:46:57 UTC
Created attachment 278119 [details]
kernel 4.19rc1 build without CONFIG_VGA_SWITCHEROO=y in config
Comment 12 jian-hong 2018-08-27 06:47:57 UTC
Created attachment 278121 [details]
dmesg of kernel 4.19rc1 build without CONFIG_VGA_SWITCHEROO=y in config
Comment 13 Takashi Iwai 2018-08-27 06:51:37 UTC
OK, then it's a different bug.

Cc'ing Lukas for taking a look at it.
Comment 14 Lukas Wunner 2018-08-27 10:20:25 UTC
The first one, ASUS X505ZA, has only a single (integrated) AMD GPU. What does the following show on this machine when you output audio on the external display:

cat /sys/bus/pci/devices/0000:03:00.1/power/runtime_status

If you try the following, does the issue go away?

echo on > /sys/bus/pci/devices/0000:03:00.1/power/control

As a shot in the dark, if you revert 57cb54e53bdd ("ALSA: hda - Force to link down at runtime suspend on ATI/AMD HDMI"), does the issue go away?
Comment 15 jian-hong 2018-08-28 02:54:24 UTC
Tested with pure 4.19-rc1 and CONFIG_VGA_SWITCHEROO=y in config on ASUS X505ZA

(In reply to Lukas Wunner from comment #14)
> The first one, ASUS X505ZA, has only a single (integrated) AMD GPU. What
> does the following show on this machine when you output audio on the
> external display:
> 
> cat /sys/bus/pci/devices/0000:03:00.1/power/runtime_status

dev@endless:~$ cat /sys/bus/pci/devices/0000:03:00.1/power/runtime_status
suspended

> If you try the following, does the issue go away?
> 
> echo on > /sys/bus/pci/devices/0000:03:00.1/power/control

After turn on the device's power control, I get

dev@endless:~$ cat /sys/bus/pci/devices/0000:03:00.1/power/runtime_status
active

And HDMI's audio works correctly.

> As a shot in the dark, if you revert 57cb54e53bdd ("ALSA: hda - Force to
> link down at runtime suspend on ATI/AMD HDMI"), does the issue go away?

I tried to revert commit 57cb54e53bdd ("ALSA: hda - Force to link down at runtime suspend on ATI/AMD HDMI"). But it does not work on X505ZA. The issue is still reproduced.
Comment 16 jian-hong 2018-08-28 06:37:03 UTC
It seems like someone forgets to wake up the HDMI's audio device?
Comment 17 Lukas Wunner 2018-08-28 07:24:53 UTC
How exactly do you output audio to the external display? Maybe there's a pm_runtime_get_sync() on the codec missing in the code path you're using to enter the kernel.

Another possible explanation: ISTR that ELD is only available if the HDA controller is powered on, but I think Takashi's audio component rework sought to address this?
Comment 18 jian-hong 2018-08-28 08:52:58 UTC
(In reply to Lukas Wunner from comment #17)
> How exactly do you output audio to the external display? Maybe there's a
> pm_runtime_get_sync() on the codec missing in the code path you're using to
> enter the kernel.
> 
> Another possible explanation: ISTR that ELD is only available if the HDA
> controller is powered on, but I think Takashi's audio component rework
> sought to address this?

I notice there is hdmi_present_sense() function in "sound/pci/hda/patch_hdmi.c" of 4.19-rc1 kernel.
hdmi_present_sense() calls snd_hda_power_up_pm() -> snd_hdac_power_up_pm() in "sound/pci/hda/hda_codec.h".
snd_hdac_power_up_pm() calls snd_hdac_power_up() which calls pm_runtime_get_sync() in "sound/hda/hdac_device.c".
Comment 19 jian-hong 2018-08-28 08:56:25 UTC
There two audio devices on X505ZA:

dev@endless:~/linux-master$ lspci -nn
...
03:00.0 VGA compatible controller [0300]: Advanced Micro Devices, Inc. [AMD/ATI] Raven Ridge [Radeon Vega Series / Radeon Vega Mobile Series] [1002:15dd] (rev c4)
03:00.1 Audio device [0403]: Advanced Micro Devices, Inc. [AMD/ATI] Device [1002:15de]
...
03:00.6 Audio device [0403]: Advanced Micro Devices, Inc. [AMD] Device [1022:15e3]

dev@endless:~/linux-master$ cat /proc/asound/cards
 0 [Generic        ]: HDA-Intel - HD-Audio Generic
                      HD-Audio Generic at 0xfe788000 irq 54
 1 [Generic_1      ]: HDA-Intel - HD-Audio Generic
                      HD-Audio Generic at 0xfe780000 irq 55

One is internal audio controller, the other one is HDMI's audio controller.
Besides, the VGA controller and HDMI's audio controller are different devices on this laptop.

According to Comment 18, the code detects the HDMI presenting or not and then power on the HDMI's audio device.

However, after plugin HDMI cable, the HDMI's audio only can be detected when the power control is "on"

dev@endless:~/linux-master$ cat /sys/bus/pci/devices/0000\:03\:00.1/power/{control,runtime_status} /proc/asound/card0/eld#0.0
on
active
monitor_present		1
eld_valid		1
monitor_name		ASUS VP247
connection_type		HDMI
eld_version		[0x2] CEA-861D or below
edid_version		[0x0] no CEA EDID Timing Extension block present
manufacture_id		0x6904
product_id		0x24c7
port_id			0xd9894495558859e
support_hdcp		0
support_ai		0
audio_sync_delay	0
speakers		[0x1] FL/FR
sad_count		1
sad0_coding_type	[0x1] LPCM
sad0_channels		2
sad0_rates		[0x4e0] 32000 44100 48000 96000
sad0_bits		[0xe0000] 16 20 24

Otherwise, the HDMI's audio cannot be detected when the power control is "auto"

dev@endless:~/linux-master$ cat /sys/bus/pci/devices/0000\:03\:00.1/power/{control,runtime_status} /proc/asound/card0/eld#0.0
auto
suspended
monitor_present		0
eld_valid		0

Seems like same as Lukas said: "ELD is only available if the HDA controller is powered on".
Comment 20 Takashi Iwai 2018-08-28 09:05:35 UTC
The possible lack of the HDMI hotplug notification is a known problem when the runtime PM is enabled.  And yes, this should be covered by the direct notification via audio component.

You can try the branch topic/hda-acomp in sound.git tree
  git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound.git

This will take three commits.  Note that the AMDGPU support there is limited and untested, it doesn't support the new DC code yet.

In anyway, I don't think that the problem is about the missing pm_runtime_get_sync().  The symptom suggests that the ELD notification didn't arrive, so HDMI audio codec isn't configured.
In that case, the only workaround with the current code would be not to put runtime PM suspend for keeping the link alive.

Still it's puzzling that reverting 57cb54e53bdd doesn't fix the things, though.

Does passing power_save_controller=0 option to snd-hda-intel module change the behavior?  It suppresses the runtime PM of the HD-audio controller (not about codec), so the link should be alive.  Of course, this would make the whole power-save scenario broken with dual GPUs, but let's sort out the single GPU case at first...
Comment 21 jian-hong 2018-08-29 03:42:26 UTC
(In reply to Takashi Iwai from comment #20)
> The possible lack of the HDMI hotplug notification is a known problem when
> the runtime PM is enabled.  And yes, this should be covered by the direct
> notification via audio component.
> 
> You can try the branch topic/hda-acomp in sound.git tree
>   git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound.git
> 
> This will take three commits.  Note that the AMDGPU support there is limited
> and untested, it doesn't support the new DC code yet.

Tried with the 3 commits: ALSA: hda/hdmi: Allow audio component for AMD/ATI HDMI, drm/radeon: Add audio component support, drm/amdgpu: Add audio component support, but the issue can be reproduced on X505ZA.

> In anyway, I don't think that the problem is about the missing
> pm_runtime_get_sync().  The symptom suggests that the ELD notification
> didn't arrive, so HDMI audio codec isn't configured.
> In that case, the only workaround with the current code would be not to put
> runtime PM suspend for keeping the link alive.
> 
> Still it's puzzling that reverting 57cb54e53bdd doesn't fix the things,
> though.
> 
> Does passing power_save_controller=0 option to snd-hda-intel module change
> the behavior?  It suppresses the runtime PM of the HD-audio controller (not
> about codec), so the link should be alive.  Of course, this would make the
> whole power-save scenario broken with dual GPUs, but let's sort out the
> single GPU case at first...

Pass the argument to snd_hda_intel module by "options snd_hda_intel power_save_controller=0". It makes the HDMI's audio work correctly on X505ZA.

Plugin HDMI:

dev@endless:~$ cat /sys/bus/pci/devices/0000\:03\:00.1/power/{control,runtime_status} /proc/asound/card0/eld#0.0
auto
active
monitor_present		1
eld_valid		1
monitor_name		ASUS VP247
connection_type		HDMI
eld_version		[0x2] CEA-861D or below
edid_version		[0x0] no CEA EDID Timing Extension block present
manufacture_id		0x6904
product_id		0x24c7
port_id			0xd9894495558859e
support_hdcp		0
support_ai		0
audio_sync_delay	0
speakers		[0x1] FL/FR
sad_count		1
sad0_coding_type	[0x1] LPCM
sad0_channels		2
sad0_rates		[0x4e0] 32000 44100 48000 96000
sad0_bits		[0xe0000] 16 20 24

Unplug HDMI:

dev@endless:~$ cat /sys/bus/pci/devices/0000\:03\:00.1/power/{control,runtime_status} /proc/asound/card0/eld#0.0
auto
active
monitor_present		0
eld_valid		0

The power runtime_status is always active.
Comment 22 jian-hong 2018-08-29 07:32:06 UTC
I also tested pure kernel 4.19-rc1 on ASUS X570ZD without passing "options snd_hda_intel power_save_controller=0".
The situation is better than ASUS X505ZA, but ...

The dmesg says the HDMI's audio is on bus at "0000:04:00.1"

dev@endless:~$ dmesg | grep -E "(snd_hda|switcheroo)"
[    9.083370] snd_hda_intel 0000:04:00.1: enabling device (0000 -> 0002)
[    9.083512] snd_hda_intel 0000:04:00.1: Handle vga_switcheroo audio client
[    9.085177] snd_hda_intel 0000:04:00.6: enabling device (0000 -> 0002)
[    9.489215] snd_hda_codec_generic hdaudioC1D0: autoconfig for Generic: line_outs=1 (0x17/0x0/0x0/0x0/0x0) type:speaker
[    9.489217] snd_hda_codec_generic hdaudioC1D0:    speaker_outs=0 (0x0/0x0/0x0/0x0/0x0)
[    9.489219] snd_hda_codec_generic hdaudioC1D0:    hp_outs=1 (0x16/0x0/0x0/0x0/0x0)
[    9.489220] snd_hda_codec_generic hdaudioC1D0:    mono: mono_out=0x0
[    9.489221] snd_hda_codec_generic hdaudioC1D0:    inputs:
[    9.489222] snd_hda_codec_generic hdaudioC1D0:      Internal Mic=0x1a
[    9.489224] snd_hda_codec_generic hdaudioC1D0:      Mic=0x19
[    9.689986] VGA switcheroo: detected Optimus DSM method \_SB_.PCI0.GPP0.VGA_ handle
[   10.703547] vga_switcheroo: enabled

lspci says: There 2 VGA controllers (AMD, NVIDIA) and 2 AMD Audio devices.

dev@endless:~$ lspci -nn
...
01:00.0 3D controller [0302]: NVIDIA Corporation GP107M [GeForce GTX 1050 Mobile] [10de:1c8d] (rev a1)
...
04:00.0 VGA compatible controller [0300]: Advanced Micro Devices, Inc. [AMD/ATI] Raven Ridge [Radeon Vega Series / Radeon Vega Mobile Series] [1002:15dd] (rev c3)
04:00.1 Audio device [0403]: Advanced Micro Devices, Inc. [AMD/ATI] Device [1002:15de]
...
04:00.6 Audio device [0403]: Advanced Micro Devices, Inc. [AMD] Device [1022:15e3]
...

dev@endless:~$ cat /proc/asound/cards
 0 [Generic        ]: HDA-Intel - HD-Audio Generic
                      HD-Audio Generic at 0xf7588000 irq 55
 1 [Generic_1      ]: HDA-Intel - HD-Audio Generic
                      HD-Audio Generic at 0xf7580000 irq 56

The HDMI's audio is "04:00.1 Audio device [0403]: Advanced Micro Devices, Inc. [AMD/ATI] Device [1002:15de]" which is not "04:00.0 VGA compatible controller [0300]: Advanced Micro Devices, Inc. [AMD/ATI] Raven Ridge [Radeon Vega Series / Radeon Vega Mobile Series] [1002:15dd] (rev c3)"

Plugin HDMI (status is correct):

dev@endless:~$ cat /sys/bus/pci/devices/0000\:04\:00.1/power/{control,runtime_status} /proc/asound/card0/eld#0.0
auto
active
monitor_present		1
eld_valid		1
monitor_name		ASUS VP247
connection_type		HDMI
eld_version		[0x2] CEA-861D or below
edid_version		[0x0] no CEA EDID Timing Extension block present
manufacture_id		0x6904
product_id		0x24c7
port_id			0xd9894495558859e
support_hdcp		0
support_ai		0
audio_sync_delay	0
speakers		[0x1] FL/FR
sad_count		1
sad0_coding_type	[0x1] LPCM
sad0_channels		2
sad0_rates		[0x4e0] 32000 44100 48000 96000
sad0_bits		[0xe0000] 16 20 24

Unplug (status is wrong!!!):

dev@endless:~$ cat /sys/bus/pci/devices/0000\:04\:00.1/power/{control,runtime_status} /proc/asound/card0/eld#0.0
auto
suspended
monitor_present		1
eld_valid		1
monitor_name		ASUS VP247
connection_type		HDMI
eld_version		[0x2] CEA-861D or below
edid_version		[0x0] no CEA EDID Timing Extension block present
manufacture_id		0x6904
product_id		0x24c7
port_id			0xd9894495558859e
support_hdcp		0
support_ai		0
audio_sync_delay	0
speakers		[0x1] FL/FR
sad_count		1
sad0_coding_type	[0x1] LPCM
sad0_channels		2
sad0_rates		[0x4e0] 32000 44100 48000 96000
sad0_bits		[0xe0000] 16 20 24

The ELD should be nothing.

Unplug and refresh by "Sound setting" (status is correct):

dev@endless:~$ cat /sys/bus/pci/devices/0000\:04\:00.1/power/{control,runtime_status} /proc/asound/card0/eld#0.0
auto
suspended
monitor_present		0
eld_valid		0
Comment 23 jian-hong 2018-08-29 07:33:44 UTC
If I use pure kernel 4.19-rc1 on ASUS X570ZD with passing "options snd_hda_intel power_save_controller=0", the HDMI's audio status always works correct.
Comment 24 jian-hong 2018-09-05 05:42:01 UTC
Created attachment 278305 [details]
snd_hda_codec_hdmi: amdgpu: Radeon Vega HDMI audio experiment

Recap some issues on this kind of model

1. The timing to send the HDMI hotplug notification to HDMI audio
2. The processes for sending in amdgpu and getting by HDMI audio, then calling  the corresponding callback function
3. How to get the correct ELD and its information

I refer to https://bugzilla.kernel.org/show_bug.cgi?id=200945#c21 and make a experimental patch as the attachment which is based on Linux kernel 4.19-rc1 and commits: "ALSA: hda/hdmi: Allow audio component for AMD/ATI HDMI", "drm/radeon: Add audio component support", "drm/amdgpu: Add audio component support".

This experimental patch prints debugging messages for tracing and forces to send the notification in amdgpu and get by snd_hda_codec_hdmi. Then calls the callback registered in amdgpu.

* Before plug in HDMI:

dev@endless:~$ cat /sys/bus/pci/devices/0000\:03\:00.1/power/{control,runtime_status} /proc/asound/card0/eld#0.0
auto
suspended
monitor_present		0
eld_valid		0

* After plug in HDMI:

dev@endless:~$ cat /sys/bus/pci/devices/0000\:03\:00.1/power/{control,runtime_status} /proc/asound/card0/eld#0.0
auto
active
monitor_present		1
eld_valid		1
monitor_name		ASUS VP247
connection_type		HDMI
eld_version		[0x2] CEA-861D or below
edid_version		[0x3] CEA-861-B, C or D
manufacture_id		0x6904
product_id		0x24c7
port_id			0x0
support_hdcp		0
support_ai		0
audio_sync_delay	0
speakers		[0x1] FL/FR
sad_count		1
sad0_coding_type	[0x1] LPCM
sad0_channels		2
sad0_rates		[0x4e0] 32000 44100 48000 96000
sad0_bits		[0xe0000] 16 20 24

* After unplug HDMI:

dev@endless:~$ cat /sys/bus/pci/devices/0000\:03\:00.1/power/{control,runtime_status} /proc/asound/card0/eld#0.0
auto
suspended
monitor_present		1
eld_valid		1
monitor_name		ASUS VP247
connection_type		HDMI
eld_version		[0x2] CEA-861D or below
edid_version		[0x3] CEA-861-B, C or D
manufacture_id		0x6904
product_id		0x24c7
port_id			0x0
support_hdcp		0
support_ai		0
audio_sync_delay	0
speakers		[0x1] FL/FR
sad_count		1
sad0_coding_type	[0x1] LPCM
sad0_channels		2
sad0_rates		[0x4e0] 32000 44100 48000 96000
sad0_bits		[0xe0000] 16 20 24

We get the correct ELD content but the "After unplug HDMI" ELD status is wrong which remains the content as "After plug in HDMI".  It should be same as "Before plug in HDMI".
Comment 25 jian-hong 2018-09-05 05:46:20 UTC
Created attachment 278307 [details]
dmesg of "snd_hda_codec_hdmi: amdgpu: Radeon Vega HDMI audio experiment" patch

dmesg of "snd_hda_codec_hdmi: amdgpu: Radeon Vega HDMI audio experiment" patch

"[drm] handle_hpd_irq: for checking code path hotplug event" is the HDMI plug/unplug message
Comment 26 Daniel Drake 2018-09-06 03:07:15 UTC
I helped Jian-Hong with the above experiment. The main challenge we faced is that in the audio component design, both the display driver and audio driver need to share an understanding of which pin (on the HDA side) corresponds to the HDMI audio. It's not clear how to do that for the devices managed by amdgpu_dm.c, which is why the patch above just loops over all available encoders until it finds one with an ELD (completely ignoring the pin info).

In addition to the Asus X505ZA mentioned above, we also see this issue on Asus X570ZD (AMD Ryzen 7 2700U with Radeon Vega Mobile Gfx). All AMD systems that we've seen have this design where the HDMI audio is a separate PCI device and separate ALSA card - I wonder if this affects all of them.

Given that this is a regression introduced on Linux 4.17 that is affecting multiple platforms, and the audio component vs amdgpu_dm stuff looks hairy, I wonder if we should be looking for some easier short term solutions such as patch reverts, forcing AMD HDMI audio to be always-on, or something like that?
Comment 27 Lukas Wunner 2018-09-06 04:28:58 UTC
Simply reverting 07f4f97 would require reverting several other patches and would be a setback for consolidation of switchable graphics code.

However as a short-term solution you could try adding the PCI IDs of HDMI audio controllers on AMD APUs to power_save_blacklist[] in sound/pci/hda/hda_intel.c.

We currently treat them as dGPUs because the HDMI controller is function 1 and the VGA controller is function 0 of the same PCI device and the vendor is AMD. That was our heuristic to detect AMD dGPUs in switchable graphics setups. The heuristic is now failing because AMD is shipping APUs with integrated graphics and they thought it would be a good idea to give the HDMI audio function 1, as on dGPUs.

The power_save_blacklist[] approach is ugly. Ideally we'd have a simple way to detect whether a GPU is an APU or dGPU. I'm feeling kind of left alone here by AMD, are they expecting the rest of the community to solve this on their own?
Comment 28 Takashi Iwai 2018-09-06 05:46:53 UTC
OK, my AMDGPU patch was untested, so certainly it's buggy.  I need to check and respin the code, or toss to AMD guys.

Meanwhile, we may think of turning off the bus power-off as a quicker workaround.
It can be done via power_save_blacklist[], but I guess this would also block the runtime PM of the secondary GPU that we want to avoid?

Does the patch (again untested) below work instead?
Comment 29 Takashi Iwai 2018-09-06 05:47:48 UTC
Created attachment 278349 [details]
Possible fix patch (untested)
Comment 30 Lukas Wunner 2018-09-06 05:59:28 UTC
(In reply to Takashi Iwai from comment #29)
> Created attachment 278349 [details]
> Possible fix patch (untested)

I think this patch only works if hda_intel binds after amdgpu. Is there some way for hda_intel to determine from PCI config space or ACPI namespace whether a given HDMI controller is on an APU or dGPU?
Comment 31 Daniel Drake 2018-09-06 06:16:05 UTC
I'm curious why the APU vs dGPU thing matters, e.g. do we know for a fact that dGPU systems can read the HDMI ELD even when the corresponding PCI device is runtime suspended?
Comment 32 Takashi Iwai 2018-09-06 06:21:46 UTC
(In reply to Lukas Wunner from comment #30)
> (In reply to Takashi Iwai from comment #29)
> > Created attachment 278349 [details]
> > Possible fix patch (untested)
> 
> I think this patch only works if hda_intel binds after amdgpu.

The client->id is updated once when amdgpu gets bound, so it should work in most cases, I guess.  (And without amdgpu running, HDMI audio is non-sense.)

> Is there some
> way for hda_intel to determine from PCI config space or ACPI namespace
> whether a given HDMI controller is on an APU or dGPU?

I don't think so.  That's why the deferred audio client id determination was implemented in vga_switcheroo.
Comment 33 Takashi Iwai 2018-09-06 06:25:03 UTC
(In reply to Daniel Drake from comment #31)
> I'm curious why the APU vs dGPU thing matters, e.g. do we know for a fact
> that dGPU systems can read the HDMI ELD even when the corresponding PCI
> device is runtime suspended?

dGPU has usually does only rendering and the connection with the actual output (including HDMI audio) is done on APU.  That is, ELD notification is the job of solely APU.
Comment 34 Takashi Iwai 2018-09-06 06:55:56 UTC
BTW, my last patch is only for 4.19, requiring the commit 4aaf448fa9754e2d5ee188d32327b24ffc15ca4d.
Comment 35 jian-hong 2018-09-10 07:14:40 UTC
(In reply to Takashi Iwai from comment #29)
> Created attachment 278349 [details]
> Possible fix patch (untested)

Tested with the attachment 278349 [details] based on Linux 4.19-rc2 without module options setting.

1. After plug in HDMI cable:

dev@endless:~$ cat /sys/bus/pci/devices/0000\:03\:00.1/power/{control,runtime_status} /proc/asound/card0/eld#0.0
auto
suspended
monitor_present		0
eld_valid		0

The HDMI audio cannot be detected.

2. Suspend & resume (HDMI cable is still plugged):

dev@endless:~$ cat /sys/bus/pci/devices/0000\:03\:00.1/power/{control,runtime_status} /proc/asound/card0/eld#0.0
auto
active
monitor_present		1
eld_valid		1
monitor_name		ASUS VP247
connection_type		HDMI
eld_version		[0x2] CEA-861D or below
edid_version		[0x0] no CEA EDID Timing Extension block present
manufacture_id		0x6904
product_id		0x24c7
port_id			0xd9894495558859e
support_hdcp		0
support_ai		0
audio_sync_delay	0
speakers		[0x1] FL/FR
sad_count		1
sad0_coding_type	[0x1] LPCM
sad0_channels		2
sad0_rates		[0x4e0] 32000 44100 48000 96000
sad0_bits		[0xe0000] 16 20 24

The HDMI audio can be detected and used

3. Unplug HDMI cable when the "/sys/bus/pci/devices/0000\:03\:00.1/power/runtime_status" is "suspended":

dev@endless:~$ cat /sys/bus/pci/devices/0000\:03\:00.1/power/{control,runtime_status} /proc/asound/card0/eld#0.0
auto
suspended
monitor_present		1
eld_valid		1
monitor_name		ASUS VP247
connection_type		HDMI
eld_version		[0x2] CEA-861D or below
edid_version		[0x0] no CEA EDID Timing Extension block present
manufacture_id		0x6904
product_id		0x24c7
port_id			0xd9894495558859e
support_hdcp		0
support_ai		0
audio_sync_delay	0
speakers		[0x1] FL/FR
sad_count		1
sad0_coding_type	[0x1] LPCM
sad0_channels		2
sad0_rates		[0x4e0] 32000 44100 48000 96000
sad0_bits		[0xe0000] 16 20 24

Get the ELD content but the "After unplug HDMI" ELD status is wrong which remains the content as "After plug in HDMI". The ELD should be nothing.
I play the audio through HDMI and the Sound settings refresh, then the HDMI goes away.

dev@endless:~$ cat /sys/bus/pci/devices/0000\:03\:00.1/power/{control,runtime_status} /proc/asound/card0/eld#0.0
auto
suspended
monitor_present		0
eld_valid		0

4. Plug in HDMI cable again:

dev@endless:~$ cat /sys/bus/pci/devices/0000\:03\:00.1/power/{control,runtime_status} /proc/asound/card0/eld#0.0
auto
suspended
monitor_present		0
eld_valid		0

The HDMI audio cannot be detected.

The dmesg is as the attachment below
Comment 36 jian-hong 2018-09-10 07:16:06 UTC
Created attachment 278401 [details]
dmesg of the attachment 278349 [details] based on Linux kernel 4.19-rc2
Comment 37 jian-hong 2018-09-10 07:23:10 UTC
By the way, if I unplug HDMI cable in the step 3 in Comment 35 when the "/sys/bus/pci/devices/0000\:03\:00.1/power/runtime_status" is "active", I get ELD content nothing directly which is correct ELD status.

dev@endless:~$ cat /sys/bus/pci/devices/0000\:03\:00.1/power/{control,runtime_status} /proc/asound/card0/eld#0.0
auto
suspended
monitor_present		0
eld_valid		0
Comment 38 Takashi Iwai 2018-09-10 14:19:56 UTC
Obviously the patch doesn't work as expected, and this is because vga switcheroo isn't enabled unless you really have switchable GPUs...

Basically we need to turn on the runtime PM only for the discrete GPU (at least for covering this regression), so the check in azx_runtime_idle() should be:

#ifdef SUPPORT_VGA_SWITCHEROO
	/* ELD notification gets broken on AMD GPUs when HD-audio bus is off */
	if ((chip->pci->vendor == PCI_VENDOR_ID_ATI ||
	     chip->pci->vendor == PCI_VENDOR_ID_AMD) &&
	    vga_switcheroo_get_client_id(chip->pci) != VGA_SWITCHEROO_DIS)
		return -EBUSY;
#endif
Comment 39 Takashi Iwai 2018-09-10 14:28:34 UTC
Created attachment 278421 [details]
Test fix (revised)
Comment 40 jian-hong 2018-09-11 04:23:20 UTC
(In reply to Takashi Iwai from comment #39)
> Created attachment 278421 [details]
> Test fix (revised)

Tested with the attachment 278421 [details] based on Linux kernel 4.19-rc2 without module options setting.

1. After plug in HDMI cable:

dev@endless:~$ cat /sys/bus/pci/devices/0000\:03\:00.1/power/{control,runtime_status} /proc/asound/card0/eld#0.0
auto
active
monitor_present		1
eld_valid		1
monitor_name		ASUS VP247
connection_type		HDMI
eld_version		[0x2] CEA-861D or below
edid_version		[0x0] no CEA EDID Timing Extension block present
manufacture_id		0x6904
product_id		0x24c7
port_id			0xd9894495558859e
support_hdcp		0
support_ai		0
audio_sync_delay	0
speakers		[0x1] FL/FR
sad_count		1
sad0_coding_type	[0x1] LPCM
sad0_channels		2
sad0_rates		[0x4e0] 32000 44100 48000 96000
sad0_bits		[0xe0000] 16 20 24

2. Unplug HDMI cable:

dev@endless:~$ cat /sys/bus/pci/devices/0000\:03\:00.1/power/{control,runtime_status} /proc/asound/card0/eld#0.0
auto
active
monitor_present		0
eld_valid		0

3. Plug in HDMI cable again and then suspend & resume:

dev@endless:~$ cat /sys/bus/pci/devices/0000\:03\:00.1/power/{control,runtime_status} /proc/asound/card0/eld#0.0
auto
active
monitor_present		0
eld_valid		0

Recap:
The HDMI's audio works correctly after boot, but the HDMI's audio cannot be detected after resume.

Get the same result on both ASUS X505ZA and ASUS X570ZD.
Comment 41 jian-hong 2018-09-11 04:27:15 UTC
Should I separate the issue "AMD's HDMI audio cannot be detected after resume with CONFIG_VGA_SWITCHEROO=y listed in config file" from this ticket?

By the way, I also checked this issue which cannot be reproduced on Linux kernel 4.16.18
Comment 42 Takashi Iwai 2018-09-11 06:28:52 UTC
You pluggined in the cable, *then* go to suspend and resume, and the ELD gets cleared?  That doesn't make sense.

Or, do you mean that you go to suspend while unplugged, then plug during suspended, and resume?

In anyway, you need to check hdmi_present_sense() call in patch_hdmi.c.  This must be called at the resume of HDMI codec.  And it must go through hdmi_present_sense_via_verbs() and read all jack states.
(snd_hda_pin_sense() reads the actual state as the jack must have been dirtied by snd_hda_jack_set_dirty_all() call in hda_call_codec_resume().)
Comment 43 jian-hong 2018-09-12 06:45:54 UTC
Created attachment 278467 [details]
dmesg of resuming back based on Linux kernel 4.19-rc3 with patch attachment 278421 [details].

I add some debug messages based on Linux kernel 4.19-rc3 with patch attachment 278421 [details].

diff --git a/sound/pci/hda/hda_codec.c b/sound/pci/hda/hda_codec.c
index 26d348b47867..633522266a23 100644
--- a/sound/pci/hda/hda_codec.c
+++ b/sound/pci/hda/hda_codec.c
@@ -2886,6 +2886,8 @@ static unsigned int hda_call_codec_suspend(struct hda_codec *codec)
  */
 static void hda_call_codec_resume(struct hda_codec *codec)
 {
+       codec_warn(codec, "%s\n", __func__);
+
        snd_hdac_enter_pm(&codec->core);
        if (codec->core.regmap)
                regcache_mark_dirty(codec->core.regmap);
diff --git a/sound/pci/hda/patch_hdmi.c b/sound/pci/hda/patch_hdmi.c
index cb587dce67a9..cc2434e10a78 100644
--- a/sound/pci/hda/patch_hdmi.c
+++ b/sound/pci/hda/patch_hdmi.c
@@ -1632,6 +1632,7 @@ static bool hdmi_present_sense(struct hdmi_spec_per_pin *per_pin, int repoll)
        struct hda_codec *codec = per_pin->codec;
        int ret;
 
+       codec_warn(codec, "%s\n", __func__);
        /* no temporary power up/down needed for component notifier */
        if (!codec_has_acomp(codec)) {
                ret = snd_hda_power_up_pm(codec);

hda_call_codec_resume() is called after resume, but hdmi_present_sense() is not.
Comment 44 jian-hong 2018-09-12 07:06:04 UTC
I tested with the steps in Comment 43:

1. Boot system into desktop environment
2. Plug in HDMI cable
3. Unplug HDMI cable
4. Do suspend & resume system
5. Plug in HDMI cable

Then found HDMI's audio cannot be detected

dev@endless:~$ cat /sys/bus/pci/devices/0000\:03\:00.1/power/{control,runtime_status} /proc/asound/card0/eld#0.0
auto
active
monitor_present		0
eld_valid		0
Comment 45 Takashi Iwai 2018-09-12 07:09:55 UTC
(In reply to jian-hong from comment #43)
> hda_call_codec_resume() is called after resume, but hdmi_present_sense() is
> not.

You need to check further, whether generic_hdmi_resume() is called or not, and from there hdmi_present_sense() is called or not.

If they aren't called, check the condition in hda_call_codec_resume() and identify why they aren't called.
Comment 46 Takashi Iwai 2018-09-12 07:11:09 UTC
And, note that hda_call_codec_resume() is called for each codec chip, not only HDMI codec.  Check the device name carefully at each call.
Comment 47 jian-hong 2018-09-12 08:59:11 UTC
Created attachment 278469 [details]
dmesg2 of resuming back based on Linux kernel 4.19-rc3 with patch attachment 278421 [details].

Add more debug message based on Linux kernel 4.19-rc3 with patch attachment 278421 [details].

diff --git a/sound/pci/hda/hda_codec.c b/sound/pci/hda/hda_codec.c
index 26d348b47867..952667f89e4b 100644
--- a/sound/pci/hda/hda_codec.c
+++ b/sound/pci/hda/hda_codec.c
@@ -2886,6 +2886,8 @@ static unsigned int hda_call_codec_suspend(struct hda_codec *codec)
  */
 static void hda_call_codec_resume(struct hda_codec *codec)
 {
+       dev_warn(&codec->core.dev, "%s\n", __func__);
+
        snd_hdac_enter_pm(&codec->core);
        if (codec->core.regmap)
                regcache_mark_dirty(codec->core.regmap);
@@ -2934,6 +2936,7 @@ static int hda_codec_runtime_resume(struct device *dev)
 {
        struct hda_codec *codec = dev_to_hda_codec(dev);
 
+       dev_warn(&codec->core.dev, "%s\n", __func__);
        snd_hdac_link_power(&codec->core, true);
        snd_hdac_codec_link_up(&codec->core);
        hda_call_codec_resume(codec);
diff --git a/sound/pci/hda/patch_hdmi.c b/sound/pci/hda/patch_hdmi.c
index cb587dce67a9..7cd295963a23 100644
--- a/sound/pci/hda/patch_hdmi.c
+++ b/sound/pci/hda/patch_hdmi.c
@@ -1632,6 +1632,7 @@ static bool hdmi_present_sense(struct hdmi_spec_per_pin *per_pin, int repoll)
        struct hda_codec *codec = per_pin->codec;
        int ret;
 
+       dev_warn(&codec->core.dev, "%s\n", __func__);
        /* no temporary power up/down needed for component notifier */
        if (!codec_has_acomp(codec)) {
                ret = snd_hda_power_up_pm(codec);
@@ -2314,6 +2315,7 @@ static int generic_hdmi_resume(struct hda_codec *codec)
        struct hdmi_spec *spec = codec->spec;
        int pin_idx;
 
+       dev_warn(&codec->core.dev, "%s\n", __func__);
        codec->patch_ops.init(codec);
        regcache_sync(codec->core.regmap);

After resume, I check dmesg again:

dev@endless:~/linux-stable$ dmesg  | grep -E "(snd_hda|switcheroo)"
[    9.110430] snd_hda_intel 0000:03:00.1: enabling device (0000 -> 0002)
[    9.110495] snd_hda_intel 0000:03:00.1: Handle vga_switcheroo audio client
[    9.110528] snd_hda_intel 0000:03:00.6: enabling device (0000 -> 0002)
[    9.228524] snd_hda_codec_hdmi hdaudioC0D0: hdmi_present_sense
[    9.232891] snd_hda_codec_generic hdaudioC1D0: autoconfig for Generic: line_outs=1 (0x17/0x0/0x0/0x0/0x0) type:speaker
[    9.232893] snd_hda_codec_generic hdaudioC1D0:    speaker_outs=0 (0x0/0x0/0x0/0x0/0x0)
[    9.232894] snd_hda_codec_generic hdaudioC1D0:    hp_outs=1 (0x16/0x0/0x0/0x0/0x0)
[    9.232895] snd_hda_codec_generic hdaudioC1D0:    mono: mono_out=0x0
[    9.232896] snd_hda_codec_generic hdaudioC1D0:    inputs:
[    9.232897] snd_hda_codec_generic hdaudioC1D0:      Internal Mic=0x1a
[    9.232898] snd_hda_codec_generic hdaudioC1D0:      Mic=0x19
[    9.249638] snd_hda_codec_generic hdaudioC1D0: hda_codec_runtime_resume
[    9.249640] snd_hda_codec_generic hdaudioC1D0: hda_call_codec_resume
[   18.827076] snd_hda_codec_hdmi hdaudioC0D0: hda_codec_runtime_resume
[   18.827078] snd_hda_codec_hdmi hdaudioC0D0: hda_call_codec_resume
[   18.827157] snd_hda_codec_hdmi hdaudioC0D0: generic_hdmi_resume
[   18.827321] snd_hda_codec_hdmi hdaudioC0D0: hdmi_present_sense
[   26.586408] snd_hda_codec_hdmi hdaudioC0D0: hda_codec_runtime_resume
[   26.586409] snd_hda_codec_hdmi hdaudioC0D0: hda_call_codec_resume
[   26.586466] snd_hda_codec_hdmi hdaudioC0D0: generic_hdmi_resume
[   26.586584] snd_hda_codec_hdmi hdaudioC0D0: hdmi_present_sense
[  108.866821] snd_hda_codec_hdmi hdaudioC0D0: hdmi_present_sense
[  108.866825] snd_hda_codec_hdmi hdaudioC0D0: hda_codec_runtime_resume
[  108.866832] snd_hda_codec_hdmi hdaudioC0D0: hda_call_codec_resume
[  108.866866] snd_hda_codec_hdmi hdaudioC0D0: generic_hdmi_resume
[  108.866980] snd_hda_codec_hdmi hdaudioC0D0: hdmi_present_sense
[  108.868848] snd_hda_codec_hdmi hdaudioC0D0: hdmi_present_sense
[  118.337819] snd_hda_codec_hdmi hdaudioC0D0: hdmi_present_sense
[  118.337827] snd_hda_codec_hdmi hdaudioC0D0: hda_codec_runtime_resume
[  118.337829] snd_hda_codec_hdmi hdaudioC0D0: hda_call_codec_resume
[  118.337864] snd_hda_codec_hdmi hdaudioC0D0: generic_hdmi_resume
[  118.337980] snd_hda_codec_hdmi hdaudioC0D0: hdmi_present_sense
[  118.338005] snd_hda_codec_hdmi hdaudioC0D0: hdmi_present_sense
[  139.170569] snd_hda_codec_generic hdaudioC1D0: hda_codec_runtime_resume
[  139.170571] snd_hda_codec_generic hdaudioC1D0: hda_call_codec_resume

Only hda_call_codec_resume() of internal audio card (hdaudioC1D0) is called, the HDMI's audio card (hdaudioC0D0) is not called after resume.
Comment 48 jian-hong 2018-09-12 09:00:59 UTC
Trace the code https://elixir.bootlin.com/linux/v4.19-rc3/source/sound/pci/hda/hda_codec.c#L2933

static int hda_codec_runtime_resume(struct device *dev)
{
	struct hda_codec *codec = dev_to_hda_codec(dev);

	snd_hdac_link_power(&codec->core, true);
	snd_hdac_codec_link_up(&codec->core);
	hda_call_codec_resume(codec);
	pm_runtime_mark_last_busy(dev);
	return 0;
}
#endif /* CONFIG_PM */

/* referred in hda_bind.c */
const struct dev_pm_ops hda_codec_driver_pm = {
	SET_SYSTEM_SLEEP_PM_OPS(pm_runtime_force_suspend,
				pm_runtime_force_resume)
	SET_RUNTIME_PM_OPS(hda_codec_runtime_suspend, hda_codec_runtime_resume,
			   NULL)
};

hda_codec_runtime_resume() is registered as the "runtime_resume" callback function of "hda_codec_driver_pm".

"hda_codec_driver_pm" is bound to every HDA codec driver when __hda_codec_driver_register() is called. https://elixir.bootlin.com/linux/v4.19-rc3/source/sound/pci/hda/hda_bind.c#L152
Comment 49 Takashi Iwai 2018-09-12 09:10:19 UTC
Then it indicates a normal pattern: the device isn't used, hence it delays the resume until it gets really accessed.  The same would happen for the other analog codecs.

That is, this is the expected behavior.  Maybe it didn't hit in the past just because the power saving didn't work properly.  For avoiding this, you'd need to disable the power-saving.

Reading eld# proc file won't trigger the runtime resume, hence the value isn't activated.  When you read a codec#* proc file, it'll do runtime resume and ELD will be updated.
Comment 50 jian-hong 2018-09-12 10:28:37 UTC
(In reply to Takashi Iwai from comment #49)
> Then it indicates a normal pattern: the device isn't used, hence it delays
> the resume until it gets really accessed.  The same would happen for the
> other analog codecs.
> 
> That is, this is the expected behavior.  Maybe it didn't hit in the past
> just because the power saving didn't work properly.  For avoiding this,
> you'd need to disable the power-saving.

Does this mean "options snd_hda_intel power_save=0" to disable the power-saving?

> Reading eld# proc file won't trigger the runtime resume, hence the value
> isn't activated.  When you read a codec#* proc file, it'll do runtime resume
> and ELD will be updated.
Comment 51 Takashi Iwai 2018-09-12 10:54:37 UTC
It's one option, which will govern the whole system in a shot.
Another option would be to adjust each item in /sys/bus/hdaudio/devices/*/power/*.
Comment 52 Takashi Iwai 2018-09-12 11:15:40 UTC
But looking at the code again, the commit 07f4f97d7b4b marked codec->auto_runtime_pm forcibly.  This is one of problems, and it will prevent power_save=0 option working.

We need another fix for covering this, supposedly.
Comment 53 Takashi Iwai 2018-09-12 11:30:19 UTC
BTW, could you give alsa-info.sh output on your machine?  Run the script with --no-upload option and attach to bugzilla.
Comment 54 Takashi Iwai 2018-09-12 18:25:48 UTC
OK, now I redesigned and rewrote the whole patch.  Below is again an untested patch.

In v2 patch, a new vga switcheroo client ops, gpu_bound, is added to get the notification of the bound client type.  Then the HD-audio driver sets up the runtime PM availability depending on it (and on the fly).
Comment 55 Takashi Iwai 2018-09-12 18:26:23 UTC
Created attachment 278477 [details]
Test fix patch (v2)
Comment 56 jian-hong 2018-09-13 02:42:11 UTC
Created attachment 278483 [details]
alsa-info of ASUS X505ZA

(In reply to Takashi Iwai from comment #53)
> BTW, could you give alsa-info.sh output on your machine?  Run the script
> with --no-upload option and attach to bugzilla.
Comment 57 jian-hong 2018-09-13 03:30:43 UTC
(In reply to Takashi Iwai from comment #55)
> Created attachment 278477 [details]
> Test fix patch (v2)

Tested the patch attachment 278477 [details] based on both Linux 4.19-rc3.

The HDMI's audio works correctly before suspend and after resume with CONFIG_VGA_SWITCHEROO=y listed in config file on ASUS X505ZA and X570ZD.

Thank you!
Comment 58 Alex Deucher 2018-09-13 15:39:27 UTC
(In reply to Lukas Wunner from comment #27)
> 
> The power_save_blacklist[] approach is ugly. Ideally we'd have a simple way
> to detect whether a GPU is an APU or dGPU. I'm feeling kind of left alone
> here by AMD, are they expecting the rest of the community to solve this on
> their own?

We can only answer questions if we know about them :)  If you have questions, please ask.  With hybrid graphics (HG), there are several variants AMD+AMD, AMD+Intel, AMD+NVIDIA.  For those sysems, displays can be wired to the dGPU or the APU.  E.g., the panel and VGA might be wired to the APU, but the HDMI and DP ports might be wired to the dGPU.  For AMD GPUs, both dGPUs and APUs, the display audio is always a sub-function of the GPU device.  There is also usually a platform audio device (e.g, for stuff like headphone jacks).  So on an A+A laptop, you may have 3 audio devices (iGPU audio, dGPU audio, platform audio).  With respect the ACPI, the ATPX and ATIF methods are in the iGPU's namespace on HG/PX laptops and on dGPU only systems the ATIF is in the dGPU's namespace (since there is no iGPU).  ATPX is always present on HG systems even though dGPU power is controlled via _PR3 rather than ATPX because there are still mappings and other info provided by the interface.  See drivers/gpu/drm/amd/include/amd_acpi.h for more info.  As for determining whether a device is a dGPU or APU, you can look at the pci ids (see which devices are flagged with AMD_IS_APU in amdgpu_drv.c), or you can determine it by looking at the pci topology.  The iGPU is always at the root of the bus, the dGPU is behind a bridge.
Comment 59 Daniel Drake 2018-09-14 07:19:09 UTC
I hope Takashi's workaround above goes upstream to fix the immediate regression (HDMI audio not usable on many AMD platforms since Linux 4.17) but additionally
I guess AMD's help is still needed in finishing and landing the audio component work which would make this work properly:
https://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound.git/log/?h=topic/hda-acomp

The work done there covers dce v6/v8/v10/v11 and looks like it might work, but none of us have that hardware on-hand to test it.

We tried to adapt it for our vega10 platforms in hand (see above) but the vega10 code does not have the audio/pin knowledge that appears readily available in the dce_v* platform files.

Skipping over the challenge of how to identify the HDA pin number for the HDMI connector in the drm driver, we then hit challenges because it seems like there are (at least) two "driver models" within amdgpu, only one was considered in the above branch but it doesn't apply for vega10. For example the amdgpu_get_connector_for_encoder() function used in amdgpu_audio_component_get_eld() doesn't work for devices like ours that use amdgpu_dm.c. Similarly we weren't quite sure where to put the amdgpu_audio_eld_notify() call (done upon modeset) in that file.
Comment 60 Alex Deucher 2018-09-14 14:13:31 UTC
There is the old modesetting code (amd/amdgpu/dce*.c) (which doesn't support audio for most asics anyway) and the new modesetting code (amd/display) (which suppports pretty much entire functionality of the display hardware).  The old modesetting code will eventually be retired for asics supported by both.  The new modesetting code is the default for most asics.

Harry, can you take a look at how to integrate Takashi's audio component work into DC?
Comment 61 Alex Deucher 2018-09-14 14:19:33 UTC
Harry, do you know how the pins are assigned or do we need to ask Vijendar or Akshu?

For reference, AMD hw does not use the ELD.  We use a backdoor register interface for the GPU and Audio drivers to communicate.  The relevant audio registers are exposed in the register headers in amdgpu, and here is the documentation on the audio side (vendor specific verbs):
http://developer.amd.com/wordpress/media/2013/10/AMD_HDA_verbs_v2.pdf

Note You need to log in before you can comment on or make changes to this bug.