Bug 204199

Summary: HDMI audio changes the ELD pin node after resume from suspend mode
Product: Drivers Reporter: jian-hong
Component: Sound(ALSA)Assignee: Jaroslav Kysela (perex)
Status: NEW ---    
Severity: normal CC: andrey+kernel, drake, tiwai
Priority: P1    
Hardware: All   
OS: Linux   
Kernel Version: 5.2 Subsystem:
Regression: No Bisected commit-id:
Bug Depends on:    
Bug Blocks: 208829    
Attachments: dmesg log
alsa-info before suspend
alsa-info after resume
pactl before suspend
pactl after resume
Test patch
dmesg with attachment 283813
Temp alsa-info mentioned in Comment 10

Description jian-hong 2019-07-17 05:54:27 UTC
Created attachment 283763 [details]
dmesg log

We have an Acer Aspire TC-865 desktop equipped with Intel i7-8700 and NVIDIA GeForce GTX 1050.  The HDMI cable is plugged to the NVIDIA card and a display with builtin speakers directly.

Tested with kernel 5.2.  The HDMI audio works correctly before suspend.  Pin#5 holds the ELD.

[   26.504190] snd_hda_codec_hdmi hdaudioC1D0: HDMI hot plug event: Codec=0 Pin=5 Device=0 Inactive=0 Presence_Detect=1 ELD_Valid=1
[   26.507811] snd_hda_codec_hdmi hdaudioC1D0: HDMI status: Codec=0 Pin=5 Presence_Detect=1 ELD_Valid=1
[   26.892083] snd_hda_codec_hdmi hdaudioC1D0: HDMI: detected monitor ASUS VP247
                  at connection type HDMI

The HDMI audio may become failed after resume.  Pin#5 does not hold ELD any more.  It changes to Pin#4.

[  369.300562] snd_hda_intel 0000:01:00.1: azx_get_response timeout, polling the codec once: last cmd=0x005f0900
[  370.304539] snd_hda_intel 0000:01:00.1: azx_get_response timeout, switching to polling mode: last cmd=0x005f0900
[  371.308512] snd_hda_intel 0000:01:00.1: azx_get_response timeout, switching to single_cmd mode: last cmd=0x005f0900
[  371.308526] snd_hda_codec_hdmi hdaudioC1D0: HDMI status: Codec=0 Pin=5 Presence_Detect=1 ELD_Valid=1
[  371.308670] snd_hda_intel 0000:01:00.1: get_response timeout: IRS=0x0
[  371.308673] snd_hda_codec_hdmi hdaudioC1D0: HDMI: invalid ELD buf size -1
[  371.404592] snd_hda_intel 0000:01:00.1: Clearing TCSEL
[  371.404610] snd_hda_intel 0000:01:00.1: Setting Nvidia snoop: 1
[  371.407882] snd_hda_codec_hdmi hdaudioC1D0: generic_hdmi_resume
[  371.408380] snd_hda_codec_hdmi hdaudioC1D0: HDMI status: Codec=0 Pin=4 Presence_Detect=1 ELD_Valid=1
[  371.412402] snd_hda_codec_hdmi hdaudioC1D0: HDMI: detected monitor ASUS VP247
                  at connection type HDMI
Comment 1 jian-hong 2019-07-17 05:55:21 UTC
Created attachment 283765 [details]
alsa-info before suspend
Comment 2 jian-hong 2019-07-17 05:55:47 UTC
Created attachment 283767 [details]
alsa-info after resume
Comment 3 jian-hong 2019-07-17 06:16:15 UTC
Created attachment 283769 [details]
pactl before suspend
Comment 4 jian-hong 2019-07-17 06:16:40 UTC
Created attachment 283771 [details]
pactl after resume
Comment 5 jian-hong 2019-07-17 06:19:59 UTC
This issue will make Pulse Audio choose wrong profile and become Dummy Output, then cannot output sound.

before suspend: alsa_output.pci-0000_01_00.1.hdmi-stereo-extra1
after resume: Dummy Output
Comment 6 Takashi Iwai 2019-07-17 16:18:34 UTC
Does this happen with nouveau driver?  I suspect it's specific to Nvidia binary driver.
Comment 7 jian-hong 2019-07-18 02:22:28 UTC
I had tried nouveau, but system cannot resume back after suspend. I also cannot ssh into the system.

Also tried nouveau's parameters:
* runpm=0: System cannot resume back after suspend
* noaccel=1: System logouts user automatically
* runpm=0 and noaccel=1: Same as noaccel=1
* modeset=0: System cannot load X

So, nouveau cannot be used on this desktop.

I also filed this bug to NVIDIA's devtalk https://devtalk.nvidia.com/default/topic/1057448/linux/hdmi-audio-changes-the-eld-pin-node-after-resume-from-suspend-mode/
Comment 8 Takashi Iwai 2019-07-18 09:35:41 UTC
A possible workaround would be to delay the jack detection at resume forcibly, a patch like below (totally untested).  But this can't be merged to upstream as-is, of course, because it's merely an ultra-ugly hack.
Comment 9 Takashi Iwai 2019-07-18 09:36:30 UTC
Created attachment 283813 [details]
Test patch
Comment 10 jian-hong 2019-07-19 07:00:04 UTC
Created attachment 283831 [details]
dmesg with attachment 283813 [details]

I tried the patch attachment 283813 [details].  But it makes system have no HDMI audio.

The alsa-info shows "cat: /proc/asound/modules: No such file or directory".  However, I notice there are some temp files in /tmp/alsa-info.XXXXX.  I will upload them as a tarball later.
Comment 11 jian-hong 2019-07-19 07:02:04 UTC
Other desktop equipped with i9-9900K and NVIDA GeForce RTX 2080 also hits this issue.
Comment 12 jian-hong 2019-07-19 07:03:43 UTC
Created attachment 283833 [details]
Temp alsa-info mentioned in Comment 10
Comment 13 Daniel Drake 2019-07-29 06:25:03 UTC
I think this issue is likely specific to the nvidia binary driver.
As described at
https://download.nvidia.com/XFree86/gpu-hdmi-audio-document/#_driver_architecture
the nvidia driver will do some magic in order to make the ELD appear.

It's really weird how it moves between pins after suspend/resume, but actually ignoring that detail, ALSA's view of the resulting setup is accurate and working fine.

This only causes a user-visible problem due to pulseaudio exposing unavailable profiles which we've now reported at https://gitlab.freedesktop.org/pulseaudio/pulseaudio/issues/708

so this issue can probably be closed.
Comment 14 Takashi Iwai 2019-07-29 06:34:28 UTC
The biggest problem here is that the HD-audio driver goes into single_cmd mode as fallback for the communication error.  And the communication error happens possibly because of the missing device resume dependency -- e.g. the HD-audio is resumed while or before the Nvidia driver gets resumed.  We have device-link chain for Intel codecs via audio component (and will likely have for radeon/amdgpu/nouveau, too), but missing for Nvidia binary.

For avoiding to switch to single_cmd mode, you can pass single_cmd=0 option to snd-hda-intel module.  Then, even if the communication error happens, it won't change to single_cmd mode, so once after the Nvidia driver resumes, it'd start working again.
Comment 15 Daniel Drake 2019-07-29 06:43:48 UTC
I see. And sequencing that resume in the nvidia binary case is going to be really painful:

https://download.nvidia.com/XFree86/gpu-hdmi-audio-document/#_driver_architecture
> Note that the NVIDIA binary X driver is specifically an X driver. For this
> reason, steps 2 and 3 (and indeed 1) are only activated when the X server is
> running, and actively controls the VT. When X is not active, or when the
> console is VT-switched to a text terminal, HDMI audio will not work.

nvidia does chvt to text console on suspend, and then chvt back to X on resume. After chvt back to X, nvidia will then resume and only at that late point restore the availability of gfx and audio stuff...