Bug 199007
Summary: | hda driver initialization takes too much time on the machine with coffeelake audio controller [8086:a348] | ||
---|---|---|---|
Product: | Drivers | Reporter: | Hui Wang (hui.wang) |
Component: | Sound(ALSA) | Assignee: | Jaroslav Kysela (perex) |
Status: | RESOLVED CODE_FIX | ||
Severity: | blocking | CC: | hui.wang, tiwai |
Priority: | P1 | ||
Hardware: | Intel | ||
OS: | Linux | ||
Kernel Version: | 4.16-rc4 and all old version kernel | Subsystem: | |
Regression: | No | Bisected commit-id: | |
Attachments: |
trace of snd_hda_codec_build_controls
dmesg about snd_hdac_bus_exec_verb_unlocked() entry and exit trace event for test 1 trace event for test 2 test on another CFL machine the trace events log of 1st test on the laptop with alc289 the trace events log of 2nd test on the laptop with alc289 Untested fix patch |
Description
Hui Wang
2018-03-05 10:16:25 UTC
The driver initialization takes too much time, the dmesg like below shown (from 6.27 to 12.99): [ 6.272797] snd_hda_codec_realtek hdaudioC0D0: autoconfig for ALC3246: line_outs=1 (0x14/0x0/0x0/0x0/0x0) type:speaker [ 6.272803] snd_hda_codec_realtek hdaudioC0D0: speaker_outs=0 (0x0/0x0/0x0/0x0/0x0) [ 6.272807] snd_hda_codec_realtek hdaudioC0D0: hp_outs=1 (0x21/0x0/0x0/0x0/0x0) [ 6.272809] snd_hda_codec_realtek hdaudioC0D0: mono: mono_out=0x0 [ 6.272811] snd_hda_codec_realtek hdaudioC0D0: inputs: [ 6.272816] snd_hda_codec_realtek hdaudioC0D0: Headset Mic=0x19 [ 6.272819] snd_hda_codec_realtek hdaudioC0D0: Headphone Mic=0x1a [ 6.272822] snd_hda_codec_realtek hdaudioC0D0: Internal Mic=0x12 [ 6.281028] iwlwifi 0000:00:14.3: base HW address: 34:13:e8:6f:61:4d [ 6.285470] Adding 32894972k swap on /dev/nvme0n1p4. Priority:-1 extents:1 across:32894972k SSFS ... [ 12.998515] input: HDA Intel PCH Headphone Mic as /devices/pci0000:00/0000:00:1f.3/sound/card0/input18 [ 12.998695] input: HDA Intel PCH HDMI/DP,pcm=3 as /devices/pci0000:00/0000:00:1f.3/sound/card0/input19 [ 12.998861] input: HDA Intel PCH HDMI/DP,pcm=7 as /devices/pci0000:00/0000:00:1f.3/sound/card0/input20 [ 12.998969] input: HDA Intel PCH HDMI/DP,pcm=8 as /devices/pci0000:00/0000:00:1f.3/sound/card0/input21 [ 12.999051] input: HDA Intel PCH HDMI/DP,pcm=9 as /devices/pci0000:00/0000:00:1f.3/sound/card0/input22 [ 12.999133] input: HDA Intel PCH HDMI/DP,pcm=10 as /devices/pci0000:00/0000:00:1f.3/sound/card0/i And the audio device looks like below show: 00:1f.3 Audio device [0403]: Intel Corporation Device [8086:a348] (rev 01) (prog-if 80) Subsystem: Dell Device [1028:0826] Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx+ Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx- Latency: 32, Cache Line Size: 64 bytes Interrupt: pin B routed to IRQ 449 Region 0: Memory at ed418000 (64-bit, non-prefetchable) [size=16K] Region 4: Memory at ed100000 (64-bit, non-prefetchable) [size=1M] Capabilities: <access denied> Kernel driver in use: snd_hda_intel Kernel modules: snd_hda_intel Sometimes this can make pulseaudio can't work well. pulseaudio starts, but the driver (as kernel modules) initialization has not finished, then there is no valid sink in the pulseaudio. At this moment, some UI Apps already start using pulseaudio to play sound, but users can't hear the sound as expected. This problem never happened before, it is exposed on cfl platforms, and all cfl platforms (with audio controller 8086:a348) have this problem. Could you identify which code actually takes so long? You can check via perf or such. It is not always same for every time. But most of time, the hdac_regmap.c:hda_reg_read()--->hda_codec.c:codec_exec_verb() will takes more than 1 second to finish randomly. I don't know which time of calling hda_reg_read() takes more than 1 second, but there are several times that takes more than 1 second within 100 times of calling this function. For going further from that point, you need to identify which codec verb triggers such an issue. You can turn on tracing HD-audio verbs to catch it. In most cases, it's some unexpected verb to perform, hence either codec or controller chip stalls. Ok, I will try to trace the verbs. Created attachment 274629 [details]
trace of snd_hda_codec_build_controls
set max_depth to be 10.
We can find $ in this txt, all functions take long time to run are snd_hdac_exec_verb().
Created attachment 274723 [details]
dmesg about snd_hdac_bus_exec_verb_unlocked() entry and exit
I added printk at entry and exit of snd_hdac_bus_exec_verb_unlocked(), this function will randomly take 1 second to finish like this:
[ 16.171904] wwwwwwwwwww enter snd_hdac_bus_exec_verb_unlocked, addr = 0x0, cmd = 0x1ef0500
[ 17.176324] ============ exit snd_hdac_bus_exec_verb_unlocked
After reading the hda_verbs.h, I found that the verb AC_VERB_GET_AMP_GAIN_MUTE and AC_VERB_GET_POWER_STATE will take 1 second to finish sometimes (snd_hdac_bus_exec_verb_unlocked), but not all these two verbs take 1 second to finish, most of time, these two verbs work normally.
This problem only happens on CFL platforms (with audio controller 8086:a348).
Check tracing of verbs to see exactly which verb (and to which NID) stalls. It would work without adding debug prints. See "Tracepoints" section in Documentation/sound/hd-audio/notes.rst. And such a problem isn't necessarily depending on the controller. Unless it's exactly the same codec chip, it might be a slight difference of codec chip models. For example, some chips don't like accesses to a NID that is disabled only for some variants of chip models. Created attachment 274735 [details] trace event for test 1 This is codec info: Vendor Id: 0x10ec0255 Revision Id: 0x100002 I did the test of comment#10 twice, and the verbs which took 1 second to finsh are different between 2 tests (does not look like a problem of codec side). In the 1st time, we can see the verb took 1 second to finish are shown as below: raw value = 0x00bb0000 cid = 0, nid = 0x0b, verb = 0xb00, parm = 0x00 raw value: verb = 0xb00, parm = 0x0 verbname = get_amp_gain_mute amp raw val = 0x0 input, right, idx=0 raw value = 0x020c0000 cid = 0, nid = 0x20, verb = 0xc00, parm = 0x00 raw value: verb = 0xc00, parm = 0x0 verbname = get_proc_coef raw value = 0x017f000c cid = 0, nid = 0x17, verb = 0xf00, parm = 0x0c raw value: verb = 0xf00, parm = 0xc verbname = get_parameters parameter = pin_cap raw value = 0x00db0000 cid = 0, nid = 0x0d, verb = 0xb00, parm = 0x00 raw value: verb = 0xb00, parm = 0x0 verbname = get_amp_gain_mute amp raw val = 0x0 input, right, idx=0 raw value = 0x008b0000 cid = 0, nid = 0x08, verb = 0xb00, parm = 0x00 raw value: verb = 0xb00, parm = 0x0 verbname = get_amp_gain_mute amp raw val = 0x0 input, right, idx=0 ... Created attachment 274737 [details]
trace event for test 2
The verbs took 1 sec to finish are shown as below:
raw value = 0x00bb2003
cid = 0, nid = 0x0b, verb = 0xb20, parm = 0x03
raw value: verb = 0xb20, parm = 0x3
verbname = get_amp_gain_mute
amp raw val = 0x2003
input, left, idx=3
raw value = 0x00bb2004
cid = 0, nid = 0x0b, verb = 0xb20, parm = 0x04
raw value: verb = 0xb20, parm = 0x4
verbname = get_amp_gain_mute
amp raw val = 0x2004
input, left, idx=4
raw value = 0x020c0000
cid = 0, nid = 0x20, verb = 0xc00, parm = 0x00
raw value: verb = 0xc00, parm = 0x0
verbname = get_proc_coef
raw value = 0x00cb2001
cid = 0, nid = 0x0c, verb = 0xb20, parm = 0x01
raw value: verb = 0xb20, parm = 0x1
verbname = get_amp_gain_mute
amp raw val = 0x2001
input, left, idx=1
raw value = 0x009b0000
cid = 0, nid = 0x09, verb = 0xb00, parm = 0x00
raw value: verb = 0xb00, parm = 0x0
verbname = get_amp_gain_mute
amp raw val = 0x0
input, right, idx=0
raw value = 0x002f0500
cid = 0, nid = 0x02, verb = 0xf05, parm = 0x00
raw value: verb = 0xf05, parm = 0x0
verbname = get_power_state
...
Hm, at least, a common pattern is that NID 0x0b triggers the stall at first. And NID 0x20 is another one commonly seen, which is a COEF, so it's no wonder. Could you give alsa-info.sh output? ... and another question is why it still "works" after one second stall. Doesn't the controller driver go to the polling mode after timeout? If yes, the kernel should have given the warning message. (But also it doesn't explain why multiple stalls happen, and each of them works after 1 second.) Created attachment 274775 [details]
test on another CFL machine
This is the alsa-info.txt generated on another CFL laptop, this laptop also has this issue. the codec on this machine is alc289, is different from the codec (alc255) on the old machine (desktop)。
Created attachment 274777 [details]
the trace events log of 1st test on the laptop with alc289
also has 1 sec stall in some verbs
Created attachment 274779 [details]
the trace events log of 2nd test on the laptop with alc289
reply comment#14: looks like the hda driver did not complain any timeout, so it did not enter polling mode. Then try to force polling mode from the beginning. If this works, it's either the missing interrupt in the CFL controller, or the memory cache issue in CORB/RIRB. OK, got it. will test polling mode. Correct the comment #18, after enabling dev_dbg(), I saw "azx_get_response timeout, polling the codec once: last cmd=" in the dmesg, looks like all snd_hdac_bus_exec_verb_unlocked() which take 1 second to finish will print out this warning message. And after making the driver work in polling mode, the problem disappears, the driver initialization will finish within 0.1s, just like other platforms. So how do we fix this problem? Well, a quick-and-easy fix would be to set polling_mode when CFL chip is probed, an (untested) patch like below. But I'd rather like to check more details. For example, whether the problem still appears when you probe with snoop=0 option to snd-hda-intel module. Created attachment 274805 [details]
Untested fix patch
Ok, will test snoop=0. Reply comment #22, set snoop=0 when loading snd-hda-intel doesn't help fix this problem. Reply comment#23, tested the untested patch, it worked very well on my CFL machines. OK. The impact of polling mode is much smaller than disabling the power-save, so we can take it for now as a tentative solution. I'll submit the patch to ML later. |