Bug 219577
Summary: | NULL pointer dereference inside snd_soc_avs, regression from 6.6 -> 6.12 | ||
---|---|---|---|
Product: | Drivers | Reporter: | Jade (jade) |
Component: | Sound(ALSA) | Assignee: | Jaroslav Kysela (perex) |
Status: | NEW --- | ||
Severity: | normal | CC: | amadeuszx.slawinski, jason, kai.vehmanen, liam.r.girdwood, peter.ujfalusi, pierre-louis.bossart |
Priority: | P3 | ||
Hardware: | Intel | ||
OS: | Linux | ||
Kernel Version: | Subsystem: | ||
Regression: | Yes | Bisected commit-id: | |
Attachments: | Full dmesg of affected machine |
Description
Jade
2024-12-08 21:03:37 UTC
Created attachment 307333 [details]
Full dmesg of affected machine
Okay so I have tried a bunch more kernel releases. The NULL dereference only happens on 6.12. However, I have tried kernels 6.7, 6.8, 6.9, 6.10 and none of them have working audio since they fail to load the firmware (some of which might be my fault, it may be the case that linux-firmware is not the right shape, since I also had WiFi issues on the particularly old ones). This is what happens on 6.11.10: Dec 08 13:56:49 snowflake kernel: snd_soc_avs 0000:00:1f.3: bound 0000:00:02.0 (ops i915_audio_component_bind_ops [i915]) Dec 08 13:56:49 snowflake kernel: kvm_intel: L1TF CPU bug present and SMT on, data leak possible. See CVE-2018-3646 and https://www.kernel.org/doc/html/latest/admin-guide/hw-vuln/l1tf.html for details. Dec 08 13:56:49 snowflake kernel: Console: switching to colour frame buffer device 240x67 Dec 08 13:56:49 snowflake kernel: i915 0000:00:02.0: [drm] fb0: i915drmfb frame buffer device Dec 08 13:56:49 snowflake kernel: vga_switcheroo: enabled Dec 08 13:56:49 snowflake kernel: snd_hda_codec_realtek hdaudioB0D0: autoconfig for ALC3266: line_outs=1 (0x17/0x0/0x0/0x0/0x0) type:speaker Dec 08 13:56:49 snowflake kernel: snd_hda_codec_realtek hdaudioB0D0: speaker_outs=0 (0x0/0x0/0x0/0x0/0x0) Dec 08 13:56:49 snowflake kernel: snd_hda_codec_realtek hdaudioB0D0: hp_outs=1 (0x21/0x0/0x0/0x0/0x0) Dec 08 13:56:49 snowflake kernel: snd_hda_codec_realtek hdaudioB0D0: mono: mono_out=0x0 Dec 08 13:56:49 snowflake kernel: snd_hda_codec_realtek hdaudioB0D0: inputs: Dec 08 13:56:49 snowflake kernel: snd_hda_codec_realtek hdaudioB0D0: Headset Mic=0x18 Dec 08 13:56:49 snowflake kernel: snd_hda_codec_realtek hdaudioB0D0: Headphone Mic=0x1a Dec 08 13:56:49 snowflake kernel: snd_hda_codec_realtek hdaudioB0D0: Internal Mic=0x12 Dec 08 13:56:49 snowflake kernel: snd_hda_codec_realtek hdaudioB0D0: creating for ALC3266 Analog 0 Dec 08 13:56:49 snowflake kernel: snd_soc_avs 0000:00:1f.3: Direct firmware load for intel/avs/hda-10ec0298-tplg.bin failed with error -2 Dec 08 13:56:49 snowflake kernel: snd_soc_avs 0000:00:1f.3: request topology "intel/avs/hda-10ec0298-tplg.bin" failed: -2 Dec 08 13:56:49 snowflake kernel: avs_hdaudio avs_hdaudio.0: trying to load fallback topology hda-generic-tplg.bin Dec 08 13:56:49 snowflake kernel: avs_hdaudio avs_hdaudio.0: ASoC: Parent card not yet available, widget card binding deferred Dec 08 13:56:49 snowflake kernel: input: hdaudioB0D0 Headphone Mic as /devices/platform/avs_hdaudio.0/sound/card1/input46 Dec 08 13:56:49 snowflake kernel: snd_hda_codec_hdmi hdaudioB0D2: creating for HDMI 0 0 Dec 08 13:56:49 snowflake kernel: snd_hda_codec_hdmi hdaudioB0D2: skipping capture dai for HDMI 0 Dec 08 13:56:49 snowflake kernel: snd_hda_codec_hdmi hdaudioB0D2: creating for HDMI 1 1 Dec 08 13:56:49 snowflake kernel: snd_hda_codec_hdmi hdaudioB0D2: skipping capture dai for HDMI 1 Dec 08 13:56:49 snowflake kernel: snd_hda_codec_hdmi hdaudioB0D2: creating for HDMI 2 2 Dec 08 13:56:49 snowflake kernel: snd_hda_codec_hdmi hdaudioB0D2: skipping capture dai for HDMI 2 Dec 08 13:56:49 snowflake kernel: snd_soc_avs 0000:00:1f.3: Direct firmware load for intel/avs/hda-8086280b-tplg.bin failed with error -2 Dec 08 13:56:49 snowflake kernel: snd_soc_avs 0000:00:1f.3: request topology "intel/avs/hda-8086280b-tplg.bin" failed: -2 Dec 08 13:56:49 snowflake kernel: avs_hdaudio avs_hdaudio.2: trying to load fallback topology hda-8086-generic-tplg.bin Dec 08 13:56:49 snowflake kernel: avs_hdaudio avs_hdaudio.2: ASoC: Parent card not yet available, widget card binding deferred Dec 08 13:56:49 snowflake kernel: avs_hdaudio avs_hdaudio.2: avs_card_late_probe: mapping HDMI converter 1 to PCM 1 (00000000ef42ffd5) Dec 08 13:56:49 snowflake kernel: avs_hdaudio avs_hdaudio.2: avs_card_late_probe: mapping HDMI converter 2 to PCM 2 (000000006e5a21c2) Dec 08 13:56:49 snowflake kernel: avs_hdaudio avs_hdaudio.2: avs_card_late_probe: mapping HDMI converter 3 to PCM 3 (000000001414d015) Dec 08 13:56:49 snowflake kernel: input: hdaudioB0D2 HDMI/DP,pcm=1 as /devices/platform/avs_hdaudio.2/sound/card2/input47 Dec 08 13:56:49 snowflake kernel: input: hdaudioB0D2 HDMI/DP,pcm=2 as /devices/platform/avs_hdaudio.2/sound/card2/input48 Dec 08 13:56:49 snowflake kernel: input: hdaudioB0D2 HDMI/DP,pcm=3 as /devices/platform/avs_hdaudio.2/sound/card2/input49 n.b. it looks like there is a "snd_soc_avs hardware grabbing race" which is at least somewhat known to the Arch forum? https://bbs.archlinux.org/viewtopic.php?id=298583 More sadness of the same variety: https://discourse.nixos.org/t/no-microphone-how-to-get-firmware-dsp-basefw-bin/38198/2 That is probably why it is broken on 6.11. But the NULL deref is also a bug for sure IMO. In summary: - There was a regression in 6.7 and up, which results in this particular Intel HD Audio being broken with common configurations. I am unsure if it was reported to the kernel devs in a way that it was seen (which is okay! here is a report :)) - There is an additional regression between 6.11 and 6.12 where, in addition to the device being broken, a NULL pointer is dereferenced while it is in the process of being broken. That is the bug I was originally attempting to report here, but it appears that it is stacked on top of the other bug. That's strange. Let's collect some information first: What is the content of yours /lib/firmware/intel/avs directory? Can you share md5sum of skl/dsp_basefw.bin file? Seems like code correctly fallbacks to generic topology, which is hda-generic-tplg.bin for HDA codec and hda-8086-generic-tplg.bin for HDMI. Can you also share md5sum of those files? In case of "hardware grabbing race", seems like those users are just missing topology files as pointed by '-2' (ENOENT) error number. Let's just concentrate on your problem, which seems to be NULL pointer on 6.12. $ for f in /run/booted-system/firmware/intel/avs/skl/dsp_basefw.bin.zst /run/booted-system/firmware/intel/avs/hda-generic-tplg.bin.zst /run/booted-system/firmware/intel/avs/hda-8086-generic-tplg.bin.zst; do echo "$f $(< $f zstd -d | md5sum -)"; done /run/booted-system/firmware/intel/avs/skl/dsp_basefw.bin.zst 75618f40a3c45237acac9cc70f26576a - /run/booted-system/firmware/intel/avs/hda-generic-tplg.bin.zst a419dc34b378a3cc659ca84fb9db6bff - /run/booted-system/firmware/intel/avs/hda-8086-generic-tplg.bin.zst 2200c9a22ed3a18094293e9fe10b58e1 - $ ls -l /run/booted-system/firmware/intel/avs/ total 68 dr-xr-xr-x 1 root root 36 Dec 31 1969 apl dr-xr-xr-x 1 root root 36 Dec 31 1969 cnl dr-xr-xr-x 1 root root 140 Dec 31 1969 skl -r--r--r-- 1 root root 588 Dec 31 1969 da7219-tplg.bin.zst -r--r--r-- 1 root root 907 Dec 31 1969 dmic-tplg.bin.zst -r--r--r-- 1 root root 603 Dec 31 1969 hda-808628xx-3ep-tplg.bin.zst lrwxrwxrwx 1 root root 29 Dec 31 1969 hda-8086-generic-tplg.bin.zst -> hda-808628xx-3ep-tplg.bin.zst -r--r--r-- 1 root root 738 Dec 31 1969 hda-generic-1ep-tplg.bin.zst lrwxrwxrwx 1 root root 28 Dec 31 1969 hda-generic-tplg.bin.zst -> hda-generic-1ep-tplg.bin.zst -r--r--r-- 1 root root 537 Dec 31 1969 max98357a-tplg.bin.zst -r--r--r-- 1 root root 612 Dec 31 1969 max98373-tplg.bin.zst -r--r--r-- 1 root root 612 Dec 31 1969 max98927-tplg.bin.zst -r--r--r-- 1 root root 592 Dec 31 1969 nau8825-tplg.bin.zst -r--r--r-- 1 root root 599 Dec 31 1969 rt274-tplg.bin.zst -r--r--r-- 1 root root 599 Dec 31 1969 rt286-tplg.bin.zst -r--r--r-- 1 root root 600 Dec 31 1969 rt298-tplg.bin.zst -r--r--r-- 1 root root 529 Dec 31 1969 rt5514-tplg.bin.zst -r--r--r-- 1 root root 661 Dec 31 1969 rt5640-tplg.bin.zst -r--r--r-- 1 root root 591 Dec 31 1969 rt5663-tplg.bin.zst -r--r--r-- 1 root root 617 Dec 31 1969 ssm4567-tplg.bin.zst Ok, FW files are fine. Although one thing you could try just in case is to try using raw bin files instead of packed ones, as we've seen bugs (already fixed in v6.12) in the past with packed ones. Also, are you using distribution provided kernel or are you building your own from sources? If you are using distribution provided one can you try building one using upstream code? Can you share kernel config and package/git commit hash from which you are building? Scratch all of the above, I've reproduced it locally, will keep investigating. The issue seemed vaguely familiar, and it did because it was already fixed, it is just that fix didn't make it into v6.12 or any stable v6.12 releases after. Here is a patch: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=a0aae96be5ffc5b456ca07bfe1385b721c20e184 I've send backport request: https://lore.kernel.org/linux-sound/20241211111011.3560836-1-amadeuszx.slawinski@linux.intel.com/T/#t I can confirm that this issues breaks the 2017 Pixelbook using kernel 6.12.4 and that the patch fixes it. Thank you! Linux 6.12.5 was released during the weekend and should contain the fix. Thanks for reporting! |