Since updating my machine (Dell XPS 15 9560; Intel i7-7700HQ) from kernel 6.6.63 to 6.12.1 to try to diagnose a regression on stable, I found that kernel 6.12 appears to regress the audio being able to work on the machine. I observe that the KDE audio panel states that the audio server is not available and that the dmesg contains a report of a NULL pointer dereference. The userspace is identical between the two boots, the only change in the configuration is that I have changed boot.kernelPackages from pkgs.linuxPackages to pkgs.linuxPackages_latest in my NixOS configuration. [ 15.615623] BUG: kernel NULL pointer dereference, address: 0000000000000008 [ 15.615631] #PF: supervisor read access in kernel mode [ 15.615634] #PF: error_code(0x0000) - not-present page [ 15.615636] PGD 800000011e89a067 P4D 800000011e89a067 PUD 11e899067 PMD 0 [ 15.615643] Oops: Oops: 0000 [#1] PREEMPT SMP PTI [ 15.615650] CPU: 4 UID: 1000 PID: 1312 Comm: wireplumber Tainted: G O 6.12.1 #1-NixOS [ 15.615654] Tainted: [O]=OOT_MODULE [ 15.615656] Hardware name: Dell Inc. XPS 15 9560/05FFDN, BIOS 1.31.0 11/10/2022 [ 15.615659] RIP: 0010:avs_dai_fe_hw_params+0x1f/0x130 [snd_soc_avs] [ 15.615685] Code: 90 90 90 90 90 90 90 90 90 90 90 f3 0f 1e fa 0f 1f 44 00 00 41 55 49 89 f1 41 54 55 53 48 63 47 3c 48 8d 04 40 4c 8b 6c c2 30 <49> 8b 75 08 48 85 f6 74 1f 45 31 e4 5b 5d 44 89 e0 41 5c 41 5d 31 [ 15.615689] RSP: 0018:ffff97e581877a98 EFLAGS: 00010286 [ 15.615693] RAX: 0000000000000000 RBX: ffff94a403888028 RCX: ffffffffc2240340 [ 15.615695] RDX: ffff94a403888028 RSI: ffff97e581877af8 RDI: ffff94a403e12400 [ 15.615697] RBP: ffff94a403e12400 R08: 0000000000000000 R09: ffff97e581877af8 [ 15.615700] R10: 0000000000000002 R11: 0000000000000002 R12: ffff97e581877af8 [ 15.615702] R13: 0000000000000000 R14: 0000000000000000 R15: ffff94a403888028 [ 15.615704] FS: 00007f86abc07c40(0000) GS:ffff94ab5e400000(0000) knlGS:0000000000000000 [ 15.615707] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 15.615710] CR2: 0000000000000008 CR3: 000000010715e002 CR4: 00000000003726f0 [ 15.615712] Call Trace: [ 15.615716] <TASK> [ 15.615719] ? __die+0x23/0x80 [ 15.615724] ? page_fault_oops+0x173/0x5b0 [ 15.615730] ? exc_page_fault+0x71/0x160 [ 15.615735] ? asm_exc_page_fault+0x26/0x30 [ 15.615744] ? avs_dai_fe_hw_params+0x1f/0x130 [snd_soc_avs] [ 15.615765] snd_soc_dai_hw_params+0x39/0xb0 [snd_soc_core] [ 15.615809] __soc_pcm_hw_params+0x51e/0x700 [snd_soc_core] [ 15.615858] dpcm_fe_dai_hw_params+0xdb/0x2e0 [snd_soc_core] [ 15.615899] snd_pcm_hw_params+0x1e1/0x4d0 [snd_pcm] [ 15.615922] snd_pcm_common_ioctl+0xbdf/0x15e0 [snd_pcm] [ 15.615938] ? ioctl_has_perm.constprop.0.isra.0+0xd2/0x140 [ 15.615947] snd_pcm_ioctl+0x2b/0x50 [snd_pcm] [ 15.615962] __x64_sys_ioctl+0x99/0xe0 [ 15.615968] do_syscall_64+0xb7/0x210 [ 15.615973] entry_SYSCALL_64_after_hwframe+0x77/0x7f [ 15.615979] RIP: 0033:0x7f86abe69aef [ 15.616003] Code: 00 48 89 44 24 18 31 c0 48 8d 44 24 60 c7 04 24 10 00 00 00 48 89 44 24 08 48 8d 44 24 20 48 89 44 24 10 b8 10 00 00 00 0f 05 <89> c2 3d 00 f0 ff ff 77 28 48 8b 44 24 18 64 48 2b 04 25 28 00 00 [ 15.616007] RSP: 002b:00007ffe7b0e4ca0 EFLAGS: 00000246 ORIG_RAX: 0000000000000010 [ 15.616010] RAX: ffffffffffffffda RBX: 00007ffe7b0e4e00 RCX: 00007f86abe69aef [ 15.616013] RDX: 00007ffe7b0e4e00 RSI: 00000000c2604111 RDI: 0000000000000015 [ 15.616015] RBP: 0000000010c279f0 R08: 0000000000000000 R09: 0000000000000000 [ 15.616018] R10: 0000000000000000 R11: 0000000000000246 R12: 0000000010b64150 [ 15.616020] R13: 00007ffe7b0e4d24 R14: 00007ffe7b0e4e00 R15: 00007ffe7b0e5070 [ 15.616025] </TASK> [ 15.616027] Modules linked in: snd_ctl_led snd_soc_avs_probe snd_soc_avs_hdaudio snd_hda_codec_hdmi snd_hda_codec_realtek snd_hda_codec_generic snd_hda_scodec_component xt_CHECKSUM ipt_REJECT nf_reject_ipv4 xt_tcpudp nft_compat nft_chain_nat ccm algif_aead crypto_null des3_ede_x86_64 des_generic libdes uhid cmac md4 algif_skcipher algif_hash af_alg bnep nf_log_syslog nft_log msr nft_ct nft_fib_inet nft_fib_ipv4 nft_fib_ipv6 nft_fib nf_tables snd_soc_avs intel_uncore_frequency intel_uncore_frequency_common snd_soc_hda_codec joydev intel_tcc_cooling snd_hda_ext_core ath10k_pci mousedev snd_soc_core ath10k_core snd_compress ac97_bus snd_pcm_dmaengine nls_iso8859_1 x86_pkg_temp_thermal nls_cp437 intel_powerclamp ath snd_hda_intel btusb snd_intel_dspcfg vfat dell_pc snd_intel_sdw_acpi platform_profile coretemp fat hid_multitouch btrtl snd_hda_codec dell_laptop dell_wmi btintel uvcvideo crct10dif_pclmul hid_generic nouveau mac80211 snd_hda_core btbcm videobuf2_vmalloc dell_smbios crc32_pclmul iTCO_wdt polyval_clmulni ee1004 [ 15.616104] dcdbas snd_hwdep btmtk intel_pmc_bxt polyval_generic uvc mei_wdt mei_pxp mei_hdcp watchdog i915 intel_rapl_msr wmi_bmof dell_wmi_descriptor intel_wmi_thunderbolt cfg80211 dell_smm_hwmon ghash_clmulni_intel snd_pcm mxm_wmi videobuf2_memops drm_gpuvm bluetooth videobuf2_v4l2 snd_timer drm_exec rapl intel_cstate videobuf2_common intel_uncore psmouse crc16 snd i2c_i801 gpu_sched rfkill soundcore mei_me i2c_mux drm_buddy drm_ttm_helper libarc4 i2c_smbus intel_lpss_pci mei intel_lpss ttm idma64 processor_thermal_device_pci_legacy i2c_hid_acpi virt_dma processor_thermal_device i2c_hid intel_pch_thermal processor_thermal_wt_hint hid drm_display_helper processor_thermal_rfim processor_thermal_rapl cec intel_gtt intel_rapl_common i2c_algo_bit intel_pmc_core processor_thermal_wt_req rtc_cmos processor_thermal_power_floor intel_vsec pmt_telemetry int3403_thermal processor_thermal_mbox int3400_thermal video intel_hid tiny_power_button int340x_thermal_zone acpi_thermal_rel pmt_class wmi sparse_keymap backlight acpi_pad [ 15.616182] intel_soc_dts_iosf battery thermal ac button evdev mac_hid sch_fq_codel serio_raw uinput loop cpufreq_powersave xt_nat x_tables nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 br_netfilter veth tun tap macvlan bridge stp llc kvm_intel kvm v4l2loopback(O) videodev mc fuse efi_pstore configfs nfnetlink dmi_sysfs dm_crypt cbc encrypted_keys trusted asn1_encoder tee tpm_crb ahci input_leds led_class libahci atkbd rtsx_pci_sdmmc libps2 mmc_core vivaldi_fmap libata nvme dm_mod sha512_ssse3 sha256_ssse3 sha1_ssse3 xhci_pci aesni_intel dax nvme_core scsi_mod gf128mul crypto_simd i8042 tpm_tis xhci_hcd cryptd rtsx_pci nvme_auth scsi_common serio tpm_tis_core btrfs blake2b_generic libcrc32c crc32c_generic crc32c_intel xor raid6_pq efivarfs tpm rng_core libaescfb ecdh_generic ecc autofs4 [ 15.616267] CR2: 0000000000000008 [ 15.616270] ---[ end trace 0000000000000000 ]--- [ 16.675223] RIP: 0010:avs_dai_fe_hw_params+0x1f/0x130 [snd_soc_avs] [ 16.675304] Code: 90 90 90 90 90 90 90 90 90 90 90 f3 0f 1e fa 0f 1f 44 00 00 41 55 49 89 f1 41 54 55 53 48 63 47 3c 48 8d 04 40 4c 8b 6c c2 30 <49> 8b 75 08 48 85 f6 74 1f 45 31 e4 5b 5d 44 89 e0 41 5c 41 5d 31 [ 16.675308] RSP: 0018:ffff97e581877a98 EFLAGS: 00010286 [ 16.675313] RAX: 0000000000000000 RBX: ffff94a403888028 RCX: ffffffffc2240340 [ 16.675316] RDX: ffff94a403888028 RSI: ffff97e581877af8 RDI: ffff94a403e12400 [ 16.675318] RBP: ffff94a403e12400 R08: 0000000000000000 R09: ffff97e581877af8 [ 16.675320] R10: 0000000000000002 R11: 0000000000000002 R12: ffff97e581877af8 [ 16.675323] R13: 0000000000000000 R14: 0000000000000000 R15: ffff94a403888028 [ 16.675325] FS: 00007f86abc07c40(0000) GS:ffff94ab5e400000(0000) knlGS:0000000000000000 [ 16.675328] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 16.675331] CR2: 0000000000000008 CR3: 000000010715e003 CR4: 00000000003726f0 [ 16.675334] note: wireplumber[1312] exited with irqs disabled
Created attachment 307333 [details] Full dmesg of affected machine
Okay so I have tried a bunch more kernel releases. The NULL dereference only happens on 6.12. However, I have tried kernels 6.7, 6.8, 6.9, 6.10 and none of them have working audio since they fail to load the firmware (some of which might be my fault, it may be the case that linux-firmware is not the right shape, since I also had WiFi issues on the particularly old ones). This is what happens on 6.11.10: Dec 08 13:56:49 snowflake kernel: snd_soc_avs 0000:00:1f.3: bound 0000:00:02.0 (ops i915_audio_component_bind_ops [i915]) Dec 08 13:56:49 snowflake kernel: kvm_intel: L1TF CPU bug present and SMT on, data leak possible. See CVE-2018-3646 and https://www.kernel.org/doc/html/latest/admin-guide/hw-vuln/l1tf.html for details. Dec 08 13:56:49 snowflake kernel: Console: switching to colour frame buffer device 240x67 Dec 08 13:56:49 snowflake kernel: i915 0000:00:02.0: [drm] fb0: i915drmfb frame buffer device Dec 08 13:56:49 snowflake kernel: vga_switcheroo: enabled Dec 08 13:56:49 snowflake kernel: snd_hda_codec_realtek hdaudioB0D0: autoconfig for ALC3266: line_outs=1 (0x17/0x0/0x0/0x0/0x0) type:speaker Dec 08 13:56:49 snowflake kernel: snd_hda_codec_realtek hdaudioB0D0: speaker_outs=0 (0x0/0x0/0x0/0x0/0x0) Dec 08 13:56:49 snowflake kernel: snd_hda_codec_realtek hdaudioB0D0: hp_outs=1 (0x21/0x0/0x0/0x0/0x0) Dec 08 13:56:49 snowflake kernel: snd_hda_codec_realtek hdaudioB0D0: mono: mono_out=0x0 Dec 08 13:56:49 snowflake kernel: snd_hda_codec_realtek hdaudioB0D0: inputs: Dec 08 13:56:49 snowflake kernel: snd_hda_codec_realtek hdaudioB0D0: Headset Mic=0x18 Dec 08 13:56:49 snowflake kernel: snd_hda_codec_realtek hdaudioB0D0: Headphone Mic=0x1a Dec 08 13:56:49 snowflake kernel: snd_hda_codec_realtek hdaudioB0D0: Internal Mic=0x12 Dec 08 13:56:49 snowflake kernel: snd_hda_codec_realtek hdaudioB0D0: creating for ALC3266 Analog 0 Dec 08 13:56:49 snowflake kernel: snd_soc_avs 0000:00:1f.3: Direct firmware load for intel/avs/hda-10ec0298-tplg.bin failed with error -2 Dec 08 13:56:49 snowflake kernel: snd_soc_avs 0000:00:1f.3: request topology "intel/avs/hda-10ec0298-tplg.bin" failed: -2 Dec 08 13:56:49 snowflake kernel: avs_hdaudio avs_hdaudio.0: trying to load fallback topology hda-generic-tplg.bin Dec 08 13:56:49 snowflake kernel: avs_hdaudio avs_hdaudio.0: ASoC: Parent card not yet available, widget card binding deferred Dec 08 13:56:49 snowflake kernel: input: hdaudioB0D0 Headphone Mic as /devices/platform/avs_hdaudio.0/sound/card1/input46 Dec 08 13:56:49 snowflake kernel: snd_hda_codec_hdmi hdaudioB0D2: creating for HDMI 0 0 Dec 08 13:56:49 snowflake kernel: snd_hda_codec_hdmi hdaudioB0D2: skipping capture dai for HDMI 0 Dec 08 13:56:49 snowflake kernel: snd_hda_codec_hdmi hdaudioB0D2: creating for HDMI 1 1 Dec 08 13:56:49 snowflake kernel: snd_hda_codec_hdmi hdaudioB0D2: skipping capture dai for HDMI 1 Dec 08 13:56:49 snowflake kernel: snd_hda_codec_hdmi hdaudioB0D2: creating for HDMI 2 2 Dec 08 13:56:49 snowflake kernel: snd_hda_codec_hdmi hdaudioB0D2: skipping capture dai for HDMI 2 Dec 08 13:56:49 snowflake kernel: snd_soc_avs 0000:00:1f.3: Direct firmware load for intel/avs/hda-8086280b-tplg.bin failed with error -2 Dec 08 13:56:49 snowflake kernel: snd_soc_avs 0000:00:1f.3: request topology "intel/avs/hda-8086280b-tplg.bin" failed: -2 Dec 08 13:56:49 snowflake kernel: avs_hdaudio avs_hdaudio.2: trying to load fallback topology hda-8086-generic-tplg.bin Dec 08 13:56:49 snowflake kernel: avs_hdaudio avs_hdaudio.2: ASoC: Parent card not yet available, widget card binding deferred Dec 08 13:56:49 snowflake kernel: avs_hdaudio avs_hdaudio.2: avs_card_late_probe: mapping HDMI converter 1 to PCM 1 (00000000ef42ffd5) Dec 08 13:56:49 snowflake kernel: avs_hdaudio avs_hdaudio.2: avs_card_late_probe: mapping HDMI converter 2 to PCM 2 (000000006e5a21c2) Dec 08 13:56:49 snowflake kernel: avs_hdaudio avs_hdaudio.2: avs_card_late_probe: mapping HDMI converter 3 to PCM 3 (000000001414d015) Dec 08 13:56:49 snowflake kernel: input: hdaudioB0D2 HDMI/DP,pcm=1 as /devices/platform/avs_hdaudio.2/sound/card2/input47 Dec 08 13:56:49 snowflake kernel: input: hdaudioB0D2 HDMI/DP,pcm=2 as /devices/platform/avs_hdaudio.2/sound/card2/input48 Dec 08 13:56:49 snowflake kernel: input: hdaudioB0D2 HDMI/DP,pcm=3 as /devices/platform/avs_hdaudio.2/sound/card2/input49
n.b. it looks like there is a "snd_soc_avs hardware grabbing race" which is at least somewhat known to the Arch forum? https://bbs.archlinux.org/viewtopic.php?id=298583 More sadness of the same variety: https://discourse.nixos.org/t/no-microphone-how-to-get-firmware-dsp-basefw-bin/38198/2 That is probably why it is broken on 6.11. But the NULL deref is also a bug for sure IMO. In summary: - There was a regression in 6.7 and up, which results in this particular Intel HD Audio being broken with common configurations. I am unsure if it was reported to the kernel devs in a way that it was seen (which is okay! here is a report :)) - There is an additional regression between 6.11 and 6.12 where, in addition to the device being broken, a NULL pointer is dereferenced while it is in the process of being broken. That is the bug I was originally attempting to report here, but it appears that it is stacked on top of the other bug.
That's strange. Let's collect some information first: What is the content of yours /lib/firmware/intel/avs directory? Can you share md5sum of skl/dsp_basefw.bin file? Seems like code correctly fallbacks to generic topology, which is hda-generic-tplg.bin for HDA codec and hda-8086-generic-tplg.bin for HDMI. Can you also share md5sum of those files? In case of "hardware grabbing race", seems like those users are just missing topology files as pointed by '-2' (ENOENT) error number. Let's just concentrate on your problem, which seems to be NULL pointer on 6.12.
$ for f in /run/booted-system/firmware/intel/avs/skl/dsp_basefw.bin.zst /run/booted-system/firmware/intel/avs/hda-generic-tplg.bin.zst /run/booted-system/firmware/intel/avs/hda-8086-generic-tplg.bin.zst; do echo "$f $(< $f zstd -d | md5sum -)"; done /run/booted-system/firmware/intel/avs/skl/dsp_basefw.bin.zst 75618f40a3c45237acac9cc70f26576a - /run/booted-system/firmware/intel/avs/hda-generic-tplg.bin.zst a419dc34b378a3cc659ca84fb9db6bff - /run/booted-system/firmware/intel/avs/hda-8086-generic-tplg.bin.zst 2200c9a22ed3a18094293e9fe10b58e1 - $ ls -l /run/booted-system/firmware/intel/avs/ total 68 dr-xr-xr-x 1 root root 36 Dec 31 1969 apl dr-xr-xr-x 1 root root 36 Dec 31 1969 cnl dr-xr-xr-x 1 root root 140 Dec 31 1969 skl -r--r--r-- 1 root root 588 Dec 31 1969 da7219-tplg.bin.zst -r--r--r-- 1 root root 907 Dec 31 1969 dmic-tplg.bin.zst -r--r--r-- 1 root root 603 Dec 31 1969 hda-808628xx-3ep-tplg.bin.zst lrwxrwxrwx 1 root root 29 Dec 31 1969 hda-8086-generic-tplg.bin.zst -> hda-808628xx-3ep-tplg.bin.zst -r--r--r-- 1 root root 738 Dec 31 1969 hda-generic-1ep-tplg.bin.zst lrwxrwxrwx 1 root root 28 Dec 31 1969 hda-generic-tplg.bin.zst -> hda-generic-1ep-tplg.bin.zst -r--r--r-- 1 root root 537 Dec 31 1969 max98357a-tplg.bin.zst -r--r--r-- 1 root root 612 Dec 31 1969 max98373-tplg.bin.zst -r--r--r-- 1 root root 612 Dec 31 1969 max98927-tplg.bin.zst -r--r--r-- 1 root root 592 Dec 31 1969 nau8825-tplg.bin.zst -r--r--r-- 1 root root 599 Dec 31 1969 rt274-tplg.bin.zst -r--r--r-- 1 root root 599 Dec 31 1969 rt286-tplg.bin.zst -r--r--r-- 1 root root 600 Dec 31 1969 rt298-tplg.bin.zst -r--r--r-- 1 root root 529 Dec 31 1969 rt5514-tplg.bin.zst -r--r--r-- 1 root root 661 Dec 31 1969 rt5640-tplg.bin.zst -r--r--r-- 1 root root 591 Dec 31 1969 rt5663-tplg.bin.zst -r--r--r-- 1 root root 617 Dec 31 1969 ssm4567-tplg.bin.zst
Ok, FW files are fine. Although one thing you could try just in case is to try using raw bin files instead of packed ones, as we've seen bugs (already fixed in v6.12) in the past with packed ones. Also, are you using distribution provided kernel or are you building your own from sources? If you are using distribution provided one can you try building one using upstream code? Can you share kernel config and package/git commit hash from which you are building?
Scratch all of the above, I've reproduced it locally, will keep investigating.
The issue seemed vaguely familiar, and it did because it was already fixed, it is just that fix didn't make it into v6.12 or any stable v6.12 releases after. Here is a patch: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=a0aae96be5ffc5b456ca07bfe1385b721c20e184 I've send backport request: https://lore.kernel.org/linux-sound/20241211111011.3560836-1-amadeuszx.slawinski@linux.intel.com/T/#t
I can confirm that this issues breaks the 2017 Pixelbook using kernel 6.12.4 and that the patch fixes it. Thank you!
Linux 6.12.5 was released during the weekend and should contain the fix. Thanks for reporting!