Bug 218304

Summary: Regression in a0575b4add21a243cc3257e75ad913cd5377d5f2
Product: Drivers Reporter: Junyi Wang (junyi.wang)
Component: Sound(ALSA)Assignee: Jaroslav Kysela (perex)
Status: RESOLVED PATCH_ALREADY_AVAILABLE    
Severity: normal CC: junyi.wang, peter.ujfalusi
Priority: P3    
Hardware: Intel   
OS: Linux   
Kernel Version: Subsystem:
Regression: No Bisected commit-id:

Description Junyi Wang 2023-12-23 17:16:51 UTC
This is the relevant part of the diff:

```
 static int hdac_hda_dev_probe(struct hdac_device *hdev)
 {
+	struct hdac_hda_priv *hda_pvt = dev_get_drvdata(&hdev->dev);
 	struct hdac_ext_link *hlink;
 	int ret;

@@ -621,9 +632,15 @@ static int hdac_hda_dev_probe(struct hdac_device *hdev)
 	snd_hdac_ext_bus_link_get(hdev->bus, hlink);

 	/* ASoC specific initialization */
-	ret = devm_snd_soc_register_component(&hdev->dev,
-					 &hdac_hda_codec, hdac_hda_dais,
-					 ARRAY_SIZE(hdac_hda_dais));
+	if (hda_pvt->need_display_power)
+		ret = devm_snd_soc_register_component(&hdev->dev,
+						&hdac_hda_hdmi_codec, hdac_hda_hdmi_dais,
+						ARRAY_SIZE(hdac_hda_hdmi_dais));
+	else
+		ret = devm_snd_soc_register_component(&hdev->dev,
+						&hdac_hda_codec, hdac_hda_dais,
+						ARRAY_SIZE(hdac_hda_dais));
+
 	if (ret < 0) {
 		dev_err(&hdev->dev, "failed to register HDA codec %d\n", ret);
```

The `hda_pvt` is NULL. I think it has to do with initialization order of some drivers, but not sure.

This is the hardware this bug is reproduced on:
```
00:1f.3 Multimedia audio controller: Intel Corporation Tiger Lake-LP Smart Sound Technology Audio Controller (rev 20)
	Subsystem: LG Electronics, Inc. Tiger Lake-LP Smart Sound Technology Audio Controller
	Flags: bus master, fast devsel, latency 64, IRQ 142, IOMMU group 13
	Memory at 6079188000 (64-bit, non-prefetchable) [size=16K]
	Memory at 6079000000 (64-bit, non-prefetchable) [size=1M]
	Capabilities: <access denied>
	Kernel driver in use: sof-audio-pci-intel-tgl

00:1f.4 SMBus: Intel Corporation Tiger Lake-LP SMBus Controller (rev 20)
	Subsystem: LG Electronics, Inc. Tiger Lake-LP SMBus Controller
	Flags: medium devsel, IRQ 16, IOMMU group 13
	Memory at 6079194000 (64-bit, non-prefetchable) [size=256]
	I/O ports at efa0 [size=32]
	Kernel driver in use: i801_smbus

00:1f.5 Serial bus controller: Intel Corporation Tiger Lake-LP SPI Controller (rev 20)
	Subsystem: LG Electronics, Inc. Tiger Lake-LP SPI Controller
	Flags: fast devsel, IOMMU group 13
	Memory at 37400000 (32-bit, non-prefetchable) [size=4K]
	Kernel driver in use: intel-spi
```

This is the stack trace:

```
[    1.530468] BUG: kernel NULL pointer dereference, address: 0000000000000078
[    1.530509] #PF: supervisor read access in kernel mode
[    1.530539] #PF: error_code(0x0000) - not-present page
[    1.530569] PGD 0 P4D 0 
[    1.530586] Oops: 0000 [#1] SMP
[    1.530609] CPU: 6 PID: 89 Comm: kworker/6:1 Not tainted 6.7.0-rc6-g5414aea7b750 #1 c1fa866695fe2bce0977c260b19e05c4e4003b23
[    1.530673] Workqueue: events sof_probe_work
[    1.530700] RIP: 0010:hdac_hda_dev_probe+0x42/0xe0
[    1.530728] Code: 48 8b 37 48 8b bb f0 02 00 00 e8 49 30 03 00 48 85 c0 48 89 c5 0f 84 8e 00 00 00 48 8b bb f0 02 00 00 48 89 c6 e8 5e 2f 03 00 <41> 80 7c 24 78 00 75 3a b9 03 00 00 00 48 c7 c2 80 cd b0 82 48 c7
[    1.530837] RSP: 0000:ffffc9000054fbc0 EFLAGS: 00010246
[    1.530870] RAX: 0000000000000000 RBX: ffff888101383000 RCX: 0000000000000001
[    1.530913] RDX: 0000000000000000 RSI: ffff8881017faf00 RDI: ffff888101381d00
[    1.530961] RBP: ffff8881017faf00 R08: 00000000000000ff R09: 000000000000000a
[    1.531008] R10: 000000000000000a R11: 0fffffffffffffff R12: 0000000000000000
[    1.531058] R13: 0000000000000000 R14: 0000000000000000 R15: ffff888101381028
[    1.531104] FS:  0000000000000000(0000) GS:ffff8884b4980000(0000) knlGS:0000000000000000
[    1.531156] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[    1.531195] CR2: 0000000000000078 CR3: 00000004c642e001 CR4: 0000000000f70ef0
[    1.531242] PKRU: 55555554
[    1.531262] Call Trace:
[    1.531281]  <TASK>
[    1.531298]  ? __die+0x1e/0x60
[    1.531323]  ? page_fault_oops+0x148/0x410
[    1.531351]  ? exc_page_fault+0x2a4/0x4e0
[    1.531382]  ? vsscanf+0x1ab/0x8b0
[    1.531408]  ? asm_exc_page_fault+0x26/0x30
[    1.531438]  ? hdac_hda_dev_probe+0x42/0xe0
[    1.531468]  really_probe+0x189/0x3d0
[    1.531495]  ? driver_probe_device+0x90/0x90
[    1.531525]  __driver_probe_device+0x73/0x150
[    1.531557]  driver_probe_device+0x1a/0x90
[    1.531587]  __device_attach_driver+0x75/0xf0
[    1.531617]  bus_for_each_drv+0x68/0xa0
[    1.531645]  __device_attach+0xa5/0x1a0
[    1.531675]  bus_probe_device+0x85/0x90
[    1.531703]  device_add+0x631/0x830
[    1.531730]  snd_hdac_device_register+0x10/0x50
[    1.531761]  hda_codec_probe_bus+0x15b/0x2b0
[    1.531791]  hda_dsp_probe+0x20e/0x4e0
[    1.531821]  sof_probe_work+0x29/0x430
[    1.531847]  process_one_work+0x127/0x230
[    1.533433]  worker_thread+0x2e4/0x400
[    1.534998]  ? flush_delayed_work+0x40/0x40
[    1.536523]  kthread+0xc8/0xf0
[    1.538022]  ? kthread_complete_and_exit+0x20/0x20
[    1.539484]  ret_from_fork+0x2c/0x40
[    1.540916]  ? kthread_complete_and_exit+0x20/0x20
[    1.542345]  ret_from_fork_asm+0x11/0x20
[    1.543742]  </TASK>
[    1.545121] Modules linked in:
[    1.546485] CR2: 0000000000000078
[    1.547870] ---[ end trace 0000000000000000 ]---
[    1.549234] RIP: 0010:hdac_hda_dev_probe+0x42/0xe0
[    1.550621] Code: 48 8b 37 48 8b bb f0 02 00 00 e8 49 30 03 00 48 85 c0 48 89 c5 0f 84 8e 00 00 00 48 8b bb f0 02 00 00 48 89 c6 e8 5e 2f 03 00 <41> 80 7c 24 78 00 75 3a b9 03 00 00 00 48 c7 c2 80 cd b0 82 48 c7
[    1.553433] RSP: 0000:ffffc9000054fbc0 EFLAGS: 00010246
[    1.554832] RAX: 0000000000000000 RBX: ffff888101383000 RCX: 0000000000000001
[    1.556203] RDX: 0000000000000000 RSI: ffff8881017faf00 RDI: ffff888101381d00
[    1.557567] RBP: ffff8881017faf00 R08: 00000000000000ff R09: 000000000000000a
[    1.558916] R10: 000000000000000a R11: 0fffffffffffffff R12: 0000000000000000
[    1.560248] R13: 0000000000000000 R14: 0000000000000000 R15: ffff888101381028
[    1.561595] FS:  0000000000000000(0000) GS:ffff8884b4980000(0000) knlGS:0000000000000000
[    1.562956] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[    1.564300] CR2: 0000000000000078 CR3: 00000004c642e001 CR4: 0000000000f70ef0
[    1.565668] PKRU: 55555554
```

Note stack trace from different commit, but behavior is exactly the same.

The symptoms from a user's perspective are: 1) sound does not work, sound card not detected; 2) rebooting gets stuck permanently(or at least for a long time); 3) sometimes the system fails to boot, screen becomes frozen, and booting does not progress.
Comment 1 Peter Ujfalusi 2024-01-02 09:16:22 UTC
The fix was submitted on 7th of Dec [1] but it did not made it to 6.7-rc (it is in linux-next)

[1] https://lore.kernel.org/linux-sound/20231207095425.19597-1-peter.ujfalusi@linux.intel.com/