Bug 216844 - Mediatek MT7921 Driver Crashing Upon Modprobe
Summary: Mediatek MT7921 Driver Crashing Upon Modprobe
Status: NEW
Alias: None
Product: Networking
Classification: Unclassified
Component: Wireless (show other bugs)
Hardware: AMD Linux
: P1 normal
Assignee: networking_wireless@kernel-bugs.osdl.org
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2022-12-24 18:11 UTC by Ralph
Modified: 2023-01-04 22:57 UTC (History)
3 users (show)

See Also:
Kernel Version: 6.1.1
Subsystem:
Regression: No
Bisected commit-id:


Attachments
Basic System Information (53.17 KB, image/png)
2022-12-24 18:11 UTC, Ralph
Details
potential patch (v1) (1.54 KB, application/mbox)
2023-01-04 22:57 UTC, Mario Limonciello (AMD)
Details

Description Ralph 2022-12-24 18:11:04 UTC
Created attachment 303468 [details]
Basic System Information

Some preliminary information:

```
$ uname -a
Linux *hostname* 6.1.1-arch1-1 #1 SMP PREEMPT_DYNAMIC Wed, 21 Dec 2022 22:27:55 +0000 x86_64 GNU/Linux
~~~

$ lspci --vvnn
...
02:00.0 Network controller [0280]: MEDIATEK Corp. MT7921 802.11ax PCI Express Wireless Network Adapter [14c3:7961]
    Subsystem: Foxconn International, Inc. Device [105b:e0b7]
    Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx+
    Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
    Latency: 0, Cache Line Size: 64 bytes
    Interrupt: pin A routed to IRQ 88
    IOMMU group: 10
    Region 0: Memory at fcf0200000 (64-bit, prefetchable) [size=1M]
    Region 2: Memory at fcf0300000 (64-bit, prefetchable) [size=16K]
    Region 4: Memory at fcf0304000 (64-bit, prefetchable) [size=4K]
    Capabilities: <access denied>
    Kernel driver in use: mt7921e
    Kernel modules: mt7921e
...
```

After blacklisting the module and restarting, error message caught in dmesg upon modprobe:

```
[   21.640345] cfg80211: Loading compiled-in X.509 certificates for regulatory database
[   21.640452] cfg80211: Loaded X.509 cert 'sforshee: 00b28ddf47aef9cea7'
[   21.700904] mt7921e 0000:02:00.0: enabling device (0000 -> 0002)
[   21.721280] mt7921e 0000:02:00.0: ASIC revision: 79610010
[   21.799909] mt7921e 0000:02:00.0: sar cnt = 0
[   21.799916] BUG: kernel NULL pointer dereference, address: 0000000000000004
[   21.799951] #PF: supervisor read access in kernel mode
[   21.799970] #PF: error_code(0x0000) - not-present page
[   21.799988] PGD 0 P4D 0 
[   21.800000] Oops: 0000 [#1] PREEMPT SMP NOPTI
[   21.800017] CPU: 9 PID: 559 Comm: modprobe Not tainted 6.1.1-arch1-1 #1 9bd09188b430be630e611f984454e4f3c489be77
[   21.800048] Hardware name: Dell Inc. Inspiron 15 3525/0PX9H7, BIOS 1.3.0 04/02/2022
[   21.800071] RIP: 0010:mt7921_init_acpi_sar+0x1c1/0x220 [mt7921_common]
[   21.800104] Code: ff 88 05 c1 c7 44 24 04 00 00 00 00 e8 b8 fc ff ff 41 89 c4 85 c0 0f 85 e4 fe ff ff 48 8b 43 08 ba 06 00 00 00 b9 06 00 00 00 <80> 78 04 00 40 0f 95 c6 e9 36 ff ff ff 48 8b 73 10 48 8b bd d0 03
[   21.800153] RSP: 0018:ffffb5aa40f0bae8 EFLAGS: 00010246
[   21.800172] RAX: 0000000000000000 RBX: ffff90adc9d11988 RCX: 0000000000000006
[   21.800194] RDX: 0000000000000006 RSI: 941c8c56d44496cf RDI: 0000000000038080
[   21.800216] RBP: ffff90adc5822080 R08: 0000000000000000 R09: ffffb5aa40f0b818
[   21.800240] R10: 0000000000000003 R11: ffffffffa62cb768 R12: 0000000000000000
[   21.800265] R13: 0000000000000000 R14: ffffb5aa40f0baec R15: ffffb5aa40f0bdc8
[   21.800290] FS:  00007f533ec9f740(0000) GS:ffff90aec6840000(0000) knlGS:0000000000000000
[   21.800320] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[   21.800342] CR2: 0000000000000004 CR3: 00000001045f6000 CR4: 0000000000750ee0
[   21.800368] PKRU: 55555554
[   21.800381] Call Trace:
[   21.800395]  <TASK>
[   21.800406]  mt7921_register_device+0x323/0x540 [mt7921_common eab0bdebbd12dfe392417c96bcae99c380288bb6]
[   21.800444]  mt7921_pci_probe+0x290/0x2c0 [mt7921e edb9e69dab4307dba494c96e4b7e5de618b3cd2a]
[   21.800479]  ? __pm_runtime_resume+0x58/0x80
[   21.800503]  local_pci_probe+0x45/0x80
[   21.800524]  pci_device_probe+0xc1/0x250
[   21.800542]  ? sysfs_do_create_link_sd+0x6e/0xe0
[   21.800563]  really_probe+0xde/0x380
[   21.800580]  ? pm_runtime_barrier+0x54/0x90
[   21.800598]  __driver_probe_device+0x78/0x170
[   21.800616]  driver_probe_device+0x1f/0x90
[   21.800634]  __driver_attach+0xd5/0x1d0
[   21.800651]  ? __device_attach_driver+0x110/0x110
[   21.800670]  bus_for_each_dev+0x8b/0xd0
[   21.800686]  bus_add_driver+0x1b2/0x200
[   21.800703]  driver_register+0x8d/0xe0
[   21.800721]  ? 0xffffffffc1064000
[   21.800759]  do_one_initcall+0x5d/0x220
[   21.800781]  do_init_module+0x4a/0x1e0
[   21.800801]  __do_sys_init_module+0x17f/0x1b0
[   21.800822]  do_syscall_64+0x5f/0x90
[   21.800843]  ? syscall_exit_to_user_mode+0x1b/0x40
[   21.800863]  ? do_syscall_64+0x6b/0x90
[   21.800880]  ? exc_page_fault+0x74/0x170
[   21.800897]  entry_SYSCALL_64_after_hwframe+0x63/0xcd
[   21.800920] RIP: 0033:0x7f533e721eae
[   21.800937] Code: 48 8b 0d dd ee 0c 00 f7 d8 64 89 01 48 83 c8 ff c3 66 2e 0f 1f 84 00 00 00 00 00 90 f3 0f 1e fa 49 89 ca b8 af 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d aa ee 0c 00 f7 d8 64 89 01 48
[   21.800991] RSP: 002b:00007fff32d77568 EFLAGS: 00000246 ORIG_RAX: 00000000000000af
[   21.801021] RAX: ffffffffffffffda RBX: 000055abccd61030 RCX: 00007f533e721eae
[   21.801039] RDX: 000055abcae64cb2 RSI: 000000000002dd2f RDI: 000055abccf39e10
[   21.801057] RBP: 000055abcae64cb2 R08: 27d4eb2f165667c5 R09: 85ebca77c2b2ae63
[   21.801075] R10: 000000000009a251 R11: 0000000000000246 R12: 0000000000040000
[   21.801094] R13: 000055abccd610b0 R14: 0000000000000000 R15: 000055abccd649d0
[   21.801114]  </TASK>
[   21.801123] Modules linked in: mt7921e(+) mt7921_common mt76_connac_lib mt76 mac80211 libarc4 cfg80211 intel_rapl_msr intel_rapl_common snd_acp3x_rn snd_soc_dmic snd_acp3x_pdm_dma snd_sof_amd_rembrandt snd_sof_amd_renoir snd_sof_amd_acp edac_mce_amd joydev snd_sof_pci mousedev snd_ctl_led snd_sof kvm_amd snd_sof_utils snd_soc_core hid_multitouch snd_hda_codec_realtek snd_compress kvm snd_hda_codec_generic snd_hda_codec_hdmi ac97_bus irqbypass snd_hda_intel snd_pcm_dmaengine crct10dif_pclmul crc32_pclmul snd_intel_dspcfg polyval_clmulni snd_pci_ps polyval_generic snd_intel_sdw_acpi gf128mul snd_rpl_pci_acp6x ghash_clmulni_intel snd_acp_pci snd_hda_codec sha512_ssse3 snd_pci_acp6x btusb dell_laptop aesni_intel snd_hda_core dell_smm_hwmon snd_pci_acp5x btrtl crypto_simd snd_hwdep cryptd btbcm snd_rn_pci_acp3x snd_pcm snd_acp_config dell_wmi rapl btintel snd_soc_acpi snd_timer sp5100_tco ledtrig_audio btmtk dell_smbios pcspkr psmouse sparse_keymap dcdbas bluetooth ecdh_generic
[   21.801168]  dell_wmi_descriptor wmi_bmof snd ccp soundcore snd_pci_acp3x i2c_piix4 k10temp dell_rbtn i2c_hid_acpi amd_pmc i2c_hid rfkill acpi_cpufreq mac_hid pkcs8_key_parser dm_multipath dm_mod sg crypto_user fuse bpf_preload ip_tables x_tables ext4 crc32c_generic crc16 mbcache jbd2 amdgpu drm_ttm_helper ttm video serio_raw atkbd libps2 gpu_sched vivaldi_fmap nvme drm_buddy crc32c_intel nvme_core drm_display_helper i8042 xhci_pci nvme_common cec xhci_pci_renesas serio wmi
[   21.804966] CR2: 0000000000000004
[   21.805835] ---[ end trace 0000000000000000 ]---
[   21.806367] RIP: 0010:mt7921_init_acpi_sar+0x1c1/0x220 [mt7921_common]
[   21.806796] Code: ff 88 05 c1 c7 44 24 04 00 00 00 00 e8 b8 fc ff ff 41 89 c4 85 c0 0f 85 e4 fe ff ff 48 8b 43 08 ba 06 00 00 00 b9 06 00 00 00 <80> 78 04 00 40 0f 95 c6 e9 36 ff ff ff 48 8b 73 10 48 8b bd d0 03
[   21.807242] RSP: 0018:ffffb5aa40f0bae8 EFLAGS: 00010246
[   21.807687] RAX: 0000000000000000 RBX: ffff90adc9d11988 RCX: 0000000000000006
[   21.808352] RDX: 0000000000000006 RSI: 941c8c56d44496cf RDI: 0000000000038080
[   21.809158] RBP: ffff90adc5822080 R08: 0000000000000000 R09: ffffb5aa40f0b818
[   21.809984] R10: 0000000000000003 R11: ffffffffa62cb768 R12: 0000000000000000
[   21.810968] R13: 0000000000000000 R14: ffffb5aa40f0baec R15: ffffb5aa40f0bdc8
[   21.811957] FS:  00007f533ec9f740(0000) GS:ffff90aec6840000(0000) knlGS:0000000000000000
[   21.812935] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[   21.813471] CR2: 0000000000000004 CR3: 00000001045f6000 CR4: 0000000000750ee0
[   21.813903] PKRU: 55555554
```

Temporarily resolved by reversion to Linux 5.19.11, but obviously not an ideal situation.

Basic system information added as a picture attachment.
Comment 1 The Linux kernel's regression tracker (Thorsten Leemhuis) 2022-12-26 11:04:32 UTC
Bug 216839 is another recent report about problems with mediatek devices. It looks different, but maybe that's due to your blacklisting. It's hence a wild guess (I'm not one of the wifi developers), but maybe this patch will help:

https://patchwork.kernel.org/project/linux-wireless/patch/20221217085624.52077-1-nbd@nbd.name/
Comment 2 Ralph 2022-12-28 16:28:23 UTC
(In reply to The Linux kernel's regression tracker (Thorsten Leemhuis) from comment #1)
> Bug 216839 is another recent report about problems with mediatek devices. It
> looks different, but maybe that's due to your blacklisting. It's hence a
> wild guess (I'm not one of the wifi developers), but maybe this patch will
> help:
> 
> https://patchwork.kernel.org/project/linux-wireless/patch/20221217085624.
> 52077-1-nbd@nbd.name/

I'll definitely take a look into this. It sounds similar to what I'm experiencing, though there are some pretty minute difference. Thank you!
Comment 3 Mario Limonciello (AMD) 2023-01-04 22:57:51 UTC
Created attachment 303527 [details]
potential patch (v1)

It looks to me that the problem is trying to access data fetched from ACPI after it's been freed.

Can you see if this patch helps?

Note You need to log in before you can comment on or make changes to this bug.