Bug 219084

Summary: RIP: 0010:mt792x_mac_link_bss_remove+0x2b/0x100
Product: Drivers Reporter: Mike Lothian (mike)
Component: network-wirelessAssignee: drivers_network-wireless (drivers_network-wireless)
Status: RESOLVED PATCH_ALREADY_AVAILABLE    
Severity: normal CC: regressions
Priority: P3    
Hardware: All   
OS: Linux   
Kernel Version: 6.11-rc0 Subsystem:
Regression: Yes Bisected commit-id: 1541d63c5fe2cebce85b2af84a2850a302ffda9c

Description Mike Lothian 2024-07-22 12:37:56 UTC
Hi I'm seeing this problem since the merge window opened

BUG: unable to handle page fault for address: ffffffffffffffa0
#PF: supervisor read access in kernel mode
#PF: error_code(0x0000) - not-present page
PGD 480c067 P4D 480c067 PUD 480e067 PMD 0 
Oops: Oops: 0000 [#1] PREEMPT SMP
CPU: 2 UID: 0 PID: 476 Comm: NetworkManager Tainted: G        W          6.10.0-tip+ #4105
Tainted: [W]=WARN
Hardware name: ASUSTeK COMPUTER INC. ROG Strix G513QY_G513QY/G513QY, BIOS G513QY.331 02/24/2023
RIP: 0010:mt792x_mac_link_bss_remove+0x2b/0x100
Code: 0f 1e fa 41 57 41 56 41 55 41 54 53 4c 8b 66 18 44 0f b7 aa b8 00 00 00 8b 46 60 48 89 d3 49 89 f7 49 89 fe 48 83 f8 0e 77 13 <66> 41 83 7c 24 a0 00 74 0a 4d 8b a4 c4 28 ff ff ff eb 07 49 81 c4
RSP: 0018:ffff88811d3ab350 EFLAGS: 00010293
RAX: 0000000000000000 RBX: ffff888105b09f20 RCX: ffff888105b099b0
RDX: ffff888105b09f20 RSI: ffff888105b09e40 RDI: ffff888114962020
RBP: ffff888114961218 R08: 0000000000000000 R09: ffffffff839123a0
R10: 0000000000000117 R11: 0000000000000400 R12: 0000000000000000
R13: 0000000000000013 R14: ffff888114962020 R15: ffff888105b09e40
FS:  00007fcffe178400(0000) GS:ffff888fde480000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: ffffffffffffffa0 CR3: 00000001218eb000 CR4: 0000000000350ef0
Call Trace:
 <TASK>
 ? __die_body+0x61/0xb0
 ? page_fault_oops+0x2eb/0x340
 ? exc_page_fault+0x66/0xa0
 ? asm_exc_page_fault+0x26/0x30
 ? mt792x_mac_link_bss_remove+0x2b/0x100
 ? mt792x_remove_interface+0x70/0x90
 ? drv_remove_interface+0x5a/0x140
 ? ieee80211_do_stop+0x6f4/0x800
 ? ieee80211_stop+0x13d/0x160
 ? __dev_close_many+0x119/0x150
 ? __dev_change_flags+0xc9/0x1b0
 ? dev_change_flags+0x1d/0x50
 ? do_setlink+0x496/0x1300
 ? schedule_timeout+0x1f/0x180
 ? _raw_spin_lock_irqsave+0x26/0x50
 ? __nla_validate_parse
 ? _raw_spin_lock_irqsave+0x26/0x50
 ? rtnl_newlink+0xb5a/0xdd0
 ? ctrl_getfamily+0x153/0x230
 ? sk_filter_trim_cap+0x12f/0x250
 ? kmalloc_reserve+0x43/0xf0
 ? __alloc_skb+0x10f/0x1a0
 ? skb_queue_tail+0x1b/0x50
 ? rtnetlink_rcv_msg+0x2dc/0x330
 ? rtnetlink_bind+0x30/0x30
 ? netlink_rcv_skb+0xb5/0xf0
 ? netlink_unicast+0x230/0x330
 ? netlink_sendmsg+0x305/0x3d0
 ? ____sys_sendmsg
 ? __sys_sendmsg+0x295/0x2d0
 ? do_syscall_64+0x72/0xf0
 ? crng_make_state+0x60/0x150
 ? get_random_bytes_user+0xdf/0x220
 ? syscall_exit_to_user_mode+0x97/0xb0
 ? do_syscall_64+0x7e/0xf0
 ? syscall_exit_to_user_mode+0x97/0xb0
 ? do_syscall_64+0x7e/0xf0
 ? do_syscall_64+0x7e/0xf0
 ? syscall_exit_to_user_mode+0x97/0xb0
 ? do_syscall_64+0x7e/0xf0
 ? irqentry_exit_to_user_mode+0x8d/0xa0
 ? entry_SYSCALL_64_after_hwframe+0x4b/0x53
 </TASK>
Modules linked in:
CR2: ffffffffffffffa0
---[ end trace 0000000000000000 ]---
pstore: backend (efi_pstore) writing error (-28)
RIP: 0010:mt792x_mac_link_bss_remove+0x2b/0x100
Code: 0f 1e fa 41 57 41 56 41 55 41 54 53 4c 8b 66 18 44 0f b7 aa b8 00 00 00 8b 46 60 48 89 d3 49 89 f7 49 89 fe 48 83 f8 0e 77 13 <66> 41 83 7c 24 a0 00 74 0a 4d 8b a4 c4 28 ff ff ff eb 07 49 81 c4
RSP: 0018:ffff88811d3ab350 EFLAGS: 00010293
RAX: 0000000000000000 RBX: ffff888105b09f20 RCX: ffff888105b099b0
RDX: ffff888105b09f20 RSI: ffff888105b09e40 RDI: ffff888114962020
RBP: ffff888114961218 R08: 0000000000000000 R09: ffffffff839123a0
R10: 0000000000000117 R11: 0000000000000400 R12: 0000000000000000
R13: 0000000000000013 R14: ffff888114962020 R15: ffff888105b09e40
FS:  00007fcffe178400(0000) GS:ffff888fde480000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: ffffffffffffffa0 CR3: 00000001218eb000 CR4: 0000000000350ef0
note: NetworkManager[476] exited with irqs disabled
Bluetooth: hci0: Device setup in 3100848 usecs
Bluetooth: hci0: HCI Enhanced Setup Synchronous Connection command is advertised, but not supported.
Bluetooth: hci0: AOSP extensions version v1.00
Bluetooth: hci0: AOSP quality report is supported
Bluetooth: MGMT ver 1.23


Kernel 6.11-rc0
Compiled with Clang 18.1.8

The driver is compiled into the kernel, not as a module

CONFIG_MT76_CORE=y
CONFIG_MT76_LEDS=y
CONFIG_MT76_CONNAC_LIB=y
CONFIG_MT792x_LIB=y
CONFIG_MT7921_COMMON=y
CONFIG_MT7921E=y
Comment 1 Mike Lothian 2024-07-22 16:04:11 UTC
1541d63c5fe2cebce85b2af84a2850a302ffda9c is the first bad commit
commit 1541d63c5fe2cebce85b2af84a2850a302ffda9c
Author: Sean Wang <sean.wang@mediatek.com>
Date:   Wed Jun 12 20:02:40 2024 -0700

    wifi: mt76: mt7925: add mt7925_mac_link_bss_remove to remove per-link BSS
    
    The mt7925_mac_link_bss_remove function currently removes the per-link BSS.
    We will extend this function when we implement the MLO functionality.
    
    This patch only includes structural changes and does not involve any
    logic changes.
    
    Signed-off-by: Sean Wang <sean.wang@mediatek.com>
    Link: https://patch.msgid.link/20240613030241.5771-47-sean.wang@kernel.org
    Signed-off-by: Felix Fietkau <nbd@nbd.name>

 drivers/net/wireless/mediatek/mt76/mt792x_core.c | 35 +++++++++++++++++++++--------------
 1 file changed, 21 insertions(+), 14 deletions(-)
Comment 2 Mike Lothian 2024-07-22 16:51:00 UTC
Reverting the following 4 patches was enough to get things working again:

d53ab629cff57e450fe69fc90eb1ddc372e8da2d
ebb1406813c6ea60d2d95efc69ce6d53bbe43b31
69acd6d910b0c83842bd45c36224d4f8fe59d1d4
1541d63c5fe2cebce85b2af84a2850a302ffda9c
Comment 3 Artem S. Tashkinov 2024-07-23 08:33:20 UTC
Sean Wang, is not on this bug tracker, so it would be nice if you sent your findings to  linux-wireless@vger.kernel.org and sean.wang@mediatek.com

Thanks.
Comment 4 The Linux kernel's regression tracker (Thorsten Leemhuis) 2024-07-24 08:54:03 UTC
TWIMC, I suspect (but might be wrong there!) this might be fixed by https://lore.kernel.org/all/20240718234633.12737-1-sean.wang@kernel.org/
Comment 5 Mike Lothian 2024-07-24 09:29:39 UTC
It is indeed :D

Thanks