Bug 218247

Summary: brcmfmac wifi stopped working with kernel 6.6.5
Product: Networking Reporter: Jan Palus (jpalus)
Component: WirelessAssignee: networking_wireless (networking_wireless)
Status: RESOLVED CODE_FIX    
Severity: normal CC: debohman, mario.limonciello, mpagano, regressions
Priority: P3    
Hardware: All   
OS: Linux   
Kernel Version: Subsystem:
Regression: No Bisected commit-id:

Description Jan Palus 2023-12-09 22:29:26 UTC
Switching from 6.6.4 to 6.6.5 on my Pine Book Pro (aarch64/rk3399) resulted in non-working WiFi (brcmfmac/BCM43456). Narrowed it down to commit:

4a7e92551618 ("wifi: cfg80211: fix CQM for non-range use")

reverting it on top of 6.6.5 makes WiFi work again.

Once booted into broken kernel it's not possible to issue `dmesg`, `ip link` or even `reboot`. They all hang. The only thing that I've got is hung task log from serial console:

[  245.389368][   T56] INFO: task kworker/0:2:129 blocked for more than 122 seconds.
[  245.390106][   T56]       Tainted: G         C  E   T  6.6.5-2 #1
[  245.390676][   T56] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[  245.391448][   T56] task:kworker/0:2     state:D stack:0     pid:129   ppid:2      flags:0x00000008
[  245.392378][   T56] Workqueue: events_power_efficient reg_check_chans_work [cfg80211]
[  245.393956][   T56] Call trace:
[  245.394267][   T56]  __switch_to+0xe0/0x178
[  245.394697][   T56]  __schedule+0x35c/0x1370
[  245.395122][   T56]  schedule+0x68/0x100
[  245.395620][   T56]  schedule_preempt_disabled+0x30/0x68
[  245.396149][   T56]  __mutex_lock.constprop.0+0x304/0x5a8
[  245.396676][   T56]  __mutex_lock_slowpath+0x2c/0x58
[  245.397164][   T56]  mutex_lock+0x8c/0xc0
[  245.397568][   T56]  reg_check_chans_work+0x78/0x4d0 [cfg80211 b659b03a6404d33720c7ee1f9845a62e749d3f51]
[  245.399332][   T56]  process_one_work+0x180/0x418
[  245.399806][   T56]  worker_thread+0x38c/0x4b0
[  245.400241][   T56]  kthread+0x11c/0x128
[  245.400637][   T56]  ret_from_fork+0x10/0x20
[  245.401164][   T56] INFO: task iwd:2840 blocked for more than 122 seconds.
[  245.401857][   T56]       Tainted: G         C  E   T  6.6.5-2 #1
[  245.402513][   T56] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[  245.403291][   T56] task:iwd             state:D stack:0     pid:2840  ppid:1      flags:0x0000080c
[  245.404138][   T56] Call trace:
[  245.404447][   T56]  __switch_to+0xe0/0x178
[  245.404872][   T56]  __schedule+0x35c/0x1370
[  245.405357][   T56]  schedule+0x68/0x100
[  245.405760][   T56]  schedule_preempt_disabled+0x30/0x68
[  245.406279][   T56]  __mutex_lock.constprop.0+0x304/0x5a8
[  245.406805][   T56]  __mutex_lock_slowpath+0x2c/0x58
[  245.407292][   T56]  mutex_lock+0x8c/0xc0
[  245.407697][   T56]  nl80211_connect+0x43c/0x650 [cfg80211 b659b03a6404d33720c7ee1f9845a62e749d3f51]
[  245.409419][   T56]  genl_family_rcv_msg_doit+0xdc/0x178
[  245.409952][   T56]  genl_rcv_msg+0x238/0x2b8
[  245.410381][   T56]  netlink_rcv_skb+0x68/0x158
[  245.410824][   T56]  genl_rcv+0x44/0x78
[  245.411205][   T56]  netlink_unicast+0x1f4/0x300
[  245.411652][   T56]  netlink_sendmsg+0x1dc/0x480
[  245.412170][   T56]  __sock_sendmsg+0x88/0x110
[  245.412624][   T56]  __sys_sendto+0x11c/0x1a8
[  245.413052][   T56]  __arm64_sys_sendto+0x38/0x78
[  245.413507][   T56]  invoke_syscall+0x90/0x140
[  245.413952][   T56]  el0_svc_common.constprop.0+0x11c/0x148
[  245.414492][   T56]  do_el0_svc+0x34/0x68
[  245.414897][   T56]  el0_svc+0x44/0x110
[  245.415376][   T56]  el0t_64_sync_handler+0x148/0x158
[  245.415875][   T56]  el0t_64_sync+0x198/0x1a0
Comment 1 The Linux kernel's regression tracker (Thorsten Leemhuis) 2023-12-10 10:27:31 UTC
FWIW, this issue is discussed here:
https://lore.kernel.org/all/87sf4belmm.fsf@turtle.gmx.de/

Replied there pointing to this ticket.
Comment 2 Mario Limonciello (AMD) 2023-12-11 16:34:08 UTC
This is fixed in kernel 6.6.6 (by reverting that commit).