Bug 218239

Summary: Kernel oops in iwlwifi 6.6.3 with AX210 while scanning (looks like firmware bug)
Product: Networking Reporter: Paul Grandperrin (paul.grandperrin)
Component: WirelessAssignee: networking_wireless (networking_wireless)
Status: NEW ---    
Severity: normal CC: entaahlaah, paul.grandperrin
Priority: P3    
Hardware: Intel   
OS: Linux   
Kernel Version: 6.6.3 Subsystem:
Regression: No Bisected commit-id:
Attachments: ioports
ver_linux
cpu_info
lspci
modules
iomem
dmesg
dmesg2
trace-cmd record -e iwlwifi -e mac80211 -e cfg80211 -e iwlwifi_msg

Description Paul Grandperrin 2023-12-07 11:10:09 UTC
Created attachment 305558 [details]
ioports

Hi!

My system was running fine, no errors in the logs, and then it started
to happen, an infinite flow of oops in the kernel logs.

The bug seems to happen in "__ieee80211_scan_completed" and it seems to
involve the firmware: "Microcode SW error detected. Restarting",
"ieee80211 phy0: Hardware restart was requested" and "iwlwifi
0000:02:00.0: FW error in SYNC CMD SCAN_REQ_UMAC".

Since then, the iwlwifi module oops all the time, even after unloading
and reloading it.

My system:
NixOS 23.11
Linux 6.6.3
Intel(R) Wi-Fi 6 AX210 160MHz, REV=0x420
firmware version: 83.e8f84e98.0 ty-a0-gf-a0-83.ucode

Here's the logs attached and more info about my system.

I tried very hard to get decode_stacktrace.sh to works with module
symbols but the NixOS file hierarchy makes things quite difficult.
I can try again if needed.


Sincerly,
Paul Grandperrin
Comment 1 Paul Grandperrin 2023-12-07 11:13:36 UTC
Created attachment 305559 [details]
ver_linux
Comment 2 Paul Grandperrin 2023-12-07 11:13:57 UTC
Created attachment 305560 [details]
cpu_info
Comment 3 Paul Grandperrin 2023-12-07 11:14:17 UTC
Created attachment 305561 [details]
lspci
Comment 4 Paul Grandperrin 2023-12-07 11:14:30 UTC
Created attachment 305562 [details]
modules
Comment 5 Paul Grandperrin 2023-12-07 11:14:42 UTC
Created attachment 305563 [details]
iomem
Comment 6 Paul Grandperrin 2023-12-07 11:15:01 UTC
Created attachment 305564 [details]
dmesg
Comment 7 Paul Grandperrin 2023-12-07 11:15:14 UTC
Created attachment 305565 [details]
dmesg2
Comment 8 Paul Grandperrin 2023-12-07 11:20:03 UTC
I've just been told about the commands to collect more information about wireless bugs at https://wireless.wiki.kernel.org/en/users/drivers/iwlwifi/debugging but unfortunately, I rebooted my computer and the issue disappeared.

I think I might be able to trigger it again so I'll do that when it happens.
Comment 9 Paul Grandperrin 2023-12-07 16:05:01 UTC
The bug happened again, and I really don't know why, I really did nothing in particular.

I'll collect more info now.
Comment 10 Paul Grandperrin 2023-12-07 16:06:02 UTC
Created attachment 305566 [details]
trace-cmd record -e iwlwifi -e mac80211 -e cfg80211 -e iwlwifi_msg
Comment 11 Paul Grandperrin 2023-12-07 16:11:58 UTC
I've been able to use trace-cmd but I can't extract debug information about the firmware.

My kernel has the correct config:
zcat /proc/config.gz |rg -i CONFIG_ALLOW_DEV_COREDUMP
CONFIG_ALLOW_DEV_COREDUMP=y

But the folder /sys/kernel/debug/iwlwifi doesn't exist.
Comment 12 Paul Grandperrin 2023-12-07 16:19:24 UTC
And finally, I started the process to do sniffing but since the issue happens when the device is scanning, I have no idea which channel is responsible.

There's quite a few AP around me on 2.4GHz and 5GHz so I can't just sniff a few selected channels.