Bug 218239 - Kernel oops in iwlwifi 6.6.3 with AX210 while scanning (looks like firmware bug)
Summary: Kernel oops in iwlwifi 6.6.3 with AX210 while scanning (looks like firmware bug)
Status: NEW
Alias: None
Product: Networking
Classification: Unclassified
Component: Wireless (show other bugs)
Hardware: Intel Linux
: P3 normal
Assignee: networking_wireless@kernel-bugs.osdl.org
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2023-12-07 11:10 UTC by Paul Grandperrin
Modified: 2023-12-24 14:35 UTC (History)
2 users (show)

See Also:
Kernel Version: 6.6.3
Subsystem:
Regression: No
Bisected commit-id:


Attachments
ioports (1.24 KB, text/plain)
2023-12-07 11:10 UTC, Paul Grandperrin
Details
ver_linux (3.63 KB, text/plain)
2023-12-07 11:13 UTC, Paul Grandperrin
Details
cpu_info (11.69 KB, text/plain)
2023-12-07 11:13 UTC, Paul Grandperrin
Details
lspci (20.01 KB, text/plain)
2023-12-07 11:14 UTC, Paul Grandperrin
Details
modules (15.74 KB, text/plain)
2023-12-07 11:14 UTC, Paul Grandperrin
Details
iomem (4.21 KB, text/plain)
2023-12-07 11:14 UTC, Paul Grandperrin
Details
dmesg (32.07 KB, text/plain)
2023-12-07 11:15 UTC, Paul Grandperrin
Details
dmesg2 (15.33 KB, text/plain)
2023-12-07 11:15 UTC, Paul Grandperrin
Details
trace-cmd record -e iwlwifi -e mac80211 -e cfg80211 -e iwlwifi_msg (2.82 MB, application/x-xz)
2023-12-07 16:06 UTC, Paul Grandperrin
Details

Description Paul Grandperrin 2023-12-07 11:10:09 UTC
Created attachment 305558 [details]
ioports

Hi!

My system was running fine, no errors in the logs, and then it started
to happen, an infinite flow of oops in the kernel logs.

The bug seems to happen in "__ieee80211_scan_completed" and it seems to
involve the firmware: "Microcode SW error detected. Restarting",
"ieee80211 phy0: Hardware restart was requested" and "iwlwifi
0000:02:00.0: FW error in SYNC CMD SCAN_REQ_UMAC".

Since then, the iwlwifi module oops all the time, even after unloading
and reloading it.

My system:
NixOS 23.11
Linux 6.6.3
Intel(R) Wi-Fi 6 AX210 160MHz, REV=0x420
firmware version: 83.e8f84e98.0 ty-a0-gf-a0-83.ucode

Here's the logs attached and more info about my system.

I tried very hard to get decode_stacktrace.sh to works with module
symbols but the NixOS file hierarchy makes things quite difficult.
I can try again if needed.


Sincerly,
Paul Grandperrin
Comment 1 Paul Grandperrin 2023-12-07 11:13:36 UTC
Created attachment 305559 [details]
ver_linux
Comment 2 Paul Grandperrin 2023-12-07 11:13:57 UTC
Created attachment 305560 [details]
cpu_info
Comment 3 Paul Grandperrin 2023-12-07 11:14:17 UTC
Created attachment 305561 [details]
lspci
Comment 4 Paul Grandperrin 2023-12-07 11:14:30 UTC
Created attachment 305562 [details]
modules
Comment 5 Paul Grandperrin 2023-12-07 11:14:42 UTC
Created attachment 305563 [details]
iomem
Comment 6 Paul Grandperrin 2023-12-07 11:15:01 UTC
Created attachment 305564 [details]
dmesg
Comment 7 Paul Grandperrin 2023-12-07 11:15:14 UTC
Created attachment 305565 [details]
dmesg2
Comment 8 Paul Grandperrin 2023-12-07 11:20:03 UTC
I've just been told about the commands to collect more information about wireless bugs at https://wireless.wiki.kernel.org/en/users/drivers/iwlwifi/debugging but unfortunately, I rebooted my computer and the issue disappeared.

I think I might be able to trigger it again so I'll do that when it happens.
Comment 9 Paul Grandperrin 2023-12-07 16:05:01 UTC
The bug happened again, and I really don't know why, I really did nothing in particular.

I'll collect more info now.
Comment 10 Paul Grandperrin 2023-12-07 16:06:02 UTC
Created attachment 305566 [details]
trace-cmd record -e iwlwifi -e mac80211 -e cfg80211 -e iwlwifi_msg
Comment 11 Paul Grandperrin 2023-12-07 16:11:58 UTC
I've been able to use trace-cmd but I can't extract debug information about the firmware.

My kernel has the correct config:
zcat /proc/config.gz |rg -i CONFIG_ALLOW_DEV_COREDUMP
CONFIG_ALLOW_DEV_COREDUMP=y

But the folder /sys/kernel/debug/iwlwifi doesn't exist.
Comment 12 Paul Grandperrin 2023-12-07 16:19:24 UTC
And finally, I started the process to do sniffing but since the issue happens when the device is scanning, I have no idea which channel is responsible.

There's quite a few AP around me on 2.4GHz and 5GHz so I can't just sniff a few selected channels.

Note You need to log in before you can comment on or make changes to this bug.