Created attachment 306724 [details] Output of sudo dmesg -T | grep -i -E 'a6:00.0|wlp|iwl|80211' I am regularly experiencing crashes with the iwlwifi driver when connected to a Wi-Fi network at my workplace. Looking at the kernel and system logs, it always seems to happen after the following line AP xx:xx:xx:xx:xx:xx changed bandwidth, new used config is 6375.000 MHz, width 3 (6345.000/0 MHz) SYSTEM INFORMATION Device: Framework Laptop 13 (Intel 12th Gen) Wi-Fi card: Intel AX210 Distribution: Fedora 40 KDE Kernel: 6.10.3-200.fc40.x86_64 (I experienced the problem with 6.9.13 too) Firmware Version: 89.202a2f7b.0 ty-a0-gf-a0-89.ucode WI-FI NETWORK INFORMATION AP Make: HPE Aruba Networking AP Model: AP-635 (Wi-Fi 6E) Authentication: WPA2-PEAP with MS-CHAPv2 STEPS TO REPRODUCE 1. Connect to Wi-Fi network described above 2. Wait a couple of minutes RESULT A crash will occur at some point, sometimes on the initial connection and sometimes after around 30 minutes. The Wi-Fi network will disconnect. EXPECTED RESULT The firmware doesn't crash and my laptop remains connected to the Wi-Fi network. THINGS I HAVE TRIED Set options iwlmvm power_scheme=1 (doesn't help) Set options iwlwifi amsdu_size=3 (doesn't help) Set options iwlwifi disable_11be=1 (doesn't help) Set options iwlwifi disable_11ax=1 (resolves problem) Of course, I don't want to disable Wi-Fi 6/6E with the last option if I can avoid it hence I am reporting this bug. ATTACHMENTS I have attached the output of dmesg encrypted using the GPG keys listed in https://wireless.wiki.kernel.org/en/users/drivers/iwlwifi/debugging. This output contains numerous crashes.
Created attachment 306725 [details] iwlwifi fw crash dump Here is a firmware dump from one of the crashes which I got by following the guide in https://wireless.wiki.kernel.org/en/users/drivers/iwlwifi/debugging. Again it is encrypted using the GPG keys therein.
[Wed Aug 14 18:32:53 2024] wlp166s0: bad HE/EHT 6 GHz operation [Wed Aug 14 18:32:53 2024] wlp166s0: AP appears to change mode (expected HE, found legacy), disconnect Can you try to change the channel on which your AP operates in the 6 GHz band? Apparently it is picking an illegal channel.
Unfortunately I don't manage this wireless network so can't change the channel. However, I can speak to the wireless infrastructure team and ask them about it. What makes a channel illegal and how do we know this one is?
Created attachment 306730 [details] iw reg get output In case this is helpful, here is the output of iw reg get
To determine what goes wrong we need to see the beacon and the assoc response of the AP. You can record this by using a monitor interface while attempting to connect. sudo iw wlan0 interface add mon0 type monitor sudo ifconfig mon0 up sudo tcpdump -i mon0 -w mon0.pcap
Does this monitoring need to happen during a firmware crash (i.e. do I need to start it and wait for a crash)? Sometimes the connection will be stable for 30 mins or so before it crashes.
Created attachment 306737 [details] Requested tcpdump during firmware crash OK, I think I was able to get the tcpdump during a firmware crash. Please find it attached (GPG encrypted as before). Looking at the the output of journalctl (not sure how to get dmesg to print the actual time), I can see a crash at 10:41:52 which is in the timerange included in the tcpdump: Aug 15 10:41:52 framira kernel: wlp166s0: AP xx:xx:xx:xx:xx:xx changed bandwidth, new used config is 6055.000 MHz, width 3 (6025.000/0 MHz) Aug 15 10:41:52 framira kernel: iwlwifi 0000:a6:00.0: Microcode SW error detected. Restarting 0x0.
I can see the association shows that your AP advertises this: 6 GHz Operation Information Primary Channel: 53 Control: 0x03 .... ..11 = Channel Width: 160MHz or 80MHz+80MHz (3) .... .0.. = Duplicate Beacon: False ..00 0... = Regulatory Info: 0 00.. .... = Reserved: 0x0 Channel Center Frequency Segment 0: 47 Channel Center Frequency Segment 1: 0 Minimum Rate: 1 This seems an invalid channel 80MHz and since the Segment 1 is 0, we consider this a 80 MHz but I still need to ask someone who knows better than me.
Even if the channel is invalid, surely in that case the wireless driver should ignore it and definitely not crash due to it?
I switched from wpa_supplicant to iwd and am now getting lots of lines like Aug 16 14:41:51 framira iwd[100110]: invalid HE capabilities for xx:xx:xx:xx:xx:xx I guess this supports your suggestion that the AP is broadcasting on an invalid channel. I think my previous point still stands that I don't think it should crash the whole driver.
We agree on that point that we should be more robust and not crash against those buggy APs. This is why this bug is still open. We'll try to see how we can cope better with those APs.
Created attachment 306761 [details] Output of iw phy0 channels I've just noticed that from the output of iw phy0 channels it seems as if only 20 MHz widths are supported for 6 GHz. Is this behaviour correct?
Can you give this a try? diff --git a/net/mac80211/util.c b/net/mac80211/util.c index aed72794d9fe..ef1748771157 100644 --- a/net/mac80211/util.c +++ b/net/mac80211/util.c @@ -3129,9 +3129,9 @@ bool ieee80211_chandef_he_6ghz_oper(struct ieee80211_local *local, he_chandef.width = NL80211_CHAN_WIDTH_80; break; case IEEE80211_HE_6GHZ_OPER_CTRL_CHANWIDTH_160MHZ: - he_chandef.width = NL80211_CHAN_WIDTH_80; if (!he_6ghz_oper->ccfs1) - break; + return false; + he_chandef.width = NL80211_CHAN_WIDTH_80; if (abs(he_6ghz_oper->ccfs1 - he_6ghz_oper->ccfs0) == 8) he_chandef.width = NL80211_CHAN_WIDTH_160; else
There is a patch in 6.11 that does more validation on the channel, but I don't think it'll be enough to fix the problem you're facing.
(In reply to Emmanuel Grumbach from comment #14) > There is a patch in 6.11 that does more validation on the channel, but I > don't think it'll be enough to fix the problem you're facing. Strike that. It should help. Please try 6.11 or if you can, you can apply [1]. [1] - https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=91b193d546683558a8799ffb2e2f935d3800633e
Did you have a chance to test?
Unfortunately not yet. I have been away from the office with these APs. I will try to find time to test this week.
(In reply to Emmanuel Grumbach from comment #15) > (In reply to Emmanuel Grumbach from comment #14) > > There is a patch in 6.11 that does more validation on the channel, but I > > don't think it'll be enough to fix the problem you're facing. > > Strike that. > > It should help. > Please try 6.11 or if you can, you can apply [1]. > > [1] - > https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/ > ?id=91b193d546683558a8799ffb2e2f935d3800633e I applied this patch and rebuilt 6.10.8 and I no longer observe firmware crashes. However, I still occasionally get disconnected from the AP when it tries to shift me onto the invalid 6 GHz channel. I think that's a separate issue though and I think this issue has been resolved.
Thanks.
I think I've stumbled upon the exact same thing, but now with the guards forbidding valid operation. > wlp4s0: bad HE/EHT 6 GHz operation > wlp4s0: AP appears to change mode (expected HE, found legacy), disconnect The AP is configured to operate on channel 69, CCFP0 being 71, CCFP1 being 79. Which seems completely valid? It's also accepted by other client devices. It would be very useful if logging were improved so that the aspect that doesn't satisfy Linux would be logged. This would also help determine which component here is buggy.
Oh, and I'm currently running 6.11.0-21-generic (#21-Ubuntu) with linux-firmware 20240913 (gita34e7a5f-0ubuntu2.6).
that is absolutely not the same problem. Your AP seems to be sending broken information.
Okay, that's good to know. I just couldn't tell if these fixes were just to avoid a firmware bug and broke actually valid setups, or if they were added to achieve actually (more) standards-compliant operation. (I mean, 6GHz is still broken on Windows, so is LAR...) Though again, the error messages printed as a result of these patches are not sufficient for determining which part is "broken information". So far I've got at least two vendors that seem to find it acceptable. In any case, if you could refer to any section of any relevant standard that specifies how it's wrong it would be very appreciated. If I had that information I could try and go pass it on to HP. As it stands right now, two major vendors seem incompatible with each other.
You can refer to https://en.wikipedia.org/wiki/List_of_WLAN_channels to see what's valid and what is not.
I would need more than a Wikipedia article to say that though. I would guess the first response would anyways be that it's "central frequency" not "valid channel for primary operation". But I'll try.
Okay, so 802.11ax-2021 page 453 (table 26-15) says that for a 6GHz AP with 160MHz BSS channel width, CCFS1 must be greater than zero and the absolute difference between CCFS1 and CCFS0 must be exactly eight. So if CCFS0 is 71 (also a valid channel) then 79 is the required value for CCFS1. What's actually incorrect here?
Created attachment 307947 [details] attachment-22265-0.html Sorry then, I was confused. I suggest you report the problem on the wireless mailing list. But clearly cfg80211 is complaining about a problem in the AP ________________________________ From: bugzilla-daemon@kernel.org <bugzilla-daemon@kernel.org> Sent: Thursday, April 10, 2025 5:50:58 PM To: Grumbach, Emmanuel <emmanuel.grumbach@intel.com> Subject: [Bug 219159] iwlwifi firmware crash with Intel AX210 on AP changing to 6 GHz channel https://bugzilla.kernel.org/show_bug.cgi?id=219159 --- Comment #26 from Avamander (avamander@gmail.com) --- Okay, so 802.11ax-2021 page 453 (table 26-15) says that for a 6GHz AP with 160MHz BSS channel width, CCFS1 must be greater than zero and the absolute difference between CCFS1 and CCFS0 must be exactly eight. So if CCFS0 is 71 (also a valid channel) then 79 is the required value for CCFS1. What's actually incorrect here? -- You may reply to this email to add a comment. You are receiving this mail because: You are on the CC list for the bug.