Bug 218884
Summary: | mac80211: simplify non-chanctx drivers (kernel 6.9) breaks monitor mode | ||
---|---|---|---|
Product: | Drivers | Reporter: | Michael (ZeroBeat) |
Component: | network-wireless | Assignee: | drivers_network-wireless (drivers_network-wireless) |
Status: | RESOLVED PATCH_ALREADY_AVAILABLE | ||
Severity: | normal | CC: | regressions |
Priority: | P3 | ||
Hardware: | All | ||
OS: | Linux | ||
Kernel Version: | >= 6.9 | Subsystem: | |
Regression: | Yes | Bisected commit-id: | 0a44dfc070749514b804ccac0b1fd38718f7daa1 |
Attachments: | bisect output of discovered commit |
Description
Michael
2024-05-24 17:04:51 UTC
BTW: Testing monitor mode mode these days is no easy task, because most of the device drivers are not working as expected: https://bugzilla.kernel.org/show_bug.cgi?id=218528 https://bugzilla.kernel.org/show_bug.cgi?id=217465 https://bugzilla.kernel.org/show_bug.cgi?id=218528 https://github.com/openwrt/mt76/issues/839 And now, since kernel 6.9.1, the mac stack is broken, too. With a bit of luck this ticket might lead to some result, but I'd say it's unlikely, as it does a few things wrong that are listed on https://linux-regtracking.leemhuis.info/post/frequent-reasons-why-linux-kernel-bug-reports-are-ignored/ Most importantly: * having two bugs reported in one ticket * reporting it here in bugzilla, as the responsible developers are unlike to see this, as most of them afaics are not following bugzilla (and don't even get a copy of bugs filed here). If you report the 6.9 regression (e.g. your second bug) in a new ticket (CC me!) I'll forward it to the developers; ideally check 6.10-rc and use a git bisection to find the culprit, as developers then will be obliged to fix it (I'll help with that). See also: https://docs.kernel.org/admin-guide/reporting-issues.html Thanks for the information. Maybe I put it the wrong way. This are not two different bugs, because both problems are related to each other. In monitor mode all commands are not forwarded from nl80211 to the device drivers. I've only checked two of them: switching a channel and transmitting a frame. I'm sure there are more. I've already checked 6.10-rc: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/diff/net/wireless/nl80211.c?id=v6.10-rc1&id2=v6.9 There are no changes that solve this problem. Instead of: First bug: Switching a channel via "NL80211_ATTR_WIPHY_FREQ" does not switch the channel/frequency. Second bug: frame injection is broken Let's call it: In monitor mode, commands via NL80211 are not forwarded to the device drivers. Thanks for your offer. It would be great if you can forward this to the nl80211 developers. Closed this report because: "reporting it here in bugzilla, as the responsible developers are unlike to see this, as most of them afaics are not following bugzilla (and don't even get a copy of bugs filed here)" No need to close this; and FWIW, I briefly mentioned this to the developers already, without effect: https://lore.kernel.org/all/a51f223f-18ac-4d67-9120-8da1c169b7eb@leemhuis.info/ A bisection result would make the difference https://docs.kernel.org/admin-guide/verify-bugs-and-bisect-regressions.html But before doing one you might want to run a search like https://lore.kernel.org/all/?q=net%2Fwireless%2Fnl80211.c to see if fixes are in the works already. Again thanks for this information. I've already treid to check the nl80211 code, but it is really complex. My work is to code some penetration testing tools that use this stack: https://github.com/ZerBea/hcxdumptool I can't reach the developers here and I don't want to waste your time with this problem which affects only a few penetration testers. But I'm sure, we get more issue reports when the most common penetration distributions move to this kernel. Again, thanks for your kind help. (In reply to Michael from comment #8) > I can't reach the developers here and I don't want to waste your time Don't worry about that, that's what I'm here for; just bisect the problem please, otherwise it might never be fixed; and it should, see https://docs.kernel.org/process/handling-regressions.html Created attachment 306404 [details]
bisect output of discovered commit
reopened as recommended: https://bugzilla.kernel.org/show_bug.cgi?id=218884#c7 bisected as suggested: https://bugzilla.kernel.org/show_bug.cgi?id=218884#c9 # good: [e8f897f4afef0031fe618a8e94127a0934896aba] Linux 6.8 git bisect good e8f897f4afef0031fe618a8e94127a0934896aba # Status: warte auf schlechten Commit, 1 guter Commit bekannt # bad: [a38297e3fb012ddfa7ce0321a7e5a8daeb1872b6] Linux 6.9 git bisect bad a38297e3fb012ddfa7ce0321a7e5a8daeb1872b6 # bad: [480e035fc4c714fb5536e64ab9db04fedc89e910] Merge tag 'drm-next-2024-03-13' of https://gitlab.freedesktop.org/drm/kernel git bisect bad 480e035fc4c714fb5536e64ab9db04fedc89e910 # bad: [9187210eee7d87eea37b45ea93454a88681894a4] Merge tag 'net-next-6.9' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next git bisect bad 9187210eee7d87eea37b45ea93454a88681894a4 # good: [a01c9fe32378636ae65bec8047b5de3fdb2ba5c8] Merge tag 'nfsd-6.9' of git://git.kernel.org/pub/scm/linux/kernel/git/cel/linux git bisect good a01c9fe32378636ae65bec8047b5de3fdb2ba5c8 # bad: [ca61ba3885274a684c83d8a538eb77b30e38ee92] Merge branch 'rework-genet-mdioclocking' git bisect bad ca61ba3885274a684c83d8a538eb77b30e38ee92 # good: [f42822f22b1c5f72c7e3497d9683f379ab0c5fe4] bnxt_en: Use firmware provided maximum filter counts. git bisect good f42822f22b1c5f72c7e3497d9683f379ab0c5fe4 # bad: [e10cd2ddd89e8b3e61b49247067e79f7debec2f1] wifi: rtw89: load BB parameters to PHY-1 git bisect bad e10cd2ddd89e8b3e61b49247067e79f7debec2f1 # good: [2594e4d9e1a2d79bf7bb262974abaf5ef153e371] wifi: iwlwifi: prepare for reading SAR tables from UEFI git bisect good 2594e4d9e1a2d79bf7bb262974abaf5ef153e371 # bad: [719036ae06d4bfdb65139e3947a8404dec298bc5] wifi: cfg80211: move puncturing validation code git bisect bad 719036ae06d4bfdb65139e3947a8404dec298bc5 # good: [1209f487d452ff7e822dec30661fd6b5163fb8cf] wifi: rtl8xxxu: Add TP-Link TL-WN823N V2 git bisect good 1209f487d452ff7e822dec30661fd6b5163fb8cf # good: [4dbd964f33aab6f99891b9610ad4b36cc215be0d] wifi: rtw89: 8922a: add chip_ops::rfk_hw_init git bisect good 4dbd964f33aab6f99891b9610ad4b36cc215be0d # good: [2fd53eb04c492eb9a2b06f994b36e5cf34ba7541] wifi: mac80211: remove unused MAX_MSG_LEN define git bisect good 2fd53eb04c492eb9a2b06f994b36e5cf34ba7541 # bad: [0a44dfc070749514b804ccac0b1fd38718f7daa1] wifi: mac80211: simplify non-chanctx drivers git bisect bad 0a44dfc070749514b804ccac0b1fd38718f7daa1 # good: [61f0261131c8dc2beeb6b34781a54788221081e9] wifi: mac80211: clean up band switch in duration git bisect good 61f0261131c8dc2beeb6b34781a54788221081e9 # good: [2d9698dd32d086e47b8bff3df4322cc017c17b55] wifi: mac80211: clean up HE 6 GHz and EHT chandef parsing git bisect good 2d9698dd32d086e47b8bff3df4322cc017c17b55 # first bad commit: [0a44dfc070749514b804ccac0b1fd38718f7daa1] wifi: mac80211: simplify non-chanctx drivers attached identified commit as requested: https://bugzilla.kernel.org/show_bug.cgi?id=218884#c10 As I started this bug report it wasn't clear for me whether it is a single bug (with a big impact on monitor mode) more bugs. I still suspect a single bug with many consequences (channel switching broken, packet injection broken), due to massive changes of net/mac8021. . Now I'm waiting for a patch to figure out if they are related to each other or not. What driver are you using? Fixes for mt76 and rtlwifi that are related to that commit are heading towards mainline already: https://lore.kernel.org/all/1fabb8e4-adf3-47ae-8462-8aea963bc2a5@gmail.com/ https://lore.kernel.org/all/20240528142308.3f7db1821e68.I531135d7ad76331a50244d6d5288e14aa9668390@changeid/ For this test: mt7601u But all drivers are affected. https://github.com/ZerBea/hcxdumptool/discussions/454 Even a self compiled out of the Linux tree driver is affected: https://github.com/lwfinger/rtw88 Unfortunately my hardware a slow and bisecting a kernel is far beyond my hardware capabilities. It took me several days to bisect the kernel. I took a closer look at the fixes you have mentioned. If they fix the problem it is is mandatory to apply that to all Linux kernel tree wifi device drivers. The 2 patches you have mentioned above fixed 2 drivers which have been forgotten in the faulty commit. The mt7601u has been modified modified in the faulty commit: const struct ieee80211_ops mt7601u_ops = { + .add_chanctx = ieee80211_emulate_add_chanctx, + .remove_chanctx = ieee80211_emulate_remove_chanctx, + .change_chanctx = ieee80211_emulate_change_chanctx, + .switch_vif_chanctx = ieee80211_emulate_switch_vif_chanctx, .tx = mt7601u_tx, .wake_tx_queue = ieee80211_handle_wake_tx_queue, .start = mt7601u_start, diff --git a/drivers/net/wireless/purelifi/plfxlc/mac.c b/drivers/net/wireless/purelifi/plfxlc/mac.c index 506d2f31efb5..7a1b27764f53 100644 --- a/drivers/net/wireless/purelifi/plfxlc/mac.c +++ b/drivers/net/wireless/purelifi/plfxlc/mac.c @@ -685,6 +685,10 @@ static int plfxlc_set_rts_threshold(struct ieee80211_hw *hw, u32 value) } So it should work, but it doesn't. Like all tested drivers it is still affected by the bug. The patches you've mentioned above do not fix my reported bug. I'm monitoring Linux Wireless Mailing List and I have noticed massive changes on mac80211 and cfg80211 nearly every day. Let's mark monitor mode of kernel 6.9.x as broken and set focus on 6.10? I was asked to test $ uname -r 6.10.0-rc3-1-git As expected, nothing has changed. Switching a channel and packet injection is still broken. Might be worth giving the patch in this message a try (albeit I'm not totally sure if is it related to this issue as well): https://lore.kernel.org/all/7869b9b29b6796c95fd5af649e4bd6696e56dcaf.camel@sipsolutions.net/ Great work. The patch is working running this test conditions: $ uname -r 6.10.0-rc3-1-git-dirty $ lsusb ID 7392:7710 Edimax Technology Co., Ltd Edimax Wi-Fi $ hcxdumptool -l 0 4 74da38eb45fc 66c5d3c23aa0 * wlp5s0f4u2 mt7601u NETLINK $ sudo hcxdumptool -i wlp5s0f4u2 --rcascan=active ... 0 ERROR(s) during runtime 233 Packet(s) captured by kernel 0 Packet(s) dropped by kernel 76 PROBERESPONSE(s) captured I looked at the history and I fully agree to Ping-Ke Shih: "We have a draft fix of rtw88 driver for RTL8821CE, but as mentioned some drivers are affected, so I don't plan to send out the patch. Instead we are looking for the fix of cfg80211/mac80211." https://lore.kernel.org/all/0e65ca6b471b4186a370b9a57de11abe@realtek.com/ This issue affects all drivers and it's the right way to fix it in cfg80211/mac80211. If helpful, I can test other devices/drivers except TP-Link TL-WN8200ND v3 (ID 2357:0126 TP-Link 802.11n NIC) which is affected by a driver problem: https://bugzilla.kernel.org/show_bug.cgi?id=218528 Best regards Michael "Savyasaachi reports that scanning for other stations in monitor mode does not work anymore with his RTL8821CE wireless network card for linux kernels after 6.8.9." https://lore.kernel.org/all/chwoymvpzwtbmzryrlitpwmta5j6mtndocxsyqvdyikqu63lon@gfds653hkknl/ This device is now working, too. I'm closing this report due to working patch. |