Created attachment 298801 [details] journalctl $ journalctl (see attachment): Sep 14 19:30:33 r61i kernel: BUG: KFENCE: use-after-free read in ieee80211_crypto_ccmp_decrypt+0x1d9/0x3d0 [mac80211] ... $ lspci -v 03:00.0 Network controller: Intel Corporation PRO/Wireless 3945ABG [Golan] Network Connection (rev 02) Subsystem: Intel Corporation ThinkPad T60/R60e/X60s/R61 Physical Slot: 3 Flags: bus master, fast devsel, latency 0, IRQ 29 Memory at dfcff000 (32-bit, non-prefetchable) [size=4K] Capabilities: <access denied> Kernel driver in use: iwl3945 Kernel modules: iwl3945 $ lsmod |grep mac80211 mac80211 1183744 2 iwl3945,iwlegacy libarc4 16384 1 mac80211 cfg80211 1040384 3 iwl3945,iwlegacy,mac80211 $ grep -ir ieee80211_crypto_ccmp_decrypt Downloads/linux-5.14.3 Downloads/linux-5.14.3/net/mac80211/rx.c: result = ieee80211_crypto_ccmp_decrypt( Downloads/linux-5.14.3/net/mac80211/rx.c: result = ieee80211_crypto_ccmp_decrypt( Downloads/linux-5.14.3/net/mac80211/wpa.c:ieee80211_crypto_ccmp_decrypt(struct ieee80211_rx_data *rx, Downloads/linux-5.14.3/net/mac80211/wpa.h:ieee80211_crypto_ccmp_decrypt(struct ieee80211_rx_data *rx,
Tested live with Arch Linux kernels: first bad: 5.12.9-arch1-1 last good: 5.12.8-arch1-1 Based on kernel.org's linux-stable repo: $ git diff --stat v5.12.9..v5.12.8 -- net/mac80211 net/mac80211/ieee80211_i.h | 36 +++++++---- net/mac80211/iface.c | 11 +++- net/mac80211/key.c | 7 --- net/mac80211/key.h | 2 - net/mac80211/rx.c | 150 ++++++++++----------------------------------- net/mac80211/sta_info.c | 6 +- net/mac80211/sta_info.h | 33 +--------- net/mac80211/wpa.c | 13 ++-- 8 files changed, 70 insertions(+), 188 deletions(-) $ git diff v5.12.9..v5.12.8 -- net/mac80211 <code diffs in detail>
Bisection based on kernel.org's linux-stable repo. first bad: 5.12.9 last good: 5.12.8 $ 1f0bf30c01d3f4de7d6c5e27b102a808c5646676 is the first bad commit commit 1f0bf30c01d3f4de7d6c5e27b102a808c5646676 Author: Johannes Berg <johannes.berg@intel.com> Date: Tue May 11 20:02:48 2021 +0200 mac80211: check defrag PN against current frame commit bf30ca922a0c0176007e074b0acc77ed345e9990 upstream. As pointed out by Mathy Vanhoef, we implement the RX PN check on fragmented frames incorrectly - we check against the last received PN prior to the new frame, rather than to the one in this frame itself. Prior patches addressed the security issue here, but in order to be able to reason better about the code, fix it to really compare against the current frame's PN, not the last stored one. Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20210511200110.bfbc340ff071.Id0b690e581da7d03d76df90bb0e3fd55930bc8a0@changeid Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> net/mac80211/ieee80211_i.h | 11 +++++++++-- net/mac80211/rx.c | 5 ++--- net/mac80211/wpa.c | 13 +++++++++---- 3 files changed, 20 insertions(+), 9 deletions(-) $ The bisect.log file: git bisect start # bad: [5a973f1903850a6257856e72959354eb856085ae] Linux 5.12.9 git bisect bad 5a973f1903850a6257856e72959354eb856085ae # good: [cfb3ea79045ad3c7bcaa0036b5a66609ccdadffe] Linux 5.12.8 git bisect good cfb3ea79045ad3c7bcaa0036b5a66609ccdadffe # bad: [bd72115939fe98bc8c80ad518ef024403eaff53d] net: dsa: sja1105: fix VL lookup command packing for P/Q/R/S git bisect bad bd72115939fe98bc8c80ad518ef024403eaff53d # bad: [1650540ad65358e54eccfeba9830827d40f2bb6d] mei: request autosuspend after sending rx flow control git bisect bad 1650540ad65358e54eccfeba9830827d40f2bb6d # good: [22e7b1aecdeaabd8b3c0c6520d50a81e3dc4950b] mac80211: properly handle A-MSDUs that start with an RFC 1042 header git bisect good 22e7b1aecdeaabd8b3c0c6520d50a81e3dc4950b # bad: [25721a38c9711603758e469ec8f33e8ae0136262] drm/amd/pm: correct MGpuFanBoost setting git bisect bad 25721a38c9711603758e469ec8f33e8ae0136262 # bad: [75ba7513a5ad9cd1a89c3d8f520e86a4c0f8457e] ath10k: drop fragments with multicast DA for PCIe git bisect bad 75ba7513a5ad9cd1a89c3d8f520e86a4c0f8457e # bad: [1f0bf30c01d3f4de7d6c5e27b102a808c5646676] mac80211: check defrag PN against current frame git bisect bad 1f0bf30c01d3f4de7d6c5e27b102a808c5646676 # good: [26ed5ca7b58f4fd466c825a48c564e940a75cfd7] mac80211: drop A-MSDUs on old ciphers git bisect good 26ed5ca7b58f4fd466c825a48c564e940a75cfd7 # good: [935ab28b02fb1598748e19e7cfba062ded143a22] mac80211: add fragment cache to sta_info git bisect good 935ab28b02fb1598748e19e7cfba062ded143a22 # first bad commit: [1f0bf30c01d3f4de7d6c5e27b102a808c5646676] mac80211: check defrag PN against current frame $ At this point I pass it to you guys as you are more knowledgeable.
I think we just need to reload 'hdr'.
(In reply to Johannes Berg from comment #3) > I think we just need to reload 'hdr'. I tested 5.12.9 (first bad kernel) with your patch proposal (by e-mail). It is good. You can submit an official patch to mainline, and get it backported to stable kernels (the bug creates an opportunity for an attack). Let us know the link to the fix commit.
https://git.kernel.org/pub/scm/linux/kernel/git/netdev/net.git/commit/?id=94513069eb549737bcfc3d988d6ed4da948a2de8
Actually, because I just did further analysis on this, I don't really see how this causes any later issues (apart from the KASAN/KCSAN reports). Not that I want to detract from the bug, it's clearly a bug I added there, but still, it seems not likely to cause problems: 1) In most cases, we don't do a reallocation in the first place, I'm not sure why that happens with iwlegacy - not a good situation for performance anyway. For iwlwifi, for example, this will never happen - the SKB is large enough and we have HW crypto, so we don't linearize, only pull the header - but the driver even already pulled it. 2) When a reallocation does happen, the worst is that we *read* a single bit (the fragment bit) in the wrong place. Since it's kernel memory, that would still be mapped, so it won't crash, it'll just read a bit from a bad place. This has basically three consequences: a) The bit goes 0 -> 1 - in this case we just do a superfluous memcpy() b) The bit goes 1 -> 0 - in this case we'd skip the memcpy(), leaving the data there zeroes (rx_data is 0 initialized on stack), resulting in the frame being dropped because the PNs aren't sequential - the if (entry->check_sequential_pn) code in ieee80211_rx_h_defragment(). c) Theoretically, this could be used as a leak of that one bit, trying to get something reallocated to this space while a fragmented frame is processed and then leaking the bit that was there by observing whether the frame got dropped or not (the "1 -> 0" case vs. "1 -> 1" depending on how the memory was reused). Given that the memory area is reused immediately and you'd have to race that to get between the free and the bit check, this seems rather difficult. Did I miss something?