Bug 216711

Summary: Regression: null SSID reported by systemd-networkd when connected to a WPA3 network
Product: Networking Reporter: Yohan Prod'homme (kernel+bugzilla)
Component: WirelessAssignee: networking_wireless (networking_wireless)
Status: RESOLVED CODE_FIX    
Severity: normal CC: benwolsieffer, dev.mbornand, jlinenkohl, mail
Priority: P1    
Hardware: Intel   
OS: Linux   
Kernel Version: >=5.19.2 Subsystem:
Regression: Yes Bisected commit-id:
Attachments: Bisect log
[PATCH] Set ssid when authenticating
[PATCH v3] Set ssid when authenticating
[PATCH v4] Set ssid when authenticating
[PATCH wireless v5] wifi: cfg80211: Set SSID if it is not already set

Description Yohan Prod'homme 2022-11-19 15:36:02 UTC
Created attachment 303236 [details]
Bisect log

Starting with linux 5.19.2, systemd-networkd reports a "(null)" SSID when connected to WPA3 network.

Steps to reproduce:
1. Get a system with systemd-networkd (maybe this issue can be exhibited with other methods, but using systemd-netword is the only one I know; I haven't found how to get this data directly from the kernel)
2. Connect to a wireless network using WPA3.
3. Run "networkctl status <wlan interface>", and observe that "WiFi access point" shows "(null)" instead of the network's SSID.
4. If you try again with a WPA2 network, the SSID is shown as expected. WPA3 network also behaves as expected on kernel <=5.19.1

This bug also affects linux 6.0 (tested on 6.0.8).


Bisection identified the following commit as the culprit:
(full bisect log attached)

7a53ad13c09150076b7ddde96c2dfc5622c90b45 is the first bad commit
commit 7a53ad13c09150076b7ddde96c2dfc5622c90b45
Author: Johannes Berg <johannes.berg@intel.com>
Date:   Thu Apr 14 16:50:57 2022 +0200

    wifi: cfg80211: do some rework towards MLO link APIs
    
    [ Upstream commit 7b0a0e3c3a88260b6fcb017e49f198463aa62ed1 ]
    
    In order to support multi-link operation with multiple links,
    start adding some APIs. The notable addition here is to have
    the link ID in a new nl80211 attribute, that will be used to
    differentiate the links in many nl80211 operations.
    
    So far, this patch adds the netlink NL80211_ATTR_MLO_LINK_ID
    attribute (as well as the NL80211_ATTR_MLO_LINKS attribute)
    and plugs it through the system in some places, checking the
    validity etc. along with other infrastructure needed for it.
    
    For now, I've decided to include only the over-the-air link
    ID in the API. I know we discussed that we eventually need to
    have to have other ways of identifying a link, but for local
    AP mode and auth/assoc commands as well as set_key etc. we'll
    use the OTA ID.
    
    Also included in this patch is some refactoring of the data
    structures in struct wireless_dev, splitting for the first
    time the data into type dependent pieces, to make reasoning
    about these things easier.
    
    Signed-off-by: Johannes Berg <johannes.berg@intel.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

 drivers/net/wireless/ath/ath6kl/cfg80211.c         |   6 +-
 drivers/net/wireless/ath/wil6210/cfg80211.c        |   9 +-
 .../broadcom/brcm80211/brcmfmac/cfg80211.c         |   4 +-
 drivers/net/wireless/marvell/libertas/mesh.c       |  10 +-
 drivers/net/wireless/marvell/mwifiex/11h.c         |   2 +-
 drivers/net/wireless/marvell/mwifiex/cfg80211.c    |  18 +-
 drivers/net/wireless/microchip/wilc1000/cfg80211.c |   3 +-
 drivers/net/wireless/quantenna/qtnfmac/cfg80211.c  |  14 +-
 drivers/net/wireless/quantenna/qtnfmac/commands.c  |   2 +-
 drivers/net/wireless/quantenna/qtnfmac/event.c     |  15 +-
 drivers/staging/rtl8723bs/os_dep/ioctl_cfg80211.c  |   4 +-
 include/linux/ieee80211.h                          |   3 +
 include/net/cfg80211.h                             |  99 +++-
 include/uapi/linux/nl80211.h                       |  28 +
 net/mac80211/cfg.c                                 |   8 +-
 net/mac80211/mlme.c                                |   2 +-
 net/wireless/ap.c                                  |  46 +-
 net/wireless/chan.c                                | 196 +++++--
 net/wireless/core.c                                |  28 +-
 net/wireless/core.h                                |  13 +-
 net/wireless/ibss.c                                |  57 +-
 net/wireless/mesh.c                                |  31 +-
 net/wireless/mlme.c                                |  74 +--
 net/wireless/nl80211.c                             | 623 +++++++++++++++------
 net/wireless/ocb.c                                 |   5 +-
 net/wireless/rdev-ops.h                            |  32 +-
 net/wireless/reg.c                                 | 139 +++--
 net/wireless/scan.c                                |   8 +-
 net/wireless/sme.c                                 | 102 ++--
 net/wireless/trace.h                               |  86 ++-
 net/wireless/util.c                                |  44 +-
 net/wireless/wext-compat.c                         |  48 +-
 net/wireless/wext-sme.c                            |  29 +-
 33 files changed, 1255 insertions(+), 533 deletions(-)




Original Arch Linux bug report: https://bugs.archlinux.org/task/75709
2 systemd issues related to this bug: https://github.com/systemd/systemd/issues/24411 and https://github.com/systemd/systemd/issues/24585
Comment 1 Sebastian S. 2022-11-23 11:54:36 UTC
I'm facing the same problem with a WPA2 network since 5.19.2, is this really specific to WPA3?
Comment 2 Yohan Prod'homme 2022-11-23 18:50:29 UTC
I observed the bug on a WPA3 network (the single WPA3 network I know/can access currently) and all other (WPA2) networks I can connect to are reporting the SSID as intended. Disabling WPA3 on the AP also "fix" the bug, so I thought it may be related to WPA3. But the cause may also be something else.

If this can help (I forgot to include this in the bug description), my wireless card is an Intel 7260AC, the AP is an ISP-provided CPE (I don't have much information about it), and I use iwd to connect to the network.
Comment 3 Justin Linenkohl 2022-11-24 09:42:54 UTC
Zoddo:  thank you for your recent post on github as I received that too.  I think we are trying to connect the dots as to where this "SSID null" bug is coming from and these details are insightful.  Thank you!  I may need to do more testing on my end (to compare WPA2, WPA3, etc) as I have a variety of systems and cards to leverage.  That seems quite suspicious though as I didn't see where the protocol would be relevant based on code review, testing, etc.

At the risk of distracting others / cross-posting, I do want to put these here for anyone else on this bug hunt for context.  Thank you!

https://bugzilla.kernel.org/show_bug.cgi?id=216397#c8

https://github.com/systemd/systemd/issues/24585
Comment 4 Johannes Berg 2022-12-05 10:50:11 UTC
Does this happen, by chance, *only* in the stable kernel (5.19)?

That commit really shouldn't have been backported in the first place.

It certainly works for me in 6.1-rc.
Comment 5 Justin Linenkohl 2022-12-05 11:08:35 UTC
Created attachment 303359 [details]
attachment-12334-0.html

Hello Johannes!

Yes.  Stable, vanilla code.  Thank you for looking.  What can I test for
you?

-Justin



On Mon, Dec 5, 2022, 5:50 AM <bugzilla-daemon@kernel.org> wrote:

> https://bugzilla.kernel.org/show_bug.cgi?id=216711
>
> Johannes Berg (johannes@sipsolutions.net) changed:
>
>            What    |Removed                     |Added
>
> ----------------------------------------------------------------------------
>              Status|NEW                         |NEEDINFO
>
> --- Comment #4 from Johannes Berg (johannes@sipsolutions.net) ---
> Does this happen, by chance, *only* in the stable kernel (5.19)?
>
> That commit really shouldn't have been backported in the first place.
>
> It certainly works for me in 6.1-rc.
>
> --
> You may reply to this email to add a comment.
>
> You are receiving this mail because:
> You are on the CC list for the bug.
Comment 6 Ben Wolsieffer 2022-12-07 04:14:54 UTC
I running into this while connected to a mixed WPA2/WPA3 network, but this particular client is using WPA2. It started happening when I updated from 5.19.0 to 6.0.11, and I'm still able to reproduce it with 6.1-rc3.
Comment 7 Ben Wolsieffer 2022-12-08 21:03:44 UTC
The issue still occurs with 6.1-rc8 as well.
Comment 8 Basti 2023-01-23 21:33:54 UTC
I discovered, that I'm also affected by this bug which broke parts of my networking-setup. A quick fix would be to use BSSID (which seems to work) but is not optimal for my use-case.

I could track it down that it possibly has to-do with the used encryption as "WiFi access point" is only "null" if WPA3 is involved. I get a proper SSID in `networkctl status wlanX` if it's WPA2 only but when having WPA2/WPA3 combine-mode - AVM/FritzBox - it goes back to null - regardless of wpa_supplicant or iwd, 2.4 or 5 GHz)

I will try to confirm when having access to a WPA3 only network if it reproducible there as well but would appreciate feedback if someone already knows.

kernel: 6.0.19 | systemd: 
```
systemd 252 (252.4)
+PAM +AUDIT -SELINUX +APPARMOR +IMA +SMACK +SECCOMP +GCRYPT -GNUTLS +OPENSSL +ACL +BLKID +CURL +ELFUTILS +FIDO2 +IDN2 -IDN +IPTC +KMOD +LIBCRYPTSETUP +LIBFDISK +PCRE2 -PWQUALITY +P11KIT -QRENCODE +TPM2 +BZIP2 +LZ4 +XZ +ZLIB +ZSTD +BPF_FRAMEWORK -XKBCOMMON +UTMP -SYSVINIT default-hierarchy=unified
```

Just passing of additional information as far as I can (as recommended here: https://github.com/systemd/systemd/issues/24585#issuecomment-1400803183). Is there anything I can help with / test out to resolve this issue? 

Thanks for your help in advanced!
Comment 9 Marc Bornand 2023-02-10 11:10:08 UTC
Being also affected by this bug, I tried to do some debugging and did some discoveries:

1. When this bug occurred wdev->u.client.ssid_len is 0 in nl80211_send_iface.

2. I found only one place where wdev->u.client.ssid is set: in cfg80211_connect. But with iwd and WPA3 this function never gets called. Instead at least for my setup it send NL80211_CMD_AUTHENTICATE and NL80211_CMD_ASSOCIATE.

3. The reason of the regression is that before 7b0a0e3c3a88 nl80211_send_iface did not get the ssid from the wireless_dev struct directly, it used ieee80211_bss_get_elem.

I see two possiblities:
1. use something like ieee80211_bss_get_elem in nl80211_send_iface again.

2. set the wdev->u.client.ssid in cfg80211_mlme_auth and reset it in cfg80211_mlme_deauth when the reason is leaving.

I personally prefer the second but I am not certain it covers all corner cases.

I will try to write a patch in the next days and try to test the second option, if there is no technical reason not to do so.
Comment 10 Marc Bornand 2023-02-10 20:42:36 UTC
Created attachment 303710 [details]
[PATCH] Set ssid when authenticating

This patch is based on 6.2-rc7.
I tested it with iwd, an Intel Corporation Wi-Fi 6 AX201 device and my phone as wpa3 ap.
Comment 11 Basti 2023-02-13 15:28:58 UTC
@Marc works fine for me also on 6.1.10 tested against AVM FritzBox WPA2/3 combi-mode
Comment 12 Marc Bornand 2023-02-13 22:09:34 UTC
Created attachment 303722 [details]
[PATCH v3] Set ssid when authenticating

Here is a new patch
that basically does the same but in a better place, more details on the mailing list.

Tests are welcome for this new patch.
If it's help someone the changes are also available under https://gitlab.com/mBornand/linux.git on the new_ssid_fix branch.
Comment 13 Marc Bornand 2023-02-14 13:16:56 UTC
Created attachment 303724 [details]
[PATCH v4] Set ssid when authenticating

Just corrected some mistakes pointed out by Johannes
Comment 14 Marc Bornand 2023-02-15 08:52:27 UTC
Created attachment 303734 [details]
[PATCH wireless v5] wifi: cfg80211: Set SSID if it is not already set

Just style and form corrections
Comment 15 Basti 2023-03-10 11:30:03 UTC
Running v5 quite stable since it was release in mid February, how can we move forward?
Comment 16 Marc Bornand 2023-03-10 11:33:51 UTC
It's in 6.3-rc1, 6.2.3 and the latest 6.1
Comment 17 Basti 2023-03-12 15:05:13 UTC
Thanks for letting me know :) How can I keep track of patch-status normally? I'm really new to the kernel-project :v:
Comment 18 Yohan Prod'homme 2023-03-12 21:48:28 UTC
Closing this bug report since the fix was committed[1] and it's working :) 

I'm also interested if someone has an answer to Basti's question. As a reporter, I was in Cc to the patch submitted to the ML, but wasn't aware it got committed and then merged into Linus' repo until I got the notification[2] from Greg that the fix was being backported to 6.2/6.1.

[1] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=c38c701851011c94ce3be1ccb3593678d2933fd8
[2] https://marc.info/?l=linux-stable-commits&m=167818332018362&w=2
Comment 19 Yohan Prod'homme 2023-03-25 17:46:44 UTC
I closed this bug after testing the fix on a WPA3 AP... But I just had the issue again with another WPA3 AP on 6.2.8 (which includes Marc's patch).

systemd-networkd returns "WiFi access point: (null) (70:fc:8f:4b:55:f2)".

To be complete, this AP is the same one that initially exhibited the issue and the one I used when I made the bisect that was posted in this bug report.

I'm open to test/check anything that can help to identify the cause. Note that I've access to this AP only every few weekends, so I may take some time to get back with results.
Comment 20 Marc Bornand 2023-03-31 11:26:54 UTC
On my side the patch works (just sending this comment with it).
The interesting part is that with this hardware I got the same bisect that Yohan got.

With some information I could try to search a little bit more, but maybe not run tests.

What software do you use? wpa_supplicant, iwd or something else?
Could you do some kernel debugging? The things I would look for are in the __cfg80211_connect_result function. In the lines the patch adds is a loop for_each_valid_link. Does it iterate at all on your setup? If yes, is there some link where memcpy gets executed in the loop and if not is ssid==NULL or ssid->datalen==0?

I don't know if this will help but does it work on other linux version with the patch?
Comment 21 Yohan Prod'homme 2023-03-31 22:17:37 UTC
I use iwd. I don't have any experience with debugging in the kernel tbh, but I can try to add a few printk() around, unless you know an easier method.

I think I will get back with some results during the next weekend (I'm not near any WPA3 AP I can connect to, currently).
Comment 22 Yohan Prod'homme 2023-06-11 09:30:12 UTC
I'm reclosing the bug because I've not been able to reproduce the issue again after my comment on March 25th, so I'm not sure it was a kernel issue this time.

The original issue is definitely fixed.