Bug 97181
Created attachment 174951 [details]
journalctl output when successfully connecting with -9 firmware
please try with this firmware: https://git.kernel.org/cgit/linux/kernel/git/iwlwifi/linux-firmware.git/plain/iwlwifi-7260-12.ucode Hi Emmanuel, Thanks for your reply. Before: iwlwifi 0000:02:00.0: loaded firmware version 25.15.12.0 op_mode iwlmvm Your link: iwlwifi 0000:02:00.0: loaded firmware version 25.17.12.0 op_mode iwlmvm No change in behaviour -- I cannot connect with either one. please provide the dmesg output without any debug flag. Please record tracing: sudo trace-cmd record -e iwlwifi -e iwlwifi_msg -e mac80211 -e cfg80211 I get this error: /sys/kernel/debug/tracing/events/cfg80211/filter /sys/kernel/debug/tracing/events/*/cfg80211/filter /sys/kernel/debug/tracing/events/mac80211/filter /sys/kernel/debug/tracing/events/*/mac80211/filter /sys/kernel/debug/tracing/events/iwlwifi_msg/filter /sys/kernel/debug/tracing/events/*/iwlwifi_msg/filter trace-cmd: No such file or directory No events enabled with iwlwifi_msg I guess the distro people who made my kernel didn't enable trace support m(? will see about compiling my own kernel in the next days... Created attachment 175141 [details]
backport based version
You can try the backport based driver I am attaching.
Created attachment 175211 [details]
trace file with newest driver/firmware
hi, thank you for the backport. I installed it and created the attached trace with trace-cmd record -e iwlwifi -e iwlwifi_msg -e mac80211 -e cfg80211
I hope it can help you to fix this annoying bug of mine.
Ok - I can't see why all this is happening. Please record a new trace and attach the dmesg output (without any debug flags) I'd also need the logs of the supplicant with -dd to make it verbose. Thanks. for now here is the dmesg log that matches the above trace: [21161.906569] iwlwifi 0000:02:00.0: L1 Enabled - LTR Enabled [21161.907175] iwlwifi 0000:02:00.0: L1 Enabled - LTR Enabled [21162.029875] iwlwifi 0000:02:00.0: L1 Enabled - LTR Enabled [21162.030479] iwlwifi 0000:02:00.0: L1 Enabled - LTR Enabled [21166.648611] wlp2s0: associate with 54:e6:fc:e5:53:5c (try 1/3) [21166.753330] wlp2s0: associate with 54:e6:fc:e5:53:5c (try 2/3) [21166.857427] wlp2s0: associate with 54:e6:fc:e5:53:5c (try 3/3) [21166.961554] wlp2s0: association with 54:e6:fc:e5:53:5c timed out [21180.164180] wlp2s0: associate with 54:e6:fc:e5:53:5c (try 1/3) [21180.268300] wlp2s0: associate with 54:e6:fc:e5:53:5c (try 2/3) [21180.372348] wlp2s0: associate with 54:e6:fc:e5:53:5c (try 3/3) [21180.476434] wlp2s0: association with 54:e6:fc:e5:53:5c timed out [21194.769966] wlp2s0: associate with 54:e6:fc:e5:53:5c (try 1/3) [21194.875093] wlp2s0: associate with 54:e6:fc:e5:53:5c (try 2/3) [21194.979193] wlp2s0: associate with 54:e6:fc:e5:53:5c (try 3/3) [21195.081272] wlp2s0: association with 54:e6:fc:e5:53:5c timed out [21217.772526] wlp2s0: associate with 54:e6:fc:e5:53:5c (try 1/3) [21217.879633] wlp2s0: associate with 54:e6:fc:e5:53:5c (try 2/3) [21217.983725] wlp2s0: associate with 54:e6:fc:e5:53:5c (try 3/3) [21218.088825] wlp2s0: association with 54:e6:fc:e5:53:5c timed out [21240.773098] wlp2s0: associate with 54:e6:fc:e5:53:5c (try 1/3) [21240.876210] wlp2s0: associate with 54:e6:fc:e5:53:5c (try 2/3) [21240.978324] wlp2s0: associate with 54:e6:fc:e5:53:5c (try 3/3) [21241.082407] wlp2s0: association with 54:e6:fc:e5:53:5c timed out [21260.207799] wlp2s0: associate with 54:e6:fc:e5:53:5c (try 1/3) [21260.311938] wlp2s0: associate with 54:e6:fc:e5:53:5c (try 2/3) [21260.418022] wlp2s0: associate with 54:e6:fc:e5:53:5c (try 3/3) [21260.522106] wlp2s0: association with 54:e6:fc:e5:53:5c timed out [21270.637224] wlp2s0: associate with 54:e6:fc:e5:53:5c (try 1/3) [21270.741345] wlp2s0: associate with 54:e6:fc:e5:53:5c (try 2/3) [21270.845485] wlp2s0: associate with 54:e6:fc:e5:53:5c (try 3/3) [21270.947542] wlp2s0: association with 54:e6:fc:e5:53:5c timed out Here is the matching output in wpa_supplicant.log wlp2s0: CTRL-EVENT-SCAN-STARTED wlp2s0: Trying to associate with 54:e6:fc:e5:53:5c (SSID='TP-LINK_E5535C' freq=2462 MHz) wlp2s0: Authentication with 54:e6:fc:e5:53:5c timed out. wlp2s0: CTRL-EVENT-DISCONNECTED bssid=54:e6:fc:e5:53:5c reason=3 locally_generated=1 wlp2s0: CTRL-EVENT-SCAN-STARTED wlp2s0: Trying to associate with 54:e6:fc:e5:53:5c (SSID='TP-LINK_E5535C' freq=2462 MHz) wlp2s0: Authentication with 54:e6:fc:e5:53:5c timed out. wlp2s0: CTRL-EVENT-DISCONNECTED bssid=54:e6:fc:e5:53:5c reason=3 locally_generated=1 wlp2s0: CTRL-EVENT-SSID-TEMP-DISABLED id=0 ssid="TP-LINK_E5535C" auth_failures=1 duration=10 reason=CONN_FAILED wlp2s0: CTRL-EVENT-SCAN-STARTED wlp2s0: Trying to associate with 54:e6:fc:e5:53:5c (SSID='TP-LINK_E5535C' freq=2462 MHz) wlp2s0: Authentication with 54:e6:fc:e5:53:5c timed out. wlp2s0: CTRL-EVENT-DISCONNECTED bssid=54:e6:fc:e5:53:5c reason=3 locally_generated=1 wlp2s0: CTRL-EVENT-SSID-TEMP-DISABLED id=0 ssid="TP-LINK_E5535C" auth_failures=1 duration=10 reason=CONN_FAILED wlp2s0: CTRL-EVENT-SCAN-STARTED wlp2s0: CTRL-EVENT-SCAN-STARTED wlp2s0: Trying to associate with 54:e6:fc:e5:53:5c (SSID='TP-LINK_E5535C' freq=2462 MHz) wlp2s0: Authentication with 54:e6:fc:e5:53:5c timed out. wlp2s0: CTRL-EVENT-DISCONNECTED bssid=54:e6:fc:e5:53:5c reason=3 locally_generated=1 wlp2s0: CTRL-EVENT-SSID-TEMP-DISABLED id=0 ssid="TP-LINK_E5535C" auth_failures=1 duration=10 reason=CONN_FAILED wlp2s0: CTRL-EVENT-SCAN-STARTED wlp2s0: CTRL-EVENT-SCAN-STARTED wlp2s0: CTRL-EVENT-SSID-REENABLED id=0 ssid="TP-LINK_E5535C" wlp2s0: Trying to associate with 54:e6:fc:e5:53:5c (SSID='TP-LINK_E5535C' freq=2462 MHz) wlp2s0: CTRL-EVENT-DISCONNECTED bssid=54:e6:fc:e5:53:5c reason=3 locally_generated=1 wlp2s0: Trying to associate with 54:e6:fc:e5:53:5c (SSID='TP-LINK_E5535C' freq=2462 MHz) wlp2s0: Association request to the driver failed wlp2s0: Authentication with 54:e6:fc:e5:53:5c timed out. wlp2s0: CTRL-EVENT-DISCONNECTED bssid=54:e6:fc:e5:53:5c reason=3 locally_generated=1 wlp2s0: CTRL-EVENT-SSID-TEMP-DISABLED id=0 ssid="TP-LINK_E5535C" auth_failures=1 duration=10 reason=CONN_FAILED wlp2s0: CTRL-EVENT-SCAN-STARTED wlp2s0: CTRL-EVENT-SCAN-STARTED wlp2s0: CTRL-EVENT-SSID-REENABLED id=0 ssid="TP-LINK_E5535C" wlp2s0: Trying to associate with 54:e6:fc:e5:53:5c (SSID='TP-LINK_E5535C' freq=2462 MHz) wlp2s0: CTRL-EVENT-DISCONNECTED bssid=54:e6:fc:e5:53:5c reason=3 locally_generated=1 wlp2s0: CTRL-EVENT-SCAN-STARTED wlp2s0: Trying to associate with 54:e6:fc:e5:53:5c (SSID='TP-LINK_E5535C' freq=2462 MHz) wlp2s0: CTRL-EVENT-DISCONNECTED bssid=54:e6:fc:e5:53:5c reason=3 locally_generated=1 I will try to make a new/more traces if needed, but can you give me a little hand here what and how I need to mess with supplicant to add this -dd flag? do I need to change the unit file and restart it? your supplicant is doing something weird I think. What is its version? can you try this: diff --git a/drivers/net/wireless/iwlwifi/mvm/mac80211.c b/drivers/net/wireless/iwlwifi/mvm/mac80211.c index dda887f..72e1b7a 100644 --- a/drivers/net/wireless/iwlwifi/mvm/mac80211.c +++ b/drivers/net/wireless/iwlwifi/mvm/mac80211.c @@ -455,9 +455,11 @@ int iwl_mvm_mac_setup_register(struct iwl_mvm *mvm) hw->flags |= IEEE80211_HW_MFP_CAPABLE; hw->flags |= IEEE80211_SINGLE_HW_SCAN_ON_ALL_BANDS; +#if 0 hw->wiphy->features |= NL80211_FEATURE_SCHED_SCAN_RANDOM_MAC_ADDR | NL80211_FEATURE_SCAN_RANDOM_MAC_ADDR; +#endif hw->sta_data_size = sizeof(struct iwl_mvm_sta); hw->vif_data_size = sizeof(struct iwl_mvm_vif); @@ -544,12 +546,14 @@ int iwl_mvm_mac_setup_register(struct iwl_mvm *mvm) else hw->wiphy->flags &= ~WIPHY_FLAG_PS_ON_BY_DEFAULT; +#if 0 hw->wiphy->flags |= WIPHY_FLAG_SUPPORTS_SCHED_SCAN; hw->wiphy->max_sched_scan_ssids = PROBE_OPTION_MAX; hw->wiphy->max_match_sets = IWL_SCAN_MAX_PROFILES; /* we create the 802.11 header and zero length SSID IE. */ hw->wiphy->max_sched_scan_ie_len = SCAN_OFFLOAD_PROBE_REQ_SIZE - 24 - 2; +#endif hw->wiphy->features |= NL80211_FEATURE_P2P_GO_CTWIN | NL80211_FEATURE_LOW_PRIORITY_SCAN | your patch didn't change anything (still cannot connect) I attached a new trace (before your patch) and the wpa_supplicant log (it's wpa_supplicant 2.2) Created attachment 175661 [details]
another trace with wpa_supplicant -dd
Created attachment 175671 [details]
supplicant log with -dd
Created attachment 176001 [details] Core10 firmware with trigger on auth timeout please take this firmware and work with the backport driver I sent earlier. When you'll have an authentication timeout, a firmware dump will be created. Please collect the data by: cat /sys/devices/virtual/devcoredump/devcd1/data > iwl-1.dump echo 1 > /sys/devices/virtual/devcoredump/devcd1/data You will probably have 5 dumps until it will stop dumping the data. Please attach the dumps to the bug and I will transfer them to the firmware team. Please take the time to check our privacy note here: https://wireless.wiki.kernel.org/en/users/drivers/iwlwifi#privacy_aspects thanks Hi Emmanuel, I attach the dump, there was only one, I fear there might be an issue. See the dmesg log, it says Sorry - debug buffer is only 4096K while you requested 65536K. Created attachment 176451 [details]
Firmware dump do debug this issue
Created attachment 176461 [details]
dmesg before saving the firmware dump
(In reply to ailin.nemui+kernel from comment #15) > Hi Emmanuel, > > I attach the dump, there was only one, I fear there might be an issue. See > the dmesg log, it says Sorry - debug buffer is only 4096K while you > requested 65536K. This is fine. But the fact that you have in error interrupt after the collection is strange. Thanks - I'll send the data to the firmware team. Created attachment 176731 [details]
Core10 firmware with trigger on auth timeout
Hi,
the firmware team asked if it is possible to reproduce and get a new dump with the firmware attached.
It will dump more debug info.
thanks you.
Created attachment 176771 [details]
new dump
find the new dump attached. weird to me that it is even smaller than before ;-)
thanks. Created attachment 176791 [details]
Core10 firmware with trigger on auth timeout
I am sorry, I made a stupid mistake.
I'll have to ask you to repro with the firmware attached.
Sorry about that.
Created attachment 176881 [details]
new dump
good luck
this one worked thanks. All this debug infra is still quite new and it has some glitches from time to time ;) I experience the same problem in kernel 3.19.0. Are you using Bluetooth? Can you try the backport based version attached to this bug? that worked! Thanks! I'm not using bluetooth, although I found that it was switched on. (In reply to mail@telenczuk.pl from comment #28) > that worked! Thanks! > Ok - this is really strange. We are investigating. I could reproduce but only if I am far away from the AP. Can you please try to go back to the previous driver (just delete the updates dir in /lib/modules/`uname -r`/) and let me know if you can reproduce the problem when you are really close to the AP? thanks. Created attachment 178431 [details]
iwlwifi with BT AG reverted
Ok - I think I found the problem.
Can you please try the backport tree attached?
You should be able to reproduce the problem on that tree.
Created attachment 178441 [details]
fix
Then, please apply the patch attached to the backport tree. And let me know what happens.
I hope that this will fix the problem.
Thank you.
Moving this bug to fixed. I will wait a few days to get confirmations. Is this supposed to fix my issue as well? Yes (In reply to ailin.nemui+kernel from comment #33) > Is this supposed to fix my issue as well? I just noticed that your AP is sending deauth with a bad sequence number. we should be able to cope with that even if your AP is really at fault here. I need to investigate this a little more. Ok we had 2 bugs here. @ailin.nemui+kernel@gmail.com can you try: https://p.sipsolutions.net/d21ea08039a75985.txt thanks. All the fixes have been merged to the master branch of: https://git.kernel.org/cgit/linux/kernel/git/iwlwifi/backport-iwlwifi.git/log/ please test this. So sad, it still doesn't work :-( I tried with the backport-iwlwifi from git together with the firmwares from git, but it just doesn't connect. I also turned off bluetooth but that didn't change anything. *Cries* Created attachment 178521 [details]
dmesg log from latest driver
It worked for me. one curious thing I noticed is that on the working driver, the dmesg says wlp2s0: authenticate with 54:e6:fc:e5:53:5c wlp2s0: send auth to 54:e6:fc:e5:53:5c (try 1/3) wlp2s0: authenticated wlp2s0: associate with 54:e6:fc:e5:53:5c (try 1/3) wlp2s0: RX AssocResp from 54:e6:fc:e5:53:5c (capab=0x431 status=0 aid=2) wlp2s0: associated i.e. it first says authenticate, whereas on the broken ones it never says authenticate, only wlp2s0: associate with 54:e6:fc:e5:53:5c (try 1/3) wlp2s0: deauthenticated from 54:e6:fc:e5:53:5c while associating (Reason: 9=STA_REQ_ASSOC_WITHOUT_AUTH) please record tracing with the latest backport tree. I also would like to see the supplicant output (with -dd). thanks. I looked again at your older supplicant logs and it looks strange. can you please run your supplicant with -T while you record kernel tracing? thanks. can you please make sure you have only one instance of the supplicant running? It seems from your logs from comment 13 that you had two supplicants running which is surely a problem. When you want to add debug level to the supplicant, you need to make sure that the daemon spawned by the OS isn't running. Note that the network manager will re-start the supplicant when you kill it. You'd have to stop the network manager as well. Created attachment 178561 [details]
disable vendor commands
Johannes found a bug in supplicant 2.2:
can you try the patch attached and apply it to the backprot tree to work around the bug in the supplicant?
Thanks.
For posterity, here's a description: we have a vendor command 11, and it's causing the following: * subcmd 11 interpreted as qca radar detect * get_capa tries to call it, but obviously it can't * get_capa returns an error, so wpa_s->drv_flags don't get set, however drv->capa.flags *is* set * wpa_s calls associate() to do a connect(), but driver_nl80211 knows SME is supported so thinks this is pure associcate() wpa_s's commit 5f9c92f8f7063b8e35118b6b4d39d10a46fea20f fixed the first step here Hi, on a whim, I updated wpa_supplicant to 2.4 prior to receiving all of your comments, and indeed this solves the issue! I can now connect again. Your explanation sounds highly likely as the underlying issue. However, now my dmesg log is getting spammed like in the attachment, does this tell you anything? Maybe it should be moved to another issue though. Created attachment 178571 [details]
dmesg log with lots of microcode errors
btw, the bug was only fixed in supplicant master branch, not released yet, so versions 2.2 through 2.4 are affected. Interesting, I don't think 2.4 fixes this bug? but perhaps it has some slightly different behaviour somewhere along the line so it doesn't affect it as much ... oh, weird. 2.4 makes it connect again. but maybe there is a more severe issue combined with 2.2 The Microcode errors are unrelated. You can downgrade to -13.ucode to check it is better. fyi it's exactly the same with -13.ucode Please open a new bug. thank you. FYI - the Microcode errors have been fixed in our latest driver. Please pull from backport. |
Created attachment 174941 [details] journalctl output when trying to connect with -12 firmware Since I installed the newest firmware I am unable to connect to the wireless network. For now I can only show you some output from journalctl, I will try to help if you tell me what to do but I'm more a regular user than kernel developer. Using my distribution provided kernel compiles (openSUSE 4.0.0-2.g53b125b-desktop) Going back to -9 makes the network connect again. I also attached that journalctl output in the noproblem log file.