Bug 78101
Summary: | iwlwifi AC 7260: No association and the time event is over - MWG100216251 | ||
---|---|---|---|
Product: | Drivers | Reporter: | Johannes Stezenbach (js) |
Component: | network-wireless | Assignee: | drivers_network-wireless (drivers_network-wireless) |
Status: | CLOSED WILL_NOT_FIX | ||
Severity: | normal | CC: | alessandro.zucca01, aroesler.privat, cachobot, h.judt, ilw, jackc, js, kernel, leho, maggu2810, marco.caminati, patrakov, stuart.stent |
Priority: | P1 | ||
Hardware: | All | ||
OS: | Linux | ||
Kernel Version: | 3.15 | Subsystem: | |
Regression: | No | Bisected commit-id: | |
Attachments: |
lspci -vvv
trace-cmd record -e iwlwifi -e cfg80211 -e mac80211 unpatched kernel 3.15, running wpa_supplicant manually beacons captured on client print info dmesg dmesg after AP reboot bad beacon good beacon FW that doesn't drop the beacon |
I need a few data on the AP. What AP do you have? More importantly, what is its beacon interval? a trace-cmd output will let us know the answer to the second question. trace-cmd record -e iwlwifi -e cfg80211 -e mac80211 The beacon interval on the office AP (TP-Link TL-WR1043ND v1.8 with vendor firmware, Atheros AR9132); After connecting with the hack patch mentioned above: $ iw dev wlp4s0 link Connected to f8:d1:11:39:1a:8c (on wlp4s0) SSID: foo freq: 2462 RX: 13135 bytes (103 packets) TX: 2033 bytes (19 packets) signal: -43 dBm tx bitrate: 1.0 MBit/s bss flags: CTS-protection short-preamble short-slot-time dtim period: 0 beacon int: 100 Will try to get the trace-cmd output asap (with unpatched kernel). dtim period 0?? hmm... sounds like a bug in iw... - or our driver... Created attachment 139951 [details]
trace-cmd record -e iwlwifi -e cfg80211 -e mac80211
unpatched kernel 3.15, running wpa_supplicant manually
Created attachment 139961 [details]
beacons captured on client
iw phy phy0 interface add mon0 type monitor
ip link set up dev mon0
iw dev mon0 set freq 2642
tcpdump -i mon0 -s10000 -w cap.pcap
then used wireshark to extract the beacons only,
hope it is useful
Did you have some time to look at the trace + beacons? It seems the dtim period 0 is not from the beacon (it has dtim per 1), but it is in the trace: drv_bss_info_changed: phy0 vif:wlp4s0(2) assoc:1 aid:9 cts:1 shortpre:1 shortslot:1 dtimper:0 bcnint:100 assoc_cap:0x431 basic_rates:0xf enable_beacon:0 ht_operation_mode:0 Maybe you need some additional trace to find out where the dtimper:0 comes from? (Even if the AP were buggy and would send DTIM period 0, there is a check in ieee80211_set_associated() that should catch it.) no - no time. I am travelling and very busy right now. This is why I asked you to open a bug. So that I can have a real tracking and not just a mail in my inbox. Hi, I haven't looked at the traces yet - but I doubt they'll help me now that I looked a bit more at the code. Can you try this: diff --git a/net/mac80211/mlme.c b/net/mac80211/mlme.c index e37b97d..661dc8b 100644 --- a/net/mac80211/mlme.c +++ b/net/mac80211/mlme.c @@ -4438,6 +4438,8 @@ int ieee80211_mgd_assoc(struct ieee80211_sub_if_data *sdata, sdata->vif.bss_conf.sync_device_ts = bss->device_ts_beacon; sdata->vif.bss_conf.sync_dtim_count = dtim_count; + sdata->vif.bss_conf.dtim_period = + ifmgd->dtim_period ? : 1; } } else { assoc_data->timeout = jiffies; Hi, I removed my workaround hack and added your change (linux-3.15), and also added a printk for ifmgd->dtim_period. It works, I can connect, and the printk shows ifmgd->dtim_period is 1, the value expected from the beacon. (i.e. the "? : 1" is not needed in my case) :) Ok - thanks for the testing. patch published Created attachment 140971 [details]
print info
Hi again,
so I sent my patch and the maintainer asks a few questions that I couldn't answer because I don't full understand how my patch solves your problem.
I attached a patch - please apply it and reproduce the issue without the fix.
This will shed more light on what is going on.
My patch is very likely to be incomplete.
Thank you for your help.
Created attachment 141021 [details]
dmesg
Today I found your previous patch didn't fix the issue. Maybe on the
day I tested the environmental conditions changed, or I made a mistake.
(However, I used the machine with the patch applied so I'm sure
wifi worked, and I triple checked I backed out my hack patch in
iwl_mvm_te_handle_notif().) Sigh...
I'm attaching the requested dmesg, but I guess it is not so interesting now.
(just ran wpa_supplicant manually for short time).
It looks like no beacon was recieved during the connection
attempt, but I captured packets on a second machine, and it shows
the beacons (although some are missing).
what you sent *is* useful - I just want to understand what you had. You had my patch from comment 12 obviously, but had you the patch from comment 8? I'll look at the logs later tonight (hopefully). from what I see here - you really seem not to hear any beacon from your AP... which is really weird... The log is without the patch from comment 8, I thought "reproduce the issue without the fix" means to back it out. I also tested with the fix, it didn't make a difference today. Didn't keep the log, though, but if I've seen it correctly all the debug prints only printed zeros. Thanks, PS: I had also run tcpdump on a monitor interface (like in comment 5) during the connection attempt, no beacons were captured. Since the packet capture taken on the other machine also had missing beacons (roughly 50% mssing), maybe some neighbouring device is interfering. (the distance to the AP is ~4 meters). So the root cause could be related to receiption quality? However, other management frames get through so I'm not sure how this is possible except if the firmware is acting funny. well - some APs are sometimes problematic. We've seen APs stop sending beacons for a few seconds... But in your case, it seems really severe... A bit too severe to be possible. Have you tried to reboot the AP? Created attachment 141091 [details]
dmesg after AP reboot
After AP reboot, it can connect (without any fix patch).
Attached is the dmesg with the debug patch applied.
So it seems the AP is part of the problem, however all other
devices can connect. Maybe the issue is simply the timeout
for the "No association and the time event is over already..."
check is too short, the driver should wait longer for a beacon?
I noticed the AP was set to automatic channel selection and selects a different channel on each boot, so I set it back to channel 11 and also tried a few other channels: It could still connect, so the issue seems not related to channel setting. The download speed (from local http server) varies wildly between 5MB/s and 100K/s, while a tiny rt2800usb dongle seems to yield more consistent and on average faster download speed. However, my Thinkpad X230 (Ultimate-N 6300 AGN) is consistently slow with this AP (~100K/s) and sometimes even stalls completely for a few seconds. Maybe of interest, "iw phy" reports "Available Antennas: TX 0 RX 0" in both cases, while the Yoga has two antennas (2x2) and the X230 has three (3x3). Any idea about it? BTW, the AP is a TP-Link TL-WR1043ND v1.8 running vendor firmware. (Atheros AR9132) please don't mix issues in the same bug. From my point of view - it seems that this issue was an issue with your AP. Sorry, I was just dumping some information gathered during testing before it gets lost. Agreed, the AP seems to be at least part of the problem, however other devices can connect without problem. I hope it is of interest to you to find out how you could improve the driver to work better with shitty APs. I can only offer to do testing and tracing. Anyway, since I have a workaround (see bottom of issue description), you can close the bug if no one else sees the same issue. I have the same problem as Johannes: I never manage to associate to my AP longer than one second or so. dmesg says: cfg80211: Calling CRDA to update world regulatory domain wlan0: authenticate with xxx wlan0: send auth to xxx (try 1/3) wlan0: authenticated iwlwifi 0000:01:00.0 wlan0: disabling HT as WMM/QoS is not supported by the AP iwlwifi 0000:01:00.0 wlan0: disabling VHT as WMM/QoS is not supported by the AP wlan0: associate with xxx (try 1/3) wlan0: RX AssocResp from xxx (capab=0x1 status=0 aid=4) wlan0: associated iwlwifi 0000:01:00.0: No association and the time event is over already... wlan0: Connection to AP xxx lost cfg80211: Calling CRDA to update world regulatory domain Note that: 1) I rebooted my AP, didn't fix. 2) All other devices I tried in the many years I've been using this AP worked. 3) It is an old AP, you can find details attached. Given that, I ask that this bug is reopened. Model: HomePortal 1000SW Serial Number: 442111004305 Hardware Version: 2700-000303-004 Software Version: 3.5.15 What kernel version do you have? (In reply to Emmanuel Grumbach from comment #24) > What kernel version do you have? I built backports-3.16-rc1-1 on a 3.8.13 kernel version. @caminati Can you please record tracing? sudo trace-cmd record -e iwlwifi -e mac80211 -e iwlwifi_msg You can send the data privately to me if you prefer. After been away for some time the issue reappeared when I came back, currently the AP uptime is 23 days. I know you think it is an AP issue and I could just reboot it again, but in case you want me to test/debug why the beacons are not received, let me know. As I mentioned before the AP works for anyone else, office mates using a large zoo of Android devices for development. (In reply to Emmanuel Grumbach from comment #26) > @caminati > > Can you please record tracing? > > sudo trace-cmd record -e iwlwifi -e mac80211 -e iwlwifi_msg > > You can send the data privately to me if you prefer. Thanks for your attention. Sadly, I get "debugfs is not configured on this kernel". At the moment, I have no time nor gear to rebuild the kernel. I hope I will in the near future, in which case I will keep this bugthread updated. I am trying to find a way to be more robust to this case. Patch will follow... Can you try this? diff --git a/drivers/net/wireless/iwlwifi/mvm/mac80211.c b/drivers/net/wireless/iwlwifi/mvm/mac80211.c index c49b08c..0d17b44 100644 --- a/drivers/net/wireless/iwlwifi/mvm/mac80211.c +++ b/drivers/net/wireless/iwlwifi/mvm/mac80211.c @@ -2229,9 +2229,9 @@ static void iwl_mvm_mac_mgd_prepare_tx(struct ieee80211_hw *hw, { struct iwl_mvm *mvm = IWL_MAC80211_GET_MVM(hw); u32 duration = min(IWL_MVM_TE_SESSION_PROTECTION_MAX_TIME_MS, - 200 + vif->bss_conf.beacon_int); + 300 + vif->bss_conf.beacon_int); u32 min_duration = min(IWL_MVM_TE_SESSION_PROTECTION_MIN_TIME_MS, - 100 + vif->bss_conf.beacon_int); + 250 + vif->bss_conf.beacon_int); if (WARN_ON_ONCE(vif->bss_conf.assoc)) return; This is just a try to easily make it more robust. I am working on a make a more generic way. But I'd like to know if this helps to know if we are in the right direction. Thanks. It seems I have this problem too (linux-3.15.2 and older). Sometimes my wireless is lucky and some connection works (there are several APs with the same ESSID in range), but most times I find in dmesg: iwlwifi 0000:03:00.0: No association and the time event is over already... wlan0: Connection to AP xx:xx:xx:xx:xx:xx lost wlan0: direct probe to xx:xx:xx:xx:xx:xx (try 2/3) wlan0: direct probe to xx:xx:xx:xx:xx:xx (try 3/3) wlan0: authentication with xx:xx:xx:xx:xx:xx timed out wlan0: authenticate with yy:yy:yy:yy:yy:yy wlan0: direct probe to yy:yy:yy:yy:yy:yy (try 1/3) wlan0: direct probe to yy:yy:yy:yy:yy:yy (try 2/3) wlan0: direct probe to yy:yy:yy:yy:yy:yy (try 3/3) I'm going to try your patches and hope they help me too. The fw session protection change in iwl_mvm_mac_mgd_prepare_tx() made no difference for me. FWIW, I wanted to check fw_rx_stats in iwlmvm debugfs directory, but all values are zero. Comment in fw-api.h for struct iwl_notif_statistics indicates the stats might only be sent while associated, so I temporarily added the workaround described at the end of this bug's description. But the values are still zero. I notice the REPLY_STATISTICS_CMD 0x9c mentioned in the comment is removed from the command enum. Something wrong here? After some experimenting, I found the device stops receiving beacons when it tries to connect, but receives beacons on monitor interface when the managed interface is down. I.e.: iw phy phy0 interface add mon0 type monitor ip link set up dev mon0 iw dev mon0 set channel 7 tcpdump -i mon0 -> tcpdump receives beacons (actually using wireshark) iw dev wlan0 set channel 7 ip link set up dev wlan0 (also tried: iw dev wlan0 connect foo 2442) -> tcpdump stops receiving beacons iw dev wlan0 scan -> tcpdump recieves some beacons from other AP, but not from our AP; it also receives probe responses, also from our AP ip link set down dev wlan0 -> tcpdump resumes receiving beacons Any idea about it? And why would this behaviour change when I reboot the AP??? (I should reboot the AP to confirm, but since it might take days or weeks until the issue re-appears I'm not doing it yet.) (Doing the same experiment using Ralink usb dongle works as expected, receiving beacons all the time.) We had a problem with beacon filtering - but that is disabled in 3.15: http://lxr.free-electrons.com/source/drivers/net/wireless/iwlwifi/mvm/mac80211.c#L829 When you have a STATION vif, we don't let anything through unless you scan / or try to associate. Try to scan on wlan0 and you'll see packets on mon0. After all, you don't want to get interrupts for any packet in the air when you are not scanning / associating. Note that the monitor interface is a real promiscuous mode when it is the only interface active. If you have a STATION interface, it'll just copy the packets coming from the STATION. I am not sure about the last sentence, but this is what I remember. I have no reasons to think it'll help, but can you please try this firmware? https://git.kernel.org/cgit/linux/kernel/git/egrumbach/linux-firmware.git/plain/iwlwifi-7260-9.ucode?h=Core6&id=59a2c0aa8b9e26533d01d153a9be2c5f61cc0d62 Thanks. The experimental firmware v25.223.9.0 does not change the behaviour. FWIW I'm currently using kernel 3.16.0-rc6. At this point I think the best way forward would be to reboot the AP to see if the previous result can be reproduced and the connection can be established. And capture beacons and probe resposes before and after AP reboot for comparison. Any other idea what to test? (I think monitor mode shows that beacons can be received so there is no issue with signal quality or antenna setup. More likely it is an issue with beacon filtering? I added the same && false as in 3.15 to 3.16.0-rc6 but without effect.) Thanks for testing. let me know what happens after your reboot the AP. I have new debug mechanism in 3.17 that can be very useful, but for that I'd need FW team and they are typically very busy (not that I am not). Hi, you will not be surprised: I'm here as I have the same issue. The kernel I'm using is 3.13.0-32-generic from Ubuntu 14.04 x86_64. If you ask me to debug stuff please give me a very easy howto, as I'm not as deep into it ;-). What I want to add is an observation I did: I'm working at 3 locations with 3 access-points. All 3 of them have the issue, but: * 2 Fritz!Boxes (7390 and 7490) are more stable then a Sophos AP30. * When I'm sitting directly next to one of my Fritz!Boxes I can work much longer (several hours) without connection-losses and reconnection mostly works. Being in a bigger distance I ran into the issue quite fast (within a few minutes) and got the connection back only by unloading the modules and reloading them. Maybe this helps. If not: Just ignore it ;-). Andreas After AP reboot the connection works. I've taken network captures with wireshark before and after reboot, both using mon0 on the iwlwifi and on a Ralink USB dongle (just in case). I also captured the WPA2 connection, it shows one beacon is received right after the key exchange, then the next beacon ~5 seconds later (apparently beacon filter at work). After inspecting the captures, I found the beacon sent by the AP appears to be corrupt. I guess the firmware drops it, but the corrupt beacon still has enough valid information to make the connection work with other devices. Created attachment 144411 [details]
bad beacon
corrupted beacon before AP reboot, one 00 byte is missing
at the end of the WMM/WME element, causing the following
elements (including HT capabilities) to be ignored
Created attachment 144421 [details]
good beacon
good beacon after AP reboot
Ok - that was fruitful The bad beacon is really messed up. You should check if you can upgrade the AP's firmware. Since the beacon is broken, the Intel firmware will throw them away (unless we are in sniffer mode). This answers all the questions :) Changing our firmware to let the beacons go is not trivial (just asked the FW team). I am not saying we won't make that change, I am just saying it will be a long shot. FWIW, I found there was indeed a firmware update for the AP available. I installed it and the issue reappeared after just two days of uptime :-( It means if you have a 7260 AC firmware update I could still test it. (The particular hardware revision of the AP has issues with OpenWRT, otherwise I would install it right away.) can you please sniff for the beacon again - to make sure that we are having the same issue? I did it already, the error is exactily the same: The last (zero) byte of the WMM/WME IE is missing, i.e. the next IEs start one byte too early and thus can't be decoded. FWIW: thanks to this input - I talked to people here and we will try to see what we can do in these cases. It will take time though... I will close this bug as 3rd party. After all, it has been clearly proven that the bug is on the AP side. FW team has started to look for a way not to drop the beacon - not sure it will find one though... Thank you! One idea: is the dropped beacon related to the beacon filter, i.e. could the issue be fixed by enabling the beacon filter only after the first beacon has been received? I'm not sure how the firmware works, is there a possibility to disable the beacon filter completely for testing? (I'm assuming this could be tested on driver level without firmware change.) the problem is not the power save feature called beacon filtering. The problem is that the beacon is so broken, that the firmware drops the packet because it can't find the IEs in the right place. So basically, we need to change the firmware to be more permissive, but this is very risky, because then there is code that will run on a broken beacon. Assumption that were taken aren't true anymore. This is why, it is not simple at all. Created attachment 149071 [details]
FW that doesn't drop the beacon
Please try this firmware.
I can't promise that the "fix" will be delivered to the code base and that we will be able to formally release this "fix".
The firmware can connect, but the connection speed is very slow (like several seconds just for a DNS lookup) and the connection is unstable. [ 1.319821] iwlwifi 0000:04:00.0: loaded firmware version 23.214.9.0 op_mode iwlmvm ... [ 9.811460] wlan1: authenticate with f8:d1:11:39:1a:8c [ 9.814692] wlan1: send auth to f8:d1:11:39:1a:8c (try 1/3) [ 9.826796] wlan1: authenticated [ 9.826913] wlan1: associating with AP with corrupt beacon [ 9.829009] wlan1: associate with f8:d1:11:39:1a:8c (try 1/3) [ 9.835252] wlan1: RX AssocResp from f8:d1:11:39:1a:8c (capab=0x31 status=0 aid=1) [ 9.836292] wlan1: associated BTW, I found the "corrupt beacon" message is from mac80211/mlme.c, it sets the IEEE80211_BSS_CORRUPT_BEACON flag. Seems this problem is not so uncommon... try to disable power save: sudo iw wlan1 set power_save off That works much better, download speed fluctuates between 100KB/s and 1MB/s. Usable. Thanks! yeah... but ... This is not something that we can live with... This AP is just ... bad. And we have no way to detect how bad it is and disable power save when we face such bad AP... Even if the FW will integrate the change they made for you (and this is far from being obvious), we still have a big challenge with your AP... Note that other vendors seem not to implement power save (not that I have any real data - but how could they get 2.4Mps?). Some data points I remember from past tests: - when I rebooted the AP, the driver worked with acceptable speed even without power save disabled - when I applied the workaround given at the end of the issue description, speed was also acceptable (but I wonder how power save could work without beacon receiption, maybe it was disabled implicitly?) - Android devices seem to have no issues with the AP, I think (hope) these generally have power save enabled Thus I think the AP is buggy but not completely broken. Maybe the experimental firmware forwards the broken beacon but disables the internal processing related to power save, e.g. never sends queued frames? IIRC the rt2800usb driver does not support power save (or rather the USB hardware interface doesn't, I think the rt2800pci devices support power save). Here's another one struggling with the AC7260 on kernel 3.16.1. At a relatively large conference (200 people in the room), the adapter initially connected, but the connection was very intermittent. After I did rmmod iwlmvm and re-modprobe'd, adapter wouldn't connect at all. It's not possible to reboot AP's at conferences etc. Adapter has been working fine at home and office and mostly everywhere else though, I'd say 95+% uptime. Kinda sucks to run into the issue unexpectedly like this though. Currently I'm at a nearby shop buying a couple of backup USB wifi adapters. They have rt2800usb and rtl8192uc-based stuff for sale here so I'll be able to test the same environment with those and can report back. @Leho - if you don't see the exact same print (No association and the time event is over) please don't report in this bug. Find another one, or open a new one. Emmanuel, it is the exact same thing. The very same message "No association and the time event is over" is there, association fails after WPA password entry etc. Apologies for not being more detailed about it, it's difficult to provide logs in the middle of the day right now. I do have the two additional USB adapters available now. This provided an additional discovery immediately. When this non-connection state happened with iwlwifi, I suspended the machine, went to the shop, started testing the RT5370 and RTL8192UC based USB adapters. Set up an Galaxy S5 based hotspot and NONE of the 3 wifi adapters were able to finish the connection, getting stuck exactly the same place (well, based on what shows in dmesg). This seems to indicate that the whole wifi stack gets into a confused state of some sort? Wired network connection worked fine in the shop. Cold rebooted and wifi connectivity was restored. After coming back from a reboot, all adapters connected to the S5 hotspot without issues. Suspended machine, walked back to conference. Now I'm sitting at the conference, trying out the rt2800usb based "148f:5370 Ralink Technology, Corp. RT5370 Wireless Adapter". It connected to the AP but traffic throughput was very intermittent, just like with iwlwifi. I have just cold rebooted with /etc/modprobe.d/wifi conf including "blacklist iwlwifi iwlmvm". Connection has been now running with great speed on top of rt2800usb without any apparent issues. First thing I'd try is to disable power save. I think that quite a few vendors (can't really say that too loud :)) don't implement power save. And this can avoid lots of issue with broken APs... So please do just like in Comment#53 Aha. I missed that it was a post-connection iw command. But this is a bit confusing overall, because modinfo iwlwifi says: ... vermagic: 3.16.1 SMP preempt mod_unload ... parm: power_save:enable WiFi power management (default: disable) (bool) ... And hence I have been living with the assumption that power save has always been disabled by default and therefore any "disable power_save" advice doesn't apply. What's the truth here? yeah iwlwifi also has a powersave parmater - this is an old legacy :) By default iwlmvm will have powersave enabled, iwldvm will not (well not aggressive powersave). I know, confusing... :) Can someone update on this bug with the latest firmware we released: https://git.kernel.org/cgit/linux/kernel/git/egrumbach/linux-firmware.git/plain/iwlwifi-7260-9.ucode?id=1f9f9df353b11c9ea0130dfe68053aaaee376df3 I don't think that it should be fixed. But OTOH, it is worth checking. Note that since this bug is caused by an AP bug - it is very low priority. The new firmware can't connect (while the one from comment 51 can). [ 483.981326] iwlwifi 0000:04:00.0: loaded firmware version 25.228.9.0 op_mode iwlmvm [ 495.217051] wlan1: authenticate with f8:d1:11:39:1a:8c [ 495.220118] wlan1: send auth to f8:d1:11:39:1a:8c (try 1/3) [ 495.221872] wlan1: authenticated [ 495.222514] wlan1: associate with f8:d1:11:39:1a:8c (try 1/3) [ 495.226764] wlan1: RX AssocResp from f8:d1:11:39:1a:8c (capab=0x31 status=0 aid=2) [ 495.229703] wlan1: associated [ 495.527188] iwlwifi 0000:04:00.0: No association and the time event is over already... [ 495.527225] wlan1: Connection to AP f8:d1:11:39:1a:8c lost # sha1sum /lib/firmware/iwlwifi-7260-9.ucode 98fb865e5f0c7b2bf52dc5a1ee77a0752eea75ad /lib/firmware/iwlwifi-7260-9.ucode after a long discussion with the firmware, they can't integrate the change that they made in the code base for the moment. Closing the bug as Will not Fix. I have Sony Xperia L with latest stock firmware and only one possibility to connect to wifi hotspot on this device from my Latitude E7440 with 7260 is use firmware in comment #51. Other devices (for example another notebook with another wifi card) connects to Xperia hotspot w/o any problem. Is it possible to implement it as an option of driver? I know changes in firmware are still necessary. please try the latest firmware from: https://git.kernel.org/cgit/linux/kernel/git/iwlwifi/linux-firmware.git/ You'll need -10 or -12. We added a few relaxations (for problems seen with Airport Xtrem). But it still won't help for the broken beacon attached in this bug. Please give it a try. Debian unstable, linux-image-3.19.0-trunk-686-pae, latest -12 firmware: CAN'T CONNECT Sorry, nothing can be done from the driver. And from firmware? I am afraid that won't happen... It's sad. IMHO it's against "be conservative in what you send and liberal in what you receive" rule. OK, i can throw my poor Sony phone to the trash and get a newer one (again) or do the same with Intel wifi card. But i would prefer to make software compatible. I understood fixing the real reason would be better but it's not possible now. Sony will never release a fix and unofficial ROMs with newer android have another bugs and still in beta stage. Guys, I have noticed that if I get into the "No association and the time event is over already..." cycle, restarting wpa_supplicant (2.3, but probably earlier too) helps. This is what I do: modprobe -rv iwlmvm # keeps NetworkManager from auto-relaunching systemctl stop NetworkManager pkill -e wpa modprobe -v iwlwifi systemctl start NetworkManager Just upgraded to wpa_supplicant-2.4, we'll see how this does. @Leho - you are restarting the whole driver. And if that helps, it means that you are not suffering from the same bug as Johannes. Yes, in the example I'm restarting everything. NetworkManager didn't auto-restart before <1.0.0, but it seems to now. Either way, previous isolation attempts have proven that on this system, it has been wpa_supplicant alone responsible for... something. Would be interesting to hear from others, not sure where else if not on this bug though. Emmanuel, is there any chance that you have a newer version of the firmware with the patch from Comment#51. Currently using Linux 4.1 and it won't accept anything under ucode-10. Thank you. No. For some (but not all) cases of this AP brokenness, this helps: modprobe iwlwifi 11n_disable=1 (tested in Domicilio Lorenzo hotel in Davao City, PH - one of their access points reliably triggers the issue) I had some TP-Links AP with OpenWRT system. Unstable with all firmware versions, module parameters (11n disable for example) etc. After months i changed AP to Mikrotik router (Atheros AR9300). Unstable again. Then i changed "disconnect timeout" to 15s on Mikrotik and voila, it works w/o probs even with default iwlwifi settings. I think Intel card/firmware/module is buggy or too much strict for some AP. Very badly usable in some configurations. |
Created attachment 139941 [details] lspci -vvv A new Thinkpad Yoga 20CD00AMGE with Intel(R) Dual Band Wireless AC 7260 fails to connect to the AP in the office, while it works on my home AP. WPA2 is used in both cases. I tried with kernel 3.14.7 and 3.15. There is a high density of APs and clients and lots of traffic in the office environment. The failure is caused by: [ 32.200142] iwlwifi 0000:04:00.0: No association and the time event is over already... Some more detail from dmesg: [ 1.161462] iwlwifi 0000:04:00.0: irq 61 for MSI/MSI-X [ 1.164277] iwlwifi 0000:04:00.0: loaded firmware version 22.24.8.0 op_mode iwlmvm [ 1.180348] iwlwifi 0000:04:00.0: Detected Intel(R) Dual Band Wireless AC 7260, REV=0x144 [ 1.180651] iwlwifi 0000:04:00.0: L1 Enabled; Disabling L0S [ 1.180885] iwlwifi 0000:04:00.0: L1 Enabled; Disabling L0S lspci -vn: 04:00.0 0280: 8086:08b2 (rev 83) Subsystem: 8086:4270 Full lspci -vvv output attached. wpa_supplicant retries in an endless loop: [ 28.181020] cfg80211: Calling CRDA to update world regulatory domain [ 31.890168] wlp4s0: authenticate with f8:d1:11:39:1a:8c [ 31.892972] wlp4s0: send auth to f8:d1:11:39:1a:8c (try 1/3) [ 31.903458] wlp4s0: authenticated [ 31.927239] wlp4s0: associate with f8:d1:11:39:1a:8c (try 1/3) [ 31.963374] wlp4s0: RX AssocResp from f8:d1:11:39:1a:8c (capab=0x431 status=0 aid=5) [ 31.968857] wlp4s0: associated [ 32.200142] iwlwifi 0000:04:00.0: No association and the time event is over already... [ 32.200154] wlp4s0: Connection to AP f8:d1:11:39:1a:8c lost [ 32.270241] cfg80211: Calling CRDA to update world regulatory domain (repeats) For testing I used a bare Arch Linux installation and ran wpa_supplicant manually: echo 'network={ ssid="foo" psk="bar" }' >w wpa_supplicant -Dnl80211 -iwlp4s0 -cw -d ... wlp4s0: State: ASSOCIATED -> 4WAY_HANDSHAKE ... wlp4s0: WPA: Key negotiation completed with f8:d1:11:39:1a:8c [PTK=CCMP GTK=CCMP] ... wlp4s0: CTRL-EVENT-CONNECTED - Connection to f8:d1:11:39:1a:8c completed [id=0 id_str=] ... nl80211: Drv Event 20 (NL80211_CMD_DEL_STATION) received for wlp4s0 nl80211: Delete station f8:d1:11:39:1a:8c nl80211: Drv Event 39 (NL80211_CMD_DEAUTHENTICATE) received for wlp4s0 nl80211: Deauthenticate event wlp4s0: Event DEAUTH (12) received wlp4s0: Deauthentication notification wlp4s0: * reason 4 (locally generated) wlp4s0: * address f8:d1:11:39:1a:8c Deauthentication frame IE(s) - hexdump(len=0): [NULL] wlp4s0: CTRL-EVENT-DISCONNECTED bssid=f8:d1:11:39:1a:8c reason=4 locally_generated=1 wlp4s0: Auto connect enabled: try to reconnect (wps=0 wpa_state=9) I also tried "iw dev wlp4s0 set power_save off", it did not help. Not really knowing what I'm doing I simply tried to comment out the check causing the disconnect: drivers/net/wireless/iwlwifi/mvm/time-event.c:iwl_mvm_te_handle_notif() iwl_mvm_te_check_disconnect(mvm, te_data->vif, "No association and the time event is over already..."); And lo and behold, it can connect.