Latest working kernel version: 2.6.26 Earliest failing kernel version: 2.6.28 Distribution: Debian Sid Hardware Environment: iwl3945 (Thinkpad R61) Software Environment: wpa_supplicant 0.6.4-3, wireless-tools 29-1.1 Problem Description: After some time (totally random, ranges from several minutes to several hours) following kernel message appears and I am disconnected from the AP: [31719.537037] wlan0: No ProbeResp from current AP 00:1b:fc:57:a9:90 - assume out of range After that, I can no longer connect and have to remove whole mac80211 stack from the kernel with: ifdown --force wlan0 rmmod iwl3945 mac80211 lib80211 cfg80211 Then put modules back and reconnect. I am using following wpa_supplicant options: key_mgmt=WPA-PSK proto=RSN pairwise=CCMP group=CCMP Googling suggests that I am not alone in this and several other users are experiencing this with 2.6.27, but I can not confirm that. Some suggests increasing IEEE80211_MONITORING_INTERVAL in net/mac80211/mlme.c to more than 2HZ, which I am going to try right after this report. Steps to reproduce: Connect to an AP with wpa_supplicant configuration above and wait. I am not able to test with an open wifi in a near future.
Created attachment 20051 [details] Kernel configuration Attached kernel configuration. I built the kernel with make-kpkg.
Created attachment 20052 [details] lspci on my machine Attached lspci output to exact hardware configuration.
> Some suggests > increasing IEEE80211_MONITORING_INTERVAL in net/mac80211/mlme.c to more than > 2HZ, which I am going to try right after this report. Dit not help. https://bugs.launchpad.net/ubuntu/+source/linux/+bug/200500 suggests that this bug is not iwl3945 specific and probably present in kernel I've reported as "Last working".
http://lkml.indiana.edu/hypermail/linux/kernel/0807.0/1712.html seems to describe the very same problem yet the only solution he suggests is to disable the problematic section. By the way, this is my `iwlist wlan0 scan` output: Cell 01 - Address: 00:1B:FC:57:A9:90 ESSID:"" Mode:Master Channel:1 Frequency:2.412 GHz (Channel 1) Quality=66/100 Signal level:-67 dBm Noise level=-127 dBm Encryption key:on IE: Unknown: 0003000000 IE: Unknown: 010C82848B8C12969824B048606C IE: Unknown: 030101 IE: Unknown: 050400010000 IE: Unknown: 2A0100 IE: Unknown: 2F0100 IE: IEEE 802.11i/WPA2 Version 1 Group Cipher : CCMP Pairwise Ciphers (1) : CCMP Authentication Suites (1) : PSK IE: Unknown: DD060010180200F0 Bit Rates:1 Mb/s; 2 Mb/s; 5.5 Mb/s; 6 Mb/s; 9 Mb/s 11 Mb/s; 12 Mb/s; 18 Mb/s; 24 Mb/s; 36 Mb/s 48 Mb/s; 54 Mb/s Extra:tsf=0000031e2722e188 Extra: Last beacon: 72ms ago and this is `iwconfig wlan0`: wlan0 IEEE 802.11abg ESSID:"PJK" Mode:Managed Frequency:2.412 GHz Access Point: 00:1B:FC:57:A9:90 Bit Rate=54 Mb/s Tx-Power=15 dBm Retry min limit:7 RTS thr:off Fragment thr=2352 B Encryption key:XXXX-XXXX-XXXX-XXXX-XXXX-XXXX-XXXX-XXXX [2] Security mode:open Power Management:off Link Quality=74/100 Signal level:-60 dBm Noise level=-127 dBm Rx invalid nwid:0 Rx invalid crypt:0 Rx invalid frag:0 Tx excessive retries:0 Invalid misc:0 Missed beacon:0 (the later after fresh reconnect)
We have seen some drivers filtering beacons that triggered this sort of problem, but I don't recall iwl3945 being one of them. Anyway, we probably should react better to this just in case the AP really has moved out of range... I'll Cc Johannes and Reinette in case they have other/better insight.
We're a little over-zealous here, if we send a probe request and never get an answer we simply assume the AP is out of range. We don't try again or anything. Kalle's beacon filtering work will clean up some parts of this and potentially remove this code entirely.
The beacon filtering work mentioned by Johannes has been merged into wireless-testing. Could you please test with this new code?
Doesn't actually help -- I see this bug on my system too now (only with hidden SSIDs!) and the new code will say "beacon loss detected" and then behave very similarly.
I'm cloning wireless-testing 8f2487d3f1b445e20aebba2cb7b20f1896b94f6f right now and will test with both visible and hidden SSID after.
I see the same problem using iwl3945 (Thinkpad T60 with Intel wireless) using vanilla 2.6.29. Sometimes every 10 minutes, sometimes once every 2 hours or so, there's a message like the following in the syslog or the dmesgs: wlan0: No ProbeResp from current AP 00:18:3a:95:d2:72 - assume out of range and then the wireless connection gets dropped. My router is an open router (no encryption, has a public SSID). At first I thought maybe the router got moved behind a brick wall but moving the router even nearer (at maximum it's only 20 feet away) made no difference. I don't see the problem with 2.6.27.4 (haven't tried any kernel in between).
Correction: I just saw the problem occur with 2.6.27.4. So it's not strictly a regression for me, but it happens far more often on 2.6.29 than on 2.6.27.4
I apologize for such a late response, but I found myself unexpectedly busy later. I see the same problem on wireless-testing (as of 8f2487d3f1b445e20aebba2cb7b20f1896b94f6f). I see some "beacon loss from AP 00:..." and then "no probe response from AP 00:..." followed by "deauthenticated (Reason: 7)" or sometimes "Reason: 3" in like 10 seconds after. Is there anything else to try?
To be exact: [ 6035.013039] wlan0: beacon loss from AP 00:1b:fc:57:a9:90 - sending probe request [ 6037.013044] wlan0: no probe response from AP 00:1b:fc:57:a9:90 - disassociating [ 6047.829310] wlan0: deauthenticated (Reason: 7) And this happens both with hidden and visible SSID.
This could be similar to the issue discussed in http://marc.info/?l=linux-wireless&m=123983768518133&w=2 - could you please try to increase the value of IEEE80211_MONITORING_INTERVAL found in net/mac80211/mlme.c ? One user reported that 10 seconds worked. Could you please try that also? #define IEEE80211_MONITORING_INTERVAL (10 * HZ)
There is now a patch in wireless-testing that addresses the problem. Could you please try with: commit 3b6dc5a431e4fef35717cba53544a95209f49b68 Author: Kalle Valo <kalle.valo@iki.fi> Date: Sun Apr 19 08:47:19 2009 +0300 mac80211: fix beacon loss detection after scan Currently beacon loss detection triggers after a scan. A probe request is sent and a message like this is printed to the log: wlan0: beacon loss from AP 00:12:17:e7:98:de - sending probe request But in fact there is no beacon loss, the beacons are just not received because of the ongoing scan. Fix it by updating last_beacon after the scan has finished. Reported-by: Jaswinder Singh Rajput <jaswinder@kernel.org> Signed-off-by: Kalle Valo <kalle.valo@iki.fi> Acked-by: Johannes Berg <johannes@sipsolutions.net> Signed-off-by: John W. Linville <linville@tuxdriver.com>
> Currently beacon loss detection triggers after a scan. A probe request > is sent and a message like this is printed to the log: No scans happen on my laptop (I don't run Neetwork Manager) -- unless dhcp scans itself? -- and I don't use any encryption, but I still get "No ProbeResp" messages: wlan0: No ProbeResp from current AP 00:18:3a:95:d2:72 - assume out of range That msg is always associated with this 'iwconfig wlan0' output: wlan0 IEEE 802.11abg ESSID:"CoppsHillTerrace" Mode:Managed Frequency:2.412 GHz Access Point: Not-Associated Tx-Power=15 dBm Retry min limit:7 RTS thr:off Fragment thr=2352 B Encryption key:off Power Management:off Link Quality:0 Signal level:0 Noise level:0 Rx invalid nwid:0 Rx invalid crypt:0 Rx invalid frag:0 Tx excessive retries:0 Invalid misc:0 Missed beacon:0 Could this be a separate problem?
Could somebody more familiar with mac80211 please look at this issue? Summary: Associations are being dropped with high frequency (2.6.27.4 did do so, but not as often as 2.6.29). The symptom can be summarized with familiar "No ProbeResp from current AP" message even if the AP is 20 feet away. Original reporter of this bug has not responded to requests for testing of recent mac80211 changes in this area. Thank you very much
Please accept, once again, my apologies for so late response. I am running linux-2.6.30-rc3-wl and _I_am_not_sure_ whether this problem still occurs in it's original form. I will investigate further tonight and return with dmesg output. With 2.6.28 I had to remove whole stack from the kernel in order to get the wifi back running. Now it only takes reconfiguration of interface. I got used to be running a ping-gw loop that will reconfigure if this happens and since reconfig is like 5 seconds long, I don't usually notice any problems.
Alright, with 2.6.30-rc3-wl I was not able to reproduce the problem. The AP was not hiding it's ESSID, so I can't say the problem is solved, but it definitely seems that way.
Closing on basis of comment 19 -- please reopen if the issue reappears...thanks!
> Closing on basis of comment 19 -- please reopen if the issue > reappears...thanks! I just tested 2.6.30-rc5-wl (commit bf2c6a38af60), and I see the same problem as with 2.6.29 (and 2.6.27.4, though less often): wlan0: no probe response from AP 00:18:3a:95:d2:72 - disassociating There's no encryption, the SSID is not hidden, and the router is about 20 feet away, so the signal strength is high. I don't think it's due to the router. As further evidence in that direction (though not conclusive), I had the same 'no probe response' problem with a router at MIT (with public SSID and no encryption) -- though that was with vanilla -rc5, not with the wireless-testing changes. I don't run Network Manager or other programs that regularly scan. There's just a scan once when the interface is brought up to find all the nearby SSID's. The laptop is a Thinkpad T60 with Intel wireless and graphics. Here is the lspci -v output for the wireless card: 03:00.0 Network controller: Intel Corporation PRO/Wireless 3945ABG [Golan] Network Connection (rev 02) Subsystem: Intel Corporation Device 1010 Flags: bus master, fast devsel, latency 0, IRQ 30 Memory at edf00000 (32-bit, non-prefetchable) [size=4K] Capabilities: [c8] Power Management version 2 Capabilities: [d0] MSI: Mask- 64bit+ Count=1/1 Enable+ Capabilities: [e0] Express Legacy Endpoint, MSI 00 Capabilities: [100] Advanced Error Reporting Capabilities: [140] Device Serial Number 22-2f-25-ff-ff-02-13-00 Kernel driver in use: iwl3945
(In reply to comment #21) > > Closing on basis of comment 19 -- please reopen if the issue > > reappears...thanks! > > I just tested 2.6.30-rc5-wl (commit bf2c6a38af60), and I see the same > problem as with 2.6.29 (and 2.6.27.4, though less often): > > wlan0: no probe response from AP 00:18:3a:95:d2:72 - disassociating There's no reason to believe that to be an actual bug or regression though. Wireless links are not perfect, so occasionally they will drop enough packets for us to assume that the AP has died. Sometimes APs even do interrupt beaconing for a short period of time. One point to take away from this is that we should try to reassociate in that case. However, a large percentage of users simply use NM with wpa_supplicant for all their networks, in which case wpa_supplicant handles such reassociations and everything just works _despite_ the occasional hiccup in the connection. Therefore, despite the fact that this has been on our todo list for ages, nobody has bothered to fix it in the kernel. This bug was concerned with that happening all the time, and then the reassociation failing, presumably due to a bug in the driver that caused the hardware to stop functioning, thereby causing _both_ the probe response timeout _and_ the reconnection failure. I don't consider an occasional probe response timeout as you are reporting to be a bug -- at best it's a feature request to reconnect afterwards, but since there's an easy workaround (run wpa_supplicant) I'm not inclined to work on that.
> I don't consider an occasional probe response timeout as you are > reporting to be a bug -- at best it's a feature request to reconnect > afterwards, but since there's an easy workaround (run wpa_supplicant) > I'm not inclined to work on that. Prompted by that suggestion, I finally learnt how to use wpa_supplicant, and installed a minimal wpa_supplicant configuration (without using NM). It seems to be working and reassociates as needed with the AP.