Bug 16462
Summary: | unable to connect to hidden SSID AP on legal channel 13 | ||
---|---|---|---|
Product: | Networking | Reporter: | Daniel J Blueman (daniel.blueman) |
Component: | Wireless | Assignee: | Johannes Berg (johannes) |
Status: | CLOSED CODE_FIX | ||
Severity: | high | CC: | andreas.neiser, huang.jin, johannes, linville, maciej.rutecki, networking_wireless, rjw, wey-yi.w.guy |
Priority: | P1 | ||
Hardware: | All | ||
OS: | Linux | ||
Kernel Version: | 2.6.35-rc6 | Subsystem: | |
Regression: | Yes | Bisected commit-id: | |
Bug Depends on: | |||
Bug Blocks: | 16055 | ||
Attachments: |
debug attaching to AP on channel 11 (success)
debug attaching to AP on channel 13 (failure) Trace from attaching to channel 13 AP with 2.6.35 debug attaching to AP on channel 13 (failure) Trace from attaching to channel 13 AP with 2.6.35 test script trace and dmesg logs with failing and working AP setup patch for 2.6.35.2 failing AP config for ath9k patch to enable probe-after-rx on 2.4 GHz |
Description
Daniel J Blueman
2010-07-25 17:06:11 UTC
Based on the provided logs, I would say this is _not_ a case of any regulatory enforcement by the kernel -- it appears to be changing to the requested channel and simply not authenticating. Hopefully the iwlwifi people have some ideas? we will look into this, we usually don't test channel 12/13 on our regular testcases, but we do have one AP can beacon on channel 12/13 and we will try to re-pro the problem here. Thanks Wey Thanks Wey. To simplify matters, I can reproduce the issue with this kernel: http://kernel.ubuntu.com/~kernel-ppa/mainline/v2.6.35-rc6-maverick/linux-image-2.6.35-020635rc6-generic_2.6.35-020635rc6_amd64.deb If it fails to reproduce there, I can quickly gather debugging as you prefer. yes and please, if you could run load the module with debug=0x43fff, then send us the log, it will be very helpful for us to understand what the problem is. Thanks Wey Ok, comparing the debug goodness from the AP on channel 11 (working) and channel 13 (failing), it is clear that in the latter, iwlagn is sending commands to the MAC controller with channel 1 specified, ie: * without RXON_FILTER_ASSOC_MSK * channel = 1 * bssid = 00:00:00:00:00:00 iwl_print_rx_config_cmd u16 channel: 0x1 This directly explains why we see the timeouts, as the radio is tuned to the wrong channel. Attaching full logs in working and non-working case. Created attachment 27268 [details]
debug attaching to AP on channel 11 (success)
Created attachment 27269 [details]
debug attaching to AP on channel 13 (failure)
the following are normal behavior and you ca nsee on both success and failure case: * without RXON_FILTER_ASSOC_MSK * channel = 1 * bssid = 00:00:00:00:00:00 iwl_print_rx_config_cmd u16 channel: 0x1 Thanks for the log, we are working on it. Wey Indeed these messages are normal and expected behaviour, but in the working case (AP on channel 11), we consistently see channel 11 being specified and in the failing case (AP on channel 13), we consistently see channel 1 being specified. Thanks for your help guys! All the better if the fix for this regression is suitable for inclusion into 2.6.35-final, as the impact is significant in many countries. Hi Daniel, Thanks for the help, it is good information and wea are looking into it now and we might need your help for testing once we have the the fix for it. Wey I was able to reproduce this once, but then not again. Any idea why that may be? FWIW, going back to channel 1 is normal, but it should still go to channel 13 obviously. This puts a different light on it though, I was suspecting a bug elsewhere. Daniel, can you try to gather tracing data? To do that see if you can compile iwlwifi with CONFIG_IWLWIFI_DEVICE_TRACING, then obtain trace-cmd from git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/trace-cmd.git and run "trace-cmd record -e iwlwifi" as root/sudo. Thanks Johannes. I no longer have access to an AP with channels 12 and 13 to test with, so was trying to run my Debian AP at home on channel 13. I'll try to get around the world regulatory domain set in it's EEPROM (despite being purchased in the UK), preventing legal use of these channels. If I succeed, I'll grab the trace. Intel will be trying to reproduce this anyway. Note that the interesting thing with channel 1 being erroneously used, rather than 13, is: 13 % len(US channels) == 1. Yeah, in case it wasn't clear, I do work for Intel, but now I can no longer reproduce this for some reason even with the same hardware and microcode you use. :-) Will have to go back to your kernel version too I guess. Hm. I can't even reproduce with that kernel :-( I reproduced this against my ath9k AP on channel 13 with 2.6.35-final. Trace attached; let me know if you need further testing/information. Created attachment 27316 [details]
Trace from attaching to channel 13 AP with 2.6.35
Thanks. I see no evidence of it even trying to connect in this log, which is very odd. It did print out the "timed out" messages again, right? Created attachment 27317 [details]
debug attaching to AP on channel 13 (failure)
Created attachment 27318 [details]
Trace from attaching to channel 13 AP with 2.6.35
updated trace and iwlagn debug messages, captured at the time of tracing
Johannes: thanks for your time so far. Do the last attachments provide the information you're looking for? Just let me know if you need a different pattern of debug or capture etc. Thanks, Daniel I can't quite figure out the issue, and I can only reproduce it exactly once every day :-( I have a vague feeling this could be related to the RTS/CTS mess, could you try this change? --- a/drivers/net/wireless/iwlwifi/iwl-agn-hcmd.c +++ b/drivers/net/wireless/iwlwifi/iwl-agn-hcmd.c @@ -212,7 +212,11 @@ static void iwlagn_chain_noise_reset(struct iwl_priv *priv) static void iwlagn_rts_tx_cmd_flag(struct ieee80211_tx_info *info, __le32 *tx_flags) { - *tx_flags |= TX_CMD_FLG_RTS_CTS_MSK; + if ((info->control.rates[0].flags & IEEE80211_TX_RC_USE_RTS_CTS) || + (info->control.rates[0].flags & IEEE80211_TX_RC_USE_CTS_PROTECT)) + *tx_flags |= TX_CMD_FLG_RTS_CTS_MSK; + else + *tx_flags &= ~TX_CMD_FLG_RTS_CTS_MSK; } /* Calc max signal level (dBm) among 3 possible receivers */ Hi Johannes, this patch doesn't help. The debug still shows the RX_CONFIG_COMMAND as having channel 1 in it's payload, so the radio will be incorrectly listening on that channel for a reply from the AP on channel 13. Thanks for testing. I think you mean RXON? But I'm still not convinced that is the error, since I do see it go to channel 13 for the assoc. And it will normally go to channel 1 again after trying (which isn't really necessary, but still). BTW, your second trace data doesn't contain the association sequence, it only seems to contain scanning? Oh also, did this work on 2.6.34, if you ever tried? We occasionally seem to be running into a similar bug on 5 GHz channels after going from 2.6.34 to 2.6.35. Still poking at straws ... but I found another stupid bug in the driver, can you try this patch? diff --git a/drivers/net/wireless/iwlwifi/iwl-power.c b/drivers/net/wireless/iwlwifi/iwl-power.c index da1f2ae..a444d0e 100644 --- a/drivers/net/wireless/iwlwifi/iwl-power.c +++ b/drivers/net/wireless/iwlwifi/iwl-power.c @@ -284,13 +284,12 @@ int iwl_power_update_mode(struct iwl_priv *priv, bool force) priv->hw->conf.flags & IEEE80211_CONF_IDLE) iwl_static_sleep_cmd(priv, &cmd, IWL_POWER_INDEX_5, 20); else if (priv->cfg->ops->lib->tt_ops.lower_power_detection && - priv->cfg->ops->lib->tt_ops.tt_power_mode) { - if (priv->cfg->ops->lib->tt_ops.lower_power_detection(priv)) { - /* in thermal throttling low power state */ - iwl_static_sleep_cmd(priv, &cmd, - priv->cfg->ops->lib->tt_ops.tt_power_mode(priv), - dtimper); - } + priv->cfg->ops->lib->tt_ops.tt_power_mode && + priv->cfg->ops->lib->tt_ops.lower_power_detection(priv)) { + /* in thermal throttling low power state */ + iwl_static_sleep_cmd(priv, &cmd, + priv->cfg->ops->lib->tt_ops.tt_power_mode(priv), + dtimper); } else if (!enabled) iwl_power_sleep_cam_cmd(priv, &cmd); else if (priv->power_data.debug_sleep_level_override >= 0) Oh wait, I think that's not relevant to 2.6.35? Yes, I meant RXON. Trying 2.6.34, this does not work either. There is no member 'lower_power_detection', so indeed I can't apply this patch against 2.6.35. The trace and dmesg output were captured together; this suggests the MAC wasn't sent a command perhaps? Or should I capture both again? U iwl_mac_config enter to channel 13 changed 0x100 U iwl_power_sleep_cam_cmd Sleep command for CAM U iwl_mac_config Not re-sending same RXON configuration. U iwl_mac_config leave wlan0: direct probe to 00:25:d3:5e:ef:62 (try 1) ieee80211 phy1: I iwl_mac_tx enter ieee80211 phy1: I iwl_mac_tx leave wlan0: direct probe to 00:25:d3:5e:ef:62 (try 2) ieee80211 phy1: I iwl_mac_tx enter ieee80211 phy1: I iwl_mac_tx leave wlan0: direct probe to 00:25:d3:5e:ef:62 (try 3) ieee80211 phy1: I iwl_mac_tx enter ieee80211 phy1: I iwl_mac_tx leave wlan0: direct probe to 00:25:d3:5e:ef:62 timed out Unable to reproduce the problem consistently in the lab. Configurations: ----------------- Intel(R) Ultimate N WiFi Lin 5300 AGN, REV-0x24 Kernel: 2.6.35-rc6 uCode: 8.24.2.12 Platform: Calpella AP: CISCO 1200 Test ---------------- Repeatedly switch between two networks (one of them is on channel 12). See attached scripts. Enough wait time is given between commands. Result ---------------- Ran for a 100 times. Only 2% of test runs failed to associate with unspecified errors. Reproducibility is very low. Created attachment 27367 [details]
test script
Test script used to reproduce the issue. Result: cannot reproduce the issue (only 2% failures)
*** Bug 16619 has been marked as a duplicate of this bug. *** Created attachment 27510 [details]
trace and dmesg logs with failing and working AP setup
Hope this helps in debugging the problem!
After 4 tries the connection works if AP with Channel 3 is used, AP with channel 12 fails 4 out of 4 (so all "failing" files show failed attempts, I think). trace-cmd (see file trace.dat) was run during ./euro_loop.sh. dmesg with debug info supplied after each connection try. leading numbers under logs/ show which try of euro_loops.sh (which was modified for my needs).
No WPA encryption was used. Please don't hesitate if you need further information.
Andreas, thanks for the information. Aaron, could we re-run the test with euro_loop.sh script provided by Andreas? Thanks Wey @Wey,Aaron: I didn't change anything in euro_loop.sh concerning reproducing the bug. I just added some lines to collect the logs (and provided so you can see when the logs are created). Hopefully someone can analyze the trace. Hi Andreas, looks like you have 2 APs AP1: 00:21:29:b5:e9:7b, channel #3 AP2: 00:04:0e:6b:80:e6, channel #12 all the success cases are connect to AP1, but never AP2. Correct? what if you change AP2 to different channel (say channel #6), and ran the same test, is that works? Thanks Wey Yes, exactly, I worked with 2 APs and AP1 was the working one. It's a bit tedious to setup the two APs again, do you really need that information? AFAIK it's quite obvious since when I change AP2 to channel 11 (the current setup), wireless works flawlessly (not one "probe" line in dmesg). Tell me if you need a definitive answer, then I will reconstruct the setup. no problem, I just need to make sure it is the setting you have. Thanks for the help, I do not see anything stand-out in the log file, but I will try re-produce the issues here. Wey btw, what kernel version you are using? Wey Can you please pick up this patch and see if it helps? http://git.kernel.org/?p=linux/kernel/git/linville/wireless-2.6.git;a=patch;h=94597ab23ea10b3bdcba534be00a9f7b35791c07 I'm using ArchLinux Kernel 2.6.35.2 (mostly vanilla sources), but the patch does not apply: patching file drivers/net/wireless/iwlwifi/iwl-1000.c Hunk #1 succeeded at 252 (offset -8 lines). patching file drivers/net/wireless/iwlwifi/iwl-3945.c Hunk #1 succeeded at 915 (offset 146 lines). Hunk #2 succeeded at 2849 (offset 148 lines). patching file drivers/net/wireless/iwlwifi/iwl-4965.c Hunk #1 succeeded at 2178 (offset -45 lines). patching file drivers/net/wireless/iwlwifi/iwl-5000.c Hunk #1 succeeded at 489 (offset -17 lines). Hunk #2 succeeded at 549 (offset 12 lines). Hunk #3 succeeded at 611 (offset 14 lines). Hunk #4 FAILED at 628. Hunk #5 FAILED at 659. 2 out of 5 hunks FAILED -- saving rejects to file drivers/net/wireless/iwlwifi/iwl-5000.c.rej patching file drivers/net/wireless/iwlwifi/iwl-6000.c Hunk #1 succeeded at 431 (offset 50 lines). Hunk #2 succeeded at 578 (offset 89 lines). Hunk #3 succeeded at 650 (offset 87 lines). Hunk #4 FAILED at 637. Hunk #5 FAILED at 714. Hunk #6 FAILED at 821. Hunk #7 FAILED at 859. Hunk #8 FAILED at 933. 5 out of 8 hunks FAILED -- saving rejects to file drivers/net/wireless/iwlwifi/iwl-6000.c.rej patching file drivers/net/wireless/iwlwifi/iwl-agn-hcmd.c Hunk #1 FAILED at 211. Hunk #2 succeeded at 266 (offset -2 lines). 1 out of 2 hunks FAILED -- saving rejects to file drivers/net/wireless/iwlwifi/iwl-agn-hcmd.c.rej patching file drivers/net/wireless/iwlwifi/iwl-agn-tx.c Hunk #1 succeeded at 376 (offset -3 lines). Hunk #2 succeeded at 450 with fuzz 2 (offset -3 lines). patching file drivers/net/wireless/iwlwifi/iwl-agn.c Hunk #1 succeeded at 200 (offset -2 lines). Hunk #2 succeeded at 1577 (offset -34 lines). Hunk #3 FAILED at 3421. Hunk #4 FAILED at 3486. 2 out of 4 hunks FAILED -- saving rejects to file drivers/net/wireless/iwlwifi/iwl-agn.c.rej patching file drivers/net/wireless/iwlwifi/iwl-core.c Hunk #1 succeeded at 400 (offset -1 lines). Hunk #2 succeeded at 1953 (offset 67 lines). patching file drivers/net/wireless/iwlwifi/iwl-core.h Hunk #1 succeeded at 102 (offset -2 lines). Hunk #2 succeeded at 241 (offset -9 lines). Hunk #3 succeeded at 310 (offset -9 lines). Hunk #4 succeeded at 376 (offset -15 lines). patching file drivers/net/wireless/iwlwifi/iwl3945-base.c Hunk #1 succeeded at 434 (offset -1 lines). if it's ok for you, you can get the latest build from http://git.kernel.org/?p=linux/kernel/git/linville/wireless-testing.git;a=summary which has the patch already. Wey Previously, I tested against APs on channel 13. Re-testing on channel 12, I can connect as expected on APs on channel 12. Tested with 2.6.34.4, 2.6.35.2 and 2.6.36-rc1. Thus the title of this bug is incorrect. Thus the issue I experience is only with connecting to APs on channel 13. Johannes, were you testing against an AP on channel 12 or 13? Well, but for me it's channel 12 which is not working. I can test tomorrow if channel 13 works for me or not. Yeah, it's not consistent here. It typically reproduces exactly once in the morning for me, and after that starts working. I have no idea why :-) I'll provide a patch against 2.6.35 since it needs to go there anyway. @Johannes: Maybe this helps in reproducing the bug: Well, if I was connected to the WLAN successfully on channel 11, then change the AP to channel 12/13 and then reconnect, it works, but not flawlessly: ADDRCONF(NETDEV_CHANGE): wlan0: link becomes ready wlan0: no IPv6 routers present wlan0: deauthenticated from 00:04:0e:6b:80:e6 (Reason: 1) iwlagn 0000:03:00.0: ACTIVATE a non DRIVER active station id 0 addr 00:04:0e:6b:80:e6 cfg80211: Calling CRDA to update world regulatory domain wlan0: authenticate with 00:04:0e:6b:80:e6 (try 1) wlan0: authenticate with 00:04:0e:6b:80:e6 (try 2) wlan0: authenticate with 00:04:0e:6b:80:e6 (try 3) wlan0: authentication with 00:04:0e:6b:80:e6 timed out wlan0: direct probe to 00:04:0e:6b:80:e6 (try 1) wlan0: direct probe to 00:04:0e:6b:80:e6 (try 2) wlan0: direct probe responded wlan0: authenticate with 00:04:0e:6b:80:e6 (try 1) wlan0: authenticated wlan0: associate with 00:04:0e:6b:80:e6 (try 1) wlan0: associate with 00:04:0e:6b:80:e6 (try 2) wlan0: RX AssocResp from 00:04:0e:6b:80:e6 (capab=0x451 status=0 aid=1) wlan0: associated You see that there are some "tries"... BUT: If you do `rmmod iwlagn && modprobe iwlagn` and try to reconnect on channel 12 or 13 (@Daniel: both channels don't work for me), it fails constantly. After the failing tries on ch12/13, changing the AP to ch11 brings the wireless back working by just reconnecting (no reload of iwlagn needed). It's good that you prepare a working patch against 2.6.35, I downloaded ~300 MB of wireless-testing and I don't know how to handle it. Is there a good howto how to compile just iwlagn and it's dependencies? Andreas, check out http://wireless.kernel.org/en/users/Download. Yesterday's release should work fine. Created attachment 27518 [details]
patch for 2.6.35.2
Please try this patch on 2.6.35.2.
@Johannes: Well, now with your patch the channel 12 works (YES!), but when I switch to ch13, the WLAN is not discovered (`sudo iw wlan0 scan` doesn't show it). When I switch back from ch13 to ch12, I have to reload the module in order to see my ssid again! Before the patch, the WLAN appeared on both channels (12&13) but a connection timed out. Do you need some debug info? Uh oh. But this doesn't change scanning, how often did you try to scan? Just once? Scanning on channels 12/13 isn't as reliable as on other channels since they're scanned passively. Also, it may be worthwhile to actually turn off your AP or your wireless for more than 15 seconds when you switch channel from 12 to 13 or vice versa, because otherwise I'm not sure what happens with the cached scan entries. Well, I tried the scan several times, and I'm quite convinced of the fact that ch13 does not even show up (as described). Unfortunately, I don't have time to concentrate on this problem further for now (in three weeks things look different, if the problem still persists;) since switching to channel 11 solves the problem for me (at least temporarily). I hope you can figure out what made the regression, there seems to be a lot changes between .34 and .35 concerning iwlagn... I wished I was a better kernel hacker, it's really interesting to dig into... but now other priorities came up. If there's something which could be really important to test, don't hesitate, maybe I find some time nevertheless. Excellent - retesting on channel 13 on 2.6.36-rc3 shows the issue is resolved. Fantastic work Johannes et al! Daniel Excellent - retesting on channel 13 on 2.6.36-rc3 shows the issue is resolved. Fantastic work Johannes et al! Daniel Fixed in mainline by commit 94597ab23ea10b3bdcba534be00a9f7b35791c07 . I spoke too soon, and thought it did work once previously once, I have not been able to connect other times. The behaviour is the same with -rc4. Again, changing the AP channel to 12, I can connect immediately; moving back to channel 13, I cannot connect. Can you clarify how long you waited between switching the AP channel and trying again, and if maybe it works with a module reload? Hi Johannes - I went further to include powering off the client to ensure there was no state after the AP was set to channel 13. When the client could not connect, I verified my Android handset could. If there is any further technical/debug etc information needed (including from the AP), just let me know and thanks for your help! Daniel On Monday, September 20, 2010, Daniel J Blueman wrote:
>
> On 20 September 2010 20:57, Rafael J. Wysocki <rjw@sisk.pl> wrote:
> > This message has been generated automatically as a part of a report
> > of regressions introduced between 2.6.34 and 2.6.35.
> >
> > The following bug entry is on the current list of known regressions
> > introduced between 2.6.34 and 2.6.35. Please verify if it still should
> > be listed and let the tracking team know (either way).
> >
> > Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=16462
> > Subject : unable to connect to AP on legal channel 13
> > Submitter : Daniel J Blueman <daniel.blueman@gmail.com>
> > Date : 2010-07-25 17:06 (58 days old)
>
> I have verified this issue still exists in 2.6.36-rc4, alas.
The issue still occurs on 2.6.36-rc5. Thanks, and sorry about the delay. I need to find time to reproduce this again, but it's been really busy for a while now. Still strange that you can use 12 but not 13 :-( Darn, I can't reproduce this, it associates just fine on channel 13: $ iw wlan0 scan dump BSS 00:24:01:38:ac:0a (on wlan0) -- associated TSF: 2576791452 usec (0d, 00:42:56) freq: 2472 beacon interval: 100 capability: ESS Privacy ShortPreamble ShortSlotTime (0x0431) signal: -22.00 dBm last seen: 301460 ms ago SSID: jo2ghz Supported rates: 1.0* 2.0* 5.5* 11.0* DS Parameter set: channel 13 ERP: <no flags> Extended supported rates: 6.0 9.0 12.0 18.0 24.0 36.0 48.0 54.0 RSN: * Version: 1 * Group cipher: TKIP * Pairwise ciphers: TKIP CCMP * Authentication suites: PSK * Capabilities: (0x0000) WPA: * Version: 1 * Group cipher: TKIP * Pairwise ciphers: TKIP CCMP * Authentication suites: PSK WMM: * Parameter version 1 * BE: CW 15-1023, AIFSN 3 * BK: CW 15-1023, AIFSN 7 * VI: CW 7-15, AIFSN 2, TXOP 3008 usec * VO: CW 3-7, AIFSN 2, TXOP 1504 usec You do see the AP in your scan results, right? Can you paste the output of "iw wlan0 scan dump" for the correct AP, like I did above? Bingo. With ignore_broadcast_ssid=1 or 2 in hostapd.conf on the AP, I can't connect when the AP is on channel 13. When I advertise the AP with ignore_broadcast_ssid=0, I connect as expected. This would explain the behaviour I saw when I though it was fixed (since I was fiddling with the configuration). Testing multiple times, this does relate to the SSID being zerod or the SSID length being 0. With ignore_broadcast_ssid=0 (can connect): $ iw wlan0 scan dump BSS 00:25:d3:5e:ef:62 (on wlan0) TSF: 91033984 usec (0d, 00:01:31) freq: 2472 beacon interval: 100 capability: ESS Privacy ShortPreamble ShortSlotTime (0x0431) signal: -34.00 dBm last seen: 4970 ms ago SSID: An error has occurred Supported rates: 1.0* 2.0* 5.5* 11.0* 6.0 9.0 12.0 18.0 DS Parameter set: channel 13 ERP: <no flags> Extended supported rates: 24.0 36.0 48.0 54.0 RSN: * Version: 1 * Group cipher: CCMP * Pairwise ciphers: CCMP * Authentication suites: PSK PSK/SHA-256 * Capabilities: MFP-capable (0x0080) * bogus tail data (6): 00 00 00 0f ac 06 HT capabilities: Capabilities: 0x11ce HT20/HT40 SM Power Save disabled RX HT40 SGI TX STBC RX STBC 1-stream Max AMSDU length: 7935 bytes DSSS/CCK HT40 Maximum RX AMPDU length 65535 bytes (exponent: 0x003) Minimum RX AMPDU time spacing: No restriction (0x00) HT RX MCS rate indexes supported: 0-15 HT TX MCS rate indexes are undefined With ignore_broadcast_ssid=1 (can't connect): $ iw wlan0 scan dump BSS 00:25:d3:5e:ef:62 (on wlan0) TSF: 10547584 usec (0d, 00:00:10) freq: 2472 beacon interval: 100 capability: ESS Privacy ShortPreamble ShortSlotTime (0x0431) signal: -35.00 dBm last seen: 640 ms ago SSID: Supported rates: 1.0* 2.0* 5.5* 11.0* 6.0 9.0 12.0 18.0 DS Parameter set: channel 13 ERP: <no flags> Extended supported rates: 24.0 36.0 48.0 54.0 RSN: * Version: 1 * Group cipher: CCMP * Pairwise ciphers: CCMP * Authentication suites: PSK PSK/SHA-256 * Capabilities: MFP-capable (0x0080) * bogus tail data (6): 00 00 00 0f ac 06 HT capabilities: Capabilities: 0x11ce HT20/HT40 SM Power Save disabled RX HT40 SGI TX STBC RX STBC 1-stream Max AMSDU length: 7935 bytes DSSS/CCK HT40 Maximum RX AMPDU length 65535 bytes (exponent: 0x003) Minimum RX AMPDU time spacing: No restriction (0x00) HT RX MCS rate indexes supported: 0-15 HT TX MCS rate indexes are undefined With ignore_broadcast_ssid=2 (can't connect): $ iw wlan0 scan dump BSS 00:25:d3:5e:ef:62 (on wlan0) TSF: 66355583 usec (0d, 00:01:06) freq: 2472 beacon interval: 100 capability: ESS Privacy ShortPreamble ShortSlotTime (0x0431) signal: -32.00 dBm last seen: 8180 ms ago SSID: \x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00 Supported rates: 1.0* 2.0* 5.5* 11.0* 6.0 9.0 12.0 18.0 DS Parameter set: channel 13 ERP: <no flags> Extended supported rates: 24.0 36.0 48.0 54.0 RSN: * Version: 1 * Group cipher: CCMP * Pairwise ciphers: CCMP * Authentication suites: PSK PSK/SHA-256 * Capabilities: MFP-capable (0x0080) * bogus tail data (6): 00 00 00 0f ac 06 HT capabilities: Capabilities: 0x11ce HT20/HT40 SM Power Save disabled RX HT40 SGI TX STBC RX STBC 1-stream Max AMSDU length: 7935 bytes DSSS/CCK HT40 Maximum RX AMPDU length 65535 bytes (exponent: 0x003) Minimum RX AMPDU time spacing: No restriction (0x00) HT RX MCS rate indexes supported: 0-15 HT TX MCS rate indexes are undefined I hope that helps to reproduce it. Thanks, Daniel Created attachment 31702 [details]
failing AP config for ath9k
Ah, hidden SSID... that could be an issue, I'm surprised it works on channel 12! But I'll try to reproduce this. Ok, it now reproduces here! Let's see channel 12... Same on channel 12 though, but oh well :-) I'll have to see if our scan is set up correctly. So the problem clearly is that you're on a channel that we mark passive, and we don't actively scan there, but connecting to a hidden SSID requires active scanning ... We try to actually be better in iwlwifi and send an active probe when we receive a beacon, but either there's something wrong with that logic (I'll investigate) or it's just not reliable enough because of timing issues etc. In my test, it connected after a long time of scanning (maybe 10 minutes later), which would point to a timing issue since there's no way it can connect if it never sent probe requests. I'm not sure what to do about this. I can try to see if there's anything about the passive-to-active logic we can do, but for a passive channel you can't really expect hidden SSIDs to work well. Try this ... --- linux-2.6.orig/drivers/net/wireless/iwlwifi/iwl-agn-lib.c 2010-09-28 21:56:19.000000000 +0200 +++ linux-2.6/drivers/net/wireless/iwlwifi/iwl-agn-lib.c 2010-09-28 21:56:31.000000000 +0200 @@ -1290,7 +1290,7 @@ void iwlagn_request_scan(struct iwl_priv rate = IWL_RATE_1M_PLCP; rate_flags = RATE_MCS_CCK_MSK; } - scan->good_CRC_th = IWL_GOOD_CRC_TH_DISABLED; + scan->good_CRC_th = IWL_GOOD_CRC_TH_DEFAULT; break; case IEEE80211_BAND_5GHZ: rate = IWL_RATE_6M_PLCP; Fantastic - this works consistently across the matrix of all unhidden, hidden #1, hidden #2 and passive and initiated connections, on channel 12 and 13. Thank you very much Johannes! It's probably a good idea to CC stable@kernel.org on your patch too, so distros releasing with 2.6.35.x will get this fix. Yeah, but wanted to test it out first -- helped here but I wasn't sure if it'd help generally :-) I'll go submit the fix with your reported-by and tested-by, thanks! Hmmm. I can't see how this can be a regression. I just traced the code all the way back to the original code drop of iwlwifi (b481de9c) and the value has always been zero. Or maybe I'm confused and this particular thing wasn't even a regression... Anyway, we'll need to test this patch more carefully, since the active-after-passive thing is more intended to be used for 5 GHz scanning. Created attachment 31862 [details]
patch to enable probe-after-rx on 2.4 GHz
Daniel, can you try this patch please? I hope it applies. It should be equivalent to your other patch, but I want to make totally sure before I add your tested-by tag.
Great - the new patch is tested and works as expected on 2.6.36-rc6. Thanks, Daniel Patch : https://bugzilla.kernel.org/attachment.cgi?id=31862 Handled-By : Johannes Berg <johannes.berg@intel.com> On Monday, October 11, 2010, Berg, Johannes wrote:
> Rafael,
>
> I believe that we have fixed the part of this bug that was a regression,
> namely being unable to connect to a non-hidden SSID, in 2.6.35 commit
> 3ee378d7e0dfdb9bb. Being able to connect to a hidden SSID I believe isn't a
> regression, and a patch for that is on its way upstream for 2.6.37. I think
> you should stop tracking this as a regression.
|