Bug 58561 - iwlwifi stalling on "changed bandwidth" events
Summary: iwlwifi stalling on "changed bandwidth" events
Status: CLOSED CODE_FIX
Alias: None
Product: Networking
Classification: Unclassified
Component: Wireless (show other bugs)
Hardware: All Linux
: P1 normal
Assignee: networking_wireless@kernel-bugs.osdl.org
URL: https://bbs.archlinux.org/viewtopic.p...
Keywords:
Depends on:
Blocks:
 
Reported: 2013-05-20 21:06 UTC by Kai Hendry
Modified: 2014-02-19 22:43 UTC (History)
11 users (show)

See Also:
Kernel Version: 3.9.2
Subsystem:
Regression: No
Bisected commit-id:


Attachments

Description Kai Hendry 2013-05-20 21:06:29 UTC
Couple of Archlinux users are experiencing this on 3.9.x
https://bbs.archlinux.org/viewtopic.php?pid=1275620
Comment 1 Ryan Davis 2013-06-21 12:05:46 UTC
Same issue under Debian Sid with Atheros AR9285
Booting from 3.8 kernel solves problem
Comment 2 Cormac Cannon 2013-06-25 07:52:12 UTC
Don't think this is an iwlwifi issue -- having the same issue with an ralink USB wifi adapter (ra2800usb driver) in Arch Linux running a 3.9.x kernel.  Wonder if it might be related to this mac80211 patch (not that I have any idea what I'm talking about :)

https://patchwork.kernel.org/patch/2124351/
Comment 3 Ryan Davis 2013-06-25 11:37:39 UTC
Ooooh, that looks promising.
If it was an issue prompted by how certain routers interact with the driver, it would explain why it wasn't caught...
Comment 4 Cormac Cannon 2013-06-25 12:01:02 UTC
Should we change the bug description to something more general so? Or is that up to a moderator to do?

'Wireless connections stalling on "changed bandwidth" events' maybe? 

Reverting to latest 3.8.x kernel solves the problem for me too BTW
Comment 5 Ryan Davis 2013-06-28 06:00:55 UTC
Dunno.. but this is a regression since it was working in an earlier version and reverting back returns function. I can really only get so far into that patch because I don't understand exactly what it is supposed to do, but what seems to be happening is this:

Computer makes contact with access point.
AP says 'Use this channel over here, it's cooler.
Computer says 'That's not OK' and disconnects with the error message.

Under the old behavior (if I am reading the comments correctly) was to ignore the request to change and keep the same channel regardless.

So... not sure how to debug this beyond that point.
Comment 6 Mohammad AlSaleh 2013-06-30 22:45:10 UTC
Bug still present in 3.10rc7.
Comment 7 Oleksii Shevchuk 2013-07-28 11:08:57 UTC
Same in 3.10.1
Comment 8 Oleksii Shevchuk 2013-07-28 11:18:52 UTC
--- /usr/src/linux-3.10/net/mac80211/mlme.c.orig	2013-07-28 14:16:49.840827598 +0300
+++ /usr/src/linux-3.10/net/mac80211/mlme.c	2013-07-28 14:17:59.517462466 +0300
@@ -446,9 +446,9 @@
 				      IEEE80211_STA_DISABLE_160MHZ)) ||
 	    !cfg80211_chandef_valid(&chandef)) {
 		sdata_info(sdata,
-			   "AP %pM changed bandwidth in a way we can't support - disconnect\n",
+			   "AP %pM changed bandwidth in a way we can't support - ignoring\n",
 			   ifmgd->bssid);
-		return -EINVAL;
+               return 0;
 	}
 
 	switch (chandef.width) {

Works for me
Comment 10 Stanislaw Gruszka 2013-07-28 12:57:50 UTC
> http://permalink.gmane.org/gmane.linux.kernel.wireless.general/111214
This patch seems to be unrelated with bug reported here. 

I think problem here is that AP send incorrect HT information in beacon frame,  similar like in this RH bug report:
https://bugzilla.redhat.com/show_bug.cgi?id=981445

Johannes proposed patch for that issue (similar to patch from comment 8):
http://p.sipsolutions.net/9d1dd0734d2c3a7a.txt
Comment 11 Dmitry S. Demin 2013-10-01 11:47:01 UTC
Same problem on Gentoo 3.10.13 and Zyxel Keenetic Ultra AP.
Johannes patch didn't help.
Patch from comment #8 helped. Thanks Oleksii.
Comment 12 Emmanuel Grumbach 2013-12-08 09:02:14 UTC
Can this be closed?
Comment 13 Nate Carlson 2014-02-18 15:39:53 UTC
Hi,

I see this is closed as 'CLOSED CODE_FIX', but I am having this issue on the following kernels:
* Ubuntu's 3.11.0-17-generic kernel
* 3.12.11
* Ubuntu's 3.11.0-17-generic with the compat drivers from linux-next dated 2014/Feb-10.

I am using an Intel 7260 card against a Cisco 3702i controller-based AP.

I get the following immediately on trying to connect to an SSID on this controller:

Feb 18 09:16:47 knight kernel: [   56.802562] wlan1: AP 18:e7:28:42:5e:25 changed bandwidth, new config is 5180 MHz, width 2 (5190/0 MHz)
Feb 18 09:16:47 knight kernel: [   56.802571] wlan1: AP 18:e7:28:42:5e:25 changed bandwidth in a way we can't support - disconnect

I applied the patch at http://p.sipsolutions.net/9d1dd0734d2c3a7a.txt (listed in Comment 10), and it resolved the issue for me:

Feb 18 09:25:30 knight kernel: [  579.782099] wlan1: AP 18:e7:28:42:5e:25 seems to have broken HT/VHT support, disable bandwidth tracking

..so I'm curious if there was actually a fix applied to the kernel for this?

Note that I am using an enterprise-grade AP, and have support with Cisco for it, so if someone can explain what the AP is doing wrong, I can open a ticket with them to try to get it fixed. But having a working workaround in the kernel would also be nice, since we all know how long that can take..  ;)
Comment 14 Luca Coelho 2014-02-19 07:38:21 UTC
It seems that Johannes never submitted this patch.  Not as it is, at least.

And this is not really an iwlwifi bug, but a mac80211 bug (actually an AP bug that needs to be worked around in mac80211).

I have pinged Johannes and he promised to take a look into this soon.
Comment 15 Johannes Berg 2014-02-19 07:50:18 UTC
We applied this patch at the time: http://p.sipsolutions.net/06427e0e6847cf9b.txt which is in the kernel starting from 3.11.

I think your bug report actually confirms my suspicion that my patch 9d1dd... wasn't actually the right thing to do.

Please open a new bug with your information (I'm pretty sure it's different and would like to keep this one with the old info) and also give me the output of

 iw wlan0 scan dump -b
and
 iw wlan0 scan dump -u

for the AP in question (only that BSSID is fine).

(unfortunately, it seems I didn't make it possible to specify both -u and -b)
Comment 16 Johannes Berg 2014-02-19 07:53:59 UTC
Maybe you could instead do

 iw wlan0 interface add moni0 type monitor flags none
 ip link set moni0 up
 tcpdump -i wlan0 -s0 -w /tmp/assoc.pcap

to capture the frames exchanged during association - I could see more precisely what's going on than even in the scan results.
Comment 17 Johannes Berg 2014-02-19 07:55:01 UTC
Last request, I hope - it would be good to have the output of "iw list" of your device, and maybe "iw event -t -f" during association as well.
Comment 18 Nate Carlson 2014-02-19 08:59:57 UTC
Thanks -- I'll open up a new bug when I'm at the office with the requested info (along with details on setup/etc of course), and toss the link in here.  :)
Comment 19 Nate Carlson 2014-02-19 22:43:49 UTC
New bug opened - https://bugzilla.kernel.org/show_bug.cgi?id=70881

Thanks!

Note You need to log in before you can comment on or make changes to this bug.