Bug 78581

Summary: [ath9k_htc]AR9271 (TL-WN722N) does not pass any network traffic after successfull authentication
Product: Drivers Reporter: Natrio (natrio)
Component: network-wirelessAssignee: drivers_network-wireless (drivers_network-wireless)
Status: NEW ---    
Severity: normal CC: 747tke+4thw6c, ath9k-devel, berndkuhls, bugs, js, lightstream, linux, linville, muhomor.d, projectdelphai, radek, raquacontact, salva.liebana, stanislav.mamontov, tom, yeled.nova
Priority: P1    
Hardware: IA-32   
OS: Linux   
Kernel Version: 3.15.1 Tree: Mainline
Regression: No
Attachments: revert with manual fixup for 3.16

Description Natrio 2014-06-21 08:17:14 UTC
After kernel upgrade from 3.14 to 3.15 AR9271 (TL-WN722N) does not pass any network traffic after successfull autentification.

 In AP mode wlan0 server (AR9271) interface seems UP, but client's interface has NO_CAREER,DORMANT state.
 On AP hostapd show successfull connect from client:
hostapd[2409]: wlan0: STA c8:d1:5e:9a:45:3f IEEE 802.11: authenticated
hostapd[2409]: wlan0: STA c8:d1:5e:9a:45:3f IEEE 802.11: associated (aid 1)
hostapd[2409]: wlan0: STA c8:d1:5e:9a:45:3f RADIUS: starting accounting session 53A53168-00000004
hostapd[2409]: wlan0: STA c8:d1:5e:9a:45:3f WPA: pairwise key handshake completed (RSN)
hostapd[2409]: wlan0: STA c8:d1:5e:9a:45:3f IEEE 802.11: authenticated
hostapd[2409]: wlan0: STA c8:d1:5e:9a:45:3f IEEE 802.11: associated (aid 1)
hostapd[2409]: wlan0: STA c8:d1:5e:9a:45:3f RADIUS: starting accounting session 53A53168-00000005
hostapd[2409]: wlan0: STA c8:d1:5e:9a:45:3f WPA: pairwise key handshake completed (RSN)
 but client can't get IP-address, because server don't receve any network requests.

Same situation in client mode of this device:
https://bbs.archlinux.org/viewtopic.php?pid=1428106

On linux-3.14.6 and linux-3.10.44 no bug with same userspace software and same configs.

Arch Linux, i686
Comment 1 Natrio 2014-06-21 12:12:54 UTC
ath9k_htc on x86_64: low speed
https://bugs.archlinux.org/task/40905
Comment 2 Martin Dratva 2014-06-21 21:29:05 UTC
I have the same HW, my wifi works, but low speed. Downgrading kernel solves the issue.
Comment 3 Damjan Georgievski 2014-06-24 18:29:43 UTC
I have the Tplink WN-722N hardware, ArchLinux 64bit, 3.15.1 kernel from repo.

from what I've seen, the wifi stick associates with the access point (I'm using WPA-PSK btw), I can see the dhcp server on the AP receiving  dhcp requests, and it replies, but on the PC tcpdump shows some garbled packets insted of the dhcp replies.

Setting the modprobe to "options ath9k_htc nohwcrypt=1" fixes the above issue, but I haven't tested performance.

ps.
the stick uses the htc_9271.fw firmware which is provided by the
linux-firmware-20140603.a4f3bc0-1 package
Comment 4 Natrio 2014-06-25 05:16:54 UTC
Thank you, Damjan Georgievski :)

(ArchLinux 32bit, 3.15.1 kernel from repo)
With "options ath9k_htc nohwcrypt=1" it works!
I tested it in AP mode, and speed now also normal: 20mbit/s download from WN722N to android phone.
Comment 5 Natrio 2014-06-29 08:17:59 UTC
Possible buggy commit reported on Arch Linux bugtracker:
https://bugs.archlinux.org/task/40905#comment124815

>> Comment by I Said Socks (socks) - Saturday, 28 June 2014, 20:17 GMT
> Which ML should I post if (I think) I've identified the foul commit?
> Does the kernel bugzilla get attention from devs? (I don't have an account
> there either.)
> Anyway, I'll put the info here first for those concerned.
> This is my first bisect so I may or may not have done it right. For me, it's
> this commit that introduced the bug:

commit 88daf80dcca19ff995cc263592426f734a9702f3
Merge: 010d3c3 35582ad
Author: John W. Linville <linville@tuxdriver.com>
Date: Thu Feb 20 15:02:02 2014 -0500

Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-next into for-davem

> Interestingly, neither of the parents has the bug.
> Please verify / resend. Thanks.
 
> Edit: sent mail, not sure if the right places >.>
> http://permalink.gmane.org/gmane.linux.kernel.wireless.general/125277
Comment 6 John W. Linville 2014-06-30 14:15:13 UTC
Maybe we are the victims of some cleanups or other patch bombing run that came through the net tree?

> git diff --stat 35582ad9d342..88daf80 net/wireless/ net/mac80211/
> drivers/net/wireless/
 drivers/net/wireless/ath/ath5k/phy.c           |  2 +-
 drivers/net/wireless/ath/wil6210/pcie_bus.c    | 32 +++++++++++++++-----------------
 drivers/net/wireless/hostap/hostap_proc.c      |  2 +-
 drivers/net/wireless/iwlwifi/dvm/mac80211.c    | 22 ++++++++++++++++++++--
 drivers/net/wireless/iwlwifi/iwl-drv.c         |  2 +-
 drivers/net/wireless/iwlwifi/iwl-modparams.h   | 11 +++++++----
 drivers/net/wireless/iwlwifi/mvm/mac80211.c    | 22 ++++++++++++++++++++--
 drivers/net/wireless/mwifiex/main.c            |  2 +-
 drivers/net/wireless/rtl818x/rtl8187/rtl8187.h | 10 ++++++++--
 drivers/net/wireless/rtlwifi/ps.c              |  2 +-
 drivers/net/wireless/rtlwifi/rtl8192ce/hw.c    | 18 ++++++++++++++++--
 net/mac80211/iface.c                           |  6 ++++--
 net/wireless/chan.c                            |  2 --
 13 files changed, 95 insertions(+), 38 deletions(-)
Comment 7 John W. Linville 2014-06-30 14:17:02 UTC
I don't see any likely candidates in that set of diffs...
Comment 8 Christopher Waid 2014-07-22 18:42:23 UTC
I can confirm this bug is aplicable to 3.15.1 to 3.15.3 with our USB N adapter (Penguin Wireless N adapters: TPE-N150USB & TPE-N150USBL which uses the same chipset).

Here is what's worked for us and some of our customers with these two adapters:

Open up a terminal and use your favourite text editor to add the following line to /etc/modprobe.d/ath9k_htc.conf:

1. run the following from a terminal (or equivalent for your favourite distribution):

sudo nano /etc/modprobe.d/ath9k_htc.conf

2. Add the following line and save:

options ath9k_htc nohwcrypt=1

3. Reboot
Comment 9 Johannes Stezenbach 2014-09-08 17:18:02 UTC
This issue is still present in 3.16.2, so I bisected it:

# first bad commit: [341b29b9cd2fa470f2a2a55d7ef07cc167be93da] ath9k_htc: use ath9k_cmn_rx_skb_postprocess

I tried to revert this on top of 3.16, but this did not fix it.
Any ideas what to try next?
Comment 10 Johannes Stezenbach 2014-09-08 19:41:04 UTC
After some more testing, reverting these two commits
on top of 3.15 fixes the issue.  The reverts do not apply cleanly
on 3.16, but with manual fixup also worked.

341b29b9cd2f ath9k_htc: use ath9k_cmn_rx_skb_postprocess
c8ec0f5c9bc4 ath9k_htc: remove useless memcpy
Comment 11 Johannes Stezenbach 2014-09-08 19:42:00 UTC
Created attachment 149501 [details]
revert with manual fixup for 3.16
Comment 12 Bernd Kuhls 2014-09-08 20:22:18 UTC
(In reply to Johannes Stezenbach from comment #11)
> Created attachment 149501 [details]
> revert with manual fixup for 3.16

Hi, I also suffered from this bug with kernel 3.16.1 and I am happy to say that your patch fixes the problem using

Bus 001 Device 009: ID 0cf3:7015 Atheros Communications, Inc. TP-Link TL-WN821N v3 802.11n [Atheros AR7010+AR9287]
Bus 001 Device 008: ID 0cf3:7015 Atheros Communications, Inc. TP-Link TL-WN821N v3 802.11n [Atheros AR7010+AR9287]
Comment 13 Oleksij Rempel 2014-09-08 20:51:13 UTC
Hi,

can you please provide more information about used encryption.
I can't reproduce this issue with my AP.
And can you please test latest firmware, there is one hwcrypt related fix. Precompiled blobs can be found here:
https://github.com/olerem/ath9k-htc-firmware-blob
Comment 14 Johannes Stezenbach 2014-09-09 05:25:50 UTC
I was using the htc_9271.fw firmware from linux-firmware git repo,
with a TL-WN722N and WPA2.  Now I tried your new firmware: works.
Went back to the old firmware: still works!?!
(I tried a few times and unplugged the TL-WN722N before each try.)

Is it possible your new firmware changes something permanently
in the device, e.g. EEPROM settings?
Comment 15 Oleksij Rempel 2014-09-09 06:19:10 UTC
(In reply to Johannes Stezenbach from comment #14)
> Is it possible your new firmware changes something permanently
> in the device, e.g. EEPROM settings?

No.
Comment 16 Johannes Stezenbach 2014-09-09 08:02:43 UTC
OK, while it was surprising to me that it works today with
unpatched 3.16.2 and old firmware (previously it didn't even
get DHCP reponse in many many tries, so it was easy to bisect),
I noticed the connection is slow.
Simple test downloading tar.xz from kernel org via 16MBit DSL:

unpatched, nohwcrypt=0: ~80KB/s
unpatched, nohwcrypt=1: ~1.6MB/s
patched,   nohwcrypt=0: ~1.7MB/s
Comment 17 Oleksij Rempel 2014-09-09 08:19:45 UTC
What is about new firmware?
Comment 18 Johannes Stezenbach 2014-09-09 11:12:28 UTC
The results are the same with both old and new firmware.
Comment 19 Johannes Stezenbach 2014-09-09 12:21:32 UTC
For lack of better ideas I rebooted: Now I'm back to
the original behaviour where I get no DHCP response
with unpatched 3.16.2 and nohwcrypt=0.
No difference between old and new firmware.

Previously I tried "modprobe ath9k_htc -r", which also unloads
the mac80211 and cfg80211 modules, and then replugged the TL-WN722N.
This did not have the same effect as a reboot.
Comment 20 Oleksij Rempel 2014-09-10 19:15:12 UTC
I'll try investigate this issue next days. Right now i need to bisect regression in current wireless master branch.
Comment 21 Johannes Stezenbach 2014-09-11 08:38:03 UTC
I read a bit through the code and found one thing which explains
both the issue and the random reproducibility behaviour in
comment 14 and 16.  Currently I can't reboot to fully confirm,
only did the speed test as in comment 16. (After overhight
hibernate I'm back to the state as in comment 14.)

--- a/drivers/net/wireless/ath/ath9k/htc_drv_txrx.c
+++ b/drivers/net/wireless/ath/ath9k/htc_drv_txrx.c
@@ -978,7 +978,7 @@ static bool ath9k_rx_prepare(struct ath9k_htc_priv *priv,
        struct ath_hw *ah = common->ah;
        struct ath_htc_rx_status *rxstatus;
        struct ath_rx_status rx_stats;
-       bool decrypt_error;
+       bool decrypt_error = false;

        if (skb->len < HTC_RX_FRAME_HEADER_SIZE) {
                ath_err(common, "Corrupted RX frame, dropping (len: %d)\n",

This one change on top of v3.16 seems to fix it.

Another thing caught my eye, it is unrelated to this bug but
I'm reporting it here for my convenience ;-)
In ath9k_cmn_rx_skb_postprocess():

	hdrlen = ieee80211_get_hdrlen_from_skb(skb);
	fc = hdr->frame_control;
	padpos = ieee80211_hdrlen(fc);

Now padpos == hdrlen, except ieee80211_get_hdrlen_from_skb()
checks the skb->len and might fail.  This is confused.  If the
skb->len check is needed, shouldn't it be done before calling
ath9k_cmn_rx_accept()?
Comment 22 Oleksij Rempel 2014-09-12 06:22:22 UTC
Hi Johannes,

can you please test your suggestion? Right now i can't reproduce this issue, my ar9271 get stable 12MB/s for RX or for TX. On both directions at same time, TX is not so good.
Comment 23 Johannes Stezenbach 2014-09-12 12:35:24 UTC
I rebooted and retested and can confirm the one-liner fixes the issue.
Comment 24 Oleksij Rempel 2014-09-12 13:15:55 UTC
great!
can you please provide a patch and sent it to fallowing addresses: linville@tuxdriver.com
linux-wireless@vger.kernel.org
ath9k-devel@lists.ath9k.org
Comment 25 Salvador 2016-01-31 22:19:42 UTC
Same issue in lubuntu 15.10 64 bits. I need help please... i cant use lubuntu without internet connection, my board network device connects perfectly but it's dont have the signal of my tl-wn722n with athpk-htc driver. it's connected but with very slow speed, only 1mb/s. I tried options ath9k_htc nohwcrypt=1 but nothing resolve the issue. how change the firmware? i dont know, all other user seems to resolve the problem with options ath9k_htc nohwcrypt=1 but no is my situation. sorry my poor english...
Comment 26 Salvador 2016-01-31 22:21:17 UTC
hahahaha a lot of grammar errors
Comment 27 Salvador 2016-01-31 22:21:58 UTC
my email is salva.liebana@gmail.com