Bug 39082

Summary: Non-functional wireless-n on Intel Corporation Ultimate N WiFi Link 5300
Product: Drivers Reporter: Del (delonly)
Component: network-wirelessAssignee: drivers_network-wireless (drivers_network-wireless)
Status: CLOSED PATCH_ALREADY_AVAILABLE    
Severity: high CC: alan, linville, stf_xl, wey-yi.w.guy
Priority: P1    
Hardware: All   
OS: Linux   
Kernel Version: 2.6.38-8 and 3.0 Subsystem:
Regression: No Bisected commit-id:
Attachments: Syslog output

Description Del 2011-07-09 18:05:01 UTC
This bug seems to coincide with some of the reports in Bug 16691, which for reasons beyond me was closed.

The wireless works flawlessly in g-mode, so current work-around for me is adding a file iwlagn.conf with the line:
install iwlagn /sbin/modprobe --ignore-install iwlagn 11n_disable=1
to /etc/modprobe.d/ folder.

In n-mode the performance frequently is around 10KB/s, reconnecting to the access point can boost it up to 7MB/s for a period, before deteriorating again.

I have stock Kubuntu 11.04 installed with what seems to be the latest microcode from Intel. I have also tested on kernel 3.0 following the coming Ubuntu 11.10 with the same abysmal performance on wireless-n while the g-mode works perfectly there too.

I am testing against a router running Openwrt straight from trunk on an Tp-link ref. http://wiki.openwrt.org/toh/tp-link/tl-wr1043nd?s with ath9k which works nicely against the plethora of other wireless devices in the house (including a wireless-n broadcom chip in a laptop and three android phones).

iwconfig from the laptop (with 11n enabled):
wlan2     IEEE 802.11abgn  ESSID:"Openwrt"  
          Mode:Managed  Frequency:2.437 GHz  Access Point: zzzzzzzzzzzz   
          Bit Rate=120 Mb/s   Tx-Power=15 dBm   
          Retry  long limit:7   RTS thr:off   Fragment thr:off
          Power Management:off
          Link Quality=55/70  Signal level=-55 dBm  
          Rx invalid nwid:0  Rx invalid crypt:0  Rx invalid frag:0
          Tx excessive retries:1  Invalid misc:50   Missed beacon:0

iwconfig from the router
wlan0     IEEE 802.11bgn  Mode:Master  Frequency:2.437 GHz  Tx-Power=27 dBm   
          RTS thr:off   Fragment thr:off
          Power Management:on


Output from /var/log/syslog after loading the iwlagn module with 11n enabled and network transfer running at abysmal speed is attached.
Comment 1 Del 2011-07-09 18:07:10 UTC
Created attachment 65102 [details]
Syslog output
Comment 2 Stanislaw Gruszka 2011-07-25 15:14:28 UTC
In your syslog is:
[ 1069.085874] iwlagn 0000:08:00.0: low ack count detected, restart firmware

This problem is fixed by b7977ffaab5187ad75edaf04ac854615cea93828
"iwlwifi: add {ack,plpc}_check module parameters"

Regarding not working 3.0, this could be another problem introduced somewhere between 2.6.38 and 3.0, or introduced before but masked by other problems. I suggest apply (or assure they are in) these tree commit on top of 2.6.38:

> 42b70a5f6d18165a075d189d1bee82fad7cdbf29 "iwlagn: use cts-to-self protection
> on 5000 adapters series"
> bfd36103ec26599557c2bd3225a1f1c9267f8fcb "iwlagn: fix "Received BA when not
> expected""
> 7977ffaab5187ad75edaf04ac854615cea93828 "iwlwifi: add {ack,plpc}_check module
> parameters"

and see if that works. Then bisect applying this 3 patches on each step. If 2.6.38 works bisect between 2.6.38 and 3.0 and if not, between latest working kernel and 2.6.38. I do not see any other way to solve this problem. Unfortunately this require backporting patches skills (small help:
backport of "iwlagn: use cts-to-self protection on 5000 adapters series" for 2.6.38 is here:
>
> http://pkgs.fedoraproject.org/gitweb/?p=kernel.git;a=blob_plain;f=iwlagn-use-cts-to-self-protection-on-5000-adapters-series.patch;h=6743f90571564265ae52ebd154920649ccfccd0f;hb=refs/heads/f15 
)
Comment 3 Del 2011-09-04 16:08:18 UTC
Thanks for your time! Finally the laptop is in my hands again so I can do more testing. The error message seems to be gone. I guess a kernel update for 11.04 has fixed it. Still the problem persist, although it is more sporadic now. At periods the wireless get really slow or even halt completely. The messages from syslog when this happens now look like this:
Sep  4 17:51:22 sunny kernel: [ 4973.513793] iwlagn 0000:08:00.0: iwlagn_tx_agg_start on ra = d8:5d:4c:bb:22:b4 tid = 0
Sep  4 17:51:53 sunny kernel: [ 5004.614858] iwlagn 0000:08:00.0: Aggregation not enabled for tid 0 because load = 0
Sep  4 17:52:00 sunny kernel: [ 5011.212903] iwlagn 0000:08:00.0: Aggregation not enabled for tid 0 because load = 3
Sep  4 17:52:07 sunny kernel: [ 5018.336088] iwlagn 0000:08:00.0: Aggregation not enabled for tid 0 because load = 3
Sep  4 17:52:20 sunny kernel: [ 5030.906939] iwlagn 0000:08:00.0: Aggregation not enabled for tid 0 because load = 1
Sep  4 17:52:26 sunny kernel: [ 5037.585962] iwlagn 0000:08:00.0: Aggregation not enabled for tid 0 because load = 5

Reconnecting the wireless typically fixes the problem temporarily.

I am not afraid of testing or back-porting patches, especially when given good pointers :)

I have also updated router to the latest trunk of openwrt after getting a bug fixed there, now running at the legal 20dBm.
Comment 4 Del 2011-09-04 17:12:57 UTC
Just checked with the latest beta of kubuntu, running kernel 3.0.0-9. The problem with degrading and stalling wireless is still there with the same log messages as right above.  However, seriously degrading wireless speed also occurs with no log messages. All testing has been done with power cable attached, i.e., power profile set to performance.
Comment 5 Del 2011-09-05 18:43:49 UTC
Did some more testing. Previously I had the MCS rate at HT20 on the router, tried this evening with HT40-

I have a Broadcom wireless here to that worked nicely (except when in power save) with HT40-

The behaviour iwlagn is still the same, periodically abysmal speed and stalls/disconnects during file download. One more log-message surfaced during slow connection:
Sep  5 20:32:55 sunny kernel: [ 2761.694324] iwlagn 0000:08:00.0: Aggregation not enabled for tid 0 because load = 5
Sep  5 20:32:57 sunny kernel: [ 2763.638620] iwlagn 0000:08:00.0: iwlagn_tx_agg_start on ra = d8:5d:4c:bb:22:b4 tid = 0
Sep  5 20:33:00 sunny kernel: [ 2767.268226] iwlagn 0000:08:00.0: Aggregation not enabled for tid 0 because load = 5
Sep  5 20:33:03 sunny kernel: [ 2770.345517] iwlagn 0000:08:00.0: iwlagn_tx_agg_start on ra = d8:5d:4c:bb:22:b4 tid = 0

During slow connection/stall I got the following:
root@sunny:~$ iw dev wlan2 link
Connected to d8:5d:4c:bb:22:b4 (on wlan2)
        SSID: Backfire
        freq: 2462
        RX: 771828164 bytes (508433 packets)
        TX: 20099314 bytes (229006 packets)
        signal: -47 dBm
        tx bitrate: 12.0 MBit/s

During normal operation with 4-7MB/s (yes, that is bytes not bits) or more, I get the following output (again with HT40-):
root@sunny:~$ iw dev wlan2 link
Connected to d8:5d:4c:bb:22:b4 (on wlan2)
        SSID: Backfire
        freq: 2462
        RX: 775054557 bytes (510605 packets)
        TX: 20193443 bytes (230085 packets)
        signal: -46 dBm
        tx bitrate: 150.0 MBit/s MCS 7 40Mhz short GI
Comment 6 wey-yi.w.guy 2011-09-06 14:48:03 UTC
Can you try this commit#dd5b6d0a2059027366028630746d951b1e1e24b3

commit dd5b6d0a2059027366028630746d951b1e1e24b3
Author: Wey-Yi Guy <wey-yi.w.guy@intel.com>
Date:   Thu Aug 25 23:10:55 2011 -0700

    iwlagn: enable 11n aggregation without checking traffic load
    
    Enable HT aggregation when it reach reasonable traffic without
    checking traffic load which delay enabling the aggregation and lower
    the throughput
    
    but this behavior can be overwrite by module parameter
    
    this address
    https://bugzilla.kernel.org/show_bug.cgi?id=40042
    
    Signed-off-by: Wey-Yi Guy <wey-yi.w.guy@intel.com>
    Signed-off-by: John W. Linville <linville@tuxdriver.com>

you can get it from John Linville's wireless-next tree

Thanks
Wey
Comment 7 Del 2011-09-07 21:11:36 UTC
Thanks for the patch. I added the two lines manually at the correct location to the driver for the 2.6.38 kernel, then I got by with simply compiling the driver, and replacing the stock version of the driver. Please inform me if I need to apply the patch to a newer kernel.

From the testing I have done it does seem to improve the stability of the wireless speed significantly. However, slow connection and stalls still occurs even with excellent signal strength. Not sure how much help there is in the log messages either. Sometimes no messages appear while the wireless performance go down the drain, other times there is the occasional line looking similar to this:
Sep  7 22:52:53 sunny kernel: [ 3104.096535] iwlagn 0000:08:00.0: iwlagn_tx_agg_start on ra = d8:5d:4c:bb:22:b4 tid = 0

Did catch this output during stalls too:
Sep  7 22:27:20 sunny kernel: [ 1571.040790] iwlagn 0000:08:00.0: iwlagn_tx_agg_start on ra = d8:5d:4c:bb:22:b4 tid = 0
Sep  7 22:27:31 sunny kernel: [ 1581.610055] iwlagn 0000:08:00.0: Queue 11 stuck for 10000 ms.
Sep  7 22:27:31 sunny kernel: [ 1581.610062] iwlagn 0000:08:00.0: On demand firmware reload
Sep  7 22:27:31 sunny kernel: [ 1581.645578] iwlagn 0000:08:00.0: Stopping AGG while state not ON or starting
Sep  7 22:27:31 sunny kernel: [ 1581.645587] iwlagn 0000:08:00.0: queue number out of range: 0, must be 10 to 19

But as I stated it has improved after the two-line patch you provided, I have transferred my 3.2GB test file three time withouth issues, it never completed without issues prior to the patch. However, trying to copy the file once more, I get abysmal speed and need to reconnect the wifi to return the speed to normal. Sorry for not being more helpful, 
If I am to guess, I would say that the connection gets very fragile if the signal strength is not excellent. Will need more time to test that though.
Comment 8 wey-yi.w.guy 2011-09-07 22:28:57 UTC
could you also try the module parameter below
"$sudo modprobe iwlagn wd_disable=1"

not sure it will help or make the case even worse, but it will be great if you can help me to test it out.

Thanks
Wey
Comment 9 Stanislaw Gruszka 2011-09-08 12:53:24 UTC
You can also check this fedora patch, which is not yet applied upstream:
https://bugzilla.redhat.com/attachment.cgi?id=519535
Comment 10 Alan 2012-08-24 15:20:01 UTC
If this is still a problem with recent kernels please re-open/update thanks