Bug 198351

Summary: iwlwifi: 9260: very slow WAN-LAN transfers at 5Ghz
Product: Drivers Reporter: sergeidanilov (sergeidanilov)
Component: network-wirelessAssignee: DO NOT USE - assign "network-wireless-intel" component instead (linuxwifi)
Status: CLOSED CODE_FIX    
Severity: high CC: acelan, chenhan.hsiao.tw, hancockrwd, luca, minori, sergeidanilov, stf_xl
Priority: P1    
Hardware: Intel   
OS: Linux   
Kernel Version: 4.15.0-rc6 Subsystem:
Regression: No Bisected commit-id:
Attachments: dmesg output
trace-cmd output while download file at 5Ghz
2.4Ghz over ftp
Air sniff during downloading file from nas connected through ethernet
Air sniff during google speedtest execution in browser
Re-uploaded lan-wan transder with amsdu_size=3
lan-wan transder with 160Mhz disabled
iw dev scan result of Huawei E5885 (affected by bug)
iw dev scan result of D-Link DIR-806A (not affected by bug)
Missing patch

Description sergeidanilov 2018-01-04 12:00:36 UTC
Created attachment 273399 [details]
dmesg output

I recently updated my 8260 card to 9260 in laptop.

After that download speed from my nas connected over ethernet to router dropped significantly at 5Gz.
Was around 40MB/s with 8260. Dropped to 4MB/s with 9260.

Download speeds from internet(google drive) fully saturate my internet channel. eg. ~15MB/s. 
So it looks like issue exists only for local wan-lan connections.

Also download speeds at 2.4Ghz is around 10MB/b which is also faster than 5Ghz one.

I'm using most recent firmware-version: 34.0.0
and most recent kernel 4.15-rc6. (got the same behavior though on 4.14.10)
Comment 1 sergeidanilov 2018-01-04 12:29:35 UTC
Created attachment 273401 [details]
trace-cmd output while download file at 5Ghz

Added output of
sudo trace-cmd record -e iwlwifi -e mac80211 -e cfg80211 -e iwlwifi_msg
while downloading file at 5Ghz over ftp
Comment 2 sergeidanilov 2018-01-04 12:30:35 UTC
Created attachment 273403 [details]
2.4Ghz over ftp

Added output of
sudo trace-cmd record -e iwlwifi -e mac80211 -e cfg80211 -e iwlwifi_msg
while downloading file at 2.4Ghz over ftp
Comment 3 AceLan Kao 2018-01-10 02:48:35 UTC
I'm not sure I have the same issue as yours.
From my observation, the speed issue only happens on tx.
And I also did the test you mentioned to measure the internet speed and have the same result as yours.

I use speedtest-cli the measure internet speed
   Download/Upload = 69.41 Mbits/s / 8.61 Mbit/s

And use iperf to measure intranet speed with a server connected over ethernet cable
   Download/Upload = 78.6 Mbits/s / 1.51 Mbit/s

So, it looks like the speed issue only happens on intranet.

acelan@u-Kabylake-Client-platform ~ % speedtest-cli                 
Retrieving speedtest.net configuration...
Testing from Taiwan Internet Gateway (175.41.48.77)...
Retrieving speedtest.net server list...
Selecting best server based on ping...
Hosted by Asia Pacific Telecom (Taipei) [3.82 km]: 1.635 ms
Testing download speed................................................................................
Download: 69.41 Mbit/s
Testing upload Testing upload speed................................................................................................     
Upload: 8.61 Mbit/s 

acelan@u-Kabylake-Client-platform ~ % iperf -c 10.101.46.219 -er                                                         
------------------------------------------------------------
Server listening on TCP port 5001 with pid 27531
Read buffer size: 1.44 KByte
TCP window size: 85.3 KByte (default)
------------------------------------------------------------
------------------------------------------------------------
Client connecting to 10.101.46.219, TCP port 5001 with pid 27531
Write buffer size:  128 KByte
TCP window size: 85.0 KByte (default)
------------------------------------------------------------
[  5] local 10.101.46.137 port 53328 connected with 10.101.46.219 port 5001
[ ID] Interval        Transfer    Bandwidth       Write/Err  Rtry    Cwnd/RTT
[  5] 0.00-13.16 sec  2.38 MBytes  1.51 Mbits/sec  1/0         86       15K/137450 us
[  4] local 10.101.46.137 port 5001 connected with 10.101.46.219 port 51230
[ ID] Interval        Transfer    Bandwidth       Reads   Dist(bin=0.2K)
[  4] 0.00-10.21 sec  95.8 MBytes  78.6 Mbits/sec  69327    212:96:139:89:972:364:197:67258
Comment 4 Emmanuel Grumbach 2018-01-15 08:50:26 UTC
Does this help?


diff --git a/drivers/net/wireless/intel/iwlwifi/mvm/mac80211.c b/drivers/net/wireless/intel/iwlwifi/mvm/mac80211.c
index c2388694979f..2f6f94f65160 100644
--- a/drivers/net/wireless/intel/iwlwifi/mvm/mac80211.c
+++ b/drivers/net/wireless/intel/iwlwifi/mvm/mac80211.c
@@ -470,7 +470,7 @@ int iwl_mvm_mac_setup_register(struct iwl_mvm *mvm)
        ieee80211_hw_set(hw, CHANCTX_STA_CSA);
        ieee80211_hw_set(hw, SUPPORT_FAST_XMIT);
        ieee80211_hw_set(hw, SUPPORTS_CLONED_SKBS);
-       ieee80211_hw_set(hw, SUPPORTS_AMSDU_IN_AMPDU);
+//     ieee80211_hw_set(hw, SUPPORTS_AMSDU_IN_AMPDU);
        ieee80211_hw_set(hw, NEEDS_UNIQUE_STA_ADDR);
 
        if (iwl_mvm_has_tlc_offload(mvm)) {
Comment 5 sergeidanilov 2018-01-15 10:14:40 UTC
yes, that helps!
thanks!
I get full speed now
Comment 6 Emmanuel Grumbach 2018-01-15 10:45:47 UTC
*** Bug 198417 has been marked as a duplicate of this bug. ***
Comment 7 Emmanuel Grumbach 2018-01-15 10:47:35 UTC
This is really not a patch we can apply.

This was a debug step. We do not want to disable AMSDU in AMPDU.

Do you have a separate Linux machine with an Intel wireless device?
I'd like you to capture an air sniffer to see what happens.


Also, does it happen without security configured?
Comment 8 sergeidanilov 2018-01-16 06:26:14 UTC
Yes , 
I have one more machine with intel wifi.
Should I just sniff traffic while downloading a file?

Speed becomes little better when I turn off security.
It's around 5-6MB/sec , but still far away from 60MB/s with turned off AMSDU
Comment 9 Emmanuel Grumbach 2018-01-16 06:31:53 UTC
Yes please, sniff while downloading a file.

Please read: https://wireless.wiki.kernel.org/en/users/drivers/iwlwifi#about_the_monitorsniffer_mode

Please sniff with security disable so that we can see what happens inside the packet.

Note: this means that we will be able to see pretty much everything inside the packet. Please transfer data that you don't mind others to see, or encrypt it using our PGP keys.

The data will be fairly big but you can compress it fairly easily.
Comment 10 AceLan Kao 2018-01-16 06:43:56 UTC
Emmanuel,

Sorry, the fix doesn't help on my side, and looks like getting worse.
I can get some good results by v4.15-rc8, but after applied the patch, the results are all bad.

Before starting the test, I compiled and tried the latest v4.15-rc8 kernel and here are the results, the speed is not so stable.
acelan@u-Kabylake-Client-platform ~ % iperf -c 10.101.46.219 -er
------------------------------------------------------------
Server listening on TCP port 5001 with pid 1475
Read buffer size: 1.44 KByte
TCP window size: 85.3 KByte (default)
------------------------------------------------------------
------------------------------------------------------------
Client connecting to 10.101.46.219, TCP port 5001 with pid 1475
Write buffer size:  128 KByte
TCP window size:  110 KByte (default)
------------------------------------------------------------
[  5] local 10.101.46.137 port 35228 connected with 10.101.46.219 port 5001
[ ID] Interval        Transfer    Bandwidth       Write/Err  Rtry    Cwnd/RTT
[  5] 0.00-11.00 sec   896 KBytes   667 Kbits/sec  1/0         65        7K/242977 us
[  4] local 10.101.46.137 port 5001 connected with 10.101.46.219 port 47680
[ ID] Interval        Transfer    Bandwidth       Reads   Dist(bin=0.2K)
[  4] 0.00-10.30 sec   104 MBytes  84.6 Mbits/sec  75198    106:106:264:356:741:410:398:72817

Test again, and again, the results are pretty the same as below.
acelan@u-Kabylake-Client-platform ~ % iperf -c 10.101.46.219 -er
------------------------------------------------------------
Server listening on TCP port 5001 with pid 1592
Read buffer size: 1.44 KByte
TCP window size: 85.3 KByte (default)
------------------------------------------------------------
------------------------------------------------------------
Client connecting to 10.101.46.219, TCP port 5001 with pid 1592
Write buffer size:  128 KByte
TCP window size: 85.0 KByte (default)
------------------------------------------------------------
[  5] local 10.101.46.137 port 35254 connected with 10.101.46.219 port 5001
[ ID] Interval        Transfer    Bandwidth       Write/Err  Rtry    Cwnd/RTT
[  5] 0.00-10.50 sec  39.9 MBytes  31.9 Mbits/sec  1/0        233      197K/9774 us
[  4] local 10.101.46.137 port 5001 connected with 10.101.46.219 port 47744
[ ID] Interval        Transfer    Bandwidth       Reads   Dist(bin=0.2K)
[  4] 0.00-10.95 sec  55.1 MBytes  42.2 Mbits/sec  39902    8:20:67:102:391:544:529:38241

After applied the patch(v4.15-rc8), the results are bad and similar to the symptom I described on comment #3.
acelan@u-Kabylake-Client-platform ~ % iperf -c 10.101.46.219 -er
------------------------------------------------------------
Server listening on TCP port 5001 with pid 1495
Read buffer size: 1.44 KByte
TCP window size: 85.3 KByte (default)
------------------------------------------------------------
------------------------------------------------------------
Client connecting to 10.101.46.219, TCP port 5001 with pid 1495
Write buffer size:  128 KByte
TCP window size: 85.0 KByte (default)
------------------------------------------------------------
[  5] local 10.101.46.137 port 36442 connected with 10.101.46.219 port 5001
[ ID] Interval        Transfer    Bandwidth       Write/Err  Rtry    Cwnd/RTT
[  5] 0.00-10.67 sec   512 KBytes   393 Kbits/sec  1/0         36        8K/2403 us
[  4] local 10.101.46.137 port 5001 connected with 10.101.46.219 port 48456
[ ID] Interval        Transfer    Bandwidth       Reads   Dist(bin=0.2K)
[  4] 0.00-10.41 sec  99.0 MBytes  79.8 Mbits/sec  71658    255:84:205:259:402:626:249:69578

acelan@u-Kabylake-Client-platform ~ % iperf -c 10.101.46.219 -er
------------------------------------------------------------
Server listening on TCP port 5001 with pid 1513
Read buffer size: 1.44 KByte
TCP window size: 85.3 KByte (default)
------------------------------------------------------------
------------------------------------------------------------
Client connecting to 10.101.46.219, TCP port 5001 with pid 1513
Write buffer size:  128 KByte
TCP window size: 85.0 KByte (default)
------------------------------------------------------------
[  5] local 10.101.46.137 port 36446 connected with 10.101.46.219 port 5001
[ ID] Interval        Transfer    Bandwidth       Write/Err  Rtry    Cwnd/RTT
[  5] 0.00-11.06 sec   768 KBytes   569 Kbits/sec  1/0         62        2K/127231 us
[  4] local 10.101.46.137 port 5001 connected with 10.101.46.219 port 48466
[ ID] Interval        Transfer    Bandwidth       Reads   Dist(bin=0.2K)
[  4] 0.00-10.30 sec   101 MBytes  82.1 Mbits/sec  73030    165:132:234:284:645:435:304:70831

acelan@u-Kabylake-Client-platform ~ % iperf -c 10.101.46.219 -er
------------------------------------------------------------
Server listening on TCP port 5001 with pid 1540
Read buffer size: 1.44 KByte
TCP window size: 85.3 KByte (default)
------------------------------------------------------------
------------------------------------------------------------
Client connecting to 10.101.46.219, TCP port 5001 with pid 1540
Write buffer size:  128 KByte
TCP window size: 85.0 KByte (default)
------------------------------------------------------------
[  5] local 10.101.46.137 port 36452 connected with 10.101.46.219 port 5001
[ ID] Interval        Transfer    Bandwidth       Write/Err  Rtry    Cwnd/RTT
[  5] 0.00-11.91 sec   640 KBytes   440 Kbits/sec  1/0         52        4K/18030 us
[  4] local 10.101.46.137 port 5001 connected with 10.101.46.219 port 48488
[ ID] Interval        Transfer    Bandwidth       Reads   Dist(bin=0.2K)
[  4] 0.00-10.32 sec   109 MBytes  88.5 Mbits/sec  78820    242:179:286:309:497:392:320:76595

acelan@u-Kabylake-Client-platform ~ % iperf -c 10.101.46.219 -er
------------------------------------------------------------
Server listening on TCP port 5001 with pid 1580
Read buffer size: 1.44 KByte
TCP window size: 85.3 KByte (default)
------------------------------------------------------------
------------------------------------------------------------
Client connecting to 10.101.46.219, TCP port 5001 with pid 1580
Write buffer size:  128 KByte
TCP window size: 85.0 KByte (default)
------------------------------------------------------------
[  5] local 10.101.46.137 port 36460 connected with 10.101.46.219 port 5001
[ ID] Interval        Transfer    Bandwidth       Write/Err  Rtry    Cwnd/RTT
[  5] 0.00-10.65 sec   896 KBytes   689 Kbits/sec  1/0         62        4K/84454 us
[  4] local 10.101.46.137 port 5001 connected with 10.101.46.219 port 48494
[ ID] Interval        Transfer    Bandwidth       Reads   Dist(bin=0.2K)
[  4] 0.00-10.18 sec   120 MBytes  98.7 Mbits/sec  86713    177:139:308:416:761:448:432:84032
Comment 11 Emmanuel Grumbach 2018-01-16 06:49:45 UTC
@Acelan

[  5] local 10.101.46.137 port 35228 connected with 10.101.46.219 port 5001
[ ID] Interval        Transfer    Bandwidth       Write/Err  Rtry    Cwnd/RTT
[  5] 0.00-11.00 sec   896 KBytes   667 Kbits/sec  1/0         65        7K/242977 us
[  4] local 10.101.46.137 port 5001 connected with 10.101.46.219 port 47680
[ ID] Interval        Transfer    Bandwidth       Reads   Dist(bin=0.2K)
[  4] 0.00-10.30 sec   104 MBytes  84.6 Mbits/sec  75198    106:106:264:356:741:410:398:72817

Sorry, but I am not used to kind of output. What is the real throughput here?

Can you please record a sniffer capture and an open connection?

Thanks.
Comment 12 Emmanuel Grumbach 2018-01-17 07:39:59 UTC
Ok, I understand that these are bidirectional tests. Can you please run unidirectional tests only? I won't fix / change anything but it'll be less confusing for me :)

Did you have a chance to record tracing / air sniffer?

What AP do you use?

thanks.
Comment 13 sergeidanilov 2018-01-18 06:53:45 UTC
Created attachment 273665 [details]
Air sniff during downloading file from nas connected through ethernet
Comment 14 sergeidanilov 2018-01-18 06:54:21 UTC
Created attachment 273667 [details]
Air sniff during google speedtest execution in browser
Comment 15 sergeidanilov 2018-01-18 06:57:40 UTC
I configured second intel wifi card and did air sniff of unsecured wifi channel.
Also I turned on debugging in kernel for 9260.

I captured and attached 2 files:
1. For ftp transfer from my nas. Speed was about 4MB/s out of possible ~60MB/s
2. For running google speed test in browser. Speed also was about 4MB/s out of ~16MB possible.
Comment 16 Emmanuel Grumbach 2018-01-18 07:00:10 UTC
What device did you use for the capture?
Looks like I can't see the data packets :(
Comment 17 Emmanuel Grumbach 2018-01-18 07:02:14 UTC
@Sergei,

what AP do you  use?
Comment 18 sergeidanilov 2018-01-18 07:20:54 UTC
It's Netfgear R7800 as AP.
And 8260 as wifi card for monitoring.

Actually I just realized I forgot to turn use amsdu_size=3 on monitoring wifi card.I'm going to update logs now with options turn on

Maybe I configured monitoring card incorrectly? I used the following sequence of commands on monitoring card from that guide:
https://wireless.wiki.kernel.org/en/users/drivers/iwlwifi/debugging

nmcli radio wifi off
rfkill unblock wifi
iw wlp2s0 set type monitor
ifconfig wlp2s0 up
#my AP is using channel 36
iw wlp2s0 set freq 5180
tcpdump -i wlp2s0 -w capture.pcap
Comment 19 Emmanuel Grumbach 2018-01-18 07:28:18 UTC
amsdu_size=3 is critical here.

What is the bandwidth of the connection?

You can check this with iw wlp2s0 link on the 9260 system.
Comment 20 sergeidanilov 2018-01-18 07:32:08 UTC
Created attachment 273669 [details]
Re-uploaded lan-wan transder with amsdu_size=3
Comment 21 sergeidanilov 2018-01-18 07:33:42 UTC
For 9260:
sudo iw wlp3s0 link
Connected to b0:b9:8a:4f:28:ad (on wlp3s0)
        SSID: Gl_5g
        freq: 5180
        RX: 215306789 bytes (150077 packets)
        TX: 5738571 bytes (42070 packets)
        signal: -54 dBm
        tx bitrate: 130.0 MBit/s VHT-MCS 0 160MHz short GI VHT-NSS 2

        bss flags:      short-slot-time
        dtim period:    2
        beacon int:     100
Comment 22 Emmanuel Grumbach 2018-01-18 07:37:57 UTC
oh wow, you're using 160MHz... That won't be able to be captured by 8260.

Your AP is very new :)

Ok - so please reduce the bandwidth of your AP and let's see if that helps (I doubt it will), but then at least, 8260 will be able to hear what happens in the air...
Comment 23 sergeidanilov 2018-01-18 07:39:39 UTC
There is an option to disable 160Mhz on AP. I will try to use it now.
Comment 24 sergeidanilov 2018-01-18 07:53:12 UTC
Created attachment 273671 [details]
lan-wan transder with 160Mhz disabled

I re-uploaded file after disabling 160Mhz
My 9260 shows regular 80Mhz now

sudo iw wlp3s0 link
Connected to b0:b9:8a:4f:28:ad (on wlp3s0)
        SSID: Gl_5g
        freq: 5180
        RX: 101948 bytes (470 packets)
        TX: 106443 bytes (366 packets)
        signal: -47 dBm
        tx bitrate: 65.1 MBit/s VHT-MCS 0 80MHz short GI VHT-NSS 2

        bss flags:      short-slot-time
        dtim period:    2
        beacon int:     100
Comment 25 Emmanuel Grumbach 2018-01-18 07:58:12 UTC
ok - you're going to hate me...

the configuration of the 8260 is not sufficient.


It is tuned to 20MHz only.
To tune it to 80MHz, you need:

sudo iw wlp2s0 set freq 5180 80 5210
Comment 26 Emmanuel Grumbach 2018-01-18 07:58:43 UTC
And please mark all the irrelevant attachments as obsolete. It will avoid confusion.

Thanks.
Comment 27 sergeidanilov 2018-01-18 08:15:38 UTC
It looks like adding extra options to 8260 did the correct monitoring now!

Captured new file is significantly bigger. It didn't fit to bugzilla. 
There is a link:
https://drive.google.com/file/d/1hT3DyZn04caog3nRE69izicWzYABjdUO/view?usp=sharing

Also the 8260 iwlwifi crashed during tcpdump with "Microcode SW error detected", but as far as I read it's expected behavior.
Comment 28 Emmanuel Grumbach 2018-01-18 08:21:15 UTC
Err... no :)
Getting a FW crash during monitor isn't expected :)

But that's a separate bug.

Look at the capture now.
Comment 29 Emmanuel Grumbach 2018-01-18 08:30:33 UTC
I understand that the transfer you captured had bad throughput, right?

The sniffer capture looks fine.
Wireshark gets confused about the TCP retransmissions because of the WiFi aggregations, but overall, it looks fine...
I need to dig deeper though.
Comment 30 sergeidanilov 2018-01-18 08:40:34 UTC
Yes,
It has bad throughput of average 4MB/s.
On the same laptop I get about 60MB/s after rebooting to kernel with disabled SUPPORTS_AMSDU_IN_AMPDU
Comment 31 Emmanuel Grumbach 2018-01-18 13:30:04 UTC
I spent a fair amount of time on the sniffer capture.

can you please check what happens with this:

diff --git a/drivers/net/wireless/intel/iwlwifi/mvm/rxmq.c b/drivers/net/wireless/intel/iwlwifi/mvm/rxmq.c
index 3de5498f6cfa..484e97f7e9cc 100644
--- a/drivers/net/wireless/intel/iwlwifi/mvm/rxmq.c
+++ b/drivers/net/wireless/intel/iwlwifi/mvm/rxmq.c
@@ -335,6 +335,8 @@ static bool iwl_mvm_is_dup(struct ieee80211_sta *sta, int queue,
        struct iwl_mvm_rxq_dup_data *dup_data;
        u8 tid, sub_frame_idx;
 
+       return false;
+
        if (WARN_ON(IS_ERR_OR_NULL(sta)))
                return false;
 


Thanks.
Comment 32 sergeidanilov 2018-01-19 07:47:34 UTC
@Emmanuel , 
what is the kernel version you have iwl_mvm_is_dup function?
I don't see it in either in linux-4.14.10 or linux-4.15-rc6

There is only one reference in comment:
intel # grep -R "iwl_mvm_is_dup" .
./iwlwifi/mvm/sta.c:             * iwl_mvm_is_dup() since the lower 4 bits are the fragment
Comment 33 Emmanuel Grumbach 2018-01-19 09:15:49 UTC
ok - this is a problem.. We'll check offline.

Can you please install our backport driver - please take the master branch:

https://wireless.wiki.kernel.org/en/users/drivers/iwlwifi/core_release#how_to_install_the_driver
Comment 34 Emmanuel Grumbach 2018-01-19 09:18:04 UTC
ah sorry.
No, all is fine.
In the kernel, it is called: iwl_mvm_is_nonagg_dup

A patch hasn't been upstreamed yet.
Comment 35 Emmanuel Grumbach 2018-01-19 13:07:39 UTC
I'll send a proper patch on Sunday.
Comment 36 sergeidanilov 2018-01-20 05:37:17 UTC
I patched iwl_mvm_is_nonagg_dup, but didn't get any difference in speed.
There is a capture of patched kernel:
https://drive.google.com/file/d/1hT3DyZn04caog3nRE69izicWzYABjdUO/view?usp=sharing
Comment 37 Emmanuel Grumbach 2018-01-25 14:01:54 UTC
I understand that the AP used by the submitter is Netgear R7800. The other people CC'ed on the bug see the problem with the same AP?
Comment 38 sergeidanilov 2018-01-25 19:22:57 UTC
@Emmanuel,
I have this issue on two AP with little different side effects:
Netgear R7800: both transfers from local nas and downloading from internet are slow
Netgear R7000: only transfers from local NAS are slow.
Comment 39 Minori Hiraoka 2018-01-26 02:07:43 UTC
@Emmanuel Grumbach,
No, I have this problem on other AP, current one I'm using is Huawei E5885, LTE mobile router w/ Ethernet WAN capability. Approx 3Mbps UP/DL.
Interestingly, I don't have this problem on other AP I have. (D-Link DIR-806A)
Patch on iwl_mvm_is_nonagg_dup or iwl_mvm_mac_setup_register did not solve issue, no performance difference.
I would really like to give packet capture dump, but I don't have any other hardware capable of 5GHz capture right now.

I will upload iw dev scan results of each APs.
Comment 40 Minori Hiraoka 2018-01-26 02:08:18 UTC
Created attachment 273863 [details]
iw dev scan result of Huawei E5885 (affected by bug)
Comment 41 Minori Hiraoka 2018-01-26 02:08:53 UTC
Created attachment 273865 [details]
iw dev scan result of D-Link DIR-806A (not affected by bug)
Comment 42 Emmanuel Grumbach 2018-01-26 04:34:42 UTC
Please try in install the latest backport based driver and see if that makes a difference.

https://wireless.wiki.kernel.org/en/users/drivers/iwlwifi/core_release#how_to_install_the_driver

I'll try to find time next week to dig into this, but I can't promise.
This week as been very busy and I couldn't find much time for this issue yet.
Sorry.
Comment 43 Emmanuel Grumbach 2018-01-26 04:35:19 UTC
Ok, course don't take the Core28 branch from the backport tree, but the master branch.

Thanks.
Comment 44 Minori Hiraoka 2018-01-26 05:42:46 UTC
@Emmanuel Grumbach,
Installing backport driver pretty much solved my issue. I now get 70Mbps result on download and upload.
Since I don't have another 5GHz device which is capable to reach 9260's full speed and my LTE router is limited to 100Mbps PHY link rate via Ethernet, I cannot test if this patch really solved all slowdown issues. Maybe the other people CC'ed here can test on better environment?
Comment 45 Luca Coelho 2018-01-26 15:27:06 UTC
Created attachment 273873 [details]
Missing patch

I've been trying to find out the differences between upstream and our internal tree and I found some differences.  The first suspect is this patch.  I'm not very confident it will fix the problem, but it's one step ahead.

Can you please try it and report if the problem is reproduced?

I also found out that we enabled HW TX checksum offload upstream from the beginning, while in our internal tree we left it disabled until things stabilized.  So you could also try to disable it like this:

diff --git a/drivers/net/wireless/intel/iwlwifi/mvm/constants.h b/drivers/net/wireless/intel/iwlwifi/mvm/constants.h
index 976640fed334..e1ae94114af7 100644
--- a/drivers/net/wireless/intel/iwlwifi/mvm/constants.h
+++ b/drivers/net/wireless/intel/iwlwifi/mvm/constants.h
@@ -108,7 +108,7 @@
 #define IWL_MVM_RS_80_20_FAR_RANGE_TWEAK       1
 #define IWL_MVM_TOF_IS_RESPONDER               0
 #define IWL_MVM_SW_TX_CSUM_OFFLOAD             0
-#define IWL_MVM_HW_CSUM_DISABLE                        0
+#define IWL_MVM_HW_CSUM_DISABLE                        1
 #define IWL_MVM_PARSE_NVM                      0
 #define IWL_MVM_RS_NUM_TRY_BEFORE_ANT_TOGGLE    1
 #define IWL_MVM_RS_HT_VHT_RETRIES_PER_RATE      2


...and report if it helps or not.  I suspect we may still be missing some of the checksum stabilization patches that we implemented later...
Comment 46 Minori Hiraoka 2018-01-26 15:52:56 UTC
@Luca Coelho,

After applying that patch to stock 4.15.0-rc9 kernel iwlwifi driver, performance is still bad.
Comment 47 Robert Hancock 2018-01-27 03:34:22 UTC
I just received an Intel 9260 card and I am also getting poor download performance under Linux using 4.14.15 kernel under Fedora 27. I'm seeing about 1-5 Mbps of download throughput. Upload throughput is around 400 Mbps. With the previous 7260 card I was getting about 400-500 Mbps in both directions in the same location and AP.

From looking at Wireshark, from the number of TCP dup acks and retransmissions it appears that there is heavy packet loss in the downstream direction while upstream looks fine.

Under Windows 10, throughput appears reasonable (300 Mbps or so) so it's not a hardware problem.

I'm using a Netgear R7800 AP, running OpenWRT though. 80MHz channel width.
Comment 48 Minori Hiraoka 2018-01-27 03:43:43 UTC
@Luca Coelho,

Scratch that comment 46, I was too sleepy before testing that patch, and accidentally only applied patch in comment, not the important one in attachment.

To correct, applying patch containing "#define IWL_MVM_HW_CSUM_DISABLE" and "[PATCH] iwlwifi: mvm: fix security bug in PN checking" to vanilla 4.15.0 rc9 solves speed issue (for me).
Comment 49 Minori Hiraoka 2018-01-27 03:55:22 UTC
@Robert Hancock

Can you try testing above patches (comment 45)?
I think that patch solves issue for me. But I'm not very sure that it solves all issue, since my connection is physically limited to 100Mbps, unlike your environment. I can see small difference between download and upload speed in my environment (60-70Mbps down, 90-100Mbps up), but I'm not very sure if this is just a limit of my router, or if this is a bug of iwlwifi module.
Comment 50 Robert Hancock 2018-01-27 04:56:56 UTC
I haven't tested just those patches against upstream, but I did test the master branch of the iwlwifi-backport driver and confirmed that I get the expected speeds with that version (up to 500 Mbps download). I may be able to try isolating those two patches later, but based on your tests it certainly seems like the "iwlwifi: mvm: fix security bug in PN checking" patch Luca attached fixes the issue.
Comment 51 Luca Coelho 2018-01-27 09:41:23 UTC
Great, thanks for testing and the feedback!

Would it be possible for you to try both patches isolatedly? They affect two different features and it would be nice to know which one caused the problem so we can fix it in stable releases without making unnecessary changes.
Comment 52 Minori Hiraoka 2018-01-27 11:02:29 UTC
I applied only "iwlwifi: mvm: fix security bug in PN checking", not "IWL_MVM_HW_CSUM_DISABLE" on vanilla 4.15.0 rc9, and 9260 performs great.
Comment 53 Luca Coelho 2018-01-27 14:14:56 UTC
Great! I'll queue this patch for stable releases.  Unless Linus decides to release rc10 this weekend, we don't have the time to get it into 4.15 anymore (but hopefully into stable 4.15.1).

Thanks everyone for the help debugging this!
Comment 54 Robert Hancock 2018-01-27 16:24:55 UTC
Sounds good. Hopefully headed for 4.14 stable branch as well?
Comment 55 Luca Coelho 2018-01-27 16:29:51 UTC
Yes, very likely.  These devices are supported since 4.13, but 4.13 AFAIK is EOL, so 4.14 and 4.15 will be the ones to get the fix.
Comment 56 sergeidanilov 2018-01-30 06:45:48 UTC
I can confirm iwlwifi-mvm-fix-security-bug-in-PN-checking fixes the speed problem both for local NAS downloads and internet downloads.
9260 fully saturates my NAS abilities now. Which is about 70MB/sec
Comment 57 Robert Hancock 2018-02-03 04:10:52 UTC
Has this patch gotten pushed upstream yet? I haven't noticed it in any trees on git.kernel.org.
Comment 58 AceLan Kao 2018-02-05 06:53:54 UTC
Sorry for the late response, I was off for 2 weeks.
The fix on comment #45 doesn't work on my side, it looks like I have different issue as yours. I applied iwlwifi-mvm-fix-security-bug-in-PN-checking patch on top of the latest kernel tree but no luck, the result is the same.
3527799 Merge branch 'x86-pti-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

TX Speed is only 605 Kbits/sec
[  5] local 10.101.46.132 port 37104 connected with 10.101.46.219 port 5001
[ ID] Interval        Transfer    Bandwidth       Write/Err  Rtry    Cwnd/RTT
[  5] 0.00-10.40 sec   768 KBytes   605 Kbits/sec  1/0         40        4K/7813 us

RX speed is upto 102 Mbits/sec
[  4] local 10.101.46.132 port 5001 connected with 10.101.46.219 port 48938
[ ID] Interval        Transfer    Bandwidth       Reads   Dist(bin=0.2K)
[  4] 0.00-10.23 sec   125 MBytes   102 Mbits/sec  90311    313:313:293:274:697:200:177:88044

So, I filed a new bug and describe my issue and the latest status there. If you have Tx speed issue, please check https://bugzilla.kernel.org/show_bug.cgi?id=198677
Comment 59 Luca Coelho 2018-02-05 12:23:08 UTC
@AceLan did you actually apply the *patch* I attached on comment #45, namely attachment 273873 [details]?

The HW_CSUM flag was not really the culprit, I just tried to check two suspects at the same time.
Comment 60 AceLan Kao 2018-02-06 02:37:23 UTC
I believe I applied the patch and boot the machine with the right kernel, and the patch doesn't help on my side.

acelan@tangerine:~/COD/linux$ git log --oneline
99fbeab (HEAD -> refs/heads/BUILD.iwlwifi) configs (based on Ubuntu-4.14.0-12.14)
abdf386 debian changelog
baf7bf1 UBUNTU: SAUCE: (no-up) disable -pie when gcc has it enabled by default
4b9c8e4 UBUNTU: SAUCE: tools/hv/lsvmbus -- add manual page
ab514cf UBUNTU: SAUCE: add vmlinux.strip to BOOT_TARGETS1 on powerpc
7c5807f base packaging
b0b65f3 (refs/heads/master) iwlwifi: mvm: fix security bug in PN checking
3527799 (refs/remotes/origin/master, refs/remotes/origin/HEAD) Merge branch 'x86-pti-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
0a646e9 Merge branch 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
f74a127 Merge branch 'irq-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
64b2868 Merge tag 'for-linus-20180204' of git://git.kernel.dk/linux-block

acelan@u-Kabylake-Client-platform ~ % dmesg | head
[    0.000000] Linux version 4.15.0-iwlwifi-generic (acelan@tangerine) (gcc version 7.2.0 (Ubuntu 7.2.0-18ubuntu2)) #201802050127 SMP Mon Feb 5 06:29:27 UTC 2018
[    0.000000] Command line: BOOT_IMAGE=/boot/vmlinuz-4.15.0-iwlwifi-generic root=UUID=af76e6ee-bca7-4fdd-87bc-d18f691fe3b1 ro i915.alpha_support=1 quiet splash vt.handoff=7

acelan@u-Kabylake-Client-platform ~ % iperf -c 10.101.46.219 -er
------------------------------------------------------------
Server listening on TCP port 5001 with pid 1672
Read buffer size: 1.44 KByte
TCP window size: 85.3 KByte (default)
------------------------------------------------------------
------------------------------------------------------------
Client connecting to 10.101.46.219, TCP port 5001 with pid 1672
Write buffer size:  128 KByte
TCP window size: 85.0 KByte (default)
------------------------------------------------------------
[  5] local 10.101.46.132 port 47234 connected with 10.101.46.219 port 5001
[ ID] Interval        Transfer    Bandwidth       Write/Err  Rtry    Cwnd/RTT
[  5] 0.00-10.36 sec  1.62 MBytes  1.32 Mbits/sec  1/0         61        4K/7164 us
[  4] local 10.101.46.132 port 5001 connected with 10.101.46.219 port 52580
[ ID] Interval        Transfer    Bandwidth       Reads   Dist(bin=0.2K)
[  4] 0.00-10.15 sec   122 MBytes   101 Mbits/sec  88242    249:239:117:94:1461:192:120:85770
Comment 61 Stanislaw Gruszka 2018-02-07 09:49:17 UTC
When fix for this will be submitted, I can not see it in iwlwifi-fixes tree ?
Comment 62 Luca Coelho 2018-02-10 07:30:09 UTC
The patch has now been submitted for upstream, it's in the iwlifi-fixes/master tree.  A pull request for wireless-drivers will be sent early next week and it should reach 4.16-rc2 and stable releases after that.