Bug 90571 - rtlwifi: rtl8192ce: slow throughput, high rtt
Summary: rtlwifi: rtl8192ce: slow throughput, high rtt
Status: NEW
Alias: None
Product: Networking
Classification: Unclassified
Component: Wireless (show other bugs)
Hardware: All Linux
: P1 normal
Assignee: networking_wireless@kernel-bugs.osdl.org
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2015-01-02 08:10 UTC by suse
Modified: 2015-01-10 07:58 UTC (History)
3 users (show)

See Also:
Kernel Version: 3.18.1
Subsystem:
Regression: Yes
Bisected commit-id:


Attachments
dmesg 3.17.4 (58.17 KB, text/plain)
2015-01-02 08:10 UTC, suse
Details
dmesg 3.18.1 (59.75 KB, text/plain)
2015-01-02 08:11 UTC, suse
Details
iperf 3.17.4 (1.29 KB, text/plain)
2015-01-02 08:11 UTC, suse
Details
iperf 3.18.1 (1.29 KB, text/plain)
2015-01-02 08:12 UTC, suse
Details
ping 3.17.4 (4.96 KB, text/plain)
2015-01-02 08:12 UTC, suse
Details
ping 3.18.1 (5.44 KB, text/plain)
2015-01-02 08:12 UTC, suse
Details
plot of continuous requests, leading to a saw tooth pattern (127.61 KB, image/png)
2015-01-09 12:20 UTC, e-kernel
Details
iperf 3.19-rc3 (1.28 KB, text/plain)
2015-01-09 22:24 UTC, suse
Details
ping 3.19-rc3 (5.73 KB, text/plain)
2015-01-09 22:24 UTC, suse
Details
Patch for 3.19-rc3 and 3.18 to avoid crash when memory is low (4.80 KB, patch)
2015-01-09 22:47 UTC, Larry Finger
Details | Diff
response time graph for 3.19-rc3 (48.25 KB, image/png)
2015-01-09 23:43 UTC, e-kernel
Details
response time graph for 3.17.6 (41.90 KB, image/png)
2015-01-09 23:45 UTC, e-kernel
Details

Description suse 2015-01-02 08:10:52 UTC
Created attachment 162231 [details]
dmesg 3.17.4

with kernel 3.18.1 the ping times to Access Point are very high with high standard derivation (2ms-2000ms)
Also the throughtput is very low: 5Mbits/sec instead of 25Mbits/sec.

With kernel 3.17.4 the performance was well.

the firmware is from 9 oct 2014 and its md5 is
fd118c183ad9e11060a6e575b472280e  /lib/firmware/rtlwifi/rtl8192cfw.bin

test result and dmesg attached
(using openSUSE tumbleweed 32-bit)
Comment 1 suse 2015-01-02 08:11:28 UTC
Created attachment 162241 [details]
dmesg 3.18.1
Comment 2 suse 2015-01-02 08:11:55 UTC
Created attachment 162251 [details]
iperf 3.17.4
Comment 3 suse 2015-01-02 08:12:11 UTC
Created attachment 162261 [details]
iperf 3.18.1
Comment 4 suse 2015-01-02 08:12:25 UTC
Created attachment 162271 [details]
ping 3.17.4
Comment 5 suse 2015-01-02 08:12:53 UTC
Created attachment 162281 [details]
ping 3.18.1
Comment 6 e-kernel 2015-01-09 12:18:58 UTC
I can confirm this bug, also the Invalid Misc count reported by iwconfig increases by 1 on each latency spike.

There is also an interesting pattern regarding the retransmits.
Comment 7 e-kernel 2015-01-09 12:20:58 UTC
Created attachment 162941 [details]
plot of continuous requests, leading to a saw tooth pattern
Comment 8 e-kernel 2015-01-09 12:22:41 UTC
I experience this bug with arch linux. Kernel 3.17.6 works just fine, 3.18.1 has the bug.
Comment 9 Larry Finger 2015-01-09 21:18:50 UTC
Sorry, but I cannot duplicate this. I am using kernel 3.19-rc3. There are some additional patches past 3.18.1, but none of them should affect ping response times. Pinging off my AP, I get

--- 192.168.1.1 ping statistics ---
100 packets transmitted, 100 received, 0% packet loss, time 99147ms
rtt min/avg/max/mdev = 1.948/4.587/151.101/14.778 ms

Considering throughput, iperf does not work for me for some reason, thus I use netperf. The Perl script does multiple RX and TX tests and does some rudimentary statistical analysis. Connecting to a server yields the following:

finger@linux:~/rtl8723bu> ~/netperf_stats.pl desktop 3
Pass 1
TCP_MAERTS Test:  35.29 39.46 35.61 29.13 27.25 37.54 33.65 40.10 41.31 23.34
RX Results: max 41.31, min 23.34. Mean 34.27(5.65)

TCP_STREAM Test:  39.83 35.16 28.38 26.19 31.56 30.78 34.94 30.90 41.07 27.29
TX Results: max 41.07, min 26.19. Mean 32.61(4.80)

Pass 2
TCP_MAERTS Test:  36.77 31.02 28.56 31.47 25.82 26.96 26.28 26.18 27.62 30.66
RX Results: max 36.77, min 25.82. Mean 29.13(3.24)

TCP_STREAM Test:  35.83 31.84 36.20 41.31 45.40 46.89 21.37 40.41 53.46 27.56
TX Results: max 53.46, min 21.37. Mean 38.03(9.06)

Pass 3
TCP_MAERTS Test:  28.41 45.44 44.70 38.62 35.82 36.28 40.41 40.28 42.81 40.78                                                                                           
RX Results: max 45.44, min 28.41. Mean 39.35(4.73)

TCP_STREAM Test:  54.81 50.22 39.32 51.38 57.27 56.17 52.64 49.76 57.11 58.02
TX Results: max 58.02, min 39.32. Mean 52.67(5.30)

Pass 4
TCP_MAERTS Test:  45.40 44.83 35.95 50.12 52.39 42.70 55.61 48.56 43.04 46.67
RX Results: max 55.61, min 35.95. Mean 46.53(5.26)

TCP_STREAM Test:  30.34 46.02 38.50 49.77 46.75 43.91 46.68 45.14 37.26 46.00
TX Results: max 49.77, min 30.34. Mean 43.04(5.57)

Pass 5
TCP_MAERTS Test:  50.60 52.37 55.98 61.23 56.12 33.78 49.68 40.32 41.37 25.63
RX Results: max 61.23, min 25.63. Mean 46.71(10.59)

TCP_STREAM Test:  29.51 31.84 36.88 37.28 44.81 47.80 51.17 54.01 54.50 55.48
TX Results: max 55.48, min 29.51. Mean 44.33(9.28)

RX Overall min 23.34, max 61.23
TX Overall min 21.37, max 58.02
Average RX 39.20, TX 42.13

All numbers are in Mbps. For unknown reasons, the Realtek drivers do show some variability, but the average values are acceptible for an 802.11n interface.
Comment 10 e-kernel 2015-01-09 21:21:14 UTC
(In reply to Larry Finger from comment #9)
> Sorry, but I cannot duplicate this. I am using kernel 3.19-rc3. There are
> some additional patches past 3.18.1, but none of them should affect ping
> response times.

Is there anything I/We could do to help debug this?
Currently this makes 3.18.* unusable for me. I could try compiling 3.19 myself if this would help.
Comment 11 e-kernel 2015-01-09 21:22:48 UTC
In addition to this: I get an increasing number of Invalid Misc output from iwconfig. Loading the kernel module with debug=5 does not show anything but i would like to investigate what those actually are.
Comment 12 Larry Finger 2015-01-09 22:15:27 UTC
Please build 3.19-rc3. I will post all the patches that I use that are not yet in that version.

I too get some "Invalid misc" packets, but I do not know where they come from.
Comment 13 suse 2015-01-09 22:23:01 UTC
I tried 3.19-rc3 and the ping times are now comparable to 3.17.4
The throughput is now better than with 3.18, but it isn't as good as with 3.17.
It's now only 15.4 Mbits/sec instead of the (practical) maximum of 24.2Mbits/s (with 3.17.4)
Comment 14 suse 2015-01-09 22:24:05 UTC
Created attachment 163001 [details]
iperf 3.19-rc3
Comment 15 suse 2015-01-09 22:24:27 UTC
Created attachment 163011 [details]
ping 3.19-rc3
Comment 16 suse 2015-01-09 22:28:58 UTC
Larry, could you change your access point to use only 802.11g instead of 802.11n?
Then, your values would be more comparable to my values. (my router doesn't support 802.11n)
You should get then 3Mbyte/sec or 24Mbps.
Comment 17 Larry Finger 2015-01-09 22:44:51 UTC
I switched to a different AP that is 802.11g and uses WPA1 encryption. My summary is

RX Overall min 11.74, max 15.57
TX Overall min  9.00, max 17.07                   
Average RX 14.55, TX 14.91

I'm getting about what you do with 3.19-rc3. As I expected, none of the new patches make any difference. I just pulled the 3.17.4 source, and I'll see what that gives me.

There is one patch that fixes a nasty bug that occurs when memory is low of fragmented. You should apply that to any 3.18 versions or 3.19-rc3.
Comment 18 Larry Finger 2015-01-09 22:47:10 UTC
Created attachment 163021 [details]
Patch for 3.19-rc3 and 3.18 to avoid crash when memory is low
Comment 19 e-kernel 2015-01-09 23:42:52 UTC
I just installed 3.19-rc3 and my response times are normal as well although there appears to be a lot more fluctuation.
Should I build 3.17.6 and apply each 3.18.* rtl8192ce related patch until the issue reoccurs?

I'll attach response time graphs for 3.19-rc3 and 3.17.6.
Comment 20 e-kernel 2015-01-09 23:43:47 UTC
Created attachment 163031 [details]
response time graph for 3.19-rc3
Comment 21 e-kernel 2015-01-09 23:45:13 UTC
Created attachment 163041 [details]
response time graph for 3.17.6
Comment 22 Larry Finger 2015-01-10 00:19:02 UTC
Trying to perform a bisection or incremental addition is not likely to work. In the past, Realtek kept their fixes private until they released a completely new driver. I finally convinced then that this was bad practice. The arrangement that we reached involved one massive change to match their current state, and a git repo that they woul add their fixes in an incremental manneer. The massive change was what happened in 3.18.

One additional point is that rtl8192ce geets the same throughput to my 802.11g router as I get with an Intel 7260 against that same router.
Comment 23 e-kernel 2015-01-10 00:27:10 UTC
Throughput in 3.19 seemed ok to me too. But in 3.18 it was really bad because of periodic high latency as shown in https://bugzilla.kernel.org/attachment.cgi?id=162941 .
Comment 24 e-kernel 2015-01-10 00:51:12 UTC
So is there anything we can still do? Or do we just accept that it's acceptable in 3.19 and close the bug?
Comment 25 Larry Finger 2015-01-10 02:57:38 UTC
First of all, I'm not sure that the difference between 3.18 and 3.19 is due rtlwifi or rtl8192ce as there are very few changes in that period. Perhaps some other networking component has a bug in 3.18 that was fixed in 3.19.

I will be watching for changes pushed to me by Realtek. In the absence of any such fixes, I am able to do very little. I do not have any NDA with Realtek, and I know nothing of the internal workings of the chips.
Comment 26 suse 2015-01-10 07:58:34 UTC
I've made a re-test with 3.19-rc3 this morning and (different to comment#13) the performance is now very well. I'm reaching maximum values for throughput. (26.1 Mbps)

I guess the air was "full" last night are therefore the values in #13 weren't totally reliable.

Conclusion:
- 3.18.1 is broken. (verify this morning, too)
- 3.17.4 and 3.19-rc3 are OK.

Maybe a bisect between 3.18 and 3.19-rc3 could reveal which commit restores the good performance. However, I will not have time for this the next days.
But I'm glad that the workaround is simple: just ignore the 3.18.x kernels. :-)

Note You need to log in before you can comment on or make changes to this bug.