Bug 60713

Summary: Driver rtl8192ce unable to connect with 3.10.x/3.11.x/3.12.x kernels
Product: Drivers Reporter: Ron S. (ron_sun)
Component: network-wirelessAssignee: drivers_network-wireless (drivers_network-wireless)
Status: NEW ---    
Severity: normal CC: anarcat, b4nst0n, bwyazel, claudiozumbo, igg, Larry.Finger, linville, obedlink, peter, photoshop.ninja, prozac, szg00000
Priority: P1    
Hardware: x86-64   
OS: Linux   
Kernel Version: 3.10.x,3.11.x,3.12 Subsystem:
Regression: No Bisected commit-id:
Attachments: Patch to improve automatic gain control
Replacement patch
system logs from a few connection attempts

Description Ron S. 2013-08-07 15:23:58 UTC
Hardware is Intel Core/Duo cpu & Asus PCE-n15 Wifi card.
There are no problems with any of the 3.9.x kernels up
to 3.9.11. The distributions I have tested this with are
Fedora 17 and Fedora 19.

When I try any 3.10.x kernel I see that the wifi connection
using NetworkManager is in a endless loop trying to connect.
I notice that wlassistant and wifi-radar report correctly about
the available stations.

I looked into the Realtek driver source code. Appears to be a
re-write or change in architecture.

Can supply screen output if requested but I believe that enough
information has been given.
Comment 1 Ron S. 2013-08-07 16:27:41 UTC
The symptom above happens with:
3.10.0
3.10.2
3.10.4
3.10.5

When moving from 3.9.11 to 3.10.0 I used my .config from
3.9.11 and did the usual make oldconfig.. I noticed that
there were a couple of new Realtek drivers for wifi which
did not apply to my hardware. Even if I go in and configure
every Realtek driver the problem still occurs.

I also have a laptop that runs 3.10.5 and wifi works, obviously
because the wifi hardware uses a different driver.
Comment 2 Ivo van Doorn 2013-08-07 18:38:32 UTC
[Removed myself from CC list, as this is a Realtek issue, not Ralink]
Comment 3 Britt Yazel 2013-08-08 06:59:06 UTC
I too have the exact issue described in the original post, though with Archlinux. The same issue shows with the endless cycling of trying to authenticate my network password. I have also tested on kernels 3.10.1-3.10.5 all doing the same. Kernel 3.9.x is unnaffected.
Comment 4 Gertjan van Wingerde 2013-08-08 07:02:23 UTC
[Removed myself from CC list, as this is a Realtek issue, not Ralink]
Comment 5 Britt Yazel 2013-08-08 07:03:54 UTC
The archlinux bug was reported here:

https://bugs.archlinux.org/task/36292

There may be use full information in the logs reported thus far.
Comment 6 Britt Yazel 2013-08-21 07:06:11 UTC
Is there anything I can do to assist in the squashing of this regression? Any logs that I could post?
Comment 7 Britt Yazel 2013-08-23 11:41:25 UTC
Can somebody mark this bug as a regression?
Comment 8 b4nst0n 2013-08-23 16:25:43 UTC
The same problem here, Card RTL8188CE with RTL8192CE driver. A lot of random disconnect.
Comment 9 Britt Yazel 2013-09-20 16:12:25 UTC
With kernel 3.11.1 this issue is still present. It appears to have been carried over
Comment 10 Larry Finger 2013-09-26 19:45:06 UTC
Created attachment 109721 [details]
Patch to improve automatic gain control

The existing code for rtl8192ce and rtl8192cu did not control the gain for RX and TX very well and for some setups, the gain would end up so low that the receiver could not get info from the AP. Unfortunately, this did not happen for my site, and I never saw the problem.

According to limited testing, this patch improves the situation quite a bit. There will likely be more changes before this goes into the kernel, but please test. The patch was derived from mainline kernel 3.12-rc2, but should apply with older versions.
Comment 11 Ron S. 2013-10-03 22:00:37 UTC
After dumping those changes in kernel 3.11.3 the problem persists.

BTW Using no brackets in one line "if/else" statements is not good practice.
Comment 12 Larry Finger 2013-10-03 22:50:18 UTC
Read the instructions for kernel patches. Any if/else construction with a single statement in every branch should not use braces!
Comment 13 Ron S. 2013-10-03 23:30:06 UTC
Tested kernel 3.12-rc2 ia64. No improvement. I am running on 3.12-rc2 right now
and the only thing that does not work is the wireless.

In the industry if/elses without braces is frowned upon because it causes
problems. I have worked in companies where it is strictly forbidden.
Comment 14 Larry Finger 2013-10-04 02:03:54 UTC
When in Rome, ....

Please post the output of lspci -nnv for your device. There are several different ones, and I have no idea which you have.

I would also like to know what AP you have, the encryption method, and the distance between the AP and station.
Comment 15 Ron S. 2013-10-04 03:33:42 UTC
Rome eventually burned...

The card works with Windows 7.
$ lspci -nnv
03:00.0 Network controller [0280]: Realtek Semiconductor Co., Ltd. RTL8188CE 802.11b/g/n WiFi Adapter [10ec:8178] (rev 01)
	Subsystem: ASUSTeK Computer Inc. Device [1043:84b6]
	Flags: bus master, fast devsel, latency 0, IRQ 16
	I/O ports at 2000 [size=256]
	Memory at f0100000 (64-bit, non-prefetchable) [size=16K]
	Capabilities: <access denied>
	Kernel driver in use: rtl8192ce

Belkin Wireless G Plus Router
WPA & WPA2
Distance: 3-4 feet
Comment 16 Ron S. 2013-10-04 03:47:10 UTC
First noticed when running 32 bit built kernel.

Since comment 11 and after:
uname -a
Linux 3.12.0-rc3 #1 SMP Thu Oct 3 16:01:55 PDT 2013 x86_64 x86_64 x86_64 GNU/Linux
Comment 17 Ron S. 2013-10-04 03:53:53 UTC
(In reply to Ron S. from comment #13)
> Tested kernel 3.12-rc2 ia64. No improvement. I am running on 3.12-rc2 right
> now
> and the only thing that does not work is the wireless.
> 
> In the industry if/elses without braces is frowned upon because it causes
> problems. I have worked in companies where it is strictly forbidden.

Correction: Should be "kernel 3.12-rc3"
Comment 18 Larry Finger 2013-10-04 04:56:11 UTC
I have several different cards that use the driver. I am now testing with the same as you:

0e:00.0 Network controller [0280]: Realtek Semiconductor Co., Ltd. RTL8188CE 802.11b/g/n WiFi Adapter [10ec:8178] (rev 01)
        Subsystem: Realtek Semiconductor Co., Ltd. RTL8188CE 802.11b/g/n WiFi Adapter [10ec:8178]
        Flags: bus master, fast devsel, latency 0, IRQ 19
        I/O ports at 4000 [size=256]
        Memory at f8000000 (64-bit, non-prefetchable) [size=16K]
        Capabilities: <access denied>
        Kernel driver in use: rtl8192ce


My kernel is 3.12-rc3 from the wireless-testing git tree with my patches applied. My AP is a Netgear WNDR3400 AP with WPA2 encryption with a distance of 6 feet from AP to STA.

Connection was no problem, and netperf gives the following results:

TCP_MAERTS Test:  58.51 57.71 56.71 38.50 63.89 42.55 55.79 51.72 52.52 62.25
RX Results: max 63.89, min 38.50. Mean 54.02(7.68)

TCP_STREAM Test:  30.49 26.19 27.95 29.75 30.69 29.74 32.06 24.26 27.80 27.64
TX Results: max 32.06, min 24.26. Mean 28.66(2.22)

Each sample was a 3 second run with the server connected via 100 Mbps wired connection.

finger@larrylap:~/wireless-testing-save> sudo iwlist scan
wlan4     Scan completed :
          Cell 01 - Address: 20:E5:2A:01:F7:EA
                    Channel:6
                    Frequency:2.437 GHz (Channel 6)
                    Quality=68/70  Signal level=-42 dBm  
                    Encryption key:on
                    ESSID:"NETGEAR81"
                    Bit Rates:1 Mb/s; 2 Mb/s; 5.5 Mb/s; 11 Mb/s; 18 Mb/s
                              24 Mb/s; 36 Mb/s; 54 Mb/s
                    Bit Rates:6 Mb/s; 9 Mb/s; 12 Mb/s; 48 Mb/s
                    Mode:Master
                    Extra:tsf=0000007e215bdc9f
                    Extra: Last beacon: 72ms ago
                    IE: IEEE 802.11i/WPA2 Version 1
                        Group Cipher : CCMP
                        Pairwise Ciphers (1) : CCMP
                        Authentication Suites (1) : PSK

From dmesg: rtl8192ce: Using firmware rtlwifi/rtl8192cfw.bin

The md5sum for the firmware:
748944fbffd3b08b5b1929bb6c7fc537  /lib/firmware/rtlwifi/rtl8192cfw.bin

I did discover that there is a problem connecting to a WPA1 AP. That was fixed earlier, but I must have broken it again.
Comment 19 Ron S. 2013-10-04 14:17:35 UTC
Do you mean to say that 3.12-rc3 already has your patches?

From dmesg:
]$ dmesg | grep rtl
[   14.796594] rtl8192ce:_rtl92ce_read_chip_version():<0-0> Chip Version ID: B_CHIP_92C
[   14.811276] rtl8192ce: Using firmware rtlwifi/rtl8192cfw.bin
[   15.127813] ieee80211 phy0: Selected rate control algorithm 'rtl_rc'
[   15.128035] rtlwifi: wireless switch is on

$ sudo iwlist scan
p6p1      Interface doesn't support scanning.

wlp3s0    Scan completed :
          Cell 01 - Address: 20:E5:64:12:79:90
                    Channel:1
                    Frequency:2.412 GHz (Channel 1)
                    Quality=44/70  Signal level=-66 dBm  
                    Encryption key:on
                    ESSID:"white2"
                    Bit Rates:1 Mb/s; 2 Mb/s; 5.5 Mb/s; 11 Mb/s; 18 Mb/s
                              24 Mb/s; 36 Mb/s; 54 Mb/s
                    Bit Rates:6 Mb/s; 9 Mb/s; 12 Mb/s; 48 Mb/s
                    Mode:Master
                    Extra:tsf=00000011be4a8a62
                    Extra: Last beacon: 5437ms ago
                    IE: Unknown: 0006776869746532
                    IE: Unknown: 010882848B962430486C
                    IE: Unknown: 030101
                    IE: Unknown: 2A0100
                    IE: Unknown: 2F0100
                    IE: IEEE 802.11i/WPA2 Version 1
                        Group Cipher : TKIP
                        Pairwise Ciphers (2) : CCMP TKIP
                        Authentication Suites (1) : PSK
                    IE: Unknown: 32040C121860
                    IE: Unknown: 2D1A7C181BFFFF000000000000000000000000000000000000000000
                    IE: Unknown: 3D1601081500000000000000000000000000000000000000
                    IE: Unknown: 4A0E14000A002C01C800140005001900
                    IE: Unknown: 7F0101
                    IE: Unknown: DD090010180201F02C0000
                    IE: WPA Version 1
                        Group Cipher : TKIP
                        Pairwise Ciphers (2) : CCMP TKIP
                        Authentication Suites (1) : PSK
                    IE: Unknown: DD180050F2020101800003A4000027A4000042435E0062322F00
Comment 20 Peter Wu 2013-11-02 16:00:42 UTC
I can observe similar issues with a 10ec:8176 (also using the rtl8192ce driver). Distribution: Arch Linux, using netctl (not NetworkManager).

 - It can connect with my Android phone (i9300) as AP. The phone lays next to the PC.
 - Although the adapter can associate with an AP, it cannot authenticate to a more distant router (approx. 7 meters? with 2 walls). Signal is OK because a phone, tablet and laptop can connect to it.

iwlist scan results for this AP (while associated):
        freq: 2462
        beacon interval: 100 TUs
        capability: ESS Privacy ShortSlotTime (0x0411)
        signal: -58.00 dBm
        last seen: 263 ms ago
        Information elements from Probe Response frame:
        SSID: <snip>
        Supported rates: 1.0* 2.0* 5.5* 11.0* 9.0 18.0 36.0 54.0 
        DS Parameter set: channel 11
        ERP: Barker_Preamble_Mode
        Extended supported rates: 6.0 12.0 24.0 48.0 
        HT capabilities:
                Capabilities: 0x6c
                        HT20
                        SM Power Save disabled
                        RX HT20 SGI
                        RX HT40 SGI
                        No RX STBC
                        Max AMSDU length: 3839 bytes
                        No DSSS/CCK HT40
                Maximum RX AMPDU length 65535 bytes (exponent: 0x003)
                Minimum RX AMPDU time spacing: 4 usec (0x05)
                HT RX MCS rate indexes supported: 0-15
                HT TX MCS rate indexes are undefined
        HT operation:
                 * primary channel: 11
                 * secondary channel offset: no secondary
                 * STA channel width: 20 MHz
                 * RIFS: 0
                 * HT protection: no
                 * non-GF present: 1
                 * OBSS non-GF present: 0
                 * dual beacon: 0
                 * dual CTS protection: 0
                 * STBC beacon: 0
                 * L-SIG TXOP Prot: 0
                 * PCO active: 0
                 * PCO phase: 0
        Secondary Channel Offset: no secondary (0)
        RSN:     * Version: 1
                 * Group cipher: CCMP
                 * Pairwise ciphers: CCMP
                 * Authentication suites: PSK
                 * Capabilities: PreAuth (0x0001)
        WMM:     * Parameter version 1
                 * BE: CW 15-1023, AIFSN 3
                 * BK: CW 15-1023, AIFSN 7
                 * VI: CW 7-15, AIFSN 2, TXOP 3008 usec
                 * VO: CW 3-7, AIFSN 2, TXOP 1504 usec
        BSS Load:
                 * station count: 1024
                 * channel utilisation: 18/255
                 * available admission capacity: 4730 [*32us]
        Extended capabilities: HT Information Exchange Supported
... WPS ...
        Country: NL     Environment: Indoor/Outdoor
                Channels [1 - 13] @ 16 dBm

This list was generated a while ago, either on 3.9 or 3.10. With 3.9 (and even later versions?) I could establish a connection after some trials in which I did a lot of actions that I cannot remember (think of, reloaded kernel modules, changing regulatory settings with `iw reg`, setting module parameters wrt PM, restarting AP, manually running with wpa_supplicant, ...).

Using a second Wi-Fi adapter, I found that although the first EAPOL message gets acknowledged by the station, the station does not reply with a second EAPOL message. Can you observe the same?

When I am near this AP, I will conduct more tests beside the rtl with AP case:
- phone with AP
- rtl with phone (as AP)

Larry, if you need a packet capture, I can provide if it can be kept private.
Comment 21 Peter Wu 2013-11-03 15:12:47 UTC
v3.12-rc7-111-g9581b7d + patch from comment 10 did not improve the situation.

I have tested three devices now:

- Problematic AP (hereafter referred to as "AP")
- i9300 Android phone as AP ("PhoneAP")
- rtl8192ce station ("RTL")
- iPad as station ("iPad")


Differences between RTL+AP and RTL+PhoneAP:
- AP uses channel 11, PhoneAP uses channel 6.
- AP advertises HT40 and HT20, PhoneAP only advertises HT20.

- After ACKing the association response, AP sends an IEEE 802.11 management frame, type action. This is a Radio Measurement request to which RTL responds with  Radio Measurement error. RTL+PhoneAP does not do this.

Packet flows:
(RTL <-> AP; acks are omitted)
< AssocResp
< Radio Measurement
Radio Measurement Error >
< EAPOL KEY 1/4
< Action: Add Block Ack request
Action: Add Block Ack response (successful) >
< Block Ack req
Block Ack >
< EAPOL KEY 1/4 [not acked!]
< EAPOL KEY 1/4 [not acked!]
... (after five seconds of retransmissions, and retransmissions with increased seq number) ...
Deauth (disassociated due to inactivity) >

Compare it to a working RTL <-> PhoneAP (acks are again omitted):
< AssocResp
< Action: Add Block Ack request
< EAPOL KEY 1/4
Action: Add Block Ack response (successful) >
< Block Ack req
Block Ack >
EAPOL KEY 2/4 >
< EAPOL KEY 3/4
EAPOL KEY 4/4 >

Or working iPad <-> AP:
< AssocResp
< Ack
< EAPOL KEY 1/4
Null function >
Comment 22 Peter Wu 2013-11-03 15:18:29 UTC
I misclicked, here is the iPad <-> AP case:
< AssocResp
< Ack
< EAPOL KEY 1/4
Null function >
< EAPOL KEY 1/4 (retransmit)
Ack >
Null function (retransmit) >
< Ack
< Action: Add Block Ack request (dialog token 0x5c)
< Ack
Action: Add Block Ack response (successful) >
< Action: Add Block Ack request (dialog token 0x01)
< Ack
EAPOL 2/4 >
...
(success)
Comment 23 Larry Finger 2013-11-03 15:39:16 UTC
Created attachment 113191 [details]
Replacement patch

This patch is a composite of a set of patches that will be sent to wireless-testing as soon as I am finished testing. The changes were primarily aimed at rtl8192cu; however, as that driver shares some components with rtl8192ce, it also improved the latter.
Comment 24 Peter Wu 2013-11-03 20:26:00 UTC
With attachment 113191 [details] on top of v3.12-rc7-111-g9581b7d, the WPA2-PSK handshake completes and ARP traffic is answered, but ICMP ping does get transmitted. Again I monitored used a secondary adapter for monitoring the air.

RTL <-> AP
ARP Who-has >
< Ack
< is-at
Ack >
Action: Add Block Ack Request >
< Action: Add Block Ack Response (successful)
Ack >
weird IEEE 802.11 Data frame (unknown LLC+data) with fc.flags.more_data flag set (Data is buffered for STA at AP)>
(long silence, no ICMP or anything)

In one capture sample, after 17 seconds I got:
Action: Delete Block Add (reason: Requested from peer STA as it does not want to use the mechanism) >
< Ack

In the second capture sample, after 3-4 seconds, RTL sends out ICMPv6 router sollicitations to the ff02::2, but ICMPv4 ping still does not show up. Another 14 seconds later, I decided to deauth as I do not get any response with ping.

In the radiotap header, I read about -20 dBm for RTL, so that should be OK. (-61 dBm for AP). I can also observe lots of retransmissions, especially from RTL -> AP (another mobile device in this capture (coincidence) has no such issues). Could this be too low TX power?

Let me know if the information I provide is useful or not, it takes me some time to gather and type this information.
Comment 25 Peter Wu 2013-11-03 23:36:48 UTC
What is rtl_is_special_data used for? When should it return true/false? Can this be documented?

Callers:
- _rtl_receive_one: rtl_is_special_data(hw, skb, false); (retval ignored; for logging??)
- _rtl_rc_get_highest_rix: rtl_is_special_data(rtlpriv->mac80211.hw, skb, true)
- rtl_tx_status: rtl_is_special_data(mac->hw, skb, true)

My guess is that a truth value indicates that the transmission rate should be reduced and a falsive value the opposite. Throwing a quick look at the rtlwifi and iwldvm (iwlwifi) code confuses me even more, so I stopped looking.
Comment 26 Larry Finger 2013-11-04 00:58:13 UTC
That routine detects those packets that *must* succeed. If the routine returns true, they are transmitted at the lowest rate. If false, the normal rate is used.

Of course, it can be documented; however, that routine has been in the kernel since 2.6.38, and this is the first call to document it.
Comment 27 Peter Wu 2013-11-04 13:45:51 UTC
I don't know this driver very well, thanks for documenting it here.

Is the use of ip->ihl safe?

    udp = (struct udphdr *)((u8 *)ip + (ip->ihl << 2));

This IP header length field is taken from the frame. Worst case is 60 bytes (which should still be in the skb memory), but it might read the wrong results.

On-topic: can you reply to comment 24 please? If you have more patches, I would be glad to test them.
Comment 28 Larry Finger 2013-11-04 17:30:55 UTC
I think that usage is safe. That data is coming from the skb we are investigating, and we only use that stuff if we have a UDP frame. In addition, it will only affect BOOTP protocol. Are you having trouble with BOOTP?

As to comment #24, do you have the pcap files from wireshark or kismet? If so, please post them some place where I can reach them. My ISP has not implemented IPv6, thus that is not heavily tested.

As most of the recent changes have been fixing problems with TX gain, I certainly hope that it is OK. The parameter in question is cur_igvalue. You can see what values are being used if you load the module with the 'debug=2' option. The allowable range for this parameter is 0x1f to 0x3e, and it seems to be inversely proportional to the radio gain. My RTL8188CE seems to have 0x27+/-2 for that parameter.
Comment 29 Peter Wu 2013-11-04 18:41:27 UTC
No BOOTP, no IPv6 router, I have so far only seen ARP. This is a static IP configuration, so no UDP/DHCP either. There are no fancy auto-updaters, NetworkManager, webbrowsers, chat applications or whatever network-accessing applications active. Just wpa_supplicant + ip + iw + ping.

I have several wireshark capture files, is the latest one where I can authenticate (but where ICMP ping fails) enough? As the packets are encrypted with WPA2, you presumably need the PSK.

I'll later mail you a link to a tarball with the capture files (only including station and AP, ok?) and PSK. It is not top-secret military data, but please keep it private. For that capture, I do not have a journal with debug > 0, however the below igvalues are taken from the journal with your first patch:

Nov 03 14:17:16 [cut] cur_igvalue = 0x20, pre_igvalue = 0x0, back_val = 10
Nov 03 14:17:17 [cut] cur_igvalue = 0x22, pre_igvalue = 0x22, back_val = 10
Nov 03 14:17:18 [cut] cur_igvalue = 0x1e, pre_igvalue = 0x24, back_val = 12
Nov 03 14:17:20 [cut] cur_igvalue = 0x29, pre_igvalue = 0x20, back_val = 10
Nov 03 14:17:22 [cut] cur_igvalue = 0x1e, pre_igvalue = 0x2b, back_val = 10
Nov 03 14:17:24 [cut] cur_igvalue = 0x1e, pre_igvalue = 0x20, back_val = 12
Nov 03 14:17:26 [cut] cur_igvalue = 0x1e, pre_igvalue = 0x20, back_val = 12
Nov 03 14:17:28 [cut] cur_igvalue = 0x20, pre_igvalue = 0x0, back_val = 10
Nov 03 14:17:29 [cut] cur_igvalue = 0x22, pre_igvalue = 0x22, back_val = 10

Do you still need debug=2 with your second patch?
Comment 30 Larry Finger 2013-11-04 19:56:45 UTC
Yes, please send the debug=2 output with your current patch.

The indication is that your gain is near the maximum (lowest value of cur_igvalue). It is not easy for me to test at 7 meters as the battery in my laptop is bad, but I will try it.

My suspicion is that these devices have poor performance under low signal conditions. Mine is about 2m from thae AP and I have a pair of 7 dBi antennas attached, yet the gain is near maximum.

I'll see what I can do about moving the computer to try to match your setup.
Comment 31 Larry Finger 2013-11-04 21:08:58 UTC
OK, at the moment I am about 12m from the AP with the signal traversing 2 walls that have dry wall on one side and stucco on the other. Cur_igvalue is about 0x20, which is definitely lower than the value I had when the computer was 2m from the AP. The indicated signal strength from iwconfig is -64 dBm. Throughput is reduced to 23 Mbps RX and 8 Mbps TX, but no problems connecting.
Comment 32 Peter Wu 2013-11-04 21:20:33 UTC
Larry, I have just sent the details by e-mail. I doubt that the issue is tx power, the Ralink capture adapter that is a few cm away from the usb (plugged in same machine as the RTL adapter) cannot see any packets in air.
Comment 33 Larry Finger 2013-11-04 21:30:23 UTC
Perhaps it is RX power, and rtl8192ce is never seeing the incoming packets.

Have you tried wireshark on the Ralink card and tcpdump on the Realtek one? If that doesn't work, I could get you a patch that dumps every incoming packet. The log would get messy, but it would not have to run very long.
Comment 34 Peter Wu 2013-11-04 21:40:47 UTC
I haven't tried to run tcpdump on RTL after connecting. What I can remember is that when I tried to use monitor mode on a vanilla kernel before, I could not see traffic of other stations, only broadcast traffic. That might be related to this bug...

That monitor mode was set-up as follows:
- No wpa_supplicant nor associated to any network.
iw wlan0 set type monitor
ip link set wlan0 up
iw wlan0 set channel 11
dumpcap -w out.pcapng -i wlan0

As an alternative, I also tried to add a secondary interface, still without luck:
iw wlan0 interface add mon0 type monitor
ip link set mon0 up
dumpcap -i mon0

And also just: dumpcap -i wlan0 -k

I cannot do more tests today, perhaps tomorrow. Does the above information provide any new insights?
Comment 35 igg 2013-11-16 17:19:14 UTC
I had the same problem with EW-7811Un 802.11n Wireless Adapter [Realtek RTL8188CUS]. Could scan and connect to AP with iw, but no dhcp, no ping to gateway with manual config.

This is 3.10.19 from git on archlinux/arm5/olinuxino.

I successfully applied the replacement patch (attachment 113191 [details]) to 3.10.19, and networking is now functional.

Some more info in case its useful.  These steps worked as before:
[root@BBD9000-06 ~]# iw wlan0 connect 'Radio Free Cathilya'
[root@BBD9000-06 ~]# [  197.470000] wlan0: authenticate with 00:16:01:84:d3:cc
[  197.550000] wlan0: send auth to 00:16:01:84:d3:cc (try 1/3)
[  197.560000] wlan0: authenticated
[  197.580000] wlan0: associate with 00:16:01:84:d3:cc (try 1/3)
[  197.600000] wlan0: RX AssocResp from 00:16:01:84:d3:cc (capab=0x401 status=0 aid=5)
[  197.640000] wlan0: associated

[root@BBD9000-06 ~]# iw wlan0 link
Connected to 00:16:01:84:d3:cc (on wlan0)
        SSID: Radio Free Cathilya
        freq: 2437
        RX: 14624 bytes (236 packets)
        TX: 104 bytes (2 packets)
        signal: -85 dBm
        tx bitrate: 1.0 MBit/s

        bss flags:      short-preamble short-slot-time
        dtim period:    0
        beacon int:     100

Many packets are being received (>100/s), none transmitted.  Doesn't seem to end.
Before the patch, at this point dhcpcd would time out, could not ping gateway with manual IP.

With the patch, dhcp now works:
[root@BBD9000-06 ~]# dhcpcd wlan0
dhcpcd[129]: version 6.1.0 starting
dhcpcd[129]: all: IPv6 kernel autoconf disabled
dhcpcd[129]: wlan0: IPv6 kernel autoconf disabled
dhcpcd[129]: wlan0: rebinding lease of 10.0.1.142
dhcpcd[129]: wlan0: leased 10.0.1.142 for 86400 seconds
dhcpcd[129]: wlan0: adding host route to 10.0.1.142 via 127.0.0.1
dhcpcd[129]: wlan0: adding route to 10.0.1.0/24
dhcpcd[129]: wlan0: adding default route via 10.0.1.1
dhcpcd[129]: forked to background, child pid 225
[root@BBD9000-06 ~]# ifconfig wlan0
wlan0: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1500
        inet 10.0.1.142  netmask 255.255.255.0  broadcast 10.0.1.255
        ether 80:1f:02:b5:cd:61  txqueuelen 1000  (Ethernet)
        RX packets 5  bytes 835 (835.0 B)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 7  bytes 795 (795.0 B)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

running iperf server, with laptop as client for an hour:
[  5]  0.0-3600.9 sec  1.24 GBytes  2.95 Mbits/sec
[  6]  0.0-3605.6 sec   224 MBytes   521 Kbits/sec

The transmission rate is ~5x lower than receive for the duration of the test, which it did survive. AP is ~7m diagonally through a wood floor.

Pulling the USB dongle out:
[root@BBD9000-06 ~]# [ 4228.190000] usb 1-1: USB disconnect, device number 2
[ 4228.250000] wlan0: deauthenticating from 00:16:01:84:d3:cc by local choice (reason=3)
[ 4228.290000] rtlwifi: reg 0x102, usbctrl_vendorreq TimeOut! status:0xffffffed value=0x69543432
[ 4228.320000] rtlwifi: reg 0x422, usbctrl_vendorreq TimeOut! status:0xffffffed value=0x3990727
[ 4228.320000] rtlwifi: reg 0x542, usbctrl_vendorreq TimeOut! status:0xffffffed value=0x1b039c
[ 4228.350000] rtlwifi: reg 0x608, usbctrl_vendorreq TimeOut! status:0xffffffed value=0x0
[ 4228.380000] cfg80211: Calling CRDA to update world regulatory domain

plugging it back in:
[root@BBD9000-06 ~]# [ 4245.130000] usb 1-1: new high-speed USB device number 3 using ci_hdrc
[ 4245.290000] rtl8192cu: Chip version 0x10
[ 4245.770000] rtl8192cu: MAC address: 80:1f:02:b5:cd:61
[ 4245.780000] rtl8192cu: Board Type 0
[ 4245.800000] rtlwifi: rx_max_size 15360, rx_urb_num 8, in_ep 1
[ 4245.820000] rtl8192cu: Loading firmware rtlwifi/rtl8192cufw_TMSC.bin
[ 4245.890000] rtlwifi: wireless switch is on

Manual re-setup required:
[root@BBD9000-06 ~]# ip link set wlan0 up
[ 4320.450000] rtl8192cu: MAC auto ON okay!
[ 4320.700000] rtl8192cu: Tx queue select: 0x05

Then, iw wlan0 connect and dhcpcd work.

This is quite awesome!  The number of people affected by this bug are legion.
Comment 36 Ron S. 2013-11-28 05:04:21 UTC
(In reply to igg from comment #35)
 
> This is quite awesome!  The number of people affected by this bug are legion.

A Bug that repeats itself every time is in a sense "already fixed". The
handling of this is quite ridiculous. I'm going to fix it myself when I
get the time and forget about all this nonsense here.
Comment 37 Eric 2013-12-15 04:20:19 UTC
My laptop is an i686 not a x86-64, but the problem is the same. The chipset is a Realtek 8188ce and I'm using the 8192ce kernel module.

It looks like my laptop can connect to the router, but cannot receive the IP-address. I've tested this with "dhcpcd wlan0" and also with a fixed IP-address (ip addr add ...) on an open connection (not password protected).
The tests were successful with kernel version 3.9.x, but always fail with version >= 3.10.x.

The last kernel version I have tried to use was 3.12.4, but have to downgrade all the way back to version 3.9.x.
Comment 38 obedlink 2014-02-14 23:00:11 UTC
the problem still continues with the kernel 3.14 rc2

I also have an Asus PCE-N15 wireless card can connect to the router. to enter the router (from another PC) I can see that the wireless connection is active and enabled.

from the PC where I have my Asus pce-n15 I can not access the router or any internet page, how is it possible that the wifi card can get an ip but can not ping the router?
Comment 39 Peter Wu 2014-02-16 15:03:51 UTC
Since applying some patches[1][2] on top of v3.14-rc2-267-g9398a10, I am so far always able to connect. ping works, but only 20 seconds after connect. There is still a lot of packet loss, but at least is connects. I will keep monitoring with dumpcap and ping.

A possible reason why the patch "fixes" the bug is that rtlpci->receive_config is overwritten before it is used. Some notes:

- In [rtl92c_]init_sw_vars, rtlpci->receive_config gets initialized (called via rtl_pci_probe).
- _rtl92ce_init_mac writes receive_config to REG_RCR which is called via rtl92ce_hw_init.
- After calling _rtl92ce_init_mac in [rtl92ce_]hw_init, rtlpci->receive_config gets re-initialized from the REG_RCR register (called via rtl_pci_start, via rtl_op_start when an interface is brought up).
- Before patch[1], [rtl92ce_]set_ch[ec]k_bssid would read directly from RCR instead of using the previous stored rtlpci->receive_config value.

So, it appears to be working, but I am not fully certain why. If rtl92ce_set_check_bssid gets called between init_sw_vars and init_mac, then I understand why the bug appears to be gone. Otherwise, no idea. Please test these patches (at least[1]) and report your findings.

 [1]: http://www.spinics.net/lists/linux-wireless/msg118693.html
 [2]: http://www.spinics.net/lists/linux-wireless/msg118694.html
Comment 40 Tore Anderson 2014-02-23 20:11:56 UTC
Created attachment 127181 [details]
system logs from a few connection attempts

I have a TP-LINK TL-WN8200ND wireless adapter (rtl8192cu chipset/driver) that's failing to associate with any AP I try. It always ends up with the following in the kernel logs:

wlp0s29f7u4: authenticate with 02:1a:11:f5:68:2f
wlp0s29f7u4: direct probe to 02:1a:11:f5:68:2f (try 1/3)
wlp0s29f7u4: direct probe to 02:1a:11:f5:68:2f (try 2/3)
wlp0s29f7u4: direct probe to 02:1a:11:f5:68:2f (try 3/3)
wlp0s29f7u4: authentication with 02:1a:11:f5:68:2f timed out

I'm certain it's not caused by poor signal; even if the network adapter and the APs are physically in contact, it fails. A laptop with an iwlwifi adapter in it has no problems getting on-line. The adapter has no problems seeing all the wireless networks in range, it's just connecting to them that fails.

I'm attaching a full log of trying unsuccessfully two times to connect to an unsecured AP, and then another time to another WPA2-PSK secured AP, also unsuccessful. It done on Linux 3.14-rc3, with the rtlwifi driver from today's wireless-next.git tree (which already includes attachment 113191 [details]). In addition "rtlwifi: Convert drivers to use new API for ieee80211_is_robust_mgmt_frame()" from http://patchwork.ozlabs.org/patch/323212/ is included as that was necessary for the wireless-next tree to compile, plus all three patches from Peter Wu's "rtlwifi promiscious mode fix and cleanup" series (http://thread.gmane.org/gmane.linux.kernel/1648348).

I know the hardware itself is good as it works well in Windows. It also works in Linux using the ndiswrapper driver, but with truly abysmal throughput.

I'm very willing to help test further patches, just let me know...

lsusb output follows:

Bus 002 Device 004: ID 2357:0100  
Device Descriptor:
  bLength                18
  bDescriptorType         1
  bcdUSB               2.00
  bDeviceClass            0 (Defined at Interface level)
  bDeviceSubClass         0 
  bDeviceProtocol         0 
  bMaxPacketSize0        64
  idVendor           0x2357 
  idProduct          0x0100 
  bcdDevice            2.00
  iManufacturer           1 Realtek
  iProduct                2 802.11n WLAN Adapter
  iSerial                 3 00e04c000001
  bNumConfigurations      1
  Configuration Descriptor:
    bLength                 9
    bDescriptorType         2
    wTotalLength           46
    bNumInterfaces          1
    bConfigurationValue     1
    iConfiguration          0 
    bmAttributes         0xa0
      (Bus Powered)
      Remote Wakeup
    MaxPower              500mA
    Interface Descriptor:
      bLength                 9
      bDescriptorType         4
      bInterfaceNumber        0
      bAlternateSetting       0
      bNumEndpoints           4
      bInterfaceClass       255 Vendor Specific Class
      bInterfaceSubClass    255 Vendor Specific Subclass
      bInterfaceProtocol    255 Vendor Specific Protocol
      iInterface              0 
      Endpoint Descriptor:
        bLength                 7
        bDescriptorType         5
        bEndpointAddress     0x81  EP 1 IN
        bmAttributes            2
          Transfer Type            Bulk
          Synch Type               None
          Usage Type               Data
        wMaxPacketSize     0x0200  1x 512 bytes
        bInterval               0
      Endpoint Descriptor:
        bLength                 7
        bDescriptorType         5
        bEndpointAddress     0x02  EP 2 OUT
        bmAttributes            2
          Transfer Type            Bulk
          Synch Type               None
          Usage Type               Data
        wMaxPacketSize     0x0200  1x 512 bytes
        bInterval               0
      Endpoint Descriptor:
        bLength                 7
        bDescriptorType         5
        bEndpointAddress     0x03  EP 3 OUT
        bmAttributes            2
          Transfer Type            Bulk
          Synch Type               None
          Usage Type               Data
        wMaxPacketSize     0x0200  1x 512 bytes
        bInterval               0
      Endpoint Descriptor:
        bLength                 7
        bDescriptorType         5
        bEndpointAddress     0x84  EP 4 IN
        bmAttributes            3
          Transfer Type            Interrupt
          Synch Type               None
          Usage Type               Data
        wMaxPacketSize     0x0040  1x 64 bytes
        bInterval               1
Device Qualifier (for other device speed):
  bLength                10
  bDescriptorType         6
  bcdUSB               2.00
  bDeviceClass            0 (Defined at Interface level)
  bDeviceSubClass         0 
  bDeviceProtocol         0 
  bMaxPacketSize0        64
  bNumConfigurations      1
Device Status:     0x0000
  (Bus Powered)

Tore
Comment 41 Peter Wu 2014-02-23 21:36:22 UTC
Ok, I have checked again and it appears that my patches do not change this situation at all. Tested with an unpatched 3.12.9-1-ARCH kernel and ping/ssh/curl example.com still works. The inability to capture other packets in monitor mode made me (wrongfully) assume that I was unable to see any packets.

(Another possibility is that the patches did make a difference, but that the changes written to registers were kept during power cycles.)

By the way, you have seem to have a different device. It uses rtl8192cu, an USB driver while this bug is about rtl8192ce (PCI Express). My device uses rtlwifi/rtl8192cfwU_B.bin firmware, yours uses rtlwifi/rtl8192cufw_TMSC.bin.
Comment 42 Larry Finger 2014-02-24 00:48:13 UTC
If you unplug the device for USB or power off the computer, no changes will be kept.

PLease note carefully what driver/device you are using. There is some similarity between the RTL8192CE (PCIe) and RTL8192CU (USB), but the drivers are quite different.
Comment 43 Tore Anderson 2014-02-24 06:35:25 UTC
(In reply to Larry Finger from comment #42)
> PLease note carefully what driver/device you are using. There is some
> similarity between the RTL8192CE (PCIe) and RTL8192CU (USB), but the drivers
> are quite different.

Just to be clear, the device I'm talking about in comment #40 is the CU variant. I am aware that the title of this bug is about the CE variant, but as the CU one has been discussed in several comments I thought it appropriate to comment here, rather than open a new bug specifically for the CU variant.

If you would like me to do so, however, please let me know.

Tore
Comment 44 anarcat 2015-11-23 15:08:21 UTC
This bug is getting old, but was still a problem for my in Debian Jessie 8.x. This was with the kernel 3.16, which was affected by this bug according to the original description. I also had some problems with 3.2, but things were generally better.

I had originally filed this in the Debian bugtracker on the realtek firmware, but i believe the problem now lies more in the kernel:

https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=776816

Also, I am happy to report that an upgrade to the 4.2 kernel fixed wifi for me here. I have been running without problems with the new kernel for almost 24 hours now and things are much, much better. So maybe this bug report could be marked as fixed on some kernel versions.