Bug 9143

Summary: rtl8187 very unreliable
Product: Drivers Reporter: Fabian Zeindl (fabian)
Component: network-wirelessAssignee: drivers_network-wireless (drivers_network-wireless)
Status: REJECTED INSUFFICIENT_DATA    
Severity: normal CC: dvd100, hauke, htl10, jarosser06, kernel, mdshort, ng.ehahn, ttimo
Priority: P1    
Hardware: All   
OS: Linux   
Kernel Version: 2.6.23 Subsystem:
Regression: --- Bisected commit-id:
Attachments: config of my kernel
0001-rtl8187-do-not-report-ACKs-if-USB-Tx-status-is-non.patch

Description Fabian Zeindl 2007-10-11 08:50:47 UTC
Hi,
 i use the new rtl8187 driver for my onboard-wlanchip. I can associate to accesspoints perfectly, I always get an IP, and sometimes tcp/ip works but the connection randomly drops. I can ping hosts then, but cannot connect to them.
Comment 1 John W. Linville 2007-10-30 13:40:22 UTC
Please post the contents of /var/log/messages when your connection drops.

It is possible that your AP is disconnecting you due to inactivity.  Are you using wpa_supplicant or NetworkManager?  They will attempt to reconnect automatically.
Comment 2 Fabian Zeindl 2007-10-31 02:36:33 UTC
I cannot really reproduce the behaviour at the moment.
If I recall correctly, there were lines like these in syslog:

Oct 31 10:31:30 desktop kernel: ADDRCONF(NETDEV_UP): wlan0: link is not ready
Oct 31 10:31:32 desktop kernel: ADDRCONF(NETDEV_CHANGE): wlan0: link becomes ready
Oct 31 10:31:35 desktop kernel: ADDRCONF(NETDEV_UP): wlan0: link is not ready
Oct 31 10:31:36 desktop kernel: ADDRCONF(NETDEV_CHANGE): wlan0: link becomes ready
Oct 31 10:31:37 desktop kernel: ADDRCONF(NETDEV_UP): wlan0: link is not ready
Oct 31 10:31:38 desktop kernel: ADDRCONF(NETDEV_CHANGE): wlan0: link becomes ready
and so on.

The problem I have at the moment that the connection is very slow, although it says 60-70% connection. Particulary everything that has to do with UDP, like DNS and Zeroconf doesn't work very well.

I use NetworkManager.
Comment 3 Fabian Zeindl 2008-01-03 06:22:33 UTC
Anything new here? This still doesn't work at all.

I usually use NetworkManager, but for debugging purposes I shut it down and did everything manually:

1.) Wlan Configuration

> iwconfig wlan0 essid "MySSID"; iwconfig

  wlan0     IEEE 802.11g  ESSID:"MySSID"  
            Mode:Managed  Frequency:2.432 GHz  Access Point: 00:14:BF:37:2E:E2   
            Bit Rate=1 Mb/s   
            Retry min limit:7   RTS thr:off   Fragment thr=2346 B   
            Link Quality=41/64  Signal level=6/65  
            Rx invalid nwid:0  Rx invalid crypt:0  Rx invalid frag:0
            Tx excessive retries:0  Invalid misc:0   Missed beacon:0

/var/log/messages says here:
  Jan  3 15:11:05 fabian-desktop kernel: wlan0: Initial auth_alg=0
  Jan  3 15:11:05 fabian-desktop kernel: wlan0: authenticate with AP   00:14:bf:37:2e:e2
  Jan  3 15:11:05 fabian-desktop kernel: wlan0: RX authentication from 00:14:bf:37:2e:e2 (alg=0 transaction=2 status=0)
  Jan  3 15:11:05 fabian-desktop kernel: wlan0: authenticated
  Jan  3 15:11:05 fabian-desktop kernel: wlan0: associate with AP 00:14:bf:37:2e:e2
  Jan  3 15:11:05 fabian-desktop kernel: wlan0: RX AssocResp from 00:14:bf:37:2e:e2 (capab=0x401 status=0 aid=1)
  Jan  3 15:11:05 fabian-desktop kernel: wlan0: associated
  Jan  3 15:11:05 fabian-desktop kernel: ADDRCONF(NETDEV_CHANGE): wlan0: link becomes ready
  Jan  3 15:11:15 fabian-desktop kernel: wlan0: no IPv6 routers present


2. IP Configuration

> dhclient wlan0

  There is already a pid file /var/run/dhclient.pid with pid 6776
  killed old client process, removed PID file
  Internet Systems Consortium DHCP Client V3.0.5
  Copyright 2004-2006 Internet Systems Consortium.
  All rights reserved.
  For info, please visit http://www.isc.org/sw/dhcp/

  wmaster0: unknown hardware address type 801
  wmaster0: unknown hardware address type 801
  Listening on LPF/wlan0/00:15:af:0c:cb:5b
  Sending on   LPF/wlan0/00:15:af:0c:cb:5b
  Sending on   Socket/fallback
  DHCPREQUEST on wlan0 to 255.255.255.255 port 67
  DHCPACK from 192.168.1.1
  bound to 192.168.1.138 -- renewal in 18069 seconds.


3.) Test with ping:

> ping google.at

  PING google.at (216.239.59.104) 56(84) bytes of data.
  64 bytes from 216.239.59.104: icmp_seq=3 ttl=241 time=51.7 ms
  64 bytes from 216.239.59.104: icmp_seq=5 ttl=241 time=82.6 ms
  64 bytes from 216.239.59.104: icmp_seq=7 ttl=241 time=50.3 ms

  --- google.at ping statistics ---
  7 packets transmitted, 3 received, 57% packet loss, time 29380ms
  rtt min/avg/max/mdev = 50.389/61.612/82.699/14.921 ms


It takes 30 seconds for three pings. SSH etc. won't work at all, since it isn't able to resolve hostnames.
The accesspoint works fine for every other wlan-device in the room, using exactly the same approach.
Comment 4 John W. Linville 2008-01-03 06:55:06 UTC
Which kernel are you using?  Have you tried wireless-2.6#everything?

Have you tried forcing the rate to a slow setting?

   iwconfig wlan0 rate 11M

Do these steps improve the situation?
Comment 5 Fabian Zeindl 2008-01-03 07:35:02 UTC
I use 2.6.23.12.
Setting the rate didn't help (and it didn't have an impact on the output of iwconfig either, Bit\ Rate= kept staying on values like 2 or 11 (sometimes)).

I don't know what you mean with wireless-2.6#everything, but I attach my config.
Comment 6 Fabian Zeindl 2008-01-03 07:35:22 UTC
Created attachment 14270 [details]
config of my kernel
Comment 7 Fabian Zeindl 2008-01-04 07:36:42 UTC
Anything I can do?
Comment 8 Fabian Zeindl 2008-03-29 06:04:45 UTC
Which info else do you need?
Comment 9 John W. Linville 2008-03-29 06:46:19 UTC
Unfortunately what we need is more people available to work on bug fixes... :-(
Comment 10 Fabian Zeindl 2008-03-29 06:56:38 UTC
Okey... Thanks for the answer
Comment 11 Michael Short 2008-09-25 21:20:09 UTC
Any recent news on this issue? I am having this exact same problem with the 2.6.27-4 kernel.

It should be noted here that the driver at http://www.datanorth.net/~cuervo/rtl8187b/ (patched) functions without these problems on the 2.6.24-19 kernel (though I wasn't able to recompile this for the 2.6.27-4 kernel). However, while WEP seems to work fine, there is no WPA support and the signal strength is incorrectly reported. It may be possible to compare the two drivers and discover the underlying issue.
Comment 12 John W. Linville 2008-10-03 07:47:07 UTC
*** Bug 11680 has been marked as a duplicate of this bug. ***
Comment 13 Michael Short 2008-10-03 09:04:33 UTC
I have found an article on ArchWiki which mentions this problem:

http://wiki.archlinux.org/index.php/Rtl8187_wireless#What_to_do_if_your_connection_always_times_out.3F
Comment 14 Michael Short 2008-10-03 10:04:38 UTC
The workaround works!

To stabilize your wireless access you can use:

# iwconfig wlan1 rate 5.5M fixed

But this is far from a solution.
Comment 15 John W. Linville 2008-10-03 10:19:49 UTC
That suggests that rtl8187 is not interacting properly with the rate scaling algorithm, which is probably not a surprise... :-(
Comment 16 Michael Short 2008-10-03 10:52:32 UTC
I am wondering if this problem coincides with a separate problem I have been experiencing on my laptop. I get the following in dmesg after booting up and connecting:

[  424.436030] APIC error on CPU0: 00(40)
[  424.436045] APIC error on CPU1: 00(40)
[  876.772013] APIC error on CPU1: 40(40)
[  876.772028] APIC error on CPU0: 40(40)
[ 2148.093025] APIC error on CPU0: 40(40)
[ 2148.093042] APIC error on CPU1: 40(40)

I've only started getting these after updating to 2.6.27-rc8
Comment 17 Hin-Tak Leung 2008-10-03 11:54:07 UTC
The APIC error message can be disabled with "noapic" in grub conf, it is harmless and unrelated.
Comment 18 Michael Short 2008-10-03 14:15:11 UTC
The workaround worked for a few hours, but then my connection dropped entirely and I was disconnected from the AP. After that I was unable to see any local APs.

Heres the output from dmesg:
[  121.166373] wlan1: authenticate with AP 00:0d:67:0a:ab:37
[  121.170878] wlan1: authenticated
[  121.170885] wlan1: associate with AP 00:0d:67:0a:ab:37
[  121.176878] wlan1: RX AssocResp from 00:0d:67:0a:ab:37 (capab=0x421 status=0 aid=7)
[  121.176883] wlan1: associated
[  121.178152] ADDRCONF(NETDEV_CHANGE): wlan1: link becomes ready
[  131.580039] wlan1: no IPv6 routers present
[  155.072031] wlan1: no IPv6 routers present
[  424.436030] APIC error on CPU0: 00(40)
[  424.436045] APIC error on CPU1: 00(40)
[  876.772013] APIC error on CPU1: 40(40)
[  876.772028] APIC error on CPU0: 40(40)
[ 2148.093025] APIC error on CPU0: 40(40)
[ 2148.093042] APIC error on CPU1: 40(40)
[ 3374.692021] APIC error on CPU0: 40(40)
[ 3374.692035] APIC error on CPU1: 40(40)
[ 3503.736019] APIC error on CPU1: 40(40)
[ 3503.736032] APIC error on CPU0: 40(40)
[ 6002.854986] wlan1: deauthenticated
[ 6003.852059] wlan1: authenticate with AP 00:0d:67:0a:ab:37
[ 6003.868253] wlan1: authenticated
[ 6003.868268] wlan1: associate with AP 00:0d:67:0a:ab:37
[ 6003.884003] wlan1: RX ReassocResp from 00:0d:67:0a:ab:37 (capab=0x421 status=0 aid=13)
[ 6003.884019] wlan1: associated
[ 6009.795258] wlan1: No ProbeResp from current AP 00:0d:67:0a:ab:37 - assume out of range
[ 6012.922160] wlan1: authenticate with AP 00:0d:67:0a:ab:37
[ 6013.290915] wlan1: authenticate with AP 00:0d:67:0a:ab:37
[ 6013.290969] wlan1: authenticated
[ 6013.290976] wlan1: associate with AP 00:0d:67:0a:ab:37
[ 6013.488062] wlan1: associate with AP 00:0d:67:0a:ab:37
[ 6013.490170] wlan1: deauthenticated
[ 6014.740070] wlan1: authenticate with AP 00:0d:67:0a:ab:37
[ 6014.940059] wlan1: authenticate with AP 00:0d:67:0a:ab:37
[ 6014.942578] wlan1: authenticated
[ 6014.942591] wlan1: associate with AP 00:0d:67:0a:ab:37
[ 6014.948598] wlan1: RX ReassocResp from 00:0d:67:0a:ab:37 (capab=0x421 status=0 aid=13)
[ 6014.948609] wlan1: associated
[ 6935.736053] wlan1: No ProbeResp from current AP 00:0d:67:0a:ab:37 - assume out of range
Comment 19 Hauke Mehrtens 2008-10-05 05:12:15 UTC
I have got the same problem with wireless-testing 2008-09-30 and many older versions.

Are you all using a rtl8187 chip build on your motherboard? I have got an Asus p5B Deluxe Wifi with this chip onboard.
With setting the rate control to auto it is strictly on 54MBit/s and I do not get many packages thought the connection, but with setting this to 11M it works and I get most of the time ~1MBit/s through the connection. Sometimes it is more than 4MBit/s. iwconfig shows me a link quality like this: "Link Quality=16/100  Signal level:65/65" with all rate settings

Round about every 24 hours using the device the connection is dropped and I can not reconnect. Even reinsmodding the module does not work, only restarting the computer works.
Comment 20 Erik Hahn 2008-10-05 05:58:50 UTC
(In reply to comment #19)
> Are you all using a rtl8187 chip build on your motherboard? I have got an
> Asus
> p5B Deluxe Wifi with this chip onboard.
I've got the chip in a USB stick.

> Round about every 24 hours using the device the connection is dropped and I
> can
> not reconnect. Even reinsmodding the module does not work, only restarting
> the
> computer works.
Sometimes it works better if you also reinsert mac80211
Comment 21 John W. Linville 2008-10-07 12:36:17 UTC
The rtl8187 driver blindly reports all Tx packets as being ACKed, which falsely encourages the rate scaling algorithms to scale speeds up. :-(

The vendor-supplied driver treats USB Tx failures as non-ACKed frames, and I don't see why we shouldn't as well.  What isn't clear is if this is sufficient data to force a more realistic view of the world onto the rate scaling algorithms. :-)

Anyone are to try the following patch?
Comment 22 John W. Linville 2008-10-07 12:37:00 UTC
Created attachment 18195 [details]
0001-rtl8187-do-not-report-ACKs-if-USB-Tx-status-is-non.patch

Do not report ACKs if USB Tx fails.
Comment 23 Hin-Tak Leung 2008-10-07 14:22:25 UTC
Hmm, the patch looks "obviously alright" but it seems to break badly, I don't know why. I grafted onto 2.6.26.5-54.fc9.x86_64 from koji with tag v2.6.27-rc8 (from wireless testing) and patch above, plus 
reverting "rtl8187: Fix for TX sequence number problem" (to make it work with 2.6.26.5-54.fc9 headers). Hmm, perhaps I should ask which tag of wireless-testing correspond to 2.6.26.5-54.fc9 to save me unpacking the src rpm...(or similiar problem in the future).
Comment 24 John W. Linville 2008-10-08 12:24:03 UTC
No idea of what wireless-testing commit corresponds to which Fedora kernel release anymore -- sorry!  The patch is pretty simple, I'm sure you can hack it into place. :-)

Of course, I don't see it getting any non-zero USB Tx status anyway... :-(
Comment 25 Hin-Tak Leung 2008-10-08 14:24:52 UTC
(In reply to comment #24)
Yes, the patch is simple enough, what I don't understand is why it makes my connectivity dramatically worse; and I was wondering about my lazy graft (and subtle mismatch with mac80211, etc).

It is simple enough to just get the src rpm and unpack it to build a matching kernel module - just feeling a little stupid downloading and unpack the kernel tree when I have most of the content of any kernel src rpm in wireless-testing's git-clone :-). I'll give it a go again.
Comment 26 Hin-Tak Leung 2008-10-10 13:48:17 UTC
Tried properly to build a matching kernel module with the patch - connectivity failed after a few minutes, while bit rate and link quality and signal level doesn't change much (link +-1, and so is signal level).
          Bit Rate=54 Mb/s   Tx-Power=27 dBm   
          Link Quality=45/100  Signal level:50/65

I am afraid the patch seems to be bad, but I cannot see why it is bad...
Comment 27 Michael Short 2008-10-12 11:57:54 UTC
I've written a script that allows me to start/restore a connection in a way that always works. When the connection dies, just run the script again!

#!/bin/bash
sudo iwconfig wlan0 ap off
sudo ifconfig wlan0 down
sleep 2
sudo ifconfig wlan0 up
sudo iwconfig wlan0 rate 11M fixed
sudo iwconfig wlan0 essid "MyNetworkEssid"
sudo iwconfig wlan0 ap auto
sleep 5
sudo dhcpcd wlan0
Comment 28 Timothee Besset 2008-11-15 13:37:13 UTC
any progress on this?

backlinking to ubuntu's corresponding bug:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/182473/

having regular problems with this driver with 2.6.27 .. should the version be updated (it says 2.6.23 atm)?
Comment 29 Lee Dowling 2008-12-24 18:50:46 UTC
I get this problem with 2.6.27 and 2.6.28, with an inbuilt rtl8187 USB in my Clevo laptop.

Basically connections die silently.  Testing tells me that any static rates up to 36M are fine, anything over that and packets start randomly being lost without any warnings or errors, eventually resulting in disassociation.  You can literally see it with "ping"... set the rate past 36M, pings start being heavily lost (<1% success), put it back done, pings return with < 7ms RTT.  Very repeatable - at first I thought the card was broke but Windows doesn't have similar problems in the exact same environment.

For the moment, I've locked my rate to 36M and it works just fine (OpenVPN running constantly for hours etc.) but if I use anything more or try "Auto" (where it will try to ramp up past 36M and then die), I get these symptoms.

I'm connecting to an AP that has no problems with other hardware (rt2500 or Madwifi cards on other machines), no encryption (I use OpenVPN to secure a WPA/WEP-less wireless network) and doesn't do anything "unusual".

This is obviously quite a serious problem because by default a lot of distros will autodetect this card and then ramp it up past it's working rates, so people just assume the card is dead under Linux.
Comment 30 Erik Hahn 2008-12-25 06:38:07 UTC
Reliability seems to depend on what you're doing. After 'iwconfig wlan0 rate 1M'  it hardly ever dies when only doing stuff like browsing. Downloads via http/ftp also is pretty reliable, with bittorent it's unusable.
Comment 31 Hin-Tak Leung 2008-12-25 12:42:48 UTC
There are a lot of changes in wireless-testing - can you try compat-wireless and (http://linuxwireless.org/en/users/Download ) and see if it works better?

And please do not fix a hard rate with compat-wireless. The rate control mechanism should do its job. 
Comment 32 Davide Totaro 2008-12-27 11:51:57 UTC
Just tested the lastest compat-wireless (under a 2.6.28 kernel)
Now the bit rate is higer and more stable but the problems remain: if I open amule, bittorent, emesene, pidgin the connection dies.. But now the reconnection is possibile without rmmod/modprobe the rtl8187.
Comment 33 John W. Linville 2009-03-02 10:00:11 UTC
This bug is ancient, and the description seems no more specific than "rtl8187 has problems sometimes".  Can anyone tell me why this bug should not be closed?  If you are having problems, you should open something with a more useful and specific description.
Comment 34 Hin-Tak Leung 2009-03-02 15:40:40 UTC
Larry did some work recently about queue depth, etc which should improve bitorrent-type usage.
Comment 35 Fabian Zeindl 2009-03-06 04:23:06 UTC
Hi,

 I reported this bug and I just noticed that my wireless hardware has the same issues under WindowsXP using the official driver. So I think either my hardware is faulty or it simply has some kind of bug.
Comment 36 John W. Linville 2009-03-06 05:40:02 UTC
Alright...on the basis of comment 35 and the fact that this bug has become too polluted to be useful, I'm going to close this one.  Others watching this bug, please open a new bug for whatever specific problems you might experience...thanks!
Comment 37 Timothee Besset 2009-03-06 15:57:07 UTC
*sigh* .. for the people who still use this POS hardware anyway .. ended up buying a different card some months after this was initially opened