Bug 9143
Summary: | rtl8187 very unreliable | ||
---|---|---|---|
Product: | Drivers | Reporter: | Fabian Zeindl (fabian) |
Component: | network-wireless | Assignee: | drivers_network-wireless (drivers_network-wireless) |
Status: | REJECTED INSUFFICIENT_DATA | ||
Severity: | normal | CC: | dvd100, hauke, htl10, jarosser06, kernel, mdshort, ng.ehahn, ttimo |
Priority: | P1 | ||
Hardware: | All | ||
OS: | Linux | ||
Kernel Version: | 2.6.23 | Subsystem: | |
Regression: | --- | Bisected commit-id: | |
Attachments: |
config of my kernel
0001-rtl8187-do-not-report-ACKs-if-USB-Tx-status-is-non.patch |
Description
Fabian Zeindl
2007-10-11 08:50:47 UTC
Please post the contents of /var/log/messages when your connection drops. It is possible that your AP is disconnecting you due to inactivity. Are you using wpa_supplicant or NetworkManager? They will attempt to reconnect automatically. I cannot really reproduce the behaviour at the moment. If I recall correctly, there were lines like these in syslog: Oct 31 10:31:30 desktop kernel: ADDRCONF(NETDEV_UP): wlan0: link is not ready Oct 31 10:31:32 desktop kernel: ADDRCONF(NETDEV_CHANGE): wlan0: link becomes ready Oct 31 10:31:35 desktop kernel: ADDRCONF(NETDEV_UP): wlan0: link is not ready Oct 31 10:31:36 desktop kernel: ADDRCONF(NETDEV_CHANGE): wlan0: link becomes ready Oct 31 10:31:37 desktop kernel: ADDRCONF(NETDEV_UP): wlan0: link is not ready Oct 31 10:31:38 desktop kernel: ADDRCONF(NETDEV_CHANGE): wlan0: link becomes ready and so on. The problem I have at the moment that the connection is very slow, although it says 60-70% connection. Particulary everything that has to do with UDP, like DNS and Zeroconf doesn't work very well. I use NetworkManager. Anything new here? This still doesn't work at all. I usually use NetworkManager, but for debugging purposes I shut it down and did everything manually: 1.) Wlan Configuration > iwconfig wlan0 essid "MySSID"; iwconfig wlan0 IEEE 802.11g ESSID:"MySSID" Mode:Managed Frequency:2.432 GHz Access Point: 00:14:BF:37:2E:E2 Bit Rate=1 Mb/s Retry min limit:7 RTS thr:off Fragment thr=2346 B Link Quality=41/64 Signal level=6/65 Rx invalid nwid:0 Rx invalid crypt:0 Rx invalid frag:0 Tx excessive retries:0 Invalid misc:0 Missed beacon:0 /var/log/messages says here: Jan 3 15:11:05 fabian-desktop kernel: wlan0: Initial auth_alg=0 Jan 3 15:11:05 fabian-desktop kernel: wlan0: authenticate with AP 00:14:bf:37:2e:e2 Jan 3 15:11:05 fabian-desktop kernel: wlan0: RX authentication from 00:14:bf:37:2e:e2 (alg=0 transaction=2 status=0) Jan 3 15:11:05 fabian-desktop kernel: wlan0: authenticated Jan 3 15:11:05 fabian-desktop kernel: wlan0: associate with AP 00:14:bf:37:2e:e2 Jan 3 15:11:05 fabian-desktop kernel: wlan0: RX AssocResp from 00:14:bf:37:2e:e2 (capab=0x401 status=0 aid=1) Jan 3 15:11:05 fabian-desktop kernel: wlan0: associated Jan 3 15:11:05 fabian-desktop kernel: ADDRCONF(NETDEV_CHANGE): wlan0: link becomes ready Jan 3 15:11:15 fabian-desktop kernel: wlan0: no IPv6 routers present 2. IP Configuration > dhclient wlan0 There is already a pid file /var/run/dhclient.pid with pid 6776 killed old client process, removed PID file Internet Systems Consortium DHCP Client V3.0.5 Copyright 2004-2006 Internet Systems Consortium. All rights reserved. For info, please visit http://www.isc.org/sw/dhcp/ wmaster0: unknown hardware address type 801 wmaster0: unknown hardware address type 801 Listening on LPF/wlan0/00:15:af:0c:cb:5b Sending on LPF/wlan0/00:15:af:0c:cb:5b Sending on Socket/fallback DHCPREQUEST on wlan0 to 255.255.255.255 port 67 DHCPACK from 192.168.1.1 bound to 192.168.1.138 -- renewal in 18069 seconds. 3.) Test with ping: > ping google.at PING google.at (216.239.59.104) 56(84) bytes of data. 64 bytes from 216.239.59.104: icmp_seq=3 ttl=241 time=51.7 ms 64 bytes from 216.239.59.104: icmp_seq=5 ttl=241 time=82.6 ms 64 bytes from 216.239.59.104: icmp_seq=7 ttl=241 time=50.3 ms --- google.at ping statistics --- 7 packets transmitted, 3 received, 57% packet loss, time 29380ms rtt min/avg/max/mdev = 50.389/61.612/82.699/14.921 ms It takes 30 seconds for three pings. SSH etc. won't work at all, since it isn't able to resolve hostnames. The accesspoint works fine for every other wlan-device in the room, using exactly the same approach. Which kernel are you using? Have you tried wireless-2.6#everything? Have you tried forcing the rate to a slow setting? iwconfig wlan0 rate 11M Do these steps improve the situation? I use 2.6.23.12. Setting the rate didn't help (and it didn't have an impact on the output of iwconfig either, Bit\ Rate= kept staying on values like 2 or 11 (sometimes)). I don't know what you mean with wireless-2.6#everything, but I attach my config. Created attachment 14270 [details]
config of my kernel
Anything I can do? Which info else do you need? Unfortunately what we need is more people available to work on bug fixes... :-( Okey... Thanks for the answer Any recent news on this issue? I am having this exact same problem with the 2.6.27-4 kernel. It should be noted here that the driver at http://www.datanorth.net/~cuervo/rtl8187b/ (patched) functions without these problems on the 2.6.24-19 kernel (though I wasn't able to recompile this for the 2.6.27-4 kernel). However, while WEP seems to work fine, there is no WPA support and the signal strength is incorrectly reported. It may be possible to compare the two drivers and discover the underlying issue. *** Bug 11680 has been marked as a duplicate of this bug. *** I have found an article on ArchWiki which mentions this problem: http://wiki.archlinux.org/index.php/Rtl8187_wireless#What_to_do_if_your_connection_always_times_out.3F The workaround works! To stabilize your wireless access you can use: # iwconfig wlan1 rate 5.5M fixed But this is far from a solution. That suggests that rtl8187 is not interacting properly with the rate scaling algorithm, which is probably not a surprise... :-( I am wondering if this problem coincides with a separate problem I have been experiencing on my laptop. I get the following in dmesg after booting up and connecting: [ 424.436030] APIC error on CPU0: 00(40) [ 424.436045] APIC error on CPU1: 00(40) [ 876.772013] APIC error on CPU1: 40(40) [ 876.772028] APIC error on CPU0: 40(40) [ 2148.093025] APIC error on CPU0: 40(40) [ 2148.093042] APIC error on CPU1: 40(40) I've only started getting these after updating to 2.6.27-rc8 The APIC error message can be disabled with "noapic" in grub conf, it is harmless and unrelated. The workaround worked for a few hours, but then my connection dropped entirely and I was disconnected from the AP. After that I was unable to see any local APs. Heres the output from dmesg: [ 121.166373] wlan1: authenticate with AP 00:0d:67:0a:ab:37 [ 121.170878] wlan1: authenticated [ 121.170885] wlan1: associate with AP 00:0d:67:0a:ab:37 [ 121.176878] wlan1: RX AssocResp from 00:0d:67:0a:ab:37 (capab=0x421 status=0 aid=7) [ 121.176883] wlan1: associated [ 121.178152] ADDRCONF(NETDEV_CHANGE): wlan1: link becomes ready [ 131.580039] wlan1: no IPv6 routers present [ 155.072031] wlan1: no IPv6 routers present [ 424.436030] APIC error on CPU0: 00(40) [ 424.436045] APIC error on CPU1: 00(40) [ 876.772013] APIC error on CPU1: 40(40) [ 876.772028] APIC error on CPU0: 40(40) [ 2148.093025] APIC error on CPU0: 40(40) [ 2148.093042] APIC error on CPU1: 40(40) [ 3374.692021] APIC error on CPU0: 40(40) [ 3374.692035] APIC error on CPU1: 40(40) [ 3503.736019] APIC error on CPU1: 40(40) [ 3503.736032] APIC error on CPU0: 40(40) [ 6002.854986] wlan1: deauthenticated [ 6003.852059] wlan1: authenticate with AP 00:0d:67:0a:ab:37 [ 6003.868253] wlan1: authenticated [ 6003.868268] wlan1: associate with AP 00:0d:67:0a:ab:37 [ 6003.884003] wlan1: RX ReassocResp from 00:0d:67:0a:ab:37 (capab=0x421 status=0 aid=13) [ 6003.884019] wlan1: associated [ 6009.795258] wlan1: No ProbeResp from current AP 00:0d:67:0a:ab:37 - assume out of range [ 6012.922160] wlan1: authenticate with AP 00:0d:67:0a:ab:37 [ 6013.290915] wlan1: authenticate with AP 00:0d:67:0a:ab:37 [ 6013.290969] wlan1: authenticated [ 6013.290976] wlan1: associate with AP 00:0d:67:0a:ab:37 [ 6013.488062] wlan1: associate with AP 00:0d:67:0a:ab:37 [ 6013.490170] wlan1: deauthenticated [ 6014.740070] wlan1: authenticate with AP 00:0d:67:0a:ab:37 [ 6014.940059] wlan1: authenticate with AP 00:0d:67:0a:ab:37 [ 6014.942578] wlan1: authenticated [ 6014.942591] wlan1: associate with AP 00:0d:67:0a:ab:37 [ 6014.948598] wlan1: RX ReassocResp from 00:0d:67:0a:ab:37 (capab=0x421 status=0 aid=13) [ 6014.948609] wlan1: associated [ 6935.736053] wlan1: No ProbeResp from current AP 00:0d:67:0a:ab:37 - assume out of range I have got the same problem with wireless-testing 2008-09-30 and many older versions. Are you all using a rtl8187 chip build on your motherboard? I have got an Asus p5B Deluxe Wifi with this chip onboard. With setting the rate control to auto it is strictly on 54MBit/s and I do not get many packages thought the connection, but with setting this to 11M it works and I get most of the time ~1MBit/s through the connection. Sometimes it is more than 4MBit/s. iwconfig shows me a link quality like this: "Link Quality=16/100 Signal level:65/65" with all rate settings Round about every 24 hours using the device the connection is dropped and I can not reconnect. Even reinsmodding the module does not work, only restarting the computer works. (In reply to comment #19) > Are you all using a rtl8187 chip build on your motherboard? I have got an > Asus > p5B Deluxe Wifi with this chip onboard. I've got the chip in a USB stick. > Round about every 24 hours using the device the connection is dropped and I > can > not reconnect. Even reinsmodding the module does not work, only restarting > the > computer works. Sometimes it works better if you also reinsert mac80211 The rtl8187 driver blindly reports all Tx packets as being ACKed, which falsely encourages the rate scaling algorithms to scale speeds up. :-( The vendor-supplied driver treats USB Tx failures as non-ACKed frames, and I don't see why we shouldn't as well. What isn't clear is if this is sufficient data to force a more realistic view of the world onto the rate scaling algorithms. :-) Anyone are to try the following patch? Created attachment 18195 [details]
0001-rtl8187-do-not-report-ACKs-if-USB-Tx-status-is-non.patch
Do not report ACKs if USB Tx fails.
Hmm, the patch looks "obviously alright" but it seems to break badly, I don't know why. I grafted onto 2.6.26.5-54.fc9.x86_64 from koji with tag v2.6.27-rc8 (from wireless testing) and patch above, plus reverting "rtl8187: Fix for TX sequence number problem" (to make it work with 2.6.26.5-54.fc9 headers). Hmm, perhaps I should ask which tag of wireless-testing correspond to 2.6.26.5-54.fc9 to save me unpacking the src rpm...(or similiar problem in the future). No idea of what wireless-testing commit corresponds to which Fedora kernel release anymore -- sorry! The patch is pretty simple, I'm sure you can hack it into place. :-) Of course, I don't see it getting any non-zero USB Tx status anyway... :-( (In reply to comment #24) Yes, the patch is simple enough, what I don't understand is why it makes my connectivity dramatically worse; and I was wondering about my lazy graft (and subtle mismatch with mac80211, etc). It is simple enough to just get the src rpm and unpack it to build a matching kernel module - just feeling a little stupid downloading and unpack the kernel tree when I have most of the content of any kernel src rpm in wireless-testing's git-clone :-). I'll give it a go again. Tried properly to build a matching kernel module with the patch - connectivity failed after a few minutes, while bit rate and link quality and signal level doesn't change much (link +-1, and so is signal level). Bit Rate=54 Mb/s Tx-Power=27 dBm Link Quality=45/100 Signal level:50/65 I am afraid the patch seems to be bad, but I cannot see why it is bad... I've written a script that allows me to start/restore a connection in a way that always works. When the connection dies, just run the script again! #!/bin/bash sudo iwconfig wlan0 ap off sudo ifconfig wlan0 down sleep 2 sudo ifconfig wlan0 up sudo iwconfig wlan0 rate 11M fixed sudo iwconfig wlan0 essid "MyNetworkEssid" sudo iwconfig wlan0 ap auto sleep 5 sudo dhcpcd wlan0 any progress on this? backlinking to ubuntu's corresponding bug: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/182473/ having regular problems with this driver with 2.6.27 .. should the version be updated (it says 2.6.23 atm)? I get this problem with 2.6.27 and 2.6.28, with an inbuilt rtl8187 USB in my Clevo laptop. Basically connections die silently. Testing tells me that any static rates up to 36M are fine, anything over that and packets start randomly being lost without any warnings or errors, eventually resulting in disassociation. You can literally see it with "ping"... set the rate past 36M, pings start being heavily lost (<1% success), put it back done, pings return with < 7ms RTT. Very repeatable - at first I thought the card was broke but Windows doesn't have similar problems in the exact same environment. For the moment, I've locked my rate to 36M and it works just fine (OpenVPN running constantly for hours etc.) but if I use anything more or try "Auto" (where it will try to ramp up past 36M and then die), I get these symptoms. I'm connecting to an AP that has no problems with other hardware (rt2500 or Madwifi cards on other machines), no encryption (I use OpenVPN to secure a WPA/WEP-less wireless network) and doesn't do anything "unusual". This is obviously quite a serious problem because by default a lot of distros will autodetect this card and then ramp it up past it's working rates, so people just assume the card is dead under Linux. Reliability seems to depend on what you're doing. After 'iwconfig wlan0 rate 1M' it hardly ever dies when only doing stuff like browsing. Downloads via http/ftp also is pretty reliable, with bittorent it's unusable. There are a lot of changes in wireless-testing - can you try compat-wireless and (http://linuxwireless.org/en/users/Download ) and see if it works better? And please do not fix a hard rate with compat-wireless. The rate control mechanism should do its job. Just tested the lastest compat-wireless (under a 2.6.28 kernel) Now the bit rate is higer and more stable but the problems remain: if I open amule, bittorent, emesene, pidgin the connection dies.. But now the reconnection is possibile without rmmod/modprobe the rtl8187. This bug is ancient, and the description seems no more specific than "rtl8187 has problems sometimes". Can anyone tell me why this bug should not be closed? If you are having problems, you should open something with a more useful and specific description. Larry did some work recently about queue depth, etc which should improve bitorrent-type usage. Hi, I reported this bug and I just noticed that my wireless hardware has the same issues under WindowsXP using the official driver. So I think either my hardware is faulty or it simply has some kind of bug. Alright...on the basis of comment 35 and the fact that this bug has become too polluted to be useful, I'm going to close this one. Others watching this bug, please open a new bug for whatever specific problems you might experience...thanks! *sigh* .. for the people who still use this POS hardware anyway .. ended up buying a different card some months after this was initially opened |