Bug 62781
Summary: | rt2800usb sluggish connection. "TX status timeout for entry" and "Got TX status for an empty queue" errors(?) | ||
---|---|---|---|
Product: | Drivers | Reporter: | Alexander Kaltsas (alexkaltsas) |
Component: | network-wireless | Assignee: | drivers_network-wireless (drivers_network-wireless) |
Status: | CLOSED OBSOLETE | ||
Severity: | high | CC: | alexkaltsas, antoni.silvestre+kernel, bhreach, dan.gebhardt, devzero, dosenfleisch, gcp, git-asdffdsa, giuseppe_stolnicu, gwingerde, haagch.christoph, IvDoorn, jarkko_korpi, jw7779, linville, m4rkusxxl, mctiew, oleg.sklyarov, paky1686, patryk, paul, ravies036, root, sebastien, serj.pilipenko, silvan.calarco, stf_xl, therealpatrobinson, tylergschmidt, webreg, zdzichu |
Priority: | P1 | ||
Hardware: | i386 | ||
OS: | Linux | ||
Kernel Version: | 3.10.10/3.11.4 | Subsystem: | |
Regression: | No | Bisected commit-id: | |
Attachments: |
lspci, ver_linux, cpuinfo, lsusb, acpidump etc.
dmesg output. slow_down_txstatus_polling.patch attachment-24043-0.html still with 3.14rc5 3.13-dmesg |
Created attachment 110541 [details]
dmesg output.
The same here with two different Ralink USB NIC. This is that one that I use more often ID 148f:3572 Ralink Technology, Corp. RT3572 Wireless Adapter The dmesg is filled with a tons of [ 6233.593094] ieee80211 phy0: rt2800usb_entry_txstatus_timeout: Warning - TX status timeout for entry 9 in queue 2 [ 6233.932602] ieee80211 phy0: rt2800usb_txdone: Warning - Got TX status for an empty queue 2, dropping Experiencing this poor performances with Arch Linux x86_64 with many kernel versions, but for sure using 3.10.10 and 3.11.1 versions. Created attachment 111061 [details]
slow_down_txstatus_polling.patch
Does the patch make problem gone or at least number of messages decreese ?
I aplied the patch and transfered big amounts of data (from various distances, -60 to -80 dbm). I only got 4 warnings and no noticable speed reduction. http://pastebin.com/raw.php?i=7Xj0mZ1F [ 3348.177582] IPv6: ADDRCONF(NETDEV_CHANGE): wlan1: link becomes ready [ 3630.534189] ieee80211 phy1: rt2800usb_entry_txstatus_timeout: Warning - TX status timeout for entry 0 in queue 2 [ 3630.534221] ieee80211 phy1: rt2800usb_entry_txstatus_timeout: Warning - TX status timeout for entry 1 in queue 2 [ 3630.534250] ieee80211 phy1: rt2800usb_entry_txstatus_timeout: Warning - TX status timeout for entry 2 in queue 2 [ 3630.585582] ieee80211 phy1: rt2800usb_txdone: Warning - Got TX status for an empty queue 2, dropping I have quite a few RT3072 Realtek based USB adapters. Until recently, they have been fantastic. They pickup weak signals, lock on, don't loose the connection and give excellent through put. All my machines run 24/7 with typical uptimes measured in months. I have them running on a box with 3.6.11-gentoo kernel, no problem and a box running an older 3.2-sabayon kernel (Sabayon is a Gentoo derivative), no problem. No problem means no timeouts or warnings are being listed in the messages file and they are stable. I am very pleased with these devices, they never drop the connection to the AP and give excellent through put. I also have them running on 2 machines with the 3.10.7-gentoo-r1 kernel, I get massive number of lines in /var/log/messages on both machines. On one machine it does not seem to affect stability or through put, this machine uses the new Predictable Network Interface Names feature. On the other machine, I loose the network occasionally. I started with the new Predictable Network Interface Names feature and would loose the connection several times a day. I switched back to the old naming convention and the connection usually will not drop out more than once a day and can be stable for several days sometimes but still get massive amount of messages. I found this bug and it seems to be the problem I am having but the patch does not solve the problem for me. The patch has not been applied to any kernel AFAICT. I upgraded the kernel, on the box that drops the network connection, to 3.10.17-gentoo, it did not fix the problem. I manually applied the patch posted and it did not solve the problem (it may have reduced the number of entries in the message log but there is still a massive number and it did not solve the network drop out problem). The cause of the network drop is that the adapter is not being assigned an IP address. With all the warnings and timeouts, the adapter must get reset but for some reason, it has no IP address. The IP address is manually assigned (config_wlan0="192.168.1.50/24" in /etc/conf.d/net). If you run iwconfig it looks normal, i.e., the adapter is connected to the AP but ifconfig or ip addr show reveals it has no IP address. If I restart the adapter manually (/etc/init.d/net.wlan0 restart), the IP address gets assigned and the network starts working again. The driver has not been revised is a couple of years so it seems clear some changes in the recent kernels are causing this problem. I would like to get this problem resolved because these adapters worked extremely well until recently. First step would be narrow regression baund to two consecutive kernel versions (i.e. 3.n and 3.n+1). (In reply to Stanislaw Gruszka from comment #6) > First step would be narrow regression baund to two consecutive kernel > versions (i.e. 3.n and 3.n+1). I tested the 3.9.11-gentoo-r1, problem still exists. I know the 3.6 kernel does not have the problem so I wanted to test the 3.7, 3.8 and 3.11 kernels but Gentoo does not have them so I found Mint 15 has 3.8 and antiX 13.2 has 3.7, ubuntu 13.04 has 3.8 and ubuntu 13.10 has 3.11. I tried Mint 15, ubuntu 13.04, antiX 13.2 and ubuntu 13.10 using a live DVD/CD and only ubuntu 13.10 seemed to have the problem. In summary, 3.2, 3.6, 3.7, 3.8 don't seem to have the problem but 3.9, 3.10 and 3.11 do. The problem being massive entries in the messages log file from rt2800usb (warnings, timeouts, etc). I am glad I found this on lkml.org I have had this issue for a long time and I have been unsure if I should report it. I have been using linux mint 15, kubuntu 13.04 and now kubuntu 13.10 I don't remember which distros had that issue. But at least the kubuntu 13.10 has with 3.12.0-031200rc7-generic. I installed some of those distros and it had 3.8.x.x kernel on default, I dont see any errors, but when I upgrade the kernel I start to achieve those. I have had many kernels installed 3.8 3.9. 3.10 3.12 And if I am not mistaken 3.8 have been the only one not showing the issue. [13123.305805] ieee80211 phy1: rt2800usb_entry_txstatus_timeout: Warning - TX status timeout for entry 12 in queue 2 [13123.305824] ieee80211 phy1: rt2800usb_entry_txstatus_timeout: Warning - TX status timeout for entry 13 in queue 2 [13123.305826] ieee80211 phy1: rt2800usb_entry_txstatus_timeout: Warning - TX status timeout for entry 14 in queue 2 [ 8.824096] ieee80211 phy0: rt2x00_set_rt: Info - RT chipset 3070, rev 0201 detected [ 8.852255] ieee80211 phy0: rt2x00_set_rf: Info - RF chipset 0005 detected 8.905328] ieee80211 phy0: Selected rate control algorithm 'minstrel_ht' [ 8.905574] usbcore: registered new interface driver rt2800usb On 3.9 there is my change that make "TX status timeout for entry" are printed by default. Previously those messages were only printed if kernel was compiled with CONFIG_RT2X00_DEBUG=y option. "TX status timeout" message mean that HW did not give us status of transmitted frame. This can or can not be a problem. If those messages are printed frequently that probably mean that we do not talk with the HW correctly . If happen only randomly i.e. AP does not ACK frame from time to time, this should not be a problem. Anyway I can move them back to DEBUG level if they flood dmesg without any actual issue (on performance or connection stability). (In reply to bhreach from comment #5) > On the other machine, I loose the network occasionally. Does this problem also start to happen between 3.8 and 3.9 ? Please test vanilla kernels, not distribution ones as those are patched. Since you are using gentoo, you should have no problem to build vanilla kernel from kernel.org source :-) I do lose connection sometimes it just drops. I thought the reason was overheating mobile based connection (the hardware I mentioned connects to a router which has usb stick that has mobile card). Taking that usb stick away which has the mobile card and connecting it a bit later resumes the connection (the stick is very hot sometimes) [ 8.824096] ieee80211 phy0: rt2x00_set_rt: Info - RT chipset 3070, rev 0201 detected [ 8.852255] ieee80211 phy0: rt2x00_set_rf: Info - RF chipset 0005 detected The package that contained the usb wifi trasfer reads as: Telewell Tw-lan 802.11g/n On the package it says that it can transfer up to 300mbpbs, but the highest I have seen is 150mbps. Is it working like it should?? I just found something. The package that came with this wifi card has no mention of linux support. So I never even tried the cd that comes along. But now I found this problem I went to telewell support page and found this https://www.telewell.fi/fi/tuote/wlanbluetooth-tuotteet/WLANUSBV2/tw-wlan-802-11gn-usb-v-2 There is a linux driver in source format!! (In reply to Stanislaw Gruszka from comment #9) > > On the other machine, I loose the network occasionally. > Does this problem also start to happen between 3.8 and 3.9 ? Please test > vanilla kernels, not distribution ones as those are patched. Since you are > using gentoo, you should have no problem to build vanilla kernel from > kernel.org source :-) I got the vanilla source for 3.8.13 and 3.9.11. I am testing 3.8.13 now. It could take a while. I went back to using the Predictable Network Interface Names because I seemed to get more dropouts with them. Also, recall the dropouts are caused by the adapter having no IP address, it is connected to the AP. I have noticed that it takes a while for the IP address to get assigned. If I restart the adapter and quickly run ifconfig, it has no IP address but if I wait a bit, and run ifconfig again the address is there. I am talking about a statically assigned address not dhcp. Got my 1st dropout already. I forgot the main reason I switched away from the Predictable Network Interface Names. Here are the lines in dmesg: [ 3489.710726] usb 1-5: USB disconnect, device number 3 [ 3491.723372] phy0 -> rt2800_wait_wpdma_ready: Error - WPDMA TX/RX busy [0xffffffff]. [ 3491.723536] wlp0s18f2u5: deauthenticating from 00:26:62:57:28:5a by local choice (reason=3) [ 3492.522957] usb 1-5: new high-speed USB device number 7 using ehci-pci [ 3492.602921] hub 1-0:1.0: unable to enumerate USB device on port 5 [ 3493.102626] usb 4-5: new full-speed USB device number 2 using ohci_hcd [ 3493.278071] usb 4-5: not running at top speed; connect to a high speed hub [ 3493.301061] usb 4-5: New USB device found, idVendor=148f, idProduct=3072 [ 3493.301063] usb 4-5: New USB device strings: Mfr=1, Product=2, SerialNumber=3 [ 3493.301065] usb 4-5: Product: 802.11 n WLAN [ 3493.301066] usb 4-5: Manufacturer: Ralink [ 3493.301067] usb 4-5: SerialNumber: 1.0 [ 3493.452396] usb 4-5: reset full-speed USB device number 2 using ohci_hcd [ 3493.882820] ieee80211 phy2: Selected rate control algorithm 'minstrel_ht' [ 3493.942220] systemd-udevd[21540]: renamed network interface wlan0 to wlp0s18f0u5 It changes the interface name from wlp0s18f2u5 to wlp0s18f0u5 so the network cannot work. It is not supposed to do that? Also, Gentoo does not use systemd by default yet, still using openRC. I'll go back to using the wlan naming to avoid this problem. (In reply to bhreach from comment #13) > [ 3489.710726] usb 1-5: USB disconnect, device number 3 > [ 3491.723372] phy0 -> rt2800_wait_wpdma_ready: Error - WPDMA TX/RX busy > [0xffffffff]. > [ 3491.723536] wlp0s18f2u5: deauthenticating from 00:26:62:57:28:5a by local > choice (reason=3) > [ 3492.522957] usb 1-5: new high-speed USB device number 7 using ehci-pci > [ 3492.602921] hub 1-0:1.0: unable to enumerate USB device on port 5 > [ 3493.102626] usb 4-5: new full-speed USB device number 2 using ohci_hcd > [ 3493.278071] usb 4-5: not running at top speed; connect to a high speed hub > [ 3493.301061] usb 4-5: New USB device found, idVendor=148f, idProduct=3072 > [ 3493.301063] usb 4-5: New USB device strings: Mfr=1, Product=2, > SerialNumber=3 > [ 3493.301065] usb 4-5: Product: 802.11 n WLAN > [ 3493.301066] usb 4-5: Manufacturer: Ralink > [ 3493.301067] usb 4-5: SerialNumber: 1.0 > [ 3493.452396] usb 4-5: reset full-speed USB device number 2 using ohci_hcd This looks like the device was physically disconnected from one USB port (ehci-pci) and put into another USB port (ohci_hcd), what actually happen? (In reply to bhreach from comment #13) > It changes the interface name from wlp0s18f2u5 to wlp0s18f0u5 so the network > cannot work. It is not supposed to do that? That depend of your user-space scripts. I think you should disable Predictable Network Interface Names option, it clearly does not work as expected. As long you have no more than one Ethernet and one WiFi card, old good names like eth0 and wlan0 should work well for you ... (In reply to Stanislaw Gruszka from comment #14) > This looks like the device was physically disconnected from one USB port > (ehci-pci) and put into another USB port (ohci_hcd), what actually happen? Nothing happened. The device was not physically touched, it is still connected to the same USB port. I switched back to the wlan naming, I have 2 wifi adapters they are named wlan0 and wlan1. I got my 1st drop out. [ 901.487909] usb 1-5: USB disconnect, device number 3 [ 903.555222] phy0 -> rt2800_wait_wpdma_ready: Error - WPDMA TX/RX busy [0xffffffff]. [ 904.184898] usb 1-5: new high-speed USB device number 7 using ehci-pci [ 904.264901] hub 1-0:1.0: unable to enumerate USB device on port 5 [ 904.764575] usb 4-5: new full-speed USB device number 2 using ohci_hcd [ 904.939752] usb 4-5: not running at top speed; connect to a high speed hub [ 904.962742] usb 4-5: New USB device found, idVendor=148f, idProduct=3072 [ 904.962745] usb 4-5: New USB device strings: Mfr=1, Product=2, SerialNumber=3 [ 904.962746] usb 4-5: Product: 802.11 n WLAN [ 904.962748] usb 4-5: Manufacturer: Ralink [ 904.962749] usb 4-5: SerialNumber: 1.0 [ 905.114352] usb 4-5: reset full-speed USB device number 2 using ohci_hcd [ 905.544466] ieee80211 phy2: Selected rate control algorithm 'minstrel_ht' [ 908.572739] IPv6: ADDRCONF(NETDEV_UP): wlan0: link is not ready [ 913.509864] wlan0: authenticate with 00:26:62:57:28:5a [ 914.155580] wlan0: send auth to 00:26:62:57:28:5a (try 1/3) [ 914.157611] wlan0: authenticated [ 914.157625] rt2800usb 4-5:1.0 wlan0: disabling HT/VHT due to WEP/TKIP use [ 914.159279] wlan0: associate with 00:26:62:57:28:5a (try 1/3) [ 914.165612] wlan0: RX AssocResp from 00:26:62:57:28:5a (capab=0x431 status=0 aid=11) [ 914.272543] wlan0: associated [ 914.272560] IPv6: ADDRCONF(NETDEV_CHANGE): wlan0: link becomes ready Same problem but the adapter name doesn't change. I am using IPv4 and it references IPv6, I don't know if that matters. There is no IP address assigned to wlan0 but it is associated with the AP. Network does not work. I restarted wlan0. This was added to dmesg: [ 2240.923612] wlan0: deauthenticating from 00:26:62:57:28:5a by local choice (reason=3) [ 2246.047907] IPv6: ADDRCONF(NETDEV_UP): wlan0: link is not ready [ 2258.869907] wlan0: authenticate with 00:26:62:57:28:5a [ 2259.474364] wlan0: send auth to 00:26:62:57:28:5a (try 1/3) [ 2259.476367] wlan0: authenticated [ 2259.476381] rt2800usb 4-5:1.0 wlan0: disabling HT/VHT due to WEP/TKIP use [ 2259.479297] wlan0: associate with 00:26:62:57:28:5a (try 1/3) [ 2259.482371] wlan0: RX AssocResp from 00:26:62:57:28:5a (capab=0x431 status=0 aid=11) [ 2259.589300] wlan0: associated [ 2259.589316] IPv6: ADDRCONF(NETDEV_CHANGE): wlan0: link becomes ready The interface has an IP address and the network is operating normally. (In reply to bhreach from comment #16) > [ 901.487909] usb 1-5: USB disconnect, device number 3 This clearly indicate disconnection. If device was not removed from the port physically, this could be some H/W problem or bug in rt2x00 firmware or bug in USB host driver. Let's try to disable autosuspend by adding bellow kernel boot parameter: usbcore.autosuspend=-1 Does it prevent issue to happen ? I don't know if this helps at all. I was playing TF2, suddenly connection just drops. I have experienced that on dota2 too and I thought it was dota issue because it happens every now and then. But usually I am able to use net quite long before issues start to happen. The connection didnt seem to reconect so I unplugged the wifi usb stick and connection resumed quite fast. 7935.722965] usb 2-5: USB disconnect, device number 3 [ 7937.375753] cfg80211: Calling CRDA to update world regulatory domain [ 7937.431742] ieee80211 phy0: rt2800_wait_wpdma_ready: Error - WPDMA TX/RX busy [0xffffffff] [ 7937.612141] cfg80211: World regulatory domain updated: [ 7937.612145] cfg80211: (start_freq - end_freq @ bandwidth), (max_antenna_gain, max_eirp) [ 7937.612147] cfg80211: (2402000 KHz - 2472000 KHz @ 40000 KHz), (300 mBi, 2000 mBm) [ 7937.612148] cfg80211: (2457000 KHz - 2482000 KHz @ 40000 KHz), (300 mBi, 2000 mBm) [ 7937.612149] cfg80211: (2474000 KHz - 2494000 KHz @ 20000 KHz), (300 mBi, 2000 mBm) [ 7937.612150] cfg80211: (5170000 KHz - 5250000 KHz @ 40000 KHz), (300 mBi, 2000 mBm) [ 7937.612151] cfg80211: (5735000 KHz - 5835000 KHz @ 40000 KHz), (300 mBi, 2000 mBm) [ 7967.556815] usb 2-5: new high-speed USB device number 4 using ehci-pci [ 7967.710466] usb 2-5: New USB device found, idVendor=148f, idProduct=3070 [ 7967.710469] usb 2-5: New USB device strings: Mfr=1, Product=2, SerialNumber=3 [ 7967.710471] usb 2-5: Product: 802.11 n WLAN [ 7967.710473] usb 2-5: Manufacturer: Ralink [ 7967.710474] usb 2-5: SerialNumber: 1.0 [ 7967.825170] usb 2-5: reset high-speed USB device number 4 using ehci-pci [ 7967.967896] ieee80211 phy1: rt2x00_set_rt: Info - RT chipset 3070, rev 0201 detected [ 7967.996056] ieee80211 phy1: rt2x00_set_rf: Info - RF chipset 0005 detected [ 7967.996429] ieee80211 phy1: Selected rate control algorithm 'minstrel_ht' [ 7968.179082] ieee80211 phy1: rt2x00lib_request_firmware: Info - Loading firmware file 'rt2870.bin' [ 7968.185058] ieee80211 phy1: rt2x00lib_request_firmware: Info - Firmware detected - version: 0.29 [ 7968.706099] IPv6: ADDRCONF(NETDEV_UP): wlan0: link is not ready [ 7970.961256] IPv6: ADDRCONF(NETDEV_UP): wlan0: link is not ready [ 7972.071199] ieee80211 phy1: rt2800usb_entry_txstatus_timeout: Warning - TX status timeout for entry 10 in queue 0 [ 7972.071207] ieee80211 phy1: rt2800usb_entry_txstatus_timeout: Warning - TX status timeout for entry 10 in queue 0 [ 7972.071210] ieee80211 phy1: rt2800usb_entry_txstatus_timeout: Warning - TX status timeout for entry 10 in queue 0 [ 7972.432197] wlan0: authenticate with MAC-ADDRESS [ 7972.499265] wlan0: send auth to MAC-ADDRESS (try 1/3) [ 7972.500874] wlan0: authenticated [ 7972.503512] wlan0: associate with MAC-ADDRESS (try 1/3) [ 7972.506893] wlan0: RX AssocResp from MAC-ADDRESS (capab=0x421 status=0 aid=1) [ 7972.513625] wlan0: associated [ 7972.513650] IPv6: ADDRCONF(NETDEV_CHANGE): wlan0: link becomes ready I just modified out mac address. Using 13.2, which had a change in this driver and it's a lot better now. Not so much annoying messages on console. And I think I haven't had the disconnecting issue either. Speed has been also good. So far looking good. (In reply to Stanislaw Gruszka from comment #17) > (In reply to bhreach from comment #16) > > [ 901.487909] usb 1-5: USB disconnect, device number 3 > > This clearly indicate disconnection. If device was not removed from the port > physically, this could be some H/W problem or bug in rt2x00 firmware or bug > in USB host driver. > > Let's try to disable autosuspend by adding bellow kernel boot parameter: > > usbcore.autosuspend=-1 > > Does it prevent issue to happen ? usbcore.autosuspend=-1 With that as a kernel parameter, I am still getting massive amounts of messages, but no network dropouts yet (up for 11 hrs and 30 mins). It has run as long as several days with no dropouts so we have to wait longer to be sure the dropouts are gone. Without usbcore.autosuspend=-1: I tried the adapter on Windows Vista and it works without problems. I tried 2 other USB adapters, 2 different brands same RT3072 chipset, get the same messages and network dropouts. I tried disabling USB 3.0 on my motherboard and in the Linux kernel both separately and together, it has no effect on the problem. Before posting to this bug, I posted a message in the Gentoo forums and another member said he has an RT3072 chipset adapter, PCI, using the rt2800pci module and he has no problems (i.e., no messages and no network dropouts) so it appears to be a USB issue. Well I have had those dropouts and those error messages. Since using 3.12.2 kernel those messages are more rare. I posted earlier the driver that is meant for this chip. It's too complicate for me to study, but someone look at it. If its based on linux kernel or not. It could have glues how to program these chips. I am downloading steam games now and it feels like the connection is faster too now. 3.8 serie kernel felt fast too. I have the same problem on amd64. Transfer is slow, but working. And a lot of these messages. (I use that system as an AP with hostapd.) Something else I noticed (perhaps its related?): strace -T ifconfig net_wlan down > ... > ioctl(4, SIOCGIFFLAGS, {ifr_name="net_wlan", > ifr_flags=IFF_UP|IFF_BROADCAST|IFF_MULTICAST}) = 0 <0.000008> > ioctl(4, SIOCSIFFLAGS, {ifr_name="net_wlan", > ifr_flags=IFF_BROADCAST|IFF_MULTICAST}) = 0 <4.098692> > ... strace -T ifconfig net_wlan up > ... > ioctl(4, SIOCGIFFLAGS, {ifr_name="net_wlan", > ifr_flags=IFF_BROADCAST|IFF_MULTICAST}) = 0 <0.000023> > ioctl(4, SIOCSIFFLAGS, {ifr_name="net_wlan", > ifr_flags=IFF_UP|IFF_BROADCAST|IFF_RUNNING|IFF_MULTICAST}) = 0 <61.200737> > ... Note the long execution times! While these commands are running the wired network does not respond as well! (The setup at boot is fast.) FYI,rt2800usb_entry_txstatus_timeout: Warning, on ARM (raspberry pi) Kernel 3.10.21, [ 7.680091] ieee80211 phy0: rt2x00_set_rt: Info - RT chipset 5390, rev 0502 detected [ 7.710935] ieee80211 phy0: rt2x00_set_rf: Info - RF chipset 5370 detected [ 7.795971] ieee80211 phy0: Selected rate control algorithm 'minstrel_ht' [ 7.843251] usbcore: registered new interface driver rt2800usb [ 7.992990] ieee80211 phy0: rt2x00lib_request_firmware: Info - Loading firmware file 'rt2870.bin' [ 7.995912] ieee80211 phy0: rt2x00lib_request_firmware: Info - Firmware detected - version: 0.29 No USB disconnects, After serveral hours of nothing, rt2800 will just start spewing out warnings In addition to my comment #22 On a fresh boot, no messages appear and a acceptable speed is reached. Everything looks fine. After some time (havent found out what triggers it, yet) the warnings start and the speed drops by more than factor 20. Even unloading all modules (rt2800usb rt2800lib rt2x00usb rt2x00lib mac80211 cfg80211) and unplugging the stick does not recover! Does bringing up the device (once it slowed down, the messages were printed and you shutdown the device) take very long for others as well or do I face a different bug? I tracked it down to "usb_control_msg" in "rt2x00usb_vendor_request" which takes about 50ms each call. (With over 1200 calls during starting the device it accumulates to round about 60s.) That would actually point to a problem in the usb subsystem? (The problem is, wireless works fine after a reboot... and takes little more than one day to "break". So it takes ages to bisect or something :-/ ) (In reply to Markus from comment #24) > After some time (havent found out what triggers it, yet) the warnings start > and the speed drops by more than factor 20. I am not getting that behavior. It appears to me that the more traffic on the device, the sooner the warnings start. I get no slow down in performance, the network goes down because the device has no IP address assigned. If I restart the device, it gets assigned an IP address and the network starts working again (that can take a few seconds). Some times the network goes down very soon after the warnings start and some times it takes several days. I switched to kernel 3.12.3 yesterday. I noticed a much much smaller number of warnings. It has been running 10-1/2 hours and the network has not gone down yet. It will have to run for at least several days to see if I lose the network. There was a patch for this device I dont remember was it 3.12.2 or 3.12.3 which made those messages to appear less frequently. Its on the kernel change log. I think I have a boost on speed because of this. I feel this device drops less now. And those messages seems to become when device have been long used/heavy traffic. Could someone look at the telewell drivers I was reffering to? This device was sold to me as telewell product. Kernel recognizes it as ralink. (In reply to Markus from comment #25) > I tracked it down to "usb_control_msg" in "rt2x00usb_vendor_request" which > takes about 50ms each call. (With over 1200 calls during starting the device > it accumulates to round about 60s.) > > That would actually point to a problem in the usb subsystem? Yes, please report that to usb maintainers. You can also try "usbcore.autosuspend=-1" boot option as workaround. (In reply to Stanislaw Gruszka from comment #28) > You can also try > "usbcore.autosuspend=-1" boot option as workaround. I tried that and it does not seem to have any affect on the problem (i.e., still get a massive number of warnings and the network still drops out). In addition, after a couple of days, my USB mouse and keyboard locked up and I had to reboot. I am using kernel 3.12.3 now and the changes you made show an improvement in the number of warnings. It has been running for 33-1/2 hours, total data through put is 20 GB (Tx + Rx), no network drop out yet. It has gone as long as 4-5 days without a network drop out so I need several more days of testing to know for sure if the network drop out problem is improved or completely gone. I upgraded kernel into 3.12.2 or 3.12.3 and cloned linuxtv daily git and installed linuxtv modules, the 1st boot wasnt succefull. Didnt have working net. Not sure about mouse and keyboard, dont remember anymore. But the problem went away just by booting basically. I just wanted to say this because of earlier messages above me mentioned not having mouse and keyboard. I have both usb plugged and wireless. (In reply to Stanislaw Gruszka from comment #28) > (In reply to Markus from comment #25) > > I tracked it down to "usb_control_msg" in "rt2x00usb_vendor_request" which > > takes about 50ms each call. (With over 1200 calls during starting the > device > > it accumulates to round about 60s.) > > > > That would actually point to a problem in the usb subsystem? > > Yes, please report that to usb maintainers. You can also try > "usbcore.autosuspend=-1" boot option as workaround. USB_SUSPEND is disabled in the kernel. Splitted my issue into bug #66841 as I seem to face a different bug. I have that same problem on Udoo (arm) with last next kernel 3.13- 20131210 and with your last path. I have that same problem with option usbcore.autosuspend=-1. After running for 4.5 days, 60+ GB of data through put, the network dropped out. Same problem, the adapter has no IP address assigned. I also looked at dmesg and the massive amount of messages are back. I am going back to testing kernels, last one tested was 3.8, still got dropouts. I installed kernel 3.12.4 from the Ubuntu "Trusty" repo (on my "saucy" install). log file errors diminished greatly from the 3.11 kernel. However, the connection speed was no better with 800 kbps down. The USB indeed is a huge suspect, because I've had in the past problems with USB Flash drive transfer speed (which has never been fixed in over 3 years of ubuntu/kernel upgrades). Perhaps the issues are related. Not sure where to go next... ndiswrapper around the windows drivers? Try to get the old linux RealTek drivers to work with kernels 3+ ? Give up and use another connection method.. I'll keep poking, though! Thanks, Dan It is possible that random disconnections are caused by PS (Power Save). It can be disabled by: iw dev wlan9 set power_save off (In reply to Stanislaw Gruszka from comment #36) > It is possible that random disconnections are caused by PS (Power Save). It > can be disabled by: > > iw dev wlan9 set power_save off iw dev wlan0 get power_save Shows Power Save is Off on my devices so that is not my problem. Off is likely the default setting because I did not change the setting. This is odd. I haven't changed any settings as far as I know and I used the command above to see powersavings and I have it on. What's the command for switching it off or on? (In reply to Jarkko K from comment #39) > What's the command for switching it off or on? iw dev wlan9 set power_save off/on I think you made a small typo there. It should be wlan0. Anyway I got lots of those warnings on dmesg. Does those warnings hint to specific code? Is it so hard to track down what has changed on these drivers 3.8 --> current state. Its hard for me to believe that there are lots of changes on driver side code. (In reply to Jarkko K from comment #42) > Anyway I got lots of those warnings on dmesg. Does those warnings hint to > specific code? Is it so hard to track down what has changed on these drivers > 3.8 --> current state. > > Its hard for me to believe that there are lots of changes on driver side > code. You can draw your own conclusions but someone with a RT3072 PCI based device using the rt2800pci driver has no warnings or network drop outs. That leads me to believe something changed in the USB code that is causing the problem. Fixing the problem is way beyond my level of expertise. Either there is a bug in the changed USB code or the driver's USB interface needs to be modified to accomodate the changes in the USB code. I'm still running the 3.12.3 kernel. It ran for 4.5 days before a network dropout, I reset the device and it has been running for almost 6 days without a dropout. Through put is 95 GB (Rx + Tx). The changes in the 3.12.3 kernel seem to have improved the situation; however, still getting a massive number of warning messages. Can anyone else provide their throughput (rate, not total data amount)? Even with the newest kernel, mine is ~ 1Mbps. I'm suspicious that this is due to a USB problem, perhaps with the device stuck in "full speed" mode, instead of "high speed". I'll need to check when I'm at home. How do we track changes on certain kernels on certain drivers? I am not a developer, just user. But I admit that after someone made the change in kernel that it prints less frequently timeouts, wifi has worked better. But I remember that when I used 3.8 series kernel I had good speed and I don't seeing dmesg errors. Whatkind of connection speed should I get? On the package it says It can transfer up to 300Mbps and this is 802.11g/n. But I have never seen higher than 150 speed on network manager. (In reply to Jarkko K from comment #45) > Whatkind of connection speed should I get? > > On the package it says It can transfer up to 300Mbps and this is 802.11g/n. > > But I have never seen higher than 150 speed on network manager. There are a number of issues when it comes to speed. 1. ISPs only guarantee their maximum speed on a wired connection (not Wi-Fi). 2. Wireless speeds are very variable, max b = 11Mb/s, max g = 54 Mb/s, max n = 150 Mb/s 3. Typically, your router will run at the protocol of the slowest device connected to it. 4. The protocol you connect to your router with depends on several factors (proximity to the router, intervening walls, the Tx power of your device, the sensitivity of your device, Tx power of the router, the sensitivity of the router, antenna on the device, antenna on the router, the weather, other devices connected to the router, your device's capabilities, your router's capabilities, etc.). 5. Say for example you connect at the g protocol, that means the max 54 Mb/s will be shared among all devices connect to the router (i.e., the sum of the speed of all devices has to be less than 54Mb/s). 6. The iwconfig Bit Rate is NOT the speed at which you are transferring data, it is the maximum theorectical speed you could transfer data at and even more annoying, some devices report the wrong number. 7. I use a program called wavemon to monitor wireless adapters. It is very useful, it gives realtime statistics about the device. I watch the signal level while moving the adapter to position it at the optimum point. 8. I use icewm for my desktop and it comes with a network monitor in the lower right hand corner of the screen. Hover the mouse over it and it gives current network transfer rate plus other statistics. Here is a link to a wikipedia articles explaining the various 802.11 protocols: http://en.wikipedia.org/wiki/IEEE_802.11 Here is a link about how to get 300 Mb/s on the n protocol: http://compnetworking.about.com/od/wireless/f/80211n-300-mbps.htm Tried to activate 300mbps mode but no luck. tp-link tl-mr3420. Anyway back to the problem. Does anyone know where can i track kernel changes? Well I might find that myself... How do you compare git changes? This is one way but there have to be a better way. http://git.kernel.org/cgit/linux/kernel/git/stable/linux-stable.git/log/?h=linux-3.8.y&qt=grep&q=ralink That gives some changes done on the driver on 3.8 series. 3.9 series http://git.kernel.org/cgit/linux/kernel/git/stable/linux-stable.git/log/?h=linux-3.9.y&qt=grep&q=ralink etc I might have found something. My pc has been up a long time but it has done so several months like so and I have used it like usually. As earlier someone mentioned that it\s possible to gain 300mbps speed with combined channels on -which I tried. I changed router's channel to 7 and transfering mode as 40mhz. They were both auto before. And changed some stuff on pc side, but I wasnt able to get better trasfer speeds. Now channel is still 7 and mode 40...and after reboot I cheked dmesg and I got lots of errors. I doubt I have never had this much before. So is it possbile that channel number or frequency (or whatever that 20/40/auto means)? Could someone else try? You must test the throughput by actually transferring a known amount of data, and measuring the time it takes. If your internet connection is fast, simply use a speedtest website. Otherwise, transfer a big file to another computer and time it. I can connect up to 44 Mbps on my 802.11G, but that's not actually what I can acheive (1 Mbps..). My laptop, sitting next to it, can actually transfer much faster ~20 Mbps. The problem is with the drivers or Kernel, not the wifi connection speed. (at least for me) I changed mode back to auto and channel auto and I do get timeout errors but less frequently than when they were on other position. Is this normal? Sorry for spamming this thread. I am making cleaning in my room and I found quide that didnt even remember having. This is for my card. It says in english that there are 2 different powersave modes. CAM = constantly awake mode, keep wireless radio activity even not trasfering data PSM = Power saving mode switches radio off when not transfering data This card also has WMM Which uses priority on traffic, videos and streaming etc. Which has 4 settings. And they are all powersaving modes. I've got the following setup here : 2 virtual machines (vmware workstation) each machine connected to eachother with Adhoc Wifi via a Ralink 3070 USB dongle each machine connected also to eachother with ethernet via a virtual NIC running achlinux up to date : linux kernel 3.12.6-1 with the patchs already included in this release SCP from machine A to machine B (and vice versa) of a 87MB .tgz file: - via ethernet => 4,6MBps ie 46Mbps (realistic on this virtual hardware) - via Adhoc => 150KBps ie 1Mbps I still have the errors in dmesg : [ 3261.087395] ieee80211 phy0: rt2800usb_txdone: Warning - Got TX status for an empty queue 2, dropping [ 3261.588072] ieee80211 phy0: rt2800usb_entry_txstatus_timeout: Warning - TX status timeout for entry 14 in queue 2 [ 3261.725591] ieee80211 phy0: rt2800usb_entry_txstatus_timeout: Warning - TX status timeout for entry 12 in queue 2 [ 3261.728182] ieee80211 phy0: rt2800usb_entry_txstatus_timeout: Warning - TX status timeout for entry 12 in queue 2 [ 3261.728270] ieee80211 phy0: rt2800usb_entry_txstatus_timeout: Warning - TX status timeout for entry 12 in queue 2 [ 3261.754154] ieee80211 phy0: rt2800usb_entry_txstatus_timeout: Warning - TX status timeout for entry 15 in queue 2 [ 3261.756265] ieee80211 phy0: rt2800usb_entry_txstatus_timeout: Warning - TX status timeout for entry 15 in queue 2 [ 3261.756306] ieee80211 phy0: rt2800usb_entry_txstatus_timeout: Warning - TX status timeout for entry 15 in queue 2 [ 3261.756364] ieee80211 phy0: rt2800usb_entry_txstatus_timeout: Warning - TX status timeout for entry 0 in queue 2 [ 3261.988875] ieee80211 phy0: rt2800usb_entry_txstatus_timeout: Warning - TX status timeout for entry 12 in queue 2 [ 3261.990123] ieee80211 phy0: rt2800usb_entry_txstatus_timeout: Warning - TX status timeout for entry 12 in queue 2 [ 3261.990186] ieee80211 phy0: rt2800usb_entry_txstatus_timeout: Warning - TX status timeout for entry 12 in queue 2 [ 3262.291448] ieee80211 phy0: rt2800usb_txdone: Warning - Got TX status for an empty queue 2, dropping [ 3262.465027] ieee80211 phy0: rt2800usb_txdone: Warning - Got TX status for an empty queue 2, dropping [ 3262.467897] ieee80211 phy0: rt2800usb_txdone: Warning - Data pending for entry 6 in queue 2 [ 3262.527013] ieee80211 phy0: rt2800usb_txdone: Warning - Got TX status for an empty queue 2, dropping [ 3262.553347] ieee80211 phy0: rt2800usb_txdone: Warning - Got TX status for an empty queue 2, dropping [ 3262.647024] ieee80211 phy0: rt2800usb_txdone: Warning - Got TX status for an empty queue 2, dropping [ 3262.705471] ieee80211 phy0: rt2800usb_txdone: Warning - Got TX status for an empty queue 2, dropping [ 3263.652141] ieee80211 phy0: rt2800usb_txdone: Warning - Got TX status for an empty queue 2, dropping do you need additional information to go deeper into the analysis ? (In reply to Sebastien Bonnaire from comment #53) > I've got the following setup here : > 2 virtual machines (vmware workstation) > each machine connected to eachother with Adhoc Wifi via a Ralink 3070 USB > dongle > each machine connected also to eachother with ethernet via a virtual NIC > > running achlinux up to date : linux kernel 3.12.6-1 with the patchs already > included in this release > > SCP from machine A to machine B (and vice versa) of a 87MB .tgz file: > - via ethernet => 4,6MBps ie 46Mbps (realistic on this virtual hardware) > - via Adhoc => 150KBps ie 1Mbps I have several RT3072 devices. Don't use scp to transfer files, it is encrypted which makes it very slow. We also cannot directly compare your performance to mine because I am not using encryption for the test. I just ran an Internet performance test that transferred 34.8 MBytes at a rate of 4.83 Mbits/sec. The computer is wirelessly connected to a router which in turn is hard wire connected to the internet. Here are the revelant lines from iwconfig: Mode:Managed Bit Rate=36 Mb/s Tx-Power=20 dBm Retry limit:30 RTS thr=2347 B Fragment thr=2346 B Power Management:off Link Quality=37/70 Signal level=-73 dBm Rx invalid nwid:0 Rx invalid crypt:0 Rx invalid frag:0 Tx excessive retries:95323 Invalid misc:458 Missed beacon:0 You can see I optimized performance by setting the Retry limit, RTS thr, and Fragment thr to values I found online. Power management is off. You can also see that some packets are getting dropped, likely due to weak signal and radio interference (BTW, the numbers are so high because the device has not been reset for many days). (In reply to bhreach from comment #54) > (In reply to Sebastien Bonnaire from comment #53) > > I've got the following setup here : > > 2 virtual machines (vmware workstation) > > each machine connected to eachother with Adhoc Wifi via a Ralink 3070 USB > > dongle > > each machine connected also to eachother with ethernet via a virtual NIC > > > > running achlinux up to date : linux kernel 3.12.6-1 with the patchs already > > included in this release > > > > SCP from machine A to machine B (and vice versa) of a 87MB .tgz file: > > - via ethernet => 4,6MBps ie 46Mbps (realistic on this virtual hardware) > > - via Adhoc => 150KBps ie 1Mbps > > I have several RT3072 devices. Don't use scp to transfer files, it is > encrypted which makes it very slow. We also cannot directly compare your > performance to mine because I am not using encryption for the test. > > I just ran an Internet performance test that transferred 34.8 MBytes at a > rate of 4.83 Mbits/sec. The computer is wirelessly connected to a router > which in turn is hard wire connected to the internet. > > Here are the revelant lines from iwconfig: > > Mode:Managed > Bit Rate=36 Mb/s Tx-Power=20 dBm > Retry limit:30 RTS thr=2347 B Fragment thr=2346 B > Power Management:off > Link Quality=37/70 Signal level=-73 dBm > Rx invalid nwid:0 Rx invalid crypt:0 Rx invalid frag:0 > Tx excessive retries:95323 Invalid misc:458 Missed beacon:0 > > You can see I optimized performance by setting the Retry limit, RTS thr, and > Fragment thr to values I found online. Power management is off. > > You can also see that some packets are getting dropped, likely due to weak > signal and radio interference (BTW, the numbers are so high because the > device has not been reset for many days). We have a few differencies : - you are using RT3072, i'm using RT3070 - you are using the managed mode, i'm using ad-hoc - you are transfering files with no encryption, i'm using scp - you have particular RTS and frag values So, i've changed a few my configuration to go nearer from yours : - i'm still using an RT3070 USB wifi dongle - i've switched to Managed mode - i'm still using SCP (definitely not a cause of a 1Mbps transfer rate, as the same hardware is able to transfer the same file on Ethernet link at more than 45Mbps. - i've set your RTS and Frag values in my configuration. => this makes no change in the throughput mesured. => i still have the same error messages in dmesg [ 1872.023950] ieee80211 phy0: rt2800usb_entry_txstatus_timeout: Warning - TX status timeout for entry 2 in queue 2 [ 1872.024289] ieee80211 phy0: rt2800usb_entry_txstatus_timeout: Warning - TX status timeout for entry 3 in queue 2 [ 1872.648786] ieee80211 phy0: rt2800usb_entry_txstatus_timeout: Warning - TX status timeout for entry 9 in queue 2 [ 1872.695247] ieee80211 phy0: rt2800usb_txdone: Warning - Got TX status for an empty queue 2, dropping [ 1872.697111] ieee80211 phy0: rt2800usb_txdone: Warning - Got TX status for an empty queue 2, dropping iwconfig output is : wls34u1 IEEE 802.11bgn ESSID:"RPIWifiTest-WPA" Mode:Managed Frequency:2.412 GHz Access Point: 00:25:22:40:71:C0 Bit Rate=6.5 Mb/s Tx-Power=20 dBm Retry long limit:7 RTS thr=2347 B Fragment thr=2346 B Encryption key:off Power Management:off Link Quality=70/70 Signal level=-17 dBm Rx invalid nwid:0 Rx invalid crypt:0 Rx invalid frag:0 Tx excessive retries:1535 Invalid misc:1565 Missed beacon:0 I can observe the following regarding the SCP file transfer : - i'm using the command "watch -d -n1 "iwconfig wls34u1;dmesg | tail -10" - during the file transfer, the transfer is like stalling often, during a few seconds, up to a "Bit Rate" renegociation (goes from 6.5Mbps to 13Mbps), the the file transfer speeds up (difficult to say how much, let's say around 5Mbps), and slows down 2 seconds later I can observe the following regarding a simple ping: - i'm using the command "ping -i 0.01 192.168.12.1 -s1000" - this is my result = around 25% of packets lost --- 192.168.12.1 ping statistics --- 1960 packets transmitted, 1459 received, 25% packet loss, time 28325ms rtt min/avg/max/mdev = 4.098/93.715/494.104/77.857 ms, pipe 22 (In reply to Sebastien Bonnaire from comment #55) >iwconfig output is : >wls34u1 IEEE 802.11bgn ESSID:"RPIWifiTest-WPA" > Mode:Managed Frequency:2.412 GHz Access Point: 00:25:22:40:71:C0 > Bit Rate=6.5 Mb/s Tx-Power=20 dBm > Retry long limit:7 RTS thr=2347 B Fragment thr=2346 B > >I can observe the following regarding a simple ping: >- i'm using the command "ping -i 0.01 192.168.12.1 -s1000" >- this is my result = around 25% of packets lost >--- 192.168.12.1 ping statistics --- >1960 packets transmitted, 1459 received, 25% packet loss, time 28325ms >rtt min/avg/max/mdev = 4.098/93.715/494.104/77.857 ms, pipe 22 Your retry limit is 7, mine is 30. In my tests that parameter made the biggest difference for transfer speed. Your Bit Rate is 6.5 Mbps, mine is 36 Mbps. Something is causing you to connect at a very slow rate. Is there an AP nearby running on the same or on an adjacent channel (the channels overlap). Here are the results of my ping command, both computers are using RT3072 wireless devices to connect to my router. 192.168.89.50 is using kernel 3.12.3 the originating box is using kernel 3.2.0. The kernel 3.12.3 box gets warnings, it has been up for 24+ days and the network has dropped out only twice. That is much better than with some earlier kernels. The 3.2.0 kernel box was rebooted 2 days ago but before that, it was up for 146 days and the network never dropped out and it gets no warnings. su -c 'ping -i 0.01 192.168.89.50 -s 1000 -c 2000' --- 192.168.89.50 ping statistics --- 2000 packets transmitted, 1994 received, +23 duplicates, 0% packet loss, time 22736ms rtt min/avg/max/mdev = 1.446/9.843/261.273/21.884 ms, pipe 14 (In reply to bhreach from comment #56) > Your retry limit is 7, mine is 30. In my tests that parameter made the > biggest difference for transfer speed. ok, it's now the same settings here. No big change in ping results (24% packet loss) > Your Bit Rate is 6.5 Mbps, mine is 36 Mbps. Something is causing you to > connect at a very slow rate. Is there an AP nearby running on the same or on > an adjacent channel (the channels overlap). when there is traffic, the negociated rate goes up to 54Mbps (wifi G). i've got another AP on the same channel at home. That's the clearest channel in the neighborhood. By the way, i can acheive easily 20Mbps and no packet loss with my windows 7, using the same RT3070 on the same channel, in the same environment. > --- 192.168.89.50 ping statistics --- > 2000 packets transmitted, 1994 received, +23 duplicates, 0% packet loss, > time 22736ms > rtt min/avg/max/mdev = 1.446/9.843/261.273/21.884 ms, pipe 14 happy to see that your config allows you a clear communication with eachother. Do you think that i'm now facing a bug only related to RT3070 USB chips ? (In reply to Sebastien Bonnaire from comment #57) > Do you think that i'm now facing a bug only related to RT3070 USB chips ? That is a definite possibility since with the same environment and device Windows gets much better performance. Another trick you can try that can help with flakey drivers is force the device to connect at a fixed rate like this: iwconfig wls34u1 rate 11M fixed If that helps, you can experiment with different rates to see what works best. (In reply to bhreach from comment #58) > Another trick you can try that can help with flakey drivers is force the > device to connect at a fixed rate like this: > > iwconfig wls34u1 rate 11M fixed > > If that helps, you can experiment with different rates to see what works > best. bad luck, setting the rate staticly doesn't help. Transfer rate is still below 1Mbps with the same messages in dmesg... I don't think this is a bug in solely the RT3070 USB chipset. I also have this problem and run the RT3072 (Etekcity High Power 802.11 B/N/G 300M USB Wireless 1000mw Wifi Network Adapter). I can't run the same tests as you at the moment, but my symptoms are similar. Updated kernel versions did not help me. -Dan I have that same problem on: Bus 001 Device 004: ID 148f:5370 Ralink Technology, Corp. RT5370 Wireless Adapter -Patryk In lieu of fixing the kernel rt2800usb drivers (or whatever is the root cause here), I'm thinking of trying to get the RealTek drivers. It seems this is not a new problem. In the past, the Realtek drivers could be used, but they don't seem to work with recent kernels. This may require some hacking and patching.. and time.. Oops.. I mean "MediaTek" here not "RealTek". (In reply to Dan G from comment #60) > I don't think this is a bug in solely the RT3070 USB chipset. I also have > this problem and run the RT3072 (Etekcity High Power 802.11 B/N/G 300M USB > Wireless 1000mw Wifi Network Adapter). I can't run the same tests as you at > the moment, but my symptoms are similar. Updated kernel versions did not > help me. > -Dan Let me summarize 1. AFAICT everybody using the rt2800USB module with a recent kernel (3.9 and later) is getting a massive number of warnings in the kernel log. 2. Other symptoms vary and depend on kernel version. 3. My personal experience is with RT3072 devices (including several of the above mentioned Etekcity devices). The most extreme case is on one box with 3.10.17 kernel and later. I would loose my network frequently, the longest it would last is 4-5 days but would frequently drop out at least once a day. Using Predictable Network Interface Names makes the problem much worse. Dropouts occur numerous times per day. I have the same 3.10.17 kernel running on another box with an RT3072 adapter using Predictable Network Interface Names and it gets the warnings in the kernel log but the network never drops out and data through put is good. It appears the problem is USB related. In an online support forum, a user with an RT3072 PCI device using the rt2800PCI module with a recent kernel gets no warnings and has no problems with the device. I have upgraded the problem box to a 3.12.3 kernel and while the warnings remain, the network drops out much less frequently (only twice in 25+ days of uptime). Also, considering that I had the same kernel running on 2 boxes with one having network dropouts and the other not having network dropouts, hardware may be playing a role. Different motherboards are going to have slightly different USB implementations. 4. The most recent discussion about the RT3070 device is about a slightly different problem. The user with that device is reporting a very slow data transfer rate. Running the same device under Windows 7 with the same hardware environment yields 20 X the data through which implies some kind of problem with the rt2800USB module and the RT3070 device. do you want me to open a new bug for point 4 (only) ? ie : don't you mean point 1 and point 4 may be related to each other ? Sebastien and bhreach: The original post for this bug relates to point 1, and may or may not be the cause of both points 3 and 4. My problem is point 4 , and that's with the RT3072. I don't get dropouts. My complaints have been solely throughput related. If we can track down the cause of these log messages, and fix the problem, perhaps that will solve both issues. I know the latest kernels have a patch, but if I understood it correctly, it simply "hides" the problem by giving a longer timeout. We don't have enough information to know if all the reported problems are being caused by the same bug. I have systems running kernel 3.2.0 and kernel 3.6.11 with no problems with the rt2800USB driver (using RT3072 based adapters). 3.9 kernel is when the warnings started to show, that is because logging those warning became the default. Before that you had to be in some kind of 'debugging mode' to see those warnings. What we need to do is test older kernels until we discover when the problem started. Then we can tell the kernel developers that kernel 3.x.y does not have the problem but 3.x.y+1 does have the problem. That will point them to the changes are causing the problem. After a work-around to bug #66841 I still saw these messages. (Less frequent and the speed does not degenerate as much.) I moved back to 802.11g only. (Got a speed of ~2.4 MBytes/s via wget.) But still I see these messages when another client is using the same wlan. (Maybe a different wlan on the same or similar channel is causeing the same problem?) I found these. Check if you can benefit from these https://bugzilla.redhat.com/show_bug.cgi?id=913631 http://archlinuxarm.org/forum/viewtopic.php?f=31&t=6598 https://dev.openwrt.org/ticket/13523 [ 7141.824196] ieee80211 phy0: rt2x00usb_watchdog_tx_dma: Warning - TX queue 2 DMA timed out, invoke forced forced reset [ 7141.831034] ieee80211 phy0: rt2800usb_txdone: Warning - Got TX status for an empty queue 2, dropping I have never before seen those on my dmesg. Could they give any clue? Hi, Alfa Networks has released a new driver (v2.6.1.3 - 2013/11/27) 2 months ago for their rt3070 chip usb wifi adapter (ge-rt3070 / awus036nh). Direct download link: http://sourceforge.net/projects/alfanetwork/files/Driver/3070_LinuxSTA_2.6.1.3_20121022.tar.bz2/download I tested it on my arch linux with 3.12.8 kernel and it build and inserts without any errors (well except some cp commands to a /tftpboot dir that I commented out). I also placed 2870 first on the chipset line in the makefile since it builds only the first. For now i can only scan the waves because it has a semi-documented .dat settings file and I don't know what to modify (i tried the stuff in the readme but without success). Anyway the scan seemed promising : my router with the new driver was shown to have a signal power of -1, while the old rt2800usb shows it at -11. Fingers crossed, maybe it works... Can you guys please check it out and see if it works ? Created attachment 123671 [details] attachment-24043-0.html Hi, To OS : i've tested the driver you gave (from Alfa networks), it is the same as the one available at Ralink (MediaTek) website. Even if this module is loaded correctly, i was not able to make it work with ad-hoc network, nor hostapd... To bhreach : I've compiled a few kernels to identify when the performance problem appears when using the default rt2800usb module : - 3.6.11 is working at 8Mbps (limited by my DSL line for my tests) - 3.7.1 is the first kernel with low performance (1Mbps) and a lot of packet loss both kernel are from 17th december 2012... My test environment is : - i have a regular DSL line, giving up to 10Mbps when i'm lucky. - my vmware workstation (my laptop) connected to internet over wired cable - Archlinux (i386) on VMware using an RT3070 USB key - my windows phone using speedtest.net over internet I do not know how to go deaper. Can anyone give me advices ? 2014-01-23 <bugzilla-daemon@bugzilla.kernel.org> > https://bugzilla.kernel.org/show_bug.cgi?id=62781 > > OS <giuseppe_stolnicu@yahoo.com> changed: > > What |Removed |Added > > ---------------------------------------------------------------------------- > CC| | > giuseppe_stolnicu@yahoo.com > > --- Comment #71 from OS <giuseppe_stolnicu@yahoo.com> --- > Hi, > Alfa Networks has released a new driver (v2.6.1.3 - 2013/11/27) 2 months > ago > for their rt3070 chip usb wifi adapter (ge-rt3070 / awus036nh). > Direct download link: > > > http://sourceforge.net/projects/alfanetwork/files/Driver/3070_LinuxSTA_2.6.1.3_20121022.tar.bz2/download > > I tested it on my arch linux with 3.12.8 kernel and it build and inserts > without any errors (well except some cp commands to a /tftpboot dir that I > commented out). I also placed 2870 first on the chipset line in the > makefile > since it builds only the first. > > For now i can only scan the waves because it has a semi-documented .dat > settings file and I don't know what to modify (i tried the stuff in the > readme > but without success). > > Anyway the scan seemed promising : my router with the new driver was > shown to > have a signal power of -1, while the old rt2800usb shows it at -11. Fingers > crossed, maybe it works... > > Can you guys please check it out and see if it works ? > > -- > You are receiving this mail because: > You are on the CC list for the bug. > > (In reply to Sebastien Bonnaire from comment #72) > To bhreach : I've compiled a few kernels to identify when the performance > problem appears when using the default rt2800usb module : > - 3.6.11 is working at 8Mbps (limited by my DSL line for my tests) > - 3.7.1 is the first kernel with low performance (1Mbps) and a lot of > packet loss ... > I do not know how to go deaper. Can anyone give me advices ? You could try to bisect between 3.6 and 3.7 like is described here: https://www.kernel.org/pub/software/scm/git/docs/git-bisect.html You have to know how to compile kernel, build it and install. Those should be done on each step of bisection. Perhaps faster would be just revert changes from rt2x00 driver between 3.6 and 3.7 to figure out bad commit . On each steps you have to build modules, install them and reload rt2800usb driver, what is faster than rebuild whole kernel. Some short tutorial: # clone the tree git clone git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git # switch to v3.7 version git checkout -b b3.7 v3.7 # copy old config and setup it for new kernel build cp /boot/config-VERSION .config make oldconfig # build kernel and install it make sudo make modules_install sudo make install # restart machine, boot with 3.7 kernel # see rt2x00 commits , and revert some amount of them (i.e. 1/3) git log --oneline v3.6..HEAD -- drivers/net/wireless/rt2x00/ > bf7e1ab rt2800: validate step value for temperature compensation > 761ce8c rt2x00: usb: fix reset resume > cf193f6 rt2x00/rt3352: Fix lnagain assignment to use register 66. ... git revert bf7e1ab git revert 761ce8c git revert cf193f6 ... # then build modules and install them make modules sudo make modules_install # and reload driver and see if works better or not modprobe -r rt2800usb modprobe rt2800usb # Note with some .config option (CONFIG_LOCALVERSION_AUTO I think), if you change vanilla sources, the version string will change with additional "+", if this will happen just install kernel and boot into it, further compilation will not change the version (it is just mark that there are local changes on kernel). # if problem persist revert further, if not rollback reverted commits by: git reset --hard HEAD~1 This way you can find first broken commit if problem was caused by rt2x00 driver change. If you revert all rt2x00 changes between 3.6 and 3.7 and problem will still persist, that mean it is caused by other kernel changes, most likely by mac80211. In such case full bisection will be needed. If you find hard reverting rt2x00 commits, I can prepare some patches for that i.e. 3 patches reverting 1/3, 2/3 and all of changes. That would help narrowing issue, i.e if you find out that 1/3 is broken but 2/3 is fine, then I will provide further patches to figure out bad commit between 1/3 and 2/3. But if you can do the same using git, that is preferred. (In reply to Sebastien Bonnaire from comment #72) > - 3.6.11 is working at 8Mbps (limited by my DSL line for my tests) > - 3.7.1 is the first kernel with low performance (1Mbps) and a lot of packet > loss > both kernel are from 17th december 2012... Actually you should first check if 3.6 -> 3.7 is proper regression window. Same release date does not mean that both -stable kernels include same fixes, i.e. 3.6.11 can include more fixes than 3.7.1 . 3.x.y versions are -stable releases, which include various fixes. In general please search for regression in major 3.x releases i.e. 3.13, 3.12, ... , 3.7, 3.6 . Once found that 3.7 is first broken release, please check it's latest stable release i.e. 3.7.7. If that version will have poor performance too then really 3.7 is first release with new bug, otherwise performance drop on 3.7 was caused by some other bug, which is already known and fixed. Created attachment 128631 [details] still with 3.14rc5 Since there was no activity for a while, just adding not much new with RT5370: Bus 001 Device 006: ID 148f:5370 Ralink Technology, Corp. RT5370 Wireless Adapter I'm using 3.14-rc5 and see most of the problems described here: 1. All input devices at one point got killed, network didn't work at all, so I don't know what was up with that, had to reboot, no logs, sorry. 2. A lot of those messages in dmesg, attached. 3. unstable connection, gets lost very quickly. It could be that the sender/receiver is just very bad, because it's a dirt cheap one, but it happens at "signal quality" levels way over 40%. ("TTT-Mall Mini 150M USB WiFi Wireless LAN 802.11 n / g / b-Adapter mit Antenne" if any developer is in germany/UK/Europe(?) and wants to buy one of those really cheap ones from amazon) 4. After plugging it in and networkmanager trying to use it, there is often a "link not ready" message. I haven't seen that with any usb wifi adapter. 5. New issue: An occassinal null dereference kernel panic. Sorry, I have only the very beginning in that dmesg. Maybe later I can work at getting the full log, but it only happens some of the time after plugging that thing in and right while/after it connects to he wifi. Anything else we can do other than bisecting the kernel? Maybe asking MediaTek directly? (Hey, don't laugh!) Their latest driver is from october 2012 it seems: http://www.mediatek.com/en/downloads/rt8070-rt3070-rt3370-rt3572-rt5370-rt5372-rt5572-usb-usb/ (the form accepts fake data) but it says it's for 2.6 and it doesn't compile anyway (»int« and »kuid_t« incompatibility somewhere). Christoph, your problems with the device are clearly different than reported here earlier First of all you have issues with USB: [ 691.518258] usb 1-1: new high-speed USB device number 7 using xhci_hcd [ 691.702296] usb 1-1: config 1 interface 0 altsetting 0 has 7 endpoint descriptors, different from the interface descriptor's value: 5 [ 691.868852] usb 1-1: reset high-speed USB device number 7 using xhci_hcd [ 691.868910] xhci_hcd 0000:00:14.0: Setup ERROR: setup context command for slot 7. [ 692.028897] usb 1-1: reset high-speed USB device number 7 using xhci_hcd [ 692.028958] xhci_hcd 0000:00:14.0: Setup ERROR: setup context command for slot 7. [ 692.189059] usb 1-1: reset high-speed USB device number 7 using xhci_hcd [ 692.213754] xhci_hcd 0000:00:14.0: xHCI xhci_drop_endpoint called with disabled ep ffff88074ce52040 This could be problem of xHCI driver of RT5370 firmware. It worth to check if updating F/W helps. You can download it from: http://rt2x00.serialmonkey.com/pipermail/users_rt2x00.serialmonkey.com/2013-January/005610.html and replace file in /lib/firmware/ Second problem are random disconnects: [ 1777.560067] wlp0s20u1: authenticate with 54:e6:fc:af:47:36 [ 1777.580536] wlp0s20u1: send auth to 54:e6:fc:af:47:36 (try 1/3) [ 1777.616609] wlp0s20u1: send auth to 54:e6:fc:af:47:36 (try 2/3) [ 1777.645496] wlp0s20u1: send auth to 54:e6:fc:af:47:36 (try 3/3) [ 1777.682743] wlp0s20u1: authentication with 54:e6:fc:af:47:36 timed out [ 1779.341611] wlp0s20u1: authenticate with 54:e6:fc:af:47:36 I have this problem on RT5390 adapter (PCI version of RT5370) on 3.14-rc. This is regression caused by some recent commits: commit c8520bcb784df69cf5960308846253814ec45db7 Author: Kevin Lo <kevlo@kevlo.org> Date: Thu Oct 24 13:24:08 2013 +0800 rt2x00: rt2800lib: update RF registers for RT5390 commit eac40d9631a7db43570df859fa8a9922e9623607 Author: Kevin Lo <kevlo@kevlo.org> Date: Mon Oct 21 15:38:31 2013 +0800 rt2x00: rt2800lib: Update BBP register initialization for RT53xx On my case reverting them make issue gone. You can try to revert them too or just try to use 3.13. (In reply to Stanislaw Gruszka from comment #76) > Christoph, your problems with the device are clearly different than reported > here earlier Well, I googled because the log was flooded with these timeout messages about the queues and perhaps low bandwidth (not really sure yet) and found this. But okay, I didn't get "failed to flush" anymore, just the other messages. I have no idea if this is normal, but it seems excessive and this bug report is not closed yet... so... Maybe I wouldn't even have posted, but comment #29 sounded exactly like something I have also seen. > First of all you have issues with USB: > > [ 691.518258] usb 1-1: new high-speed USB device number 7 using xhci_hcd > [ 691.702296] usb 1-1: config 1 interface 0 altsetting 0 has 7 endpoint > descriptors, different from the interface descriptor's value: 5 > [ 691.868852] usb 1-1: reset high-speed USB device number 7 using xhci_hcd > [ 691.868910] xhci_hcd 0000:00:14.0: Setup ERROR: setup context command for > slot 7. > [ 692.028897] usb 1-1: reset high-speed USB device number 7 using xhci_hcd > [ 692.028958] xhci_hcd 0000:00:14.0: Setup ERROR: setup context command for > slot 7. > [ 692.189059] usb 1-1: reset high-speed USB device number 7 using xhci_hcd > [ 692.213754] xhci_hcd 0000:00:14.0: xHCI xhci_drop_endpoint called with > disabled ep ffff88074ce52040 > > This could be problem of xHCI driver of RT5370 firmware. It worth to check > if updating F/W helps. You can download it from: > http://rt2x00.serialmonkey.com/pipermail/users_rt2x00.serialmonkey.com/2013- > January/005610.html > and replace file in /lib/firmware/ Thanks, but no. [ 5541.684756] usb 1-1: new high-speed USB device number 4 using xhci_hcd [ 5541.868807] usb 1-1: config 1 interface 0 altsetting 0 has 7 endpoint descriptors, different from the interface descriptor's value: 5 [ 5542.035245] usb 1-1: reset high-speed USB device number 4 using xhci_hcd [ 5542.035293] xhci_hcd 0000:00:14.0: Setup ERROR: setup context command for slot 3. [ 5542.195270] usb 1-1: reset high-speed USB device number 4 using xhci_hcd [ 5542.195310] xhci_hcd 0000:00:14.0: Setup ERROR: setup context command for slot 3. [ 5542.355442] usb 1-1: reset high-speed USB device number 4 using xhci_hcd [ 5542.379962] xhci_hcd 0000:00:14.0: xHCI xhci_drop_endpoint called with disabled ep ffff8800ca8bd440 [ 5542.379970] xhci_hcd 0000:00:14.0: xHCI xhci_drop_endpoint called with disabled ep ffff8800ca8bd400 [ 5542.379973] xhci_hcd 0000:00:14.0: xHCI xhci_drop_endpoint called with disabled ep ffff8800ca8bd480 [ 5542.379976] xhci_hcd 0000:00:14.0: xHCI xhci_drop_endpoint called with disabled ep ffff8800ca8bd4c0 [ 5542.379979] xhci_hcd 0000:00:14.0: xHCI xhci_drop_endpoint called with disabled ep ffff8800ca8bd500 [ 5542.380245] ieee80211 phy3: rt2x00_set_rt: Info - RT chipset 5390, rev 0502 detected [ 5542.391079] ieee80211 phy3: rt2x00_set_rf: Info - RF chipset 5370 detected [ 5542.391394] ieee80211 phy3: Selected rate control algorithm 'minstrel_ht' [ 5542.446371] systemd-udevd[6088]: renamed network interface wlan0 to wlp0s20u1 [ 5542.453392] ieee80211 phy3: rt2x00lib_request_firmware: Info - Loading firmware file 'rt2870.bin' [ 5542.453474] ieee80211 phy3: rt2x00lib_request_firmware: Info - Firmware detected - version: 0.33 [ 5542.591653] IPv6: ADDRCONF(NETDEV_UP): wlp0s20u1: link is not ready [ 5544.915746] wlp0s20u1: authenticate with Also, it is now broken and doesn't work at all ("unable to enumerate USB device"). I still have others so I'll see whether the firmware caused this, but I'm pretty sure it just broke it was so cheap. Maybe later I can open a new bug for it if so desired. > Second problem are random disconnects: > > [ 1777.560067] wlp0s20u1: authenticate with 54:e6:fc:af:47:36 > [ 1777.580536] wlp0s20u1: send auth to 54:e6:fc:af:47:36 (try 1/3) > [ 1777.616609] wlp0s20u1: send auth to 54:e6:fc:af:47:36 (try 2/3) > [ 1777.645496] wlp0s20u1: send auth to 54:e6:fc:af:47:36 (try 3/3) > [ 1777.682743] wlp0s20u1: authentication with 54:e6:fc:af:47:36 timed out > [ 1779.341611] wlp0s20u1: authenticate with 54:e6:fc:af:47:36 > > I have this problem on RT5390 adapter (PCI version of RT5370) on 3.14-rc. > This is regression caused by some recent commits: > > commit c8520bcb784df69cf5960308846253814ec45db7 > Author: Kevin Lo <kevlo@kevlo.org> > Date: Thu Oct 24 13:24:08 2013 +0800 > > rt2x00: rt2800lib: update RF registers for RT5390 > > commit eac40d9631a7db43570df859fa8a9922e9623607 > Author: Kevin Lo <kevlo@kevlo.org> > Date: Mon Oct 21 15:38:31 2013 +0800 > > rt2x00: rt2800lib: Update BBP register initialization for RT53xx > > On my case reverting them make issue gone. You can try to revert them too or > just try to use 3.13. Okay, so it's a known issue. Thanks. (In reply to Stanislaw Gruszka from comment #76) > Christoph, your problems with the device are clearly different than reported > here earlier > > First of all you have issues with USB: > > [ 691.518258] usb 1-1: new high-speed USB device number 7 using xhci_hcd > [ 691.702296] usb 1-1: config 1 interface 0 altsetting 0 has 7 endpoint > descriptors, different from the interface descriptor's value: 5 > [ 691.868852] usb 1-1: reset high-speed USB device number 7 using xhci_hcd > [ 691.868910] xhci_hcd 0000:00:14.0: Setup ERROR: setup context command for > slot 7. > [ 692.028897] usb 1-1: reset high-speed USB device number 7 using xhci_hcd > [ 692.028958] xhci_hcd 0000:00:14.0: Setup ERROR: setup context command for > slot 7. > [ 692.189059] usb 1-1: reset high-speed USB device number 7 using xhci_hcd > [ 692.213754] xhci_hcd 0000:00:14.0: xHCI xhci_drop_endpoint called with > disabled ep ffff88074ce52040 > > This could be problem of xHCI driver of RT5370 firmware. It worth to check > if updating F/W helps. You can download it from: > http://rt2x00.serialmonkey.com/pipermail/users_rt2x00.serialmonkey.com/2013- > January/005610.html > and replace file in /lib/firmware/ > > Second problem are random disconnects: > > [ 1777.560067] wlp0s20u1: authenticate with 54:e6:fc:af:47:36 > [ 1777.580536] wlp0s20u1: send auth to 54:e6:fc:af:47:36 (try 1/3) > [ 1777.616609] wlp0s20u1: send auth to 54:e6:fc:af:47:36 (try 2/3) > [ 1777.645496] wlp0s20u1: send auth to 54:e6:fc:af:47:36 (try 3/3) > [ 1777.682743] wlp0s20u1: authentication with 54:e6:fc:af:47:36 timed out > [ 1779.341611] wlp0s20u1: authenticate with 54:e6:fc:af:47:36 > > I have this problem on RT5390 adapter (PCI version of RT5370) on 3.14-rc. > This is regression caused by some recent commits: > > commit c8520bcb784df69cf5960308846253814ec45db7 > Author: Kevin Lo <kevlo@kevlo.org> > Date: Thu Oct 24 13:24:08 2013 +0800 > > rt2x00: rt2800lib: update RF registers for RT5390 > > commit eac40d9631a7db43570df859fa8a9922e9623607 > Author: Kevin Lo <kevlo@kevlo.org> > Date: Mon Oct 21 15:38:31 2013 +0800 > > rt2x00: rt2800lib: Update BBP register initialization for RT53xx > > On my case reverting them make issue gone. You can try to revert them too or > just try to use 3.13. I'm seeing the same issue running 3.13.5-1 Distro: Debian x86_64 Wireless card: Ralink Technology, Corp. RT5370 Wireless Adapter while under load the I experience latency and packet loss as well as the following errors in dmesg [116854.144208] ieee80211 phy1: rt2800usb_entry_txstatus_timeout: Warning - TX status timeout for entry 14 in queue 2 [116854.147687] ieee80211 phy1: rt2800usb_txdone: Warning - Got TX status for an empty queue 0, dropping [116854.150318] ieee80211 phy1: rt2800usb_txdone: Warning - Got TX status for an empty queue 0, dropping I tried updating the firmware as suggested, but the issue persists. dmesg attached Created attachment 133571 [details]
3.13-dmesg
dmesg of issue in 3.13 with updated firmware
Same. The problem is that bisecting around kernels 3.6-3.7 doesn't work because support for that chipset was only added around kernel 3.10. I tried adding the device IDs to the earlier kernels but that doesn't work. I am doing reinstall of kubuntu using live usb of 13.04. It has kernel Linux kubuntu 3.8.0-12-generic #21-Ubuntu It doesnt have this issue reported here. I had several kernels in my pc and they all seem to have this issue. I have been having problem on this on various kernel. Yesterday I downgrade my ubuntu 14.04 to kernel 3.8.0-35-generic. Managed to download a few files but still got this problem at the 5th file download. I will try to use kernel 3.8.0-12 sometime later. on an old geode gx1 board with usb 1.1 attached MSI US54SE II i`m getting lots of these in dmesg during some network testing with kernel 3.17 Apr 1 14:33:20 debian7 kernel: [ 1784.238928] ieee80211 phy0: rt2x00usb_watchdog_tx_dma: Warning - TX queue 2 DMA timed out, invoke forced forced reset Apr 1 14:33:21 debian7 kernel: [ 1785.234214] ieee80211 phy0: rt2x00usb_watchdog_tx_dma: Warning - TX queue 2 DMA timed out, invoke forced forced reset Apr 1 14:33:22 debian7 kernel: [ 1786.234794] ieee80211 phy0: rt2x00usb_watchdog_tx_dma: Warning - TX queue 2 DMA timed out, invoke forced forced reset Apr 1 14:33:24 debian7 kernel: [ 1788.242542] ieee80211 phy0: rt2x00usb_watchdog_tx_dma: Warning - TX queue 2 DMA timed out, invoke forced forced reset Apr 1 14:33:25 debian7 kernel: [ 1789.254458] ieee80211 phy0: rt2x00usb_watchdog_tx_dma: Warning - TX queue 2 DMA timed out, invoke forced forced reset Apr 1 14:33:26 debian7 kernel: [ 1790.251390] ieee80211 phy0: rt2x00usb_watchdog_tx_dma: Warning - TX queue 2 DMA timed out, invoke forced forced reset Apr 1 14:33:27 debian7 kernel: [ 1791.250907] ieee80211 phy0: rt2x00usb_watchdog_tx_dma: Warning - TX queue 2 DMA timed out, invoke forced forced reset Apr 1 14:33:28 debian7 kernel: [ 1792.254741] ieee80211 phy0: rt2x00usb_watchdog_tx_dma: Warning - TX queue 2 DMA timed out, invoke forced forced reset Apr 1 14:33:29 debian7 kernel: [ 1793.274510] ieee80211 phy0: rt2x00usb_watchdog_tx_dma: Warning - TX queue 2 DMA timed out, invoke forced forced reset Apr 1 14:33:30 debian7 kernel: [ 1794.274497] ieee80211 phy0: rt2x00usb_watchdog_tx_dma: Warning - TX queue 2 DMA timed out, invoke forced forced reset Apr 1 14:33:31 debian7 kernel: [ 1795.258867] ieee80211 phy0: rt2x00usb_watchdog_tx_dma: Warning - TX queue 2 DMA timed out, invoke forced forced reset Apr 1 14:33:32 debian7 kernel: [ 1796.264083] ieee80211 phy0: rt2x00usb_watchdog_tx_dma: Warning - TX queue 2 DMA timed out, invoke forced forced reset Apr 1 14:33:34 debian7 kernel: [ 1798.263352] ieee80211 phy0: rt2x00usb_watchdog_tx_dma: Warning - TX queue 2 DMA timed out, invoke forced forced reset Apr 1 14:33:35 debian7 kernel: [ 1799.281393] ieee80211 phy0: rt2x00usb_watchdog_tx_dma: Warning - TX queue 2 DMA timed out, invoke forced forced reset Apr 1 14:33:36 debian7 kernel: [ 1800.285902] ieee80211 phy0: rt2x00usb_watchdog_tx_dma: Warning - TX queue 2 DMA timed out, invoke forced forced reset Apr 1 14:33:37 debian7 kernel: [ 1801.270160] ieee80211 phy0: rt2x00usb_watchdog_tx_dma: Warning - TX queue 2 DMA timed out, invoke forced forced reset Apr 1 14:33:38 debian7 kernel: [ 1802.279546] ieee80211 phy0: rt2x00usb_watchdog_tx_dma: Warning - TX queue 2 DMA timed out, invoke forced forced reset Apr 1 14:33:39 debian7 kernel: [ 1803.295065] ieee80211 phy0: rt2x00usb_watchdog_tx_dma: Warning - TX queue 2 DMA timed out, invoke forced forced reset i don`t get any message with 3.2.63-2+deb7u1 lsusb: Bus 001 Device 005: ID 148f:2573 Ralink Technology, Corp. RT2501/RT2573 Wireless Adapter root@debian7:/var/log# lsmod |grep usb rt73usb 21998 0 rt2x00usb 13422 1 rt73usb rt2x00lib 29245 2 rt2x00usb,rt73usb mac80211 167576 2 rt2x00lib,rt2x00usb crc_itu_t 12331 1 rt73usb usb_storage 35245 2 usbhid 31704 0 hid 60188 1 usbhid usbcore 104793 7 ehci_hcd,ohci_hcd,usbhid,usb_storage,rt2x00usb,rt73usb scsi_mod 135586 4 libata,usb_storage,sd_mod,sg usb_common 12338 1 usbcore Still present in Linux version 4.1.11 # dmesg |grep chip [ 14.833984] ieee80211 phy0: rt2x00_set_rt: Info - RT chipset 3070, rev 0201 detected [ 14.875461] ieee80211 phy0: rt2x00_set_rf: Info - RF chipset 0005 detected # dmesg |grep firm [ 30.345772] ieee80211 phy0: rt2x00lib_request_firmware: Info - Loading firmware file 'rt2870.bin' [ 30.371362] ieee80211 phy0: rt2x00lib_request_firmware: Info - Firmware detected - version: 0.29 [ 159.852999] ieee80211 phy0: rt2x00usb_watchdog_tx_dma: Warning - TX queue 2 DMA timed out, invoke forced forced reset [ 160.003533] ieee80211 phy0: rt2800usb_txdone: Warning - Data pending for entry 9 in queue 2 [ 282.872600] ieee80211 phy0: rt2x00usb_watchdog_tx_dma: Warning - TX queue 2 DMA timed out, invoke forced forced reset [ 283.656790] ieee80211 phy0: rt2800usb_txdone: Warning - Data pending for entry 14 in queue 2 [ 301.356902] ieee80211 phy0: rt2800usb_txdone: Warning - Data pending for entry 6 in queue 2 [ 308.045146] ieee80211 phy0: rt2800usb_txdone: Warning - Data pending for entry 4 in queue 2 [ 308.071095] ieee80211 phy0: rt2x00usb_watchdog_tx_dma: Warning - TX queue 2 DMA timed out, invoke forced forced reset [ 312.072849] ieee80211 phy0: rt2x00usb_watchdog_tx_dma: Warning - TX queue 2 DMA timed out, invoke forced forced reset [ 313.073225] ieee80211 phy0: rt2x00usb_watchdog_tx_dma: Warning - TX queue 2 DMA timed out, invoke forced forced reset [ 313.506148] ieee80211 phy0: rt2800usb_txdone: Warning - Data pending for entry 4 in queue 2 [ 314.073725] ieee80211 phy0: rt2x00usb_watchdog_tx_dma: Warning - TX queue 2 DMA timed out, invoke forced forced reset [ 314.431157] ieee80211 phy0: rt2800usb_txdone: Warning - Data pending for entry 3 in queue 2 [ 315.073974] ieee80211 phy0: rt2x00usb_watchdog_tx_dma: Warning - TX queue 2 DMA timed out, invoke forced forced reset [ 330.793982] ieee80211 phy0: rt2800usb_txdone: Warning - Data pending for entry 14 in queue 2 [ 331.079726] ieee80211 phy0: rt2x00usb_watchdog_tx_dma: Warning - TX queue 2 DMA timed out, invoke forced forced reset [ 333.080348] ieee80211 phy0: rt2x00usb_watchdog_tx_dma: Warning - TX queue 2 DMA timed out, invoke forced forced reset [ 333.480588] ieee80211 phy0: rt2800usb_txdone: Warning - Data pending for entry 9 in queue 2 [ 334.080852] ieee80211 phy0: rt2x00usb_watchdog_tx_dma: Warning - TX queue 2 DMA timed out, invoke forced forced reset [ 337.081754] ieee80211 phy0: rt2x00usb_watchdog_tx_dma: Warning - TX queue 2 DMA timed out, invoke forced forced reset [ 339.968025] ieee80211 phy0: rt2800usb_txdone: Warning - Data pending for entry 11 in queue 2 [ 340.030139] ieee80211 phy0: rt2800usb_txdone: Warning - Data pending for entry 11 in queue 2 [ 402.070341] ieee80211 phy0: rt2x00usb_watchdog_tx_dma: Warning - TX queue 2 DMA timed out, invoke forced forced reset [ 403.069851] ieee80211 phy0: rt2x00usb_watchdog_tx_dma: Warning - TX queue 2 DMA timed out, invoke forced forced reset [ 404.069224] ieee80211 phy0: rt2x00usb_watchdog_tx_dma: Warning - TX queue 2 DMA timed out, invoke forced forced reset [ 404.257020] ieee80211 phy0: rt2800usb_txdone: Warning - Data pending for entry 4 in queue 2 [ 404.410771] ieee80211 phy0: rt2800usb_txdone: Warning - Data pending for entry 13 in queue 2 [ 410.065874] ieee80211 phy0: rt2x00usb_watchdog_tx_dma: Warning - TX queue 2 DMA timed out, invoke forced forced reset [ 411.065351] ieee80211 phy0: rt2x00usb_watchdog_tx_dma: Warning - TX queue 2 DMA timed out, invoke forced forced reset [ 452.991032] ieee80211 phy0: rt2800usb_txdone: Warning - Data pending for entry 10 in queue 2 [ 516.969317] ieee80211 phy0: rt2800usb_txdone: Warning - Data pending for entry 15 in queue 2 [ 523.743445] ieee80211 phy0: rt2800usb_txdone: Warning - Data pending for entry 15 in queue 2 [ 533.058599] ieee80211 phy0: rt2x00usb_watchdog_tx_dma: Warning - TX queue 2 DMA timed out, invoke forced forced reset I built the kernel from source with the warning code removed from rt2800 driver, the overall performance is much better and kernel log is clean. It would be better if we disable by default this unhelpful warnings from the kernel. So far, I have been using Ralink cards for injection of data frames which worked fine. Now I have tried to inject CTS frames, however as soon as I inject with higher datarates than 18Mbit, I get those Kernel messages and injection stops or becomes very slow. Then I also get kernel messages like these: [ 2227.009933] ieee80211 phy2: rt2x00usb_watchdog_tx_dma: Warning - TX queue 2 DMA timed out, invoke forced forced reset [ 2231.009846] ieee80211 phy2: rt2x00usb_watchdog_tx_dma: Warning - TX queue 2 DMA timed out, invoke forced forced reset [ 2235.009871] ieee80211 phy2: rt2x00usb_watchdog_tx_dma: Warning - TX queue 2 DMA timed out, invoke forced forced reset [ 2239.009906] ieee80211 phy2: rt2x00usb_watchdog_tx_dma: Warning - TX queue 2 DMA timed out, invoke forced forced reset [ 2242.993945] ieee80211 phy2: rt2x00usb_watchdog_tx_dma: Warning - TX queue 2 DMA timed out, invoke forced forced reset To me it seems, this is caused by too many mananagement frames. Maybe this helps to narrow this issue down after all those years. I'm using kernel 4.4.11 on a Raspberry Pi with a Ralink 5572 USB wifi card. I'll do some more experimenting ... Hi, I am using ralink usb dongles for for 80211s mesh. I also get kernel messages like these. [ 303.008219] ieee80211 phy0: rt2x00usb_watchdog_tx_dma: Warning - TX queue 2 DMA timed out, invoke forced forced reset [ 303.918343] ieee80211 phy0: rt2x00usb_watchdog_tx_dma: Warning - TX queue 2 DMA timed out, invoke forced forced reset [ 304.948213] ieee80211 phy0: rt2x00usb_watchdog_tx_dma: Warning - TX queue 2 DMA timed out, invoke forced forced reset [ 305.918211] ieee80211 phy0: rt2x00usb_watchdog_tx_dma: Warning - TX queue 2 DMA timed out, invoke forced forced reset [ 306.920082] ieee80211 phy0: rt2x00usb_watchdog_tx_dma: Warning - TX queue 2 DMA timed out, invoke forced forced reset [ 307.918213] ieee80211 phy0: rt2x00usb_watchdog_tx_dma: Warning - TX queue 2 DMA timed out, invoke forced forced reset [ 317.948208] ieee80211 phy0: rt2x00usb_watchdog_tx_dma: Warning - TX queue 2 DMA timed out, invoke forced forced reset [ 355.008357] ieee80211 phy0: rt2x00usb_watchdog_tx_dma: Warning - TX queue 2 DMA timed out, invoke forced forced reset [ 355.978218] ieee80211 phy0: rt2x00usb_watchdog_tx_dma: Warning - TX queue 2 DMA timed out, invoke forced forced reset [ 357.918208] ieee80211 phy0: rt2x00usb_watchdog_tx_dma: Warning - TX queue 2 DMA timed out, invoke forced forced reset [ 359.008334] ieee80211 phy0: rt2x00usb_watchdog_tx_dma: Warning - TX queue 2 DMA timed out, invoke forced forced reset [ 360.008285] ieee80211 phy0: rt2x00usb_watchdog_tx_dma: Warning - TX queue 2 DMA timed out, invoke forced forced reset [ 367.551832] ieee80211 phy0: rt2800usb_txdone: Warning - Data pending for entry 5 in queue 2 [ 368.008228] ieee80211 phy0: rt2x00usb_watchdog_tx_dma: Warning - TX queue 2 DMA timed out, invoke forced forced reset [ 372.193836] ieee80211 phy0: rt2800usb_txdone: Warning - Data pending for entry 3 in queue 0 [ 383.918221] ieee80211 phy0: rt2x00usb_watchdog_tx_dma: Warning - TX queue 2 DMA timed out, invoke forced forced reset [ 384.948089] ieee80211 phy0: rt2x00usb_watchdog_tx_dma: Warning - TX queue 2 DMA timed out, invoke forced forced reset [ 385.918210] ieee80211 phy0: rt2x00usb_watchdog_tx_dma: Warning - TX queue 2 DMA timed out, invoke forced forced reset [ 387.008107] ieee80211 phy0: rt2x00usb_watchdog_tx_dma: Warning - TX queue 2 DMA timed out, invoke forced forced reset [ 388.008332] ieee80211 phy0: rt2x00usb_watchdog_tx_dma: Warning - TX queue 2 DMA timed out, invoke forced forced reset [ 388.918332] ieee80211 phy0: rt2x00usb_watchdog_tx_dma: Warning - TX queue 2 DMA timed out, invoke forced forced reset [ 407.978240] ieee80211 phy0: rt2x00usb_watchdog_tx_dma: Warning - TX queue 2 DMA timed out, invoke forced forced reset [ 408.918209] ieee80211 phy0: rt2x00usb_watchdog_tx_dma: Warning - TX queue 2 DMA timed out, invoke forced forced reset [ 412.754383] ieee80211 phy0: rt2800usb_txdone: Warning - Data pending for entry 0 in queue 0 [ 419.008214] ieee80211 phy0: rt2x00usb_watchdog_tx_dma: Warning - TX queue 2 DMA timed out, invoke forced forced reset i applied patch to comment these prints. But now im facing real problem. mesh is queuing skbs and because of these error interface is slow. at some point(after few hours) skbuff_head_cache allocating all the kernel memory. OOM starts killing all the processes and rebooting the system. Need help.. Thanks in advance Watchdog for usb devices was not correctly implemented and was removed in wireless-drivers-next (scheduled to 4.11): https://git.kernel.org/cgit/linux/kernel/git/kvalo/wireless-drivers-next.git/commit/?id=480b468625da1f054c487f7168e9a9bdc1bf869b In general there are changes in 4.10 and incomming 4.11 kernels, that should improve situations regarding problems reported in this bz. Please check 4.10 and if 4.10 does not help incomming 4.11 and let me know if still there are problems. Thanks, i will check latest kernels(.10 and .11). I am creating an AP & mesh interface on the same dongle (to create MAP). this problem occurs faster based on a number of nodes(boards). I'm suspecting rt2x00 dongle driver. I want to create a stable 80211s mesh. Please give me suggestions. Hi, i took only one change. --- a/drivers/net/wireless/ralink/rt2x00/rt2800usb.c +++ b/drivers/net/wireless/ralink/rt2x00/rt2800usb.c @@ -144,7 +144,7 @@ static inline bool rt2800usb_entry_txsta if (!test_bit(ENTRY_DATA_STATUS_PENDING, &entry->flags)) return false; - tout = time_after(jiffies, entry->last_action + msecs_to_jiffies(100)); + tout = time_after(jiffies, entry->last_action + msecs_to_jiffies(500)); if (unlikely(tout)) rt2x00_dbg(entry->queue->rt2x00dev, "TX status timeout for entry %d in queue %d\n", with above patch, now no warning messages. but still, skbuff_head_cache increasing rapidly. any idea.?? skbuff_head_cache issue looks like memory leak problem, probably not realted with rt2x00. I suggest to use kmemleak to debug the problem. The change from comment 90 is currently in kernel source. |
Created attachment 110531 [details] lspci, ver_linux, cpuinfo, lsusb, acpidump etc. With an rt2800usb compatible card (ID 148f:3070 Ralink Technology, Corp. RT2870/RT3070 Wireless Adapter) the connection gets really sluggish on high traffic (moving files, torrent downloading etc). The dmesg output is flooded with the following messages. [ 3668.314850] ieee80211 phy4: rt2800usb_entry_txstatus_timeout: Warning - TX status timeout for entry 7 in queue 2 [ 3668.314874] ieee80211 phy4: rt2800usb_entry_txstatus_timeout: Warning - TX status timeout for entry 7 in queue 2 [ 3668.314878] ieee80211 phy4: rt2800usb_entry_txstatus_timeout: Warning - TX status timeout for entry 7 in queue 2 [ 3668.428550] ieee80211 phy4: rt2800usb_txdone: Warning - Got TX status for an empty queue 2, dropping [ 3668.428917] ieee80211 phy4: rt2800usb_txdone: Warning - Got TX status for an ieee80211 phy4: rt2x00queue_flush_queue: Warning - Queue 2 failed to flush Distro: Arch Linux Tested from kernel 3.10.10 to 3.11.4 Tested on WEP,WPA,WPA2 and no encryption wireless networks.