Bug 13638 - rt2870 driver is broken for (some) cards
rt2870 driver is broken for (some) cards
Status: CLOSED CODE_FIX
Product: Networking
Classification: Unclassified
Component: Wireless
All Linux
: P1 high
Assigned To: Greg Kroah-Hartman
http://bbs.archlinux.org/viewtopic.ph...
:
Depends on:
Blocks: 13070
  Show dependency treegraph
 
Reported: 2009-06-27 17:33 UTC by jakob gruber
Modified: 2009-11-26 13:45 UTC (History)
8 users (show)

See Also:
Kernel Version: 2.6.30
Tree: Mainline
Regression: No


Attachments
everything.log (4.96 KB, text/plain)
2009-07-29 20:04 UTC, jakob gruber
Details

Description jakob gruber 2009-06-27 17:33:26 UTC
The issue is described here:

http://bbs.archlinux.org/viewtopic.php?id=73964

in the first 2 posts.

Short summary:

"The problem appears if I run
ifconfig ra0 down
the command doesn't complete, produces no output and causes many other prolems. Examples: I lose the VT where I ran this. I can't cancel it with CTRL+C. Not even sysrq is able to recover it. Everything using PAM begins to fail, if I run sudo in a VT, i lose it (this might be related to your mounting problem). X crashes. Wicd stalls, the process can't be shut down any more. The reason fot this is that many networking applications "cylce" the interface once before using it to clear left-over configurations.
I can't even shutdown anymore if I don't use the -n option. The whole system basically goes down the drain."

Affected cards: 

* Belkin F5D8053 N Wireless USB Adapter
* Trendnet USB stick

Fix:

* downgrade to 2.6.29
* or use the v2.1.2.0 drivers from http://www.ralinktech.com.tw/data/drivers/2009_0521_RT2870_Linux_STA_V2.1.2.0.tgz
Comment 1 Malstrond 2009-06-30 20:55:12 UTC
I was able to reproduce this issue, in fact I'm the person that owns the mentioned Trendnet USB Stick. It's a Trendnet TEW-624UB.

lsusb output: http://dpaste.com/hold/61747/
Comment 2 John W. Linville 2009-07-01 17:27:18 UTC
I don't think we track regressions in drivers/staging...
Comment 3 Greg Kroah-Hartman 2009-07-01 17:31:54 UTC
Not really.

patches gladly accepted to resolve this issue.

If you can work out what broke, that would be greatly appreciated.

Can you use 'git bisect' to track down the patch that caused the regression?
Comment 4 jakob gruber 2009-07-02 17:18:16 UTC
Using git bisect, I always seemed to up in with a working copy that wouldn't build and I don't know git well enough to fix that. 

So - I started with 2.6.29 and manually applied the rt2870 commits until I was able to isolate the problem. I also built a kernel with a checkout from 2.6.29 + all rt2870 commits included in 2.6.30 EXCEPT the one causing this issue, which resulted in a working rt2870sta modules.

By "rt2870 commits", I mean all commits with changes in the drivers/staging/rt2870 directory.

The problem is in commit d44ca7af9e79abf4e80514583734cffed1117ee1 (http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=d44ca7af9e79abf4e80514583734cffed1117ee1).
Comment 5 Peter Teoh 2009-07-03 00:38:25 UTC
Thank you for the bug report.   I will work on this issue.   Let me dig into the source.
Comment 6 Ph. Marek 2009-07-15 06:30:51 UTC
I have an RT2870 too, and saw the machine nearly completely dead upon inserting it, although the downloaded driver and the one in staging/ just worked.

This is an 07d1:3c09; in my /var/log/messages I've got many of

  ERROR!!! HWM_MAILBOX still hold by MCU. command fail
  Bulk Out MLME Failed, Status=-2!
  Qidx(3), not enough space in MgmtRing, MgmtRingFullCount=..

Then I removed the stick (because X was unresponsive), and after 
  Bulk Out MLME Failed, Status=-108!
  Bulk In Failed. Status=-108, BIIdx=0x0, BIRIdx=0x0, actual_length=0x0
  rtusb_disconnect ...
  RTUSB_VendorRequest failed(-19),...

Then *many*
  device disconnected

As X was still dead, and SysRq+K didn't help either, I sync'ed and shut the machine down.



I should mention that I'm currently running a debian kernel; but I'm available for testing with plain kernels, too.
Comment 7 Peter Teoh 2009-07-22 07:01:11 UTC
Sorry, I must apologize for this:  my earlier remark to work on this was constantly broken by some other commitment.   Most important is the lack of hardware for testing.   I have to give up.   Sorry for all these.
Comment 8 Rafael J. Wysocki 2009-07-28 20:43:58 UTC
On Tuesday 28 July 2009, Jakob Gruber wrote:
> Hi
> 
> as far as I'm concerned this bug should still be listed as a regression.
> 
> On Sun, 26 Jul 2009 22:45:32 +0200 (CEST)
> "Rafael J. Wysocki" <rjw@sisk.pl> wrote:
> 
> > This message has been generated automatically as a part of a report
> > of regressions introduced between 2.6.29 and 2.6.30.
> > 
> > The following bug entry is on the current list of known regressions
> > introduced between 2.6.29 and 2.6.30.  Please verify if it still should
> > be listed and let me know (either way).
> > 
> > 
> > Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=13638
> > Subject		: rt2870 driver is broken for (some) cards
> > Submitter	: jakob gruber <jakob.gruber@kabelnet.at>
> > Date		: 2009-06-27 17:33 (30 days old)
Comment 9 jakob gruber 2009-07-29 20:04:59 UTC
Created attachment 22527 [details]
everything.log

The relevant part of everything.log.

This was triggered by executing "ifconfig ra0 down" on kernel 2.6.31-rc4, which returned with a kernel bug.

I was returned to a responsive prompt, but executing "ifconfig" again killed tty1. Typing on the keyboard shows up on the screen, but Ctrl-C does nothing. Switching to other ttys works and they are responsive until executing "ifconfig".

From the log, my _guess_ is that this is caused by this part of the patch:

--- a/drivers/staging/rt2870/2870_main_dev.c
+++ b/drivers/staging/rt2870/2870_main_dev.c
@@ -944,69 +944,46 @@ VOID RT28xxThreadTerminate(
 	RTUSBCancelPendingIRPs(pAd);
 
 	// Terminate Threads
-	CHECK_PID_LEGALITY(pObj->TimerQThr_pid)
+	BUG_ON(pObj->TimerQThr_task == NULL);
+	CHECK_PID_LEGALITY(task_pid(pObj->TimerQThr_task))
 	{
 		POS_COOKIE pObj = (POS_COOKIE)pAd->OS_Cookie;
 
-		printk("Terminate the TimerQThr_pid=%d!\n", GET_PID_NUMBER(pObj->TimerQThr_pid));
+		printk(KERN_DEBUG "Terminate the TimerQThr pid=%d!\n",
+			pid_nr(task_pid(pObj->TimerQThr_task)));
 		mb();
 		pAd->TimerFunc_kill = 1;
 		mb();
-		ret = KILL_THREAD_PID(pObj->TimerQThr_pid, SIGTERM, 1);
-		if (ret)
-		{
-			printk(KERN_WARNING "%s: unable to stop TimerQThread, pid=%d, ret=%d!\n",
-					pAd->net_dev->name, GET_PID_NUMBER(pObj->TimerQThr_pid), ret);
-		}
-		else
-		{
-			wait_for_completion(&pAd->TimerQComplete);
-			pObj->TimerQThr_pid = THREAD_PID_INIT_VALUE;
-		}
+		kthread_stop(pObj->TimerQThr_task);
+		pObj->TimerQThr_task = NULL;
 	}
 
-	CHECK_PID_LEGALITY(pObj->MLMEThr_pid)
+	BUG_ON(pObj->MLMEThr_task == NULL);
+	CHECK_PID_LEGALITY(task_pid(pObj->MLMEThr_task))
 	{
-		printk("Terminate the MLMEThr_pid=%d!\n", GET_PID_NUMBER(pObj->MLMEThr_pid));
+		printk(KERN_DEBUG "Terminate the MLMEThr pid=%d!\n",
+			pid_nr(task_pid(pObj->MLMEThr_task)));
 		mb();
 		pAd->mlme_kill = 1;
 		//RT28XX_MLME_HANDLER(pAd);
 		mb();
-		ret = KILL_THREAD_PID(pObj->MLMEThr_pid, SIGTERM, 1);
-		if (ret)
-		{
-			printk (KERN_WARNING "%s: unable to Mlme thread, pid=%d, ret=%d!\n",
-					pAd->net_dev->name, GET_PID_NUMBER(pObj->MLMEThr_pid), ret);
-		}
-		else
-		{
-			//wait_for_completion (&pAd->notify);
-			wait_for_completion (&pAd->mlmeComplete);
-			pObj->MLMEThr_pid = THREAD_PID_INIT_VALUE;
-		}
+		kthread_stop(pObj->MLMEThr_task);
+		pObj->MLMEThr_task = NULL;
 	}
 
-	CHECK_PID_LEGALITY(pObj->RTUSBCmdThr_pid)
+	BUG_ON(pObj->RTUSBCmdThr_task == NULL);
+	CHECK_PID_LEGALITY(task_pid(pObj->RTUSBCmdThr_task))
 	{
-		printk("Terminate the RTUSBCmdThr_pid=%d!\n", GET_PID_NUMBER(pObj->RTUSBCmdThr_pid));
+		printk(KERN_DEBUG "Terminate the RTUSBCmdThr pid=%d!\n",
+			pid_nr(task_pid(pObj->RTUSBCmdThr_task)));
 		mb();
 		NdisAcquireSpinLock(&pAd->CmdQLock);
 		pAd->CmdQ.CmdQState = RT2870_THREAD_STOPED;
 		NdisReleaseSpinLock(&pAd->CmdQLock);
 		mb();
 		//RTUSBCMDUp(pAd);
-		ret = KILL_THREAD_PID(pObj->RTUSBCmdThr_pid, SIGTERM, 1);
-		if (ret)
-		{
-			printk(KERN_WARNING "%s: unable to RTUSBCmd thread, pid=%d, ret=%d!\n",
-					pAd->net_dev->name, GET_PID_NUMBER(pObj->RTUSBCmdThr_pid), ret);
-		}
-		else
-		{
-			//wait_for_completion (&pAd->notify);
-			wait_for_completion (&pAd->CmdQComplete);
-			pObj->RTUSBCmdThr_pid = THREAD_PID_INIT_VALUE;
-	}
+		kthread_stop(pObj->RTUSBCmdThr_task);
+		pObj->RTUSBCmdThr_task = NULL;
 	}

Hope this helps!
Comment 10 Bjoern Olausson 2009-08-01 18:54:09 UTC
I don't actually know if this thread is also about the rt2800usb driver for rt2870 USB chips.

I recently got a D-Link DWA-140 and first used the staging driver which worked, except that it throws an error at me when shutting down my system and keeps the system from shutting down. Switching to the rt2800usb (2.6.31-rc5) driver made things worse.

So far nothing is working with rt2800usb:

[ 2221.733054] usb 1-2: new high speed USB device using ehci_hcd and address 5
[ 2221.866702] usb 1-2: New USB device found, idVendor=07d1, idProduct=3c09
[ 2221.866712] usb 1-2: New USB device strings: Mfr=1, Product=2, SerialNumber=3
[ 2221.866720] usb 1-2: Product: 802.11 n WLAN
[ 2221.866725] usb 1-2: Manufacturer: Ralink
[ 2221.866730] usb 1-2: SerialNumber: 1.0
[ 2221.867235] usb 1-2: configuration #1 chosen from 1 choice
[ 2221.899270] phy1: Selected rate control algorithm 'minstrel'
[ 2222.131880] rt2800usb 1-2:1.0: firmware: requesting rt2870.bin
[ 2222.141916] phy1 -> rt2x00lib_request_firmware: Error - Failed to request Firmware.
[ 2222.355499] rt2800usb 1-2:1.0: firmware: requesting rt2870.bin
[ 2222.365504] phy1 -> rt2x00lib_request_firmware: Error - Failed to request Firmware.

Any ideas?

kind regards
Bjoern Olausson
Comment 11 Ph. Marek 2009-08-01 19:01:29 UTC
I saw that there's a new driver available at http://www.ralinktech.com/ralink/Home/Support/Linux.html, but I have to admit that I didn't test it yet.

The staging drivers are broken for my USB (DWA140=rt2870), adapter, too.


Can someone compare the staging driver with the ralink version?
Comment 13 Bjoern Olausson 2009-08-01 22:29:48 UTC
(In reply to comment #12)
> Bug should be fixed with this commit: 
> 
> http://git.kernel.org/?p=linux/kernel/git/gregkh/patches.git;a=blob;f=staging/staging-rt2870-revert-d44ca7-removal-of-kernel_thread-api.patch;h=6482560a2d6de1962effd995e924b2a906b8c1b7;hb=2f6934c1a5b93267d2793111f6565a69a5bbb514
> 
> http://lkml.org/lkml/2009/7/30/271

Applied the patch to 2.6.31-r5 and the system reboots fine.

Thanks a lot.

kind regards
Bjoern Olausson
Comment 14 Rafael J. Wysocki 2009-08-03 14:44:12 UTC
On Monday 03 August 2009, Mike Galbraith wrote:
> On Sun, 2009-08-02 at 21:09 +0200, Rafael J. Wysocki wrote:
> > This message has been generated automatically as a part of a report
> > of regressions introduced between 2.6.29 and 2.6.30.
> > 
> > The following bug entry is on the current list of known regressions
> > introduced between 2.6.29 and 2.6.30.  Please verify if it still should
> > be listed and let me know (either way).
> > 
> > 
> > Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=13638
> > Subject		: rt2870 driver is broken for (some) cards
> > Submitter	: jakob gruber <jakob.gruber@kabelnet.at>
> > Date		: 2009-06-27 17:33 (37 days old)
> 
> Greg has staging-rt2870-revert-d44ca7-removal-of-kernel_thread-api.patch
> queued to fix this regression, but it hasn't hit mainline yet.
Comment 15 Rafael J. Wysocki 2009-08-10 13:49:58 UTC
On Monday 10 August 2009, Jakob Gruber wrote:
> This is fixed in the latest kernel snapshot, ifconfig down works fine.
> Thanks!
> 
> On Sun,  9 Aug 2009 23:10:25 +0200 (CEST)
> "Rafael J. Wysocki" <rjw@sisk.pl> wrote:
> 
> > This message has been generated automatically as a part of a report
> > of regressions introduced between 2.6.29 and 2.6.30.
> > 
> > The following bug entry is on the current list of known regressions
> > introduced between 2.6.29 and 2.6.30.  Please verify if it still should
> > be listed and let me know (either way).
> > 
> > 
> > Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=13638
> > Subject		: rt2870 driver is broken for (some) cards
> > Submitter	: jakob gruber <jakob.gruber@kabelnet.at>
> > Date		: 2009-06-27 17:33 (44 days old)
Comment 16 Ph. Marek 2009-08-16 18:09:48 UTC
With 2.6.31-rc6+something(git) it works for me too.

Note You need to log in before you can comment on or make changes to this bug.