Bug 73821

Summary: [BISECTED]rt2500pci Hardware WLan switch
Product: Drivers Reporter: Niels (nille0386)
Component: network-wirelessAssignee: drivers_network-wireless (drivers_network-wireless)
Status: RESOLVED CODE_FIX    
Severity: normal CC: alan, stf_xl
Priority: P1    
Hardware: All   
OS: Linux   
Kernel Version: 3.4.* Subsystem:
Regression: Yes Bisected commit-id:
Attachments: rt2x00_rfkill_unregister.patch
rfkill_debug.patch
Only with the Debugging Patch. Wlan not working
With Debug Patch and reverted e2bc75f wlan working
dmesg
Working WLan
Bad WLan
rt2500_gpiocsr_init.patch
rt2500_regs_init.patch
rt2500_msleep.patch
r2500 msleep(1000) error
rt2x00_delayed_rfkill.patch

Description Niels 2014-04-11 05:26:43 UTC
Since Kernel 3.4.* the Hardware Switch from the RT2500 PCI Wlan didn't work anymore. The WLan is from the start turned off.
Comment 1 Niels 2014-04-21 08:56:45 UTC
I have bisect this problem and found the culprit. e2bc7c5f3cb8756865aa0ab140d2288f61599dda

https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=e2bc7c5f3cb8756865aa0ab140d2288f61599dda

After revert this change the wlan hardware switch is working again.
Comment 2 Stanislaw Gruszka 2014-04-24 08:43:37 UTC
Bad commit looks sane, it start rfkill polling when when we discover the device, instead when we start the interface (ifconfig wlan0 up). The only mistake I can see so far in that commit is that we do not disable rfkill polling accordingly, but on ifconfig wlan0 down. Hence after down we can newer start rfkill poll again.
Comment 3 Stanislaw Gruszka 2014-04-24 08:47:21 UTC
Created attachment 133511 [details]
rt2x00_rfkill_unregister.patch

Does this patch fix the problem for you ?
Comment 4 Niels 2014-04-24 12:40:42 UTC
(In reply to Stanislaw Gruszka from comment #3)
> Created attachment 133511 [details]
> rt2x00_rfkill_unregister.patch
> 
> Does this patch fix the problem for you ?

Unfortunately not.
Comment 5 Stanislaw Gruszka 2014-04-24 14:06:02 UTC
Created attachment 133561 [details]
rfkill_debug.patch

Let's try to debug the problem a bit. Please apply this patch and run it with and without bad commit reverted and provide dmesg output as plaintext attachment in both cases. Thanks.
Comment 6 Niels 2014-04-24 17:11:31 UTC
Created attachment 133651 [details]
Only with the Debugging Patch. Wlan not working
Comment 7 Niels 2014-04-24 17:12:38 UTC
Created attachment 133661 [details]
With Debug Patch and reverted e2bc75f wlan working
Comment 8 Stanislaw Gruszka 2014-04-24 18:22:19 UTC
On working scenario there are no rfkill messages. Could you wait until device associate to wireless network, then use RF_KILL switch i.e. make it off and on, again wait device to associate and then do dmesg ?
Comment 9 Niels 2014-04-24 18:42:58 UTC
Created attachment 133671 [details]
dmesg

I put the wlan on, off and on, connect to the wlan and turned it again off

NOTE: Your debug patch is compiled in! i checked it twice.
Comment 10 Stanislaw Gruszka 2014-04-29 15:49:10 UTC
Hmm, I think there is something wrong with dmesg on working scenario. At least there should be messages with showing association, similar like those below:

[   59.491437] wlan0: authenticate with 54:e6:fc:98:63:fe
[   59.491961] wlan0: send auth to 54:e6:fc:98:63:fe (try 1/3)
[   59.492606] wlan0: send auth to 54:e6:fc:98:63:fe (try 2/3)
[   59.493882] wlan0: authenticated
[   59.495329] wlan0: associate with 54:e6:fc:98:63:fe (try 1/3)
[   59.499826] wlan0: RX AssocResp from 54:e6:fc:98:63:fe (capab=0x431 status=0 aid=1)
[   59.529431] wlan0: associated

I do not see them on your dmesg. Did you attach proper dmesg file ? Did you do "dmesg > dmesg.txt" after all performed RF_KILL actions ? Also please always attach dmesg as plain/text type.
Comment 11 Niels 2014-04-29 16:17:58 UTC
Created attachment 134231 [details]
Working WLan

i copied the log direct from /var/log but it looks like thats was not complete.
Comment 12 Niels 2014-04-29 16:18:54 UTC
Created attachment 134241 [details]
Bad WLan

This is a new dmesg where wlan is not working
Comment 13 Niels 2014-06-06 11:00:47 UTC
Any new on this? I'm still there.
Comment 14 Stanislaw Gruszka 2014-06-09 08:31:23 UTC
Created attachment 138571 [details]
rt2500_gpiocsr_init.patch

Sorry for delay. On bad scenario we always read 0 from GPIOCSR register, so perhaps this register is not correctly initialized as input of RF_KILL switch. Please try this patch, does it help ?
Comment 15 Niels 2014-06-09 13:56:57 UTC
rfkill_debug.patch + rt2x00_rfkill_unregister.patch + rt2500_gpiocsr_init.patch

and 

rfkill_debug.patch + rt2500_gpiocsr_init.patch

Didn't work.
Comment 16 Stanislaw Gruszka 2014-06-10 07:25:50 UTC
Created attachment 138811 [details]
rt2500_regs_init.patch

Perhaps we have to initialize some other chip subsystems to power up GPIO . Let's try this patch, it can be tested without other patches.
Comment 17 Niels 2014-06-10 11:07:40 UTC
Still no luck.
Comment 18 Stanislaw Gruszka 2014-06-13 09:42:04 UTC
Created attachment 139601 [details]
rt2500_msleep.patch

Ok, next try. Let's add some sleeps, perhaps this is timing issue, i.e. we have to initialize input subsystem before we can enable GPIO on RT2500. Additionally this patch initialize more RT2500 subsystems on probe, before starting rfkill poll.

If this patch will not help, please try to increase msleep() values. 

Other than that I'm running out of ideas, so solution would be partial revert of bad commit (on rt2500pci and leave it as is on other chips). However this approach is a bit problematic, because when system boots with RF_KILL enabled (radio disabled) users-space can never start interface, hence never start rfkill pooling and we will not see rfkill changes. Thought that depends on user-space software, i.e. NetworkManager does not UP interface if RF_KILL is enabled, but when using wpa_supplicant only this can happen.
Comment 19 Niels 2014-06-13 13:17:40 UTC
Created attachment 139631 [details]
r2500 msleep(1000) error

With 500 its not working and with 1000 i get this Error (see Attachment).

And you describe my current situation. I have currently this options.

life without wlan because its completely blocked and not even a stick is working.

blacklist the device and use a wlan stick. 

revert the bad commit but then i have to use my own kernel builds.

blacklist the module and use the ndiswrapper with the xp driver. 

anyhow, thank you for your time and that you looked at this.
Comment 20 Stanislaw Gruszka 2014-06-13 13:42:36 UTC
As I wrote before, we will partially revert bad commit to restore old behaviour on your device, but keep new behaviour on other devices. I'll prepare proper patch soon.
Comment 21 Stanislaw Gruszka 2014-06-13 14:22:46 UTC
Created attachment 139651 [details]
rt2x00_delayed_rfkill.patch

This patch restores old rfkill behaviour on rt2500pci. Please test if it works, then I'll post it upstream.
Comment 22 Niels 2014-06-13 15:12:01 UTC
Its working. 

Thanks for your work and time.
Comment 23 Niels 2014-06-16 09:19:27 UTC
Is there also a chance, that this patch get backported to the latest stable and LTS Kernels?
Comment 24 Stanislaw Gruszka 2014-06-16 16:50:20 UTC
Patch was posted with Cc: stable mark
http://marc.info/?l=linux-wireless&m=140293723015652&w=2