Bug 43187

Summary: Asyncronous firmware loading introduced in 3.3.1 fails on realtek rtl8192cfw.bin (RTL8188CE wifi adapter)
Product: Drivers Reporter: Lucas Treffenstädt (lucas)
Component: network-wirelessAssignee: Larry Finger (Larry.Finger)
Status: CLOSED CODE_FIX    
Severity: normal CC: florian, frostyplanet, Larry.Finger, linville
Priority: P1    
Hardware: All   
OS: Linux   
Kernel Version: 3.3.1-3.3.4 Subsystem:
Regression: Yes Bisected commit-id:
Attachments: kernel config
Trial patch for incorrect initialization
Trial patch for incorrect initialization - #2

Description Lucas Treffenstädt 2012-05-01 12:51:30 UTC
After updating to the 3.3.1 kernel, my wifi card cannot be used anymore because the firmware loading fails, see dmesg:

[    6.599003] Pid: 812, comm: firmware/rtlwif Not tainted 3.3.1-gentoo #3
[    6.599005] Call Trace:
[    6.599013]  [<ffffffff8102ae06>] warn_slowpath_common+0x7e/0x96
[    6.599017]  [<ffffffff8102ae33>] warn_slowpath_null+0x15/0x17
[    6.599021]  [<ffffffff8147320e>] wiphy_register+0x56/0x3dc
[    6.599027]  [<ffffffff810cb8d7>] ? __kmalloc+0xbf/0xcf
[    6.599031]  [<ffffffff8148c269>] ieee80211_register_hw+0x37a/0x5a2
[    6.599035]  [<ffffffff81311230>] rtl_fw_cb+0x145/0x1c0
[    6.599040]  [<ffffffff812d1f6f>] ? request_firmware_nowait+0x19d/0x19d
[    6.599044]  [<ffffffff812d1fbe>] request_firmware_work_func+0x4f/0x6d
[    6.599048]  [<ffffffff810451b7>] kthread+0x86/0x8e
[    6.599053]  [<ffffffff814de7f4>] kernel_thread_helper+0x4/0x10
[    6.599056]  [<ffffffff81045131>] ? kthread_freezable_should_stop+0x3e/0x3e
[    6.599060]  [<ffffffff814de7f0>] ? gs_change+0xb/0xb
[    6.599062] ---[ end trace 61bc4bb270a47404 ]---
[    6.599066] rtlwifi:rtl_fw_cb():<0-0> Can't register mac80211 hw

I'm using firmware from the rtl_92ce_92se_92de_linux_mac80211_0005.1230.2011 package provided by realtek. The same file works fine in 3.3.0, so I guess this is related to the switch to asyncronous firmware loading in 3.3.1. (see http://www.kernel.org/pub/linux/kernel/v3.0/ChangeLog-3.3.1 )

I'm using the gentoo and bfs patches.
Comment 1 Larry Finger 2012-05-01 14:42:13 UTC
The problem is not with the firmware itself. If it could not find the firmware, then ieee80211_register_hw() would not have been called. The failure is in wiphy_register().

Could you please try 3.3.4, or the latest "bleeding-edge" compat-wireless code? There have been a lot of changes in the mac80211 code that could cause the problem. In the meantime, I will try the 3.3.1 code here.
Comment 2 Lucas Treffenstädt 2012-05-01 14:44:02 UTC
This problem persists with kernel 3.3.4
Comment 3 Larry Finger 2012-05-01 16:23:53 UTC
It does not fail for me with 3.3.1!

Are those patches you are using posted anywhere?
Comment 4 Lucas Treffenstädt 2012-05-01 16:37:56 UTC
You can find the latest bfs patch right here: http://ck.kolivas.org/patches/bfs/3.3.0/3.3-sched-bfs-420.patch
the gentoo patchset can be found here: http://dev.gentoo.org/~mpagano/genpatches/
Comment 5 Lucas Treffenstädt 2012-05-01 16:38:56 UTC
Created attachment 73143 [details]
kernel config

This is my kernel configuration, in case it provides necessary information.
Comment 6 Larry Finger 2012-05-01 16:51:44 UTC
Please try 3.3.1 without the bfs patch. That change in the scheduler is pretty invasive, and may either be exposing or causing a bug. I'll try adding it to my 3.3.1 source as well. The gentoo patches should be harmless.
Comment 7 Lucas Treffenstädt 2012-05-01 17:14:08 UTC
I confirmed that the problem persists with 3.3.4, gentoo patchset only.
I will now try 3.3.4 vanilla
Comment 8 Larry Finger 2012-05-01 17:51:33 UTC
Is there some additional information output just above where you started the dmesg output listing? The output seems to come from one of the following:

        if (WARN_ON(wiphy->addresses && !wiphy->n_addresses))
                return -EINVAL;

        if (WARN_ON(wiphy->addresses &&
                    !is_zero_ether_addr(wiphy->perm_addr) &&
                    memcmp(wiphy->perm_addr, wiphy->addresses[0].addr,
                           ETH_ALEN)))
                return -EINVAL;

It would be nice to know which one is generating the warning.
Comment 9 Larry Finger 2012-05-01 17:57:43 UTC
For me, 3.3.1 with the bfs patch works.

If you 'modprobe -rv rtl8192ce' followed by 'modprobe -v rtl8192ce' help? In other words, is it a problem only when booting, or does it also happen after user space is happpily running?
Comment 10 Lucas Treffenstädt 2012-05-01 18:16:23 UTC
1. More extensive dmesg:
[   11.518036] Using firmware rtlwifi/rtl8192cfw.bin
[   11.518078] ------------[ cut here ]------------
[   11.518087] WARNING: at net/wireless/core.c:566 wiphy_register+0x56/0x3dc()
[   11.518090] Hardware name: 0221A16
[   11.518092] Modules linked in: rtl8192ce(+) snd_hda_intel(+) snd_hda_codec rtl8192c_common snd_hwdep rtlwifi snd_pcm snd_page_alloc
[   11.518105] Pid: 1116, comm: firmware/rtlwif Not tainted 3.3.4-gentoo #3
[   11.518107] Call Trace:
[   11.518116]  [<ffffffff8102ae26>] warn_slowpath_common+0x7e/0x96
[   11.518121]  [<ffffffff8102ae53>] warn_slowpath_null+0x15/0x17
[   11.518125]  [<ffffffff81457486>] wiphy_register+0x56/0x3dc
[   11.518130]  [<ffffffff810cbb56>] ? __kmalloc+0xbf/0xcf
[   11.518134]  [<ffffffff814704f5>] ieee80211_register_hw+0x37a/0x5a2
[   11.518143]  [<ffffffffa002012c>] rtl_fw_cb+0x145/0x1c0 [rtlwifi]
[   11.518148]  [<ffffffff812d23bc>] ? _request_firmware_prepare+0x9b/0x9b
[   11.518151]  [<ffffffff812d2474>] request_firmware_work_func+0xb8/0xd3
[   11.518156]  [<ffffffff81045433>] kthread+0x86/0x8e
[   11.518161]  [<ffffffff814c0d74>] kernel_thread_helper+0x4/0x10
[   11.518165]  [<ffffffff810453ad>] ? kthread_freezable_should_stop+0x3e/0x3e
[   11.518169]  [<ffffffff814c0d70>] ? gs_change+0xb/0xb
[   11.518172] ---[ end trace d3baf572359b220b ]---
[   11.518176] rtlwifi:rtl_fw_cb():<0-0> Can't register mac80211 hw

2. 3.3.4 vanilla works just fine, so this is definitely due to the gentoo patchset
Comment 11 Larry Finger 2012-05-01 18:26:14 UTC
That explains why I did not see the problem. That warning indicates that no band has been set. In most systems, that is set by CRDA. Are you running that? Perhaps the Gentoo patchset is messing that up. In any case, this is not a Linux kernel bug, and needs to be filed with Gentoo.

I will close this one.
Comment 12 Neptune Ning 2012-05-01 22:53:18 UTC
I am still having this issue with vanilla-source 3.3.4 :(  and even no luck using compat-wireless.  Here is dmesg when loading the module :

----------
Compat-wireless backport release: compat-wireless-v3.4-rc3-1
Backport based on linux-stable.git v3.4-rc3
cfg80211: Calling CRDA to update world regulatory domain
rtl8192ce: Using firmware rtlwifi/rtl8192cfw.bin
------------[ cut here ]------------
WARNING: at /home/ning/Downloads/compat-wireless-3.4-rc3-1/net/wireless/core.c:580 wiphy_register+0x577/0x620 [cfg80211]()
Hardware name: 42862EC
Modules linked in: rtl8192ce(O+) rtl8192c_common(O) rtlwifi(O) mac80211(O) cfg80211(O) compat(O) iptable_filter ipt_MASQUERADE iptable_nat nf_nat iptable_raw [last unloaded: compat]
Pid: 3005, comm: firmware/rtlwif Tainted: G        W  O 3.3.4 #6
Call Trace:
 [<ffffffff81061a6b>] ? warn_slowpath_common+0x7b/0xc0
 [<ffffffffa00c25b7>] ? wiphy_register+0x577/0x620 [cfg80211]
 [<ffffffff81086251>] ? ttwu_do_wakeup+0x11/0x90
 [<ffffffff8108897b>] ? try_to_wake_up+0xcb/0x280
 [<ffffffffa00f94d1>] ? ieee80211_register_hw+0x2e1/0x6d0 [mac80211]
 [<ffffffff812fae20>] ? _request_firmware+0x2f0/0x2f0
 [<ffffffffa014a905>] ? rtl_fw_cb+0x65/0x100 [rtlwifi]
 [<ffffffff812faec6>] ? request_firmware_work_func+0xa6/0xe0
 [<ffffffff812fae20>] ? _request_firmware+0x2f0/0x2f0
 [<ffffffff8107d67e>] ? kthread+0x9e/0xb0
 [<ffffffff8150cbd4>] ? kernel_thread_helper+0x4/0x10
 [<ffffffff8107d5e0>] ? kthread_freezable_should_stop+0x60/0x60
 [<ffffffff8150cbd0>] ? gs_change+0xb/0xb
---[ end trace 0954e0242588b287 ]---

If any information is needed, please let me know
Comment 13 Larry Finger 2012-05-02 00:01:36 UTC
My source does not show a warning at line 580 of net/wireless/core.c. Please open that file and tell me what that line says.
Comment 14 Neptune Ning 2012-05-02 02:25:56 UTC
compat-wireless-3.4-rc3-1/net/wireless/core.c:580  is the line "WARN_ON(1)"
---------
   if (!have_band) {
        WARN_ON(1);
        return -EINVAL;
   }
--------------
Comment 15 Larry Finger 2012-05-02 03:04:12 UTC
Your system is not setting up the wireless band correctly. At this point, the system does not know whether you are using 2.4 of 5 GHz. For rtl8192ce, only 2.4 GHz makes any sense,

I have heard of another user that has this kind of problem, but no one knows how it happens.

What distro are you using? The problem is somewhere in the user-space code. I will try to duplicate the result, but it certainly does not happen with my openSUSE system running NetworkManager under KDE.
Comment 16 Neptune Ning 2012-05-02 03:30:11 UTC
I'm using gentoo on thinkpad x220i-42862EC,  
linux-firmware-20120219 from  http://git.kernel.org/?p=linux/kernel/git/firmware/linux-firmware.git ,
wpa_supplicant-0.7.3-r,
gnome 3 and networkmanager-0.9.4.0-r2。
I can use my wireless with kernel driver <=3.2.16. 
Any other user-space program to remind me?
Comment 17 Neptune Ning 2012-05-02 04:09:34 UTC
I failed to compile compat-wireless dirver < 3.3 against 3.3.4 kernel source.

when using driver compiled from source package rtl_92ce_92se_92de_linux_mac80211_0005.1230.2011 provided by realtek on kernel 3.3.4, currently is working for me.
dmesg after I modprobe rtl8192ce . It did show band-width other than 2.4 GHz ...
=====================
cfg80211: Calling CRDA to update world regulatory domain
ieee80211 phy0: Selected rate control algorithm 'rtl_rc'
rtlwifi: wireless switch is on
cfg80211: World regulatory domain updated:
cfg80211:   (start_freq - end_freq @ bandwidth), (max_antenna_gain, max_eirp)
cfg80211:   (2402000 KHz - 2472000 KHz @ 40000 KHz), (300 mBi, 2000 mBm)
cfg80211:   (2457000 KHz - 2482000 KHz @ 20000 KHz), (300 mBi, 2000 mBm)
cfg80211:   (2474000 KHz - 2494000 KHz @ 20000 KHz), (300 mBi, 2000 mBm)
cfg80211:   (5170000 KHz - 5250000 KHz @ 40000 KHz), (300 mBi, 2000 mBm)
cfg80211:   (5735000 KHz - 5835000 KHz @ 40000 KHz), (300 mBi, 2000 mBm)
cfg80211: Calling CRDA for country: EC
cfg80211: Regulatory domain changed to country: EC
cfg80211:   (start_freq - end_freq @ bandwidth), (max_antenna_gain, max_eirp)
cfg80211:   (2402000 KHz - 2482000 KHz @ 40000 KHz), (N/A, 2000 mBm)
cfg80211:   (5170000 KHz - 5250000 KHz @ 20000 KHz), (300 mBi, 1700 mBm)
cfg80211:   (5250000 KHz - 5330000 KHz @ 20000 KHz), (300 mBi, 2300 mBm)
cfg80211:   (5735000 KHz - 5835000 KHz @ 20000 KHz), (300 mBi, 3000 mBm)
ADDRCONF(NETDEV_UP): wlan0: link is not ready
wlan0: authenticate with 00:26:f2:ee:1f:0f (try 1)
wlan0: authenticated
wlan0: associate with 00:26:f2:ee:1f:0f (try 1)
wlan0: RX AssocResp from 00:26:f2:ee:1f:0f (capab=0x411 status=0 aid=1)
wlan0: associated
wlan0: moving STA 00:26:f2:ee:1f:0f to state 1
wlan0: moving STA 00:26:f2:ee:1f:0f to state 2
ADDRCONF(NETDEV_CHANGE): wlan0: link becomes ready
cfg80211: Calling CRDA for country: US
cfg80211: Regulatory domain changed to country: US
cfg80211:   (start_freq - end_freq @ bandwidth), (max_antenna_gain, max_eirp)
cfg80211:   (2402000 KHz - 2472000 KHz @ 40000 KHz), (300 mBi, 2700 mBm)
cfg80211:   (5170000 KHz - 5250000 KHz @ 40000 KHz), (300 mBi, 1700 mBm)
cfg80211:   (5250000 KHz - 5330000 KHz @ 40000 KHz), (300 mBi, 2000 mBm)
cfg80211:   (5490000 KHz - 5600000 KHz @ 40000 KHz), (300 mBi, 2000 mBm)
cfg80211:   (5650000 KHz - 5710000 KHz @ 40000 KHz), (300 mBi, 2000 mBm)
cfg80211:   (5735000 KHz - 5835000 KHz @ 40000 KHz), (300 mBi, 3000 mBm)
wlan0: moving STA 00:26:f2:ee:1f:0f to state 3
wlan0: no IPv6 routers present
=============
Comment 18 Larry Finger 2012-05-03 16:03:18 UTC
Compat-wireless is designed to backport newer versions of the code to older kernels. When you try to compile a version older that the kernel you are using, it is expected that things will not compile - they are not expected to work.

The bands that are shown are those allowed by the regulations, they are not from the driver.

A patch showing a possible race condition that might explain your problem was posted yesterday. I will attach it next for you to try.
Comment 19 Larry Finger 2012-05-03 16:04:30 UTC
Created attachment 73166 [details]
Trial patch for incorrect initialization
Comment 20 Larry Finger 2012-05-03 17:05:26 UTC
Do NOT use that patch. It works under very limited circumstances; however, it may prevent your system from booting. I'll submit a new version ASAP.
Comment 21 Neptune Ning 2012-05-03 18:25:16 UTC
I patched it to compat-wireless-3.4-rc3-1, kernel crashed when loading wifi driver, and I also failed to patched it against kernel 3.3.4 source.
Comment 22 Larry Finger 2012-05-03 18:33:37 UTC
That is why I told you not to use it. The patch is defective.
Comment 23 Larry Finger 2012-05-03 18:42:04 UTC
Created attachment 73167 [details]
Trial patch for incorrect initialization - #2

Please try this one. The current thinking is that there is a racy condition that shows up when the driver is loaded with the firmware cached that causes the initialization to be done in the wrong order.

This one does not fail on bootup. I have no idea if it fixes the other problem as I still have not duplicated it.
Comment 24 Yill Din 2012-05-03 19:20:05 UTC
(In reply to comment #23)

Works for me with 3.4.0-rc4 vanilla, Gentoo. The same as previous patch, by the way.

---
[   12.612197] rtl8192ce: Using firmware rtlwifi/rtl8192cfw.bin
[   12.748298] rtlwifi: wireless switch is on
---
Comment 25 Neptune Ning 2012-05-04 01:30:21 UTC
(In reply to comment #23)
> Created an attachment (id=73167) [details]
> Trial patch for incorrect initialization - #2
> 
> Please try this one. The current thinking is that there is a racy condition
> that shows up when the driver is loaded with the firmware cached that causes
> the initialization to be done in the wrong order.
> 
> This one does not fail on bootup. I have no idea if it fixes the other
> problem
> as I still have not duplicated it.

this patch works for compat-wireless-3.4-rc3-1 , much thanks !
Comment 26 Larry Finger 2012-05-04 03:08:52 UTC
The patch is being pushed for kernel 3.4, and will be backported to 3.3. Thanks for testing.
Comment 27 Florian Mickler 2012-07-01 09:44:41 UTC
A patch referencing this bug report has been merged in Linux v3.4:

commit 574e02abaf816b582685805f0c1150ca9f1f18ee
Author: Larry Finger <Larry.Finger@lwfinger.net>
Date:   Fri May 4 08:27:43 2012 -0500

    rtlwifi: fix for race condition when firmware is cached
Comment 28 Florian Mickler 2012-07-01 09:47:04 UTC
A patch referencing this bug report has been merged in Linux v3.4:

commit 8011652957995914272f398071b70140639185ce
Merge: 568b445 26a5d3c
Author: Linus Torvalds <torvalds@linux-foundation.org>
Date:   Wed May 16 13:14:52 2012 -0700
Comment 29 Florian Mickler 2012-08-12 09:33:49 UTC
A patch referencing this bug report has been merged in Linux v3.4:

commit d0cad88d071d59169ac25e5c1e3bee0719a4fccf
Merge: 3ab77bf 6037463
Author: David S. Miller <davem@davemloft.net>
Date:   Wed May 16 01:03:54 2012 -0400