Bug 20822 - Poor wireless scan results with WUSB54GS (BCM4320a)
Summary: Poor wireless scan results with WUSB54GS (BCM4320a)
Status: RESOLVED CODE_FIX
Alias: None
Product: Drivers
Classification: Unclassified
Component: network-wireless (show other bugs)
Hardware: All Linux
: P1 normal
Assignee: Jussi Kivilinna
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2010-10-20 22:43 UTC by Pitxyoki
Modified: 2010-12-22 15:12 UTC (History)
1 user (show)

See Also:
Kernel Version: 2.6.36-rc7
Subsystem:
Regression: No
Bisected commit-id:


Attachments
Poller scanning patch (3.38 KB, patch)
2010-10-22 12:46 UTC, Jussi Kivilinna
Details | Diff
Alternative scan mode for BCM4320a (5.97 KB, patch)
2010-10-23 10:06 UTC, Jussi Kivilinna
Details | Diff
rndis_wlan: scanning, workaround device returning incorrect bssid-list item count. (2.66 KB, patch)
2010-12-18 14:06 UTC, Jussi Kivilinna
Details | Diff
[2.6.36] rndis_wlan: scanning, workaround device returning incorrect bssid-list item count. (1.71 KB, patch)
2010-12-19 11:07 UTC, Jussi Kivilinna
Details | Diff
[v2][2.6.36] rndis_wlan: scanning, workaround device returning incorrect bssid-list item count. (2.75 KB, patch)
2010-12-20 08:50 UTC, Jussi Kivilinna
Details | Diff
Kernel Oops (317.30 KB, image/jpeg)
2010-12-20 16:38 UTC, Pitxyoki
Details
2.6.36.2 config file (117.81 KB, application/octet-stream)
2010-12-20 18:50 UTC, Pitxyoki
Details
2.6.36.2 patched rndis_wlan.ko (32.26 KB, application/octet-stream)
2010-12-20 18:52 UTC, Pitxyoki
Details
[v3][2.6.36] rndis_wlan: scanning, workaround device returning incorrect bssid-list item count. (3.35 KB, patch)
2010-12-20 21:05 UTC, Jussi Kivilinna
Details | Diff
Kernel Oops with third patch (161.50 KB, image/jpeg)
2010-12-20 23:02 UTC, Pitxyoki
Details
[v4][2.6.36] rndis_wlan: scanning, workaround device returning incorrect bssid-list item count. (3.58 KB, patch)
2010-12-21 07:17 UTC, Jussi Kivilinna
Details | Diff
Warning caught under 2.6.36.2+v4-patch (2.90 KB, text/plain)
2010-12-22 11:23 UTC, Pitxyoki
Details

Description Pitxyoki 2010-10-20 22:43:10 UTC
When using iwlist scan, this device shows very few results comparatively to what would be expected.

If I do a scan right after booting (to ensure there are no caching or other optimizations taking place), I see only two or three SSIDs. If I keep retrying, it can take a long time until my network's beacon is detected. NetworkManager, for example, usually takes 20 to 30 minutes until it finally detects the correct SSID and is then able to establish the connection.

Even without previously detecting the beacon, I can use iwconfig to set up the device and connect to my network (see bug#20152 ). Once it is established, there are no problems with the connection: it's fast and stable, so I can't say that the long time to detect the beacon is related to a long distance or weak signal. When the connection is established, subsequent scans always show my network's SSID (out of a total of three or four SSIDs).

Also, my laptop is usually placed at about one meter from this device (it has an Intel Wifi Link 5100 device -- iwlagn driver). After a cold boot it shows 15~20 networks right on the first scan, including my own network. This makes me think that the BCM4320a driver is really not detecting well all these beacons and that it's not just a problem with my setup.

Best regards,
Luís Picciochi
Comment 1 Jussi Kivilinna 2010-10-22 12:45:20 UTC
I have gone throught NDIS specs on scanning, and cannot really see good way to improve scanning on device. There is no interface to change scanning parameters, just initiate scan and get results. I made one patch to read scanning results in the poller (read results every second) but that doesn't seem to improve scanning on BCM4320b. When testing I noticed that BCM4320b is poorer at scanning than zd1211 based device I have. 

I don't see any way to fix/workaround this problem, it's (mostlikely) firmware bug. Can you test scanning on other OS or with ndiswrapper to see if driver makes any difference?

-Jussi

ps. I'll attach passive/poller scanning patch, althought probability of it working is low.
Comment 2 Jussi Kivilinna 2010-10-22 12:46:04 UTC
Created attachment 34402 [details]
Poller scanning patch
Comment 3 Pitxyoki 2010-10-22 18:48:06 UTC
Another interesting fact that I forgot to mention is that when I reboot the computer, most of the times (but not always) the beacon is detected much faster. It looks like --though I have no way to prove it-- that when the adapter is hot (and it heats up considerably), it detects the network faster. If I shutdown the computer, wait a while and then turn it on, the device always takes a long time once again.


With your latest patch it still took more than 15 minutes for n-m to detect the beacon. At syslog there were always only two or three networks shown at each reading.

I verified the scan behaviour under Windows XP: it seems even worse than what we have here.
- With Linksys' "Wireless Network Monitor", I couldn't see any scan results. Only after I used the "manual configuration" feature I could see my ESSID there (even after insisting for a while, I couldn't see any other network).
- With Windows' "Wireless Zero Configuration", I saw two or three networks. Coincidentally (?), mine appeared quite quickly.

Although the throughput and connection stability is much better with rndis_wlan than on Windows, I guess that means we're stuck here.



For the record, and for others eventually in the same situation as me: in order to keep a fast boot and letting n-m manage the connection, I edited the networking init script so that the device is configured and dhclient is run during boot on the background.

This makes the device see the beacon and associate with the network. Then n-m just picks/resets/whatevers the connection even before a working gnome session is ready. No more long waits for the beacon to be detected and no need for manually configuring the device to see the beacon every time the computer is started. I suppose it will have to be like this at least until n-m is able to automatically connect to networks before it sees their beacon.

Best regards,
Pitxyoki
Comment 4 Jussi Kivilinna 2010-10-23 10:06:20 UTC
Created attachment 34502 [details]
Alternative scan mode for BCM4320a

One last try...

This patch changes scanning with BCM4320a so that new scan is launch and old results are read every second, when radio is on but device is not associated.

This mode worked BCM4320b without problems, but on the other hand didn't yield any more scan results.
Comment 5 Pitxyoki 2010-10-23 12:07:31 UTC
Ok, this is much better now. Before connecting, now I see five to ten (!) SSIDs. Most of the time I see seven or eight.
It still looks like a matter of luck if I get my network's beacon soon or later, but I'd say that this increases pretty much the likelihood of getting connected sooner.

Give me a week or two of normal use to tell if this is definitely better or not.
Comment 6 Jussi Kivilinna 2010-11-02 15:12:13 UTC
So, any results yet?
Comment 7 Pitxyoki 2010-11-06 22:38:52 UTC
Sorry, I was away for the past week. I'm back now.

Yes, the beacon is detected much faster. The fastest case was just 22 seconds, the slowest 2m25 (!). Most of the times, it took from about 1m to about 1m30s. These times are mostly hidden by the boot process, so when I have the desktop available, it usually takes just a few seconds until I get a working connection.

If you don't see any other ways of improving this, I'd say this is already very good.


Thanks,
Pitxyoki
Comment 8 Jussi Kivilinna 2010-12-18 14:06:51 UTC
Created attachment 40672 [details]
rndis_wlan: scanning, workaround device returning incorrect bssid-list item count.

I think I found real reason for these bad scan results. I started getting same problems myself too. Device returns too small count of scan results. Now looking at your dmesgs, I see quite large scan/bssid-list buffers with small BSSID count. This new patch workarounds the bug.
Comment 9 Pitxyoki 2010-12-18 22:58:43 UTC
Hi,

What kernel version does this apply to?

Should I remove the patch for the "alternative scan mode", or should I keep both patches?


Regards,
Pitxyoki
Comment 10 Jussi Kivilinna 2010-12-19 11:06:08 UTC
It does apply to wireless-testing. I'll post new one for plain 2.6.36.

Now more I think about this, I think this new patch fixes the poor-scanning bug and the WARNING-bug, as they both had to do with bssid-list. So please test this patch without those two previous patches.
Comment 11 Jussi Kivilinna 2010-12-19 11:07:13 UTC
Created attachment 40822 [details]
[2.6.36] rndis_wlan: scanning, workaround device returning incorrect bssid-list item count.
Comment 12 Pitxyoki 2010-12-20 01:18:11 UTC
This made my kernel crash very badly. This is what I did and saw:

1. Booted the machine
2. Connected the device
3. Looked at network-manager. Two networks there. Not too much...
4. Issued iwlist scan. Two networks apeared. Mine is not there yet.

5. Issued iwlist scan again. A few networks appeared again. As I was going to scroll on this terminal, X suddenly switched to a TTY with a stack trace and with the last line saying something like "the machine must be restarted". The keyboard was unresponsive. I had to do a cold shutdown.

I can't remember exactly what the stack trace contents were, but I saw there some rndis_wlan-related stuff. I'm reluctant to retry this as fdisk detected some "lost inodes" on two partitions after restarting to an older kernel. Unfortunately the stack trace didn't get saved to the syslog...
Comment 13 Jussi Kivilinna 2010-12-20 08:02:12 UTC
Comment on attachment 40822 [details]
[2.6.36] rndis_wlan: scanning, workaround device returning incorrect bssid-list item count.

Causes oops. (last bssid->length read outside buffer?)
Comment 14 Jussi Kivilinna 2010-12-20 08:03:22 UTC
Comment on attachment 40672 [details]
rndis_wlan: scanning, workaround device returning incorrect bssid-list item count.

Causes oops? (last bssid->length read outside buffer?)
Comment 15 Jussi Kivilinna 2010-12-20 08:14:03 UTC
Sorry for that. There is some obvious errors with bssid-list loop that patch triggers. And now it appears reason I don't have crashes is that device appends extra 4 zero-bytes at end of buffer for me.
Comment 16 Jussi Kivilinna 2010-12-20 08:50:09 UTC
Created attachment 40992 [details]
[v2][2.6.36] rndis_wlan: scanning, workaround device returning incorrect bssid-list item count. 

[v2] Fixed accesses behind buffer.
Comment 17 Jussi Kivilinna 2010-12-20 08:51:15 UTC
Well, there was no zero bytes on my device after all, I misread debug output. Again sorry about that.
Comment 18 Pitxyoki 2010-12-20 16:38:07 UTC
Created attachment 41032 [details]
Kernel Oops


Same thing as before: scanning was only giving two or three scan results.

After a while, the attached kernel Oops appeared.
Comment 19 Jussi Kivilinna 2010-12-20 18:13:41 UTC
Hmm. I had 2.6.36.2+patch run 10hrs doing scanning in loop. Can you post your kernel config and rndis_wlan.ko module?
Comment 20 Pitxyoki 2010-12-20 18:50:46 UTC
Created attachment 41042 [details]
2.6.36.2 config file
Comment 21 Pitxyoki 2010-12-20 18:52:23 UTC
Created attachment 41052 [details]
2.6.36.2 patched rndis_wlan.ko
Comment 22 Pitxyoki 2010-12-20 18:56:25 UTC
Sure.
I notice there are some "preempt" lines on the Oops. I compiled this kernel with preemption enabled. Can this be somehow related?


MD5 sums of patched .c source file and resulting .ko:

luis@C-5:/usr/src/linux-2.6.36.2/drivers/net/wireless$ md5sum rndis_wlan.{c,ko}
577b01c3b6e3ff8006036bbe0446de55  rndis_wlan.c
0718d8e74eccfede44c286b6198eeb19  rndis_wlan.ko
Comment 23 Jussi Kivilinna 2010-12-20 20:33:18 UTC
I checked rndis_wlan.ko and point of crash and it seemed like place that shouldn't have crashed with v2. 

Then I noticed not-so-obvious bug in buffer-resizing part and finally managed to create a crash on my end too.

I'll post v3-patch soon.
Comment 24 Jussi Kivilinna 2010-12-20 21:05:30 UTC
Created attachment 41072 [details]
[v3][2.6.36] rndis_wlan: scanning, workaround device returning incorrect bssid-list item count.
Comment 25 Pitxyoki 2010-12-20 23:02:04 UTC
Created attachment 41132 [details]
Kernel Oops with third patch


Practically the same behaviour. The Oops looks like the same to me too.
This time my network was detected quickly and network manager was already trying to setup the connection... And then, poof.
Comment 26 Jussi Kivilinna 2010-12-21 07:16:49 UTC
Ok, this is last try... I was pretty sure previous versions would have fixed crashing and I'm not sure about this one. 

It has more strict buffer range checks, in case device returns negative offsets/lengths (about the last thing I could think off that could still cause this).
Comment 27 Jussi Kivilinna 2010-12-21 07:17:42 UTC
Created attachment 41152 [details]
[v4][2.6.36] rndis_wlan: scanning, workaround device returning incorrect bssid-list item count.
Comment 28 Pitxyoki 2010-12-21 20:16:59 UTC
Ok. This seems to work. I'm sending this with the module patched with that last patch.
However, scan performance is very bad again. After one hour of letting the computer turned on, network-manager hadn't discovered my network yet.

After an hour, I started issuing 'iwlist scan's inside a loop (while true ; do iwlist  wlan0 scan; done). The network was detected and I got connected quite quickly after that.
Comment 29 Jussi Kivilinna 2010-12-21 20:29:16 UTC
Ok, thanks. This patch fixes scanning issue that I now started having and thought it could have been same issue. 

Anyway I'm clad you did test it through, the first version was already going upstream but pulled it back after crashes you had. The two earlier patches are already there and should get to 2.6.38.
Comment 30 Pitxyoki 2010-12-22 11:23:03 UTC
Created attachment 41292 [details]
Warning caught under 2.6.36.2+v4-patch

With this patch, when not using network-manager, the warning reported at bug #20152 is also present. See attachment.
Comment 31 Jussi Kivilinna 2010-12-22 15:12:45 UTC
Ok, so this newer patch does not help BCM4320a at all. Instead it workarounds hardware bug that I have with BCM4320b, althought symptoms were exactly the same. Funny piece of ...hardware we have here ;)

I mark this bug now resolved, 'alternative scan mode' patch is on way to 2.6.38 and atleast helps a bit.

Note You need to log in before you can comment on or make changes to this bug.