Bug 9237

Summary: b43 broadcom driver does not work in ad-hoc mode
Product: Drivers Reporter: Christian Casteyde (casteyde.christian)
Component: network-wirelessAssignee: Michael Buesch (mb)
Status: CLOSED OBSOLETE    
Severity: normal CC: alan, oakad
Priority: P1    
Hardware: All   
OS: Linux   
Kernel Version: 2.6.24-rc1 Subsystem:
Regression: No Bisected commit-id:
Attachments: crash of zd1201
circular lock warning

Description Christian Casteyde 2007-10-27 09:51:06 UTC
Most recent kernel where this bug did not occur:
N/A (b43 is new to 2.6.24-rc1)

Distribution:
Bluewhite64 12

Hardware Environment:
laptop with broadcom 43xx chip and Wifi USB key with zd1201 chip

Software Environment:
iwconfig

Problem Description:
It is impossible to make a Wifi connection between a zd1201 Wifi USB key and a broadcom card of a laptop with b43
The Wifi cell is seen from both parts, but the connection cannot be established with or without key - zd1201 only supports WEP as far as I know).
When I ping the zd from the bcm:
ifconfig on bcm tells there are Tx packets but 0 rx packets
ifconfig on zd tells there are Rx packets (same as tx from bcm), but 0 rx packets.
So it seems the zd1201 driver doesn't send echo reply packets.

Please note that I never managed to get a connection with bcm43xx either, but I managed once to get it with ndiswrapper. However, by that time, I used the zd1201 driver with ieee802.11 lib (not mac80211). I guess the new zd1201 driver is broken on mac80211.

Steps to reproduce:
Comment 1 Christian Casteyde 2007-10-27 09:55:26 UTC
Sorry, in previous comment, the zd driver has Rx packets but 0 ***tx***. It does not send anything.
Comment 2 Christian Casteyde 2007-10-27 10:16:44 UTC
Since it seems to be a zd driver problem, I tried with the kernel 2.6.23.1 and with ieee802.11. It doesn't manage to send packets either.
It used to work (last time I tried this driver was one year or so, it worked, but I don't remember which kernel I used at that time).
Comment 3 Christian Casteyde 2007-10-27 11:13:45 UTC
Also tried with ieee802.11 under 2.6.24-rc1, it receives packets but does not send any. The driver is not usable anymore.
Comment 4 Johannes Berg 2007-11-13 05:40:52 UTC
Nothing in the zd1201 driver should've changed since it's not a mac80211 driver or anything. Isn't it even full-mac? Can you send to/from the zd1201 from the AP or another networked host rather than b43?
Comment 5 Christian Casteyde 2007-11-13 11:15:12 UTC
No, it simply does not work, even with other devices.
I've also tested with a rt25xx based device, in ad-hoc mode and without encryption, and I didn't managed to get any packet sent or received.
However, the zd1201 driver indeed crashed when I tried to ping it from the rtxx device, as the console shown in the next attachment shows. I already got this crash last time, but now I have my camera... So the behaviour is really the same with both "other end": zd crashes when it receives something, and cannot send anything.
In the other hand, scanning works, and the devices manage to join the Wifi cell (sometimes, the same essid creates 2 cells, but if I start them in a row they associate).

(note that rt25xx is not bullet proof either, since when I unplugged it it also crashed, see 9368).
Comment 6 Christian Casteyde 2007-11-13 11:19:44 UTC
Created attachment 13534 [details]
crash of zd1201
Comment 7 Christian Casteyde 2007-11-13 11:21:39 UTC
More info: this crash occured with 2.6.23.1-hrt3
So this is not the new Wifi stack. And indeed I think zd1201 doesn't use any stack last time I see the code.
Comment 8 John W. Linville 2007-11-13 12:17:50 UTC
Please be clear...does the zd1201 device work on 2.6.24-rc2?  If not, what was the latest kernel where it did work?
Comment 9 Christian Casteyde 2007-11-13 12:28:14 UTC
No, it does not work neither with 2.6.24-rc2, nor 2.6.23.
I hadn't use that device for a while, and cannot remember what was the last version that used to work. Maybe something like 2.6.16 or 2.6.18, that was more than one year ago. It used to work when the driver was added to mainline indeed.
Comment 10 John W. Linville 2007-11-13 12:48:04 UTC
I do not have a zd1201 device, but I recently got reports from a Fedora user that the device is working with a recent rawhide kernel.  That would be either 2.6.23 or 2.6.24-rc1, so the current driver sources work for someone.

Are you certain that the device itself still functions?

If so, is there any way you can be more precise about which kernels worked for you and which do not?
Comment 11 Christian Casteyde 2007-11-13 12:54:32 UTC
At least it can scan APs, creat a cell, and connect to an existing one. But I cannot guarantee the device still works. I didn't stressed it, but this can be the case. If anybody with a zd1201 based device can confirm it works?

To answer for the kernel, I would have to bissect and test with older ones.
That will take me some time (not even sure my distro/udev will boot with older kernels in fact). I will try that with a few kernels.

Still, there is the crash...
Comment 12 John W. Linville 2007-11-13 13:13:59 UTC
The crash you reported on 2.6.23.1-hrt3 (whatever that is) looks like something that is already fixed.  If you can recreate that crash on 2.6.24-rc2 I'll be more interested in it.

Your bug reports are very confusing, with lots of cross referencing between them as if they are all the same issue.  Please focus on the exact problem in each bug report individually.  I want to be helpful, but that is difficult if any given problem in a single bug reports translates to "it doesn't work" in all of the bug reports.

So I am confused..."it can scan APs, creat a cell, and connect to an existing one", but you say it doesn't work?  How do you define "work"?
Comment 13 Christian Casteyde 2007-11-13 13:46:02 UTC
I'm sorry for being confusing (confused myself). The fact is I'm trying to connect with two USB wifi keys, and I have **many** problems simultaneously (that is: 2-3 crashes in b43, one in rt24xx, one in zd): so I report a bug for each, but I tend to link them because I'm not sure they are unrelated. In fact, they can reside in common code and not in drivers, so they may be related.
If you want me not to tell they happen while doing the same tests with different drivers, I will focus only on each one independantly.

For the "it doesn't work", I try to give as much as I can, that is what I'm trying to do (basically build a wireless network), what I have (in general, crashes when doing pings, no connectivity, crash at unload, and often all at the same time - sorry for being harsh with the wireless stability).

Now, as far as zd is regarded, I'll try to restate all:
- it indeed scans, create cells and connect to the cell (that is: iwconfig tells the if is in the cell, and the cell/AP IDs are the same on both computer), so it does not seem to be broken (ie: what I call **physically** working, sorry for not being precise in last mail);
- however, no IP/ICMP packets are transfered (of course, I removed all iptables rules, and so), that is, it **logically** doesn't work (ie, the goal of this report, and now I'm quite sure this is specific to zd1201);
- the zd device receives ICMP packets from b43 for instance, and does not send any reply (I check ifconfig TX/RX counts for that);
- after a while (say 10s) of receiving ICMP packets from the other computer, it crashes. The console photo is all I can get from that (at least there is a call stack).

Still for the zd problem, for the kernel dichotomy:
2.6.20 doesn't work but didn't crash (for 2mn, but beware, that doesn't imply that it would never crash). 2.6.18 doesn't compile on my 64 bits system - I won't be able to check it until this WE. I'm still compiling 2.6.19. Just a reboot now and I'll tell the result.
Comment 14 Christian Casteyde 2007-11-13 14:24:54 UTC
Well, I didn't managed to get ICMP replies with 2.6.19 either.
However, I didn't reproduce the crash with 2.6.34-rc2, so at least this thing seems to be fixed.

But I'm definitively missing something here, since I didn't manage to build my network between my laptop (with b43) and my rt25xx key, so the zd crash apart, I may do something wrong. I'll investigate that, maybe the zd driver is not so guilty finally, don't waste your time with this for now. I'll post when I'm sure all is OK in my way of doing things there.
Comment 15 Christian Casteyde 2007-11-15 08:27:21 UTC
OK, I've understood that I couldn't manage to get zd1201 working in Ad-Hoc mode with a rt25xx USB keys, since rt25xx doesn't support ad-hoc mode (as shown in recent commit in the driver). Moreover, the rt25xx doesn't seem to be useable itself: it doesn't associate in WPA mode with my AP, so I do not trust it. (adding another bug on rt28xx to trace the problems I got).

So I finally cannot answer to #4, since the only other device I have to do tests is b43 wireless card of my laptop (but note b43 works and I trust it now).

I also tried to connect my laptop with ndiswrapper+bcmwl5 in ad-hoc mode with the zd1201 key+linux zd1201 driver (using ndiswrapper to eliminate any b43 driver potential problem in case it wouldn't handle ad-hoc mode as rt28xx driver). It failed too (same behaviour : both cards joign the cell, but absolutely no IP/ICMP connectivity).

To go further, I'll have to test zd1201 in managed mode with an open/WEP AP to see if the device works physically, and to get an AP configured in ad-hoc mode - if possible - to check ad-hoc mode of zd1201. But for now I will concentrate on rt25xx driver until I can get it associated with my AP, because zd1201 doesn't support WPA now and is a quite old device now.
Comment 16 Christian Casteyde 2007-11-17 06:39:34 UTC
I managed to connect to a D-Link DSL-G604T AP with the zd1201 using WEP, so the zd1201 device and driver indeed work in managed mode. The hardware is definitively **not** broken.

So, either ad-hoc mode is broken in zd1201, or it is broken in all the other drivers I tried (rt28xx is indeed unable to do ad-hoc connections, I suspect b43 not being able to do it either, but I remember that ndiswrapper could connect in ad-hoc mode to the zd1201 before there was any native broadcom driver in then kernel).

I'm unable to say which driver is broken, since I couldn't find any that connect with the zd device. I would be enclined to think the zd1201 driver is OK now, and that it is all other drivers that only support managed mode, if ndiswrapper hadn't failed too.

I also wonder if there is a possibility that the WEP keys I'm using are interpreted differently by the different drivers (ndiswrapper, zd1201 essentially), but it doesn't work with no encryption either.
Comment 17 Christian Casteyde 2007-11-17 06:44:04 UTC
In comment #16, I managed to connect to the AP using a 32 bits kernel on another computer (slackare 32 instead of bluewhite64), and on USB1 port only instead of USB2. I don't think that matters, but that was the environment of this test.
Comment 18 John W. Linville 2007-11-20 08:52:15 UTC
Could you try some older kernels to narrow when the zd1201 ad-hoc support might have been broken?  There have been very few patches to this driver for a long time...
Comment 19 Christian Casteyde 2007-11-24 06:19:55 UTC
OK, I've just managed to connect to the zd1201 key on a 32 bit slackware with kernel 2.6.24-rc3 (2.6.23 and previous crash in the zd driver on RX), using ndiswrapper+2.6.23 on my laptop for the broadcom card, both with WEP and no encryption. The native broadcom drivers do not work on the laptop side.

So, current status is:
a. zd1201 driver works in 2.6.24-rc3 at least in 32 bits mode on USB 1.1 host;
b. I failed to connect in comment #5 to the zd1201 driver on a 64 bits kernel and USB 2.0, with the same laptop (broadcom+ndiswrapper, 64bits). I have to redo this test to be sure that it fails, but it seems zd1201 is not 64 bits compatible or there is a problem with USB 2.0 (the only difference with the computer that works);
c. rtxx, b43 and bcm43xx drivers are obviously not able to connect to ad-hoc networks. Now, rtxx explicitly forbids it, but b43 still pretends to be able to do that, errouneously.

So this bug may be shared between zd1201 (not working in 64bits mode or on USB 2.0 hosts, will confirm that tomorrow) and b43 (not able to connect an ad-hoc network). For b43, the bug is confirmed at least: it can send packets (I see them on the zd1201 interface on the computer where it works), but it does not receive any (stats shows they are sent on the zd1201 end).

If I manage to connect to zd1201 running it on my 64 bits kernel, this bug would be a b43 only bug indeed. I'll retest that configuration tomorrow.
Comment 20 Christian Casteyde 2007-11-24 14:01:43 UTC
I've retested ad-hoc on zd1201 with the key on USB2 on a 64 bit kernel, it works if the other end is ndiswrapper.

So zd1201 driver is OK, and the problem was simply this:
"ad-hoc mode is broken in b43."

Maybe b43 does not support at all ad-hoc mode, but it nearly does (it seems it's only not receiving packets, indeed it manages to send them).
At least this bug is not a regression, and it may be an evolution if b43 was not designed to work in ad-hoc mode (I do not consider bcm43xx to be interesting to fix for that since b43 is better now).
Comment 21 Christian Casteyde 2007-11-24 14:03:00 UTC
Changing the bug description accordingly (the bug description in comment #1 was finally right, except the faulty drivers were both b43 and rt25xx and not zd1201). 
Comment 22 John W. Linville 2008-07-02 07:15:09 UTC
Are you still having these issues with current kernels?
Comment 23 Christian Casteyde 2008-07-04 13:27:17 UTC
Yes. It still does not work neither with 2.6.25.4, nor 2.6.26-rc8.

Moreover, with 2.6.26-rc8 compiled with debug info, I got another circular lock dependancy warning, you can see it in the appended dmesg log.

What I did to get this lock problem was simply trying to connect to a "MonRezo2" WEP ad-hoc mode wireless network with the following wrong command:
iwconfig essid MonRezo mode ad-hoc channel 12 key 123456

with an existing "MonRezo" WPA managed mode router in the area (the previous command should have used MonRezo2 instead of MonRezo).
Comment 24 Christian Casteyde 2008-07-04 13:31:46 UTC
Created attachment 16737 [details]
circular lock warning

well, seems there was udev around there when the wireless was up.
I forgot to tell that after the iwconfig command, I issue:

ifconfig eth1 up
to get the link, that was what caused the warning.
Comment 25 Alex Dubov 2009-12-08 13:46:41 UTC
Recently I tried to establish ad-hoc network between b43 (2.6.31) and atheros (win xp) based cards. I was able to establish the network both in unsecured and wep mode, however, the connection survived for only couple of minutes at most (machines standing near each other and no disturbance nearby). After some short time, connection seems to abort, whereupon windows peeks a new bssid for the ad-hoc network. I have a reason to believe that the problems lies with the b43, even through there are no indications in the system log.
Comment 26 Christian Casteyde 2011-09-01 16:55:44 UTC
As far as I'm concerned, I won't be able to test anything on b43 anymore, since my laptop passed away and the new one don't have this kind of hardware.
Also I don't know the status of this bug.