Bug 51221 - [BISECTED]Bluetooth connections not working with 0a5c:201e Broadcom Corp. IBM Integrated Bluetooth IV
Summary: [BISECTED]Bluetooth connections not working with 0a5c:201e Broadcom Corp. IBM...
Status: NEW
Alias: None
Product: Drivers
Classification: Unclassified
Component: Bluetooth (show other bugs)
Hardware: All Linux
: P1 normal
Assignee: linux-bluetooth@vger.kernel.org
URL: https://bugs.launchpad.net/ubuntu/+so...
Keywords:
Depends on:
Blocks:
 
Reported: 2012-12-02 16:49 UTC by Vittorio Gambaletta (VittGam)
Modified: 2013-12-31 00:20 UTC (History)
3 users (show)

See Also:
Kernel Version: 3.12
Subsystem:
Regression: Yes
Bisected commit-id:


Attachments
Dmesg from boot to pand to connection stalling (60.30 KB, text/plain)
2012-12-03 12:37 UTC, Vittorio Gambaletta (VittGam)
Details
btmon with the non-working hardware (2.46 KB, text/plain)
2012-12-03 20:07 UTC, Vittorio Gambaletta (VittGam)
Details
btmon with working hardware (2.46 KB, text/plain)
2012-12-03 20:08 UTC, Vittorio Gambaletta (VittGam)
Details

Description Vittorio Gambaletta (VittGam) 2012-12-02 16:49:08 UTC
Bluetooth connections do not work: the rx channel works, while the tx stalls after a few seconds of connection, and timeouts disconnecting the connection after approximately one minute.

The device is 0a5c:201e Broadcom Corp. IBM Integrated Bluetooth IV, as seen on lsusb; it uses the generic btusb driver.

This bug has been reported in the following places too, but I could not find any follow-ups from kernel developers, so I'm reporting it on this Bugzilla too.

https://bugs.launchpad.net/ubuntu/+source/linux/+bug/950413
http://permalink.gmane.org/gmane.linux.kernel/1279113
http://www.spinics.net/lists/linux-bluetooth/msg23498.html


I've tested the latest mainline kernel 3.6.8-030608.201211271040 from http://kernel.ubuntu.com/~kernel-ppa/mainline/v3.6.8-raring/ , this should be the latest according to the timestamp, please tell me if I need to test with another version.

I'm using Ubuntu 12.04 on an IBM ThinkPad X41.


Terminal output demonstrating the bug:

# pand --connect 00:11:22:33:44:55 --nodetach
pand[11002]: Bluetooth PAN daemon version 4.98
pand[11002]: Connecting to 00:11:22:33:44:55
pand[11002]: bnep0 connected
# pand -l
bnep0 55:44:33:22:11:00 PANU // the reversed mac address is another bug... ;)
# ifconfig bnep0
bnep0     Link encap:Ethernet  HWaddr 00:aa:bb:cc:dd:ee
          BROADCAST MULTICAST  MTU:1500  Metric:1
          RX packets:4 errors:0 dropped:0 overruns:0 frame:0
          TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
          collisioni:0 txqueuelen:1000 
          Byte RX:736 (736.0 B)  Byte TX:0 (0.0 B)

(after some time)

# ifconfig bnep0
bnep0: error fetching interface information: Device not found
# pand -l
# dmesg
(...)
[ 3145.660165] Bluetooth: hci0 link tx timeout
[ 3145.660173] Bluetooth: hci0 killing stalled connection 00:11:22:33:44:55
Comment 1 Alan 2012-12-03 10:04:45 UTC
Can you attach the full dmesg of booting and repeating this
Comment 2 Vittorio Gambaletta (VittGam) 2012-12-03 12:22:03 UTC
I'm currently trying to debug this: it seems that the device does not generate the Number of Completed Packets event (0x13 HCI_EV_NUM_COMP_PKTS), so the hdev->acl_cnt never gets incremented in the hci_num_comp_pkts_evt function and the hci_sched_acl function never sends ACL frames.
As of now my dmesg is full of debug messages; I'm now going to reboot with the stock drivers and attach the dmesg here.
Comment 3 Vittorio Gambaletta (VittGam) 2012-12-03 12:37:11 UTC
Created attachment 88261 [details]
Dmesg from boot to pand to connection stalling
Comment 4 Vittorio Gambaletta (VittGam) 2012-12-03 12:39:31 UTC
Actually only 10 ACL frames per connection get sent out, because the acl_cnt gets initialized to acl_pkts once, and acl_pkts is 10 for this card. When it reaches 0, it is never incremented again due to the missing 0x13 event, and the tx packet flow stops.
Comment 5 Vittorio Gambaletta (VittGam) 2012-12-03 15:40:50 UTC
Two other Broadcom based USB bluetooth dongles I have (0a5c:2101 Broadcom Corp. Bluetooth Controller, and 0a5c:200a Broadcom Corp. BCM2035 Bluetooth dongle) work perfectly with the same btusb driver.
I can exclude it's a problem with the remote devices I have (the two USB dongles, and I've tried with an iPhone 4 and some Samsung phones too with the same results), because with the 2101 and 200a (the usb sticks) everything works perfectly, while with the 201e (the thinkpad integrated card) nothing works correctly.
I think it's not a problem with baseband ACKs from the remote host that get lost, because rx ACL packets and every other type of packets get through normally, so the antenna and radio should be in good conditions.
Comment 6 Gustavo Padovan 2012-12-03 18:32:07 UTC
Can you provide us the output of btmon monitor tool. You can find this tool on bluez sources[1]. Build bluez with "bootstrap-configure && make" then run monitor/btmon, reproduce the issue, and post the Bluetooth output here.

[1] http://git.kernel.org/?p=bluetooth/bluez.git;a=summary
Comment 7 Vittorio Gambaletta (VittGam) 2012-12-03 19:45:12 UTC
I'm going to do that.

In the meantime I did install the IBM/Broadcom Bluetooth Stack on a Virtualbox with Windows XP, and the connection works there. If you also need some captures of the USB stack of the Windows guest and/or the Linux host, please ask.
Comment 8 Vittorio Gambaletta (VittGam) 2012-12-03 20:07:59 UTC
Created attachment 88331 [details]
btmon with the non-working hardware
Comment 9 Vittorio Gambaletta (VittGam) 2012-12-03 20:08:26 UTC
Created attachment 88341 [details]
btmon with working hardware
Comment 10 Vittorio Gambaletta (VittGam) 2012-12-03 20:11:19 UTC
Unfortunately I can't see differences between the btmon output from the ThinkPad module and the output from a working USB dongle (except the timestamps, the hci ID (hci0/hci1) and the handle (6 versus 11))...
Comment 11 Paul Bolle 2013-11-10 11:10:16 UTC
0) Summary: perhaps this is not (only) a kernel problem, but (also) a bluez problem.

1)  I ran into issues pairing the same bluetooth controller (0a5c:201e), also built into an (outdated) ThinkPad X41. But these issues were only seen when I tried to pair with a bluetooth headset. For some reason that headset aborts the pairing process after a few seconds.

2) Pairing with an (equally outdated) PDA did work. Pairing the troubling headset with some other Laptop I have lying around (which is using another Broadcom bluetooth controller) does work.

3) The current non working setup is Fedora 18, which is using bluez 4.101, and a locally built v3.12 kernel.

4) Quite a bit of testing with Fedora Live disks showed that the last working Live disk was Fedora 14, running a v2.6.35 kernel and bluez 4.71 The first Live disk with issues was Fedora 15, running a v2.6.38 kernel and bluez 4.87.

5) Downgrading bluez to Fedora 14's 4.71 on my Fedora 18 system seems to be enough to get pairing working again. (Downgrading to 4.71 required manually copying libudev.so.0.9.1 into /usr/lib/. Readers are warned!) I 

6) I'm not sure I've correctly identified the problem. But my current thinking is that changes in bluez between 4.71 and 4.87 trigger issues in how this Broadcom controller and that headset behave during pairing. Does that make sense?
Comment 12 Paul Bolle 2013-11-10 13:41:22 UTC
(In reply to Paul Bolle from comment #11)
> But my current
> thinking is that changes in bluez between 4.71 and 4.87 trigger issues in
> how this Broadcom controller and that headset behave during pairing.

With rpms (built for Fedora 15) downloaded from koji.fedoraproject.org I've been able to narrow this down to 4.79 and 4.80. To be continued.

(Note that the 4.80 error mode is slightly different. There the headset seems to keep waiting for a pairing request (or whatever) forever.)
Comment 13 Paul Bolle 2013-11-11 10:21:39 UTC
(In reply to Paul Bolle from comment #12)
> With rpms (built for Fedora 15) downloaded from koji.fedoraproject.org I've
> been able to narrow this down to 4.79 and 4.80. To be continued.

I've bisected this down to bluez commit b8486f046aadbe0145519c86a087e880fd1d638f ("Move more hciops specific functionality into hciops").

I'm not sure why things change after this commit. But please note that this commit does move code around that is specific to this Broadcom controller. So this bisect isn't obviously bogus. See this annotation snippet:

+static uint8_t get_inquiry_mode(int index)
+{
[...]
+       if (VER(index).manufacturer == 15) {
[...]
+               if (VER(index).hci_rev == 0x09 &&
+                                       VER(index).lmp_subver == 0x6963)
+                       return 1;
[That is this controller!]
[...]
+       }
[...]
+       return 0;
+}

> (Note that the 4.80 error mode is slightly different. There the headset
> seems to keep waiting for a pairing request (or whatever) forever.)

Up to 4.81 the headset (apparently) stays in pairing mode. Perhaps it never receives a request. By 4.85 that behavior changes. The headset leaves pairing mode if one tries to pair from the laptop using this controller. I haven't yet bothered to further pinpoint the commit that causes that change in behavior.

Anyhow, it is getting likely that this is not a kernel bug. Should I move this to a bluez specific bug tracker?
Comment 14 Paul Bolle 2013-11-11 10:38:53 UTC
Note that the bluez versions mentioned in the three reports link in comment #0 are:
- 4.98
- 4.96, 4.98 and 4.99
- 4.99

So, at least, these reports don't contradict my findings.
Comment 15 Paul Bolle 2013-12-29 22:56:09 UTC
For the record: problem still reproducible on Fedora 20 (that currently ships the bluez 5.12-2.fc20 package).
Comment 16 Paul Bolle 2013-12-31 00:20:35 UTC
0) The Kernel Version was bumped to 3.12. That's fine with me, but I should add that I reproduced this on Fedora 20 while running v3.13-rc5.

1) I can't get the last working version of bluez (4.79+) to work on Fedora 20. The Gnome 3 GUI tools require bluez-5. And trying to get pairing working using CLI tools proved beyond me. (I think there were dbus related issues that turned out to be really stubborn.) If someone could talk me through pairing using 4.79-era CLI tools that would be rather nice.

2) For the record: I did set up a Fedora 19 live image, and forced it to use the last working version of bluez (ie, 4.79+). That at least made pairing possible. (No sound though, but I decided not to investigate that issue.)

3) Another note: all code special casing this controller was removed in bluez-5.

Note You need to log in before you can comment on or make changes to this bug.