Bug 202515

Summary: Bluetooth LE Extended Connect returning Command Disallowed
Product: Drivers Reporter: Mathias Tillman (master.homer)
Component: BluetoothAssignee: linux-bluetooth (linux-bluetooth)
Severity: high CC: emil.lenngren, marcel, szymon.janc
Priority: P1    
Hardware: All   
OS: Linux   
Kernel Version: v4.19 Tree: Mainline
Regression: No
Attachments: Non-working btmon log
Working btmon log
Btmon extended working log
Check if ext scanning and ext params are supported independently

Description Mathias Tillman 2019-02-06 00:01:03 UTC
Ever since Linux v4.19 I haven't been able to connect to my mouse via bluetooth LE, using an Intel 9260 (supporting AC Wifi and Bluetooth 5) with a pci-e adaptor.

Basically, the device doesn't pair at all - dmesg just says:
Bluetooth: hci0: request failed to create LE connection: status 0x0c.

I've been able to narrow this down to commit 4d94f95d30c8fbfe86068e9abed110974d697cf5, which introduced extended LE Connect if supported by the card.

To get some more info, I ran hcidump, and the relevant output says:
< HCI Command: Unknown (0x08|0x0043) plen 26
> HCI Event: Command Status (0x0f) plen 4
    Unknown (0x08|0x0043) status 0x0c ncmd 1
    Error: Command Disallowed

On a working system (using v4.18), with the legacy connect command, the hcidump says:
< HCI Command: LE Create Connection (0x08|0x000d) plen 25
    bdaddr DA:B8:DC:90:75:A7 type 1
    interval 96 window 96 initiator_filter 0
    own_bdaddr_type 0 min_interval 6 max_interval 9
    latency 100 supervision_to 600 min_ce 0 max_ce 0
> HCI Event: Command Status (0x0f) plen 4
    LE Create Connection (0x08|0x000d) status 0x00 ncmd 2

I also grabbed the Local commands supported, which says:
< HCI Command: Read Local Supported Commands (0x04|0x0002) plen 0
> HCI Event: Command Complete (0x0e) plen 68
    Read Local Supported Commands (0x04|0x0002) ncmd 1
    status 0x00
    Commands: fffffb03ccffefffff3ffc9ff30fe8fe3ff78fff1c00040061f7ffff7fb80000

As far as I can see, the kernel checks byte 37 of the local supported commands output, to see if it should enable the Extended connect command or not.

I couldn't find any reports at all about this, and I couldn't really find much info on the extended connect command at all. Is there any way of disabling this? Or would it be at all possible to use the legacy connect command when the extended command fails?

Do let me know if you need any more information.
Comment 1 Mathias Tillman 2019-02-06 08:27:23 UTC
I should mention that I'm trying to pair it with an Elecom DEFT Pro trackball mouse. It works perfectly in Windows and in linux when using the legacy Create Command, so something definitely seems to be broken with the Extended Create Command.
Comment 2 Johan Hedberg 2019-02-06 08:27:47 UTC
According to the Bluetooth core spec, HCI_LE_Extended_Create_Connection is octet 37, bit 7 in the supported commands mask. In the supported commands that your controller returns octet 37 is 0xdf which does have bit 7 set, i.e. it's claiming to support the extended version of the command. The way this kind of issues are normally worked around is by introducing a so-called quirk flag that gets set (usually by the HCI driver) when a problematic controller is identified. It might be worth checking first however if there's a newer firmware available for your controller - perhaps the issue has already been fixed.
Comment 3 Mathias Tillman 2019-02-06 09:07:21 UTC
I believe it is using the latest firmware, I haven't been able to find a later one than the one mentioned in the kernel log anyway:
[    6.478466] Bluetooth: hci0: Found device firmware: intel/ibt-18-16-1.sfi
[    8.361085] Bluetooth: hci0: Found Intel DDC parameters: intel/ibt-18-16-1.ddc

And yeah, the controller clearly reports that the Extended Create Connection command is available, so it is a bit odd that it's not working correctly. I suppose the only way to find out why would be to debug the controller, but I have no idea how one would go about doing that.

I haven't found this issue being reported at all, so I don't know if it's just my card, or if all of them have this issue. I know the 9260 is commonly used in laptops, but I couldn't say to what extent. Perhaps there is someone who has this card that could test, and if it is in fact broken, add a quirk as you suggested that would force legacy mode?
Comment 4 Emil Lenngren 2019-02-06 09:42:10 UTC
Command Disallowed does not mean unsupported hci command.

The Command Disallowed error code indicates that the command requested cannot be executed because the Controller is in a state where it cannot process this command at this time. This error shall not be used for command OpCodes where the error code Unknown HCI Command is valid."

It's usually sent if the controller is busy with some operation which can't be combined with the one you're trying to execute. For example, some controllers only support either BLE scanning or BLE initiating at a time (exposed through LE Read Supported States). In that case, it will send Command Disallowed if scanning is already active. It could also be that you already have an active connection attempt outstanding, in that case it will also send Command Disallowed.

You should use the new "btmon" tool which I assume supports decoding LE Extended Connect packets. Also try capture the packets before (ideally all the way from the hci is brought up) so one knows the whole state and can determine if we actually send a command that we are not allowed to in this state.
Comment 5 Mathias Tillman 2019-02-06 11:11:36 UTC
Created attachment 281015 [details]
Non-working btmon log
Comment 6 Mathias Tillman 2019-02-06 11:11:58 UTC
Created attachment 281017 [details]
Working btmon log
Comment 7 Mathias Tillman 2019-02-06 11:19:07 UTC
Thank you for pointing me towards the btmon tool, that does seem to support decoding the LE Extended Connect packets. I've attached two logs, one working and one non-working. In the working one I used a kernel I compiled myself with the use_ext_conn patched out.
I started btmon, then restarted the bluetooth service and then tried to connect the mouse - I'm hoping that's enough to catch the hci initialisation? Seems like it from the log anyway.

I had a look at the logs, and it seems to disable scanning right before it tries to connect, so I'm not sure that's the problem. However, I'm not familiar with how LE works at all, so I probably shouldn't start speculating.

Let me know if you need anything else, and thank you for the help so far.
Comment 8 Emil Lenngren 2019-02-06 11:58:33 UTC
What I can see the host side (Linux system) did nothing wrong. It's not sending the command in an illegal state. Hence, it seems the BT chip is behaving incorrectly. Maybe someone from Intel can check what's going on here?

Just to be sure, is this the firmware you have? https://git.kernel.org/pub/scm/linux/kernel/git/firmware/linux-firmware.git/commit/?id=ae90c3bce108ac623bc0bae44a7b411e4fa42e7c
Comment 9 Mathias Tillman 2019-02-06 12:18:01 UTC
I compared those firmware files against what I had, and while their hashes were different I am unfortunately getting the same results using those. So yes, I am using those firmware files.
Comment 10 Mathias Tillman 2019-02-06 19:36:18 UTC
Do you know if intel have an email address for Bluetooth bugs I can cc? I know they have one for WiFi bugs, but couldn't find one for bt bugs. Just in case they don't check bugzilla.
Comment 11 Mathias Tillman 2019-02-07 12:59:31 UTC
Created attachment 281037 [details]
Btmon extended working log

This is pretty odd, all of a sudden it has decided to start working without me doing anything. Looking at the log, I can see it has started to use LE Set Extended Scan Enable instead of LE Set Scan Enable - and sure enough, the Local Supported Commands now lists LE Set Extended Scan Parameters (Octet 37 - Bit 5).
I haven't done anything to try and fix it - I even reverted to the bt firmware that's included in my distro, and it's still working.

Any idea what might have caused the Extended Scan command to suddenly appear in the supported list?
Comment 12 Mathias Tillman 2019-02-07 22:55:18 UTC
Right, so it went back to not reporting the LE Set Extended Scan Parameters (Octet 37 - Bit 5) in the supported commands list after shutting off my computer and turning it on again after a short while. So while the card does seem to be a bit buggy, should the kernel actually check for the parameters bit when enabling or disabling ext scanning? From what I can tell, the ext scanning parameters and enabling are two different commands, so shouldn't it check just the ext scanning bit when sending the ext scan enable command, and check the ext scan params bit only when sending the ext scan params command?
Comment 13 Mathias Tillman 2019-02-07 23:05:01 UTC
Created attachment 281053 [details]
Check if ext scanning and ext params are supported independently

As a test I created a patch that removes the set ext scanning params check, and instead only checks it when actually sending the appropriate command (which is once in the code).

It's working perfectly now, and it's using the ext scan command, instead of the regular scan command - which is what caused the Command Disallowed error.

Like I said before, I'm not too familiar with how bt le works, but is there any case where the set ext scanning params would be a requirement for ext scanning to work? Of course, working around buggy firmware that sometimes reports a command as supported, and other times not is far from ideal - but seeing as it is working for me now, I can't see any reason why the commands should depend on each other.
Comment 14 Mathias Tillman 2019-02-07 23:48:27 UTC
Sorry, was too quick to reply. As it turns out, the set ext scanning params command *is* required, or it won't find any LE devices properly when scanning - it will connect to already paired devices though. 
As a test I decided to try and remove the set ext scanning params check completely, and just send it even though it's reported as being not supported - oddly enough this works, and scans are again working properly, so it definitely seems like a problem with the controller.
Comment 15 Szymon Janc 2019-02-08 16:16:29 UTC
Extended Advertising command are mutually exclusive with legacy commands and controller will reject ones if others were already issued. So if one did 'hcitool lescan' before kernel issued any EA commands, it would lead to described behavior of rejecting EA commands with 'Command Disallowed' status code.
Comment 16 Mathias Tillman 2019-02-08 17:35:54 UTC
Yeah, that does seem to be the case - it works perfectly fine when it uses the extended le Scan command instead of the legacy one. The problem really is that the controller only sometimes reports that it supports the set le extended scan params command, which causes it to use legacy scan instead. This is most likely due to a bug in the firmware.

However, I am wondering if there realistically are any controllers that only support the le extended scan command and not the params one, since they are both required to work properly.
Comment 17 Mathias Tillman 2019-02-08 21:39:29 UTC
To make a long story short, I've managed to figure it out: the latest firmware actually fixes it, but for some reason it's not actually loaded on to the controller until I shut down the computer and leave it off for around 10 seconds or so. A simple reboot isn't enough for it to load the new firmware.

I verified this by doing a cold boot with the distro firmware, copying over the new firmware then rebooting several times. None of these times was the LE Set Extended Scan Parameters reported as being supported. I then shut down the computer and turned it on again after roughly 10 seconds, and after that the command was reported. I then copied over the distro firmware, rebooted several times and it was still working. Turned off the computer and on again, and it no longer worked.

So aside from that odd firmware loading requiring a cold boot, it seems to be working as expected without a patched kernel.

Thanks for the replies everyone, and sorry for all of the long-winded messages. I will be marking this as resolved now.