Bug 216936 - First attempt to upload firmware for Intel Bluetooth fails (a timing issue?)
Summary: First attempt to upload firmware for Intel Bluetooth fails (a timing issue?)
Status: RESOLVED CODE_FIX
Alias: None
Product: Drivers
Classification: Unclassified
Component: Bluetooth (show other bugs)
Hardware: AMD Linux
: P1 normal
Assignee: linux-bluetooth@vger.kernel.org
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2023-01-15 20:35 UTC by Artem S. Tashkinov
Modified: 2024-04-12 21:40 UTC (History)
5 users (show)

See Also:
Kernel Version: 6.1.4
Subsystem:
Regression: No
Bisected commit-id:


Attachments

Description Artem S. Tashkinov 2023-01-15 20:35:14 UTC
This looks like a new regression for kernel 6.1.6.

Kernel 6.1.4 and previous kernels didn't have the issue.

Hardware: ID 8087:0025 Intel Corp. Wireless-AC 9260 Bluetooth Adapter

From dmesg:

Bluetooth: hci0: FW download error recovery failed (-19)
Bluetooth: hci0: sending frame failed (-19)
Bluetooth: hci0: Failed to read MSFT supported features (-19)

Full log:

Bluetooth: hci0: Found device firmware: intel/ibt-18-16-1.sfi
Bluetooth: hci0: Boot Address: 0x40800
Bluetooth: hci0: Firmware Version: 214-6.22
Bluetooth: hci0: FW download error recovery failed (-19)
Bluetooth: hci0: sending frame failed (-19)
Bluetooth: hci0: Failed to read MSFT supported features (-19)
Bluetooth: hci0: Bootloader revision 0.1 build 42 week 52 2015
Bluetooth: hci0: Device revision is 2
Bluetooth: hci0: Secure boot is enabled
Bluetooth: hci0: OTP lock is enabled
Bluetooth: hci0: API lock is enabled
Bluetooth: hci0: Debug lock is disabled
Bluetooth: hci0: Minimum firmware build 1 week 10 2014
Bluetooth: hci0: Found device firmware: intel/ibt-18-16-1.sfi
Bluetooth: hci0: Boot Address: 0x40800
Bluetooth: hci0: Firmware Version: 214-6.22
Bluetooth: hci0: Waiting for firmware download to complete
Bluetooth: hci0: Firmware loaded in 1111205 usecs
Bluetooth: hci0: Waiting for device to boot
Bluetooth: hci0: Device booted in 14844 usecs
Bluetooth: hci0: Malformed MSFT vendor event: 0x02
Bluetooth: hci0: Found Intel DDC parameters: intel/ibt-18-16-1.ddc
Bluetooth: hci0: Applying Intel DDC parameters completed
Bluetooth: hci0: Firmware revision 0.1 build 214 week 6 2022
Comment 1 Artem S. Tashkinov 2023-01-15 20:38:20 UTC
I see no Bluetooth related changes in 6.1.6 or 6.1.5. This is probably a timing issue but it's still not nice to see.

If there's any additional info I could provide, please let me know.
Comment 2 Artem S. Tashkinov 2023-01-15 22:26:37 UTC
I had these errors with 6.1.4 as well.
Comment 3 The Linux kernel's regression tracker (Thorsten Leemhuis) 2023-01-16 12:42:59 UTC
(In reply to Artem S. Tashkinov from comment #2)
> I had these errors with 6.1.4 as well.

But earlier series (e.g. 6.0.y) worked fine?
Comment 4 Mario Limonciello (AMD) 2023-01-17 05:58:57 UTC
> Kernel 6.1.4 and previous kernels didn't have the issue.
> I had these errors with 6.1.4 as well.

Which one is it?

If you can find a definitive kernel it really was working 100% of the time a bisect would be ideal.  But given the ambiguous result you indicated with 6.1.4, I suspect it's not really ever been 100% correct.

Perhaps turning on dynamic debugging for net/bluetooth would be helpful to identify what about the sequence of init causes -ENODEV.  Naively it looks to me that the main reason this can get returned is if a command is issued while the device is actively being unregistered.

So perhaps your userspace is racing?
Comment 5 Artem S. Tashkinov 2023-01-31 03:01:00 UTC
This seemingly happens only after rebooting from Windows 10. It does not happen when cold booting.

I guess if the firmware is already uploaded, then there's an error.

Maybe it happened before, I just didn't reboot after Windows so often.
Comment 6 Artem S. Tashkinov 2023-01-31 03:06:06 UTC
Johannes, could you take a look please?
Comment 7 Johannes Berg 2023-01-31 09:11:25 UTC
No, I don't know anything about Bluetooth.
Comment 8 James Addison (inactive) 2023-04-29 16:35:30 UTC
I'm seeing a similar problem intermittently, since at least 6.1.0 of the kernel.

Hardware: Intel Corporation Wireless 8260 [8086:24f3]

(yes, it reads like the name a wi-fi card, but it does also contain a bluetooth controller)

Here are the oldest relevant logs that I have available:

Mar 23 21:19:13 kernel: Linux version 6.1.0-6-amd64 (debian-kernel@lists.debian.org) (gcc-12 (Debian 12.2.0-14) 12.2.0, GNU ld (GNU Binutils for Debian) 2.40) #1 SMP PREEMPT_DYNAMIC Debian 6.1.15-1 (2023-03-05)
...
Mar 23 21:19:14 kernel: usb 1-7: USB disconnect, device number 2
Mar 23 21:19:14 kernel: Bluetooth: hci0: Failed to send firmware data (-19)
Mar 23 21:19:14 kernel: Bluetooth: hci0: sending frame failed (-19)
Mar 23 21:19:14 kernel: Bluetooth: hci0: FW download error recovery failed (-19)


The logs contain 165 occurrences of "hci0: Waiting for firmware download to complete", and 9 occurrences of "hci0: FW download error recovery failed".

I'm not yet sure whether there's a specific series of actions to recreate the failure.
Comment 9 James Addison (inactive) 2023-04-30 08:44:21 UTC
> I'm not yet sure whether there's a specific series of actions to recreate the
> failure.

Maybe a clue: of three recent boots with the 8260, two were "cold" (system booted entirely from poweroff state), and one was "warm" (a system reboot).  I'm not sure if that terminology is correct, but: the "Failed to send firmware data" error occurred during both of the cold boots, and did not appear during the warm boot.

Does any of that seem relevant in your (9260) case, Artem?
Comment 10 Artem S. Tashkinov 2023-05-12 08:58:26 UTC
I can confirm that firmware gets loaded fine on a cold boot and

I get errors after rebooting from Windows.

This is reproducible in 6.2.10.

Bluetooth: hci0: Found device firmware: intel/ibt-18-16-1.sfi
Bluetooth: hci0: Boot Address: 0x40800
Bluetooth: hci0: Firmware Version: 214-6.22
Bluetooth: hci0: FW download error recovery failed (-19)
Bluetooth: hci0: sending frame failed (-19)
Bluetooth: hci0: Failed to read MSFT supported features (-19)
Bluetooth: hci0: Bootloader revision 0.1 build 42 week 52 2015
Bluetooth: hci0: Device revision is 2
Bluetooth: hci0: Secure boot is enabled
Bluetooth: hci0: OTP lock is enabled
Bluetooth: hci0: API lock is enabled
Bluetooth: hci0: Debug lock is disabled
Bluetooth: hci0: Minimum firmware build 1 week 10 2014
Bluetooth: hci0: Found device firmware: intel/ibt-18-16-1.sfi
Bluetooth: hci0: Boot Address: 0x40800
Bluetooth: hci0: Firmware Version: 214-6.22
Bluetooth: hci0: Waiting for firmware download to complete
Bluetooth: hci0: Firmware loaded in 1104392 usecs
Bluetooth: hci0: Waiting for device to boot
Bluetooth: hci0: Device booted in 14640 usecs
Bluetooth: hci0: Malformed MSFT vendor event: 0x02
Bluetooth: hci0: Found Intel DDC parameters: intel/ibt-18-16-1.ddc
Bluetooth: hci0: Applying Intel DDC parameters completed
Bluetooth: hci0: Firmware revision 0.1 build 214 week 6 2022

BTW "Secure boot is enabled" is wrong. I have it disabled.
Comment 11 The Linux kernel's regression tracker (Thorsten Leemhuis) 2023-05-13 10:59:54 UTC
So can anyone reliably reproduce this now? I ask, as I fear this might never lead to anything unless we find the change where the problem started to happen using a bisection.
Comment 12 espritlibre 2023-07-11 14:44:08 UTC
i have the same issue

Jul 11 16:06:56 kernel: thinkpad_acpi: rfkill switch tpacpi_bluetooth_sw: radio is unblocked
Jul 11 16:06:57 kernel: Bluetooth: Core ver 2.22
Jul 11 16:06:57 kernel: NET: Registered PF_BLUETOOTH protocol family
Jul 11 16:06:57 kernel: Bluetooth: HCI device and connection manager initialized
Jul 11 16:06:57 kernel: Bluetooth: HCI socket layer initialized
Jul 11 16:06:57 kernel: Bluetooth: L2CAP socket layer initialized
Jul 11 16:06:57 kernel: Bluetooth: SCO socket layer initialized
Jul 11 16:06:57 kernel: Bluetooth: hci0: Device revision is 0
Jul 11 16:06:57 kernel: Bluetooth: hci0: Secure boot is enabled
Jul 11 16:06:57 kernel: Bluetooth: hci0: OTP lock is enabled
Jul 11 16:06:57 kernel: Bluetooth: hci0: API lock is enabled
Jul 11 16:06:57 kernel: Bluetooth: hci0: Debug lock is disabled
Jul 11 16:06:57 kernel: Bluetooth: hci0: Minimum firmware build 1 week 10 2014
Jul 11 16:06:57 kernel: Bluetooth: hci0: Bootloader timestamp 2019.40 buildtype 1 build 38
Jul 11 16:06:57 kernel: Bluetooth: hci0: Found device firmware: intel/ibt-0040-0041.sfi
Jul 11 16:06:57 kernel: Bluetooth: hci0: Boot Address: 0x100800
Jul 11 16:06:57 kernel: Bluetooth: hci0: Firmware Version: 98-13.23
Jul 11 16:06:57 NetworkManager[1454]: <info>  [1689084417.8760] Loaded device plugin: NMBluezManager (/usr/lib/NetworkManager/1.42.6-1/libnm-device-plugin-bluetooth.so)
Jul 11 16:06:58 rfkill[1709]: block set for type bluetooth
Jul 11 16:06:58 systemd[1557]: Reached target Bluetooth.
Jul 11 16:06:58 kernel: Bluetooth: hci0: Failed to send firmware data (-19)
Jul 11 16:06:58 kernel: Bluetooth: hci0: sending frame failed (-19)
Jul 11 16:06:58 kernel: Bluetooth: hci0: FW download error recovery failed (-19)
Jul 11 16:06:58 kernel: Bluetooth: hci0: sending frame failed (-19)
Jul 11 16:06:58 kernel: Bluetooth: hci0: Failed to read MSFT supported features (-19)
Jul 11 16:06:58 systemd[1]: Starting Bluetooth service...
Jul 11 16:06:58 systemd[1557]: Stopped target Bluetooth.
Jul 11 16:06:58 bluetoothd[1842]: Bluetooth daemon 5.68
Jul 11 16:06:58 systemd[1]: Started Bluetooth service.
Jul 11 16:06:58 systemd[1]: Reached target Bluetooth Support.
Jul 11 16:06:58 systemd[1]: Stopped target Bluetooth Support.
Jul 11 16:06:58 kernel: Bluetooth: BNEP (Ethernet Emulation) ver 1.3
Jul 11 16:06:58 kernel: Bluetooth: BNEP filters: protocol multicast
Jul 11 16:06:58 kernel: Bluetooth: BNEP socket layer initialized
Jul 11 16:06:58 bluetoothd[1842]: Bluetooth management interface 1.22 initialized

i'm not exactly sure when this started happening but i can say it's at least since kernel 6.1.x
i still have these errors on linux-hardened 6.1.38 and linux 6.4.2 and for me it doesn't matter if they appear on a cold or warm boot, for example i can have 3 cold boots in a row without erros and on the fourth cold boot these errors appear.
i can still connect my phone via bluetooth and send files to my phone and back, when i have these errors but whenever i have these errors (most of the time) turning bluetooth on most likely results in a kernel panic and i have to hard power off the laptop which is fscking up my drive (sysrq/reisub ain't working)
my wifi card is AX211

i can try to bisect, if someone gives me instructions how to...
Comment 13 Mario Limonciello (AMD) 2023-07-11 14:48:51 UTC
https://www.kernel.org/doc/html/v6.4/admin-guide/bug-bisect.html
Comment 14 espritlibre 2023-07-12 16:34:04 UTC
i made some progress, i found the kernel version which introduced this bug.
good linux-5.19.9.arch1-1
bad linux-6.0.arch1-1

i started the bisect but while building the 2nd kernel i got build errors. i will start from scratch when i find the time again, this already took me a couple hours due to the annoying nature of this bug...
Comment 15 espritlibre 2023-07-12 23:11:50 UTC
sorry! forget what i wrote, 5.19.11 seems to be bad too...

i think this could be a race condition between user space and kernel. because i start my notbeook always in flight mode, in my case it's gnome/wayland. when i turn off flight mode, the bluetooth firmware gets loaded.

if i turn flight mode off and quickly on again, so the firmware download gets interrupted and results in a:
kernel: Bluetooth: hci0: FW download error recovery failed (-19)
most of the time

or:
kernel: Bluetooth: hci0: FW download error recovery failed (-108)

i think on boot, the kernel trys to upload the firmware and then the flight mode kicks in, resulting in the error, sometimes. i need to pay more attention and not use flight mode and see if the error appears again.
Comment 16 The Linux kernel's regression tracker (Thorsten Leemhuis) 2023-07-13 14:06:08 UTC
(In reply to espritlibre from comment #14)
> i made some progress, i found the kernel version which introduced this bug.
> good linux-5.19.9.arch1-1

Here you talk about a arch kernel, but...

(In reply to espritlibre from comment #15)
> sorry! forget what i wrote, 5.19.11 seems to be bad too...

...here you do not. If that's also an arch kernel and if arch kernels are really close to vanilla, than you want to bisect between the two (unless 5.19.9 turns out to be broken on retesting).

Sight note: it's unlikely that 6.5-rc1 accidentally fixes this, but might be worth a quick test.
Comment 17 espritlibre 2023-07-29 20:09:18 UTC
sorry for getting back to this late and the confusion i caused, it's a busy time... the kernels i use are arch kernel, according to the ArchWiki they are pretty much vanilla upstream kernel with some patches applied, but for the current kernel i'm running i see no patches in the PKGBUILD.
i have been observing journal in the and can say that even though i'm booting without being in flightmode but wifi turned on, sometimes

> kernel: Bluetooth: hci0: Failed to send firmware data (-19)
> kernel: Bluetooth: hci0: sending frame failed (-19)
> kernel: Bluetooth: hci0: FW download error recovery failed (-19)
> kernel: Bluetooth: hci0: sending frame failed (-19)
> kernel: Bluetooth: hci0: Reading supported features failed (-19)
> kernel: Bluetooth: hci0: Error reading debug features
> kernel: Bluetooth: hci0: sending frame failed (-19)
> kernel: Bluetooth: hci0: Failed to read MSFT supported features (-19)

appears, or:

> kernel: Bluetooth: hci0: Failed to send firmware data (-19)
> kernel: Bluetooth: hci0: FW download error recovery failed (-108)
> kernel: Bluetooth: hci0: sending frame failed (-19)
> kernel: Bluetooth: hci0: sending frame failed (-19)
> kernel: Bluetooth: hci0: Reading supported features failed (-19)
> kernel: Bluetooth: hci0: Error reading debug features
> kernel: Bluetooth: hci0: sending frame failed (-19)
> kernel: Bluetooth: hci0: Failed to read MSFT supported features (-19)

i played with bluetooth and the kernel didn't panic.


the last panic i got was 2 days ago and it went like this: https://0x0.st/H2bz.txt (uploaded for better readability, no javascript needed)

i haven't tried 6.5-rc1, but yesterday i updated to linux-6.4.7.arch1-1. since this bug is so annoying and they happen at random it's hard for me to bisect and i'm not confident in saying kernel X is bad or good. i have set up kdump now, which will automatically dump the crash. i hope this will lead to something useful, i just booted the system 7-8 times but couldn't trigger a panic.
Comment 18 Artem S. Tashkinov 2023-09-20 10:53:32 UTC
This looks to be fixed in 6.5.
Comment 19 James Addison (inactive) 2023-09-24 10:11:03 UTC
Although I experienced slightly different symptoms -- cold boot firmware downloads failed; warm/reboot firmware downloads succeeded -- when using an 8260 card, I can confirm that the problem appears resolved with a Debian 6.5.0 kernel.
Comment 20 Artem S. Tashkinov 2024-04-12 15:37:29 UTC
This is still reproducible in 6.8.5:

dmesg -t | grep -i blue
Bluetooth: Core ver 2.22
NET: Registered PF_BLUETOOTH protocol family
Bluetooth: HCI device and connection manager initialized
Bluetooth: HCI socket layer initialized
Bluetooth: L2CAP socket layer initialized
Bluetooth: SCO socket layer initialized
Bluetooth: hci0: Found device firmware: intel/ibt-18-16-1.sfi
Bluetooth: hci0: Boot Address: 0x40800
Bluetooth: hci0: Firmware Version: 173-5.24
Bluetooth: BNEP (Ethernet Emulation) ver 1.3
Bluetooth: BNEP socket layer initialized
Bluetooth: hci0: FW download error recovery failed (-19)
Bluetooth: hci0: sending frame failed (-19)
Bluetooth: hci0: Reading supported features failed (-19)
Bluetooth: hci0: Error reading debug features
Bluetooth: hci0: HCI LE Coded PHY feature bit is set, but its usage is not supported.
Bluetooth: hci0: sending frame failed (-19)
Bluetooth: hci0: Failed to read MSFT supported features (-19)
Bluetooth: hci0: Bootloader revision 0.1 build 42 week 52 2015
Bluetooth: hci0: Device revision is 2
Bluetooth: hci0: Secure boot is enabled
Bluetooth: hci0: OTP lock is enabled
Bluetooth: hci0: API lock is enabled
Bluetooth: hci0: Debug lock is disabled
Bluetooth: hci0: Minimum firmware build 1 week 10 2014
Bluetooth: hci0: Found device firmware: intel/ibt-18-16-1.sfi
Bluetooth: hci0: Boot Address: 0x40800
Bluetooth: hci0: Firmware Version: 173-5.24
Bluetooth: hci0: Waiting for firmware download to complete
Bluetooth: hci0: Firmware loaded in 1129749 usecs
Bluetooth: hci0: Waiting for device to boot
Bluetooth: hci0: Device booted in 14839 usecs
Bluetooth: hci0: Malformed MSFT vendor event: 0x02
Bluetooth: hci0: Found Intel DDC parameters: intel/ibt-18-16-1.ddc
Bluetooth: hci0: Applying Intel DDC parameters completed
Bluetooth: hci0: Firmware revision 0.1 build 173 week 5 2024
Bluetooth: hci0: HCI LE Coded PHY feature bit is set, but its usage is not supported.
Bluetooth: MGMT ver 1.22
Bluetooth: ISO socket layer initialized
Bluetooth: RFCOMM TTY layer initialized
Bluetooth: RFCOMM socket layer initialized
Bluetooth: RFCOMM ver 1.11

No timings, btintel just tries to load the firmware when it's already there (if you're rebooting from Windows).
Comment 21 Wren Turkal 2024-04-12 19:56:38 UTC
I got a similar issue with btqca. I wonder if this is something bigger than a single module.
Comment 22 Artem S. Tashkinov 2024-04-12 20:02:02 UTC
It's not a big issue that's why I kept it close.

I just wonder if it's possible to poll HW and ask it "Hey, do you already have firmware uploaded?" because it surely looks like Linux tries to upload it without much success twice only to fail and realize it's not necessary.

I'm not sure btqca is affected by the same issue and it's not like too many people casually switch between OSes all the time.

At the same time AFAIK ALSA has quite a number of workarounds to deal with the exact same situation: Windows initializes audio HW a certain way and then it starts malfunctioning after rebooting into Linux. It was the case years ago, I've not heard much if any about this recently.

TBO, it more looks like the case of Windows not using cold reboot (and AFAIK by default Linux doesn't do it either) but why would it if e.g. makes booting slower.
Comment 23 Wren Turkal 2024-04-12 21:40:01 UTC
AFAIK, no OS uses a cold reboot for the reboot option. Cold reboot implies turning the machine off. Every arch I know of requires a button presses to turn it back on. Very different then just resetting the computer without powering down.

Is there no way to reset the hardware such that it is in the state it is before the power button is pressed?

Note You need to log in before you can comment on or make changes to this bug.