Created attachment 300051 [details] testing script We have a couple of Lenovo machines which have Bluetooth hci (8087:0029), If we run a script like below: for a in {1..100} do echo $a rfkill block $HCI0_ID sleep 1 rfkill unblock $HCI0_ID sleep 1 done The kernel will print "hci0: Failed to read MSFT supported features (-110)" after several round of rfkill block/unblock, and the bluetooth driver can't work anymore from that moment. Sometimes 'rmmod btusb;modprobe btusb" could make the bluetooth driver work again, sometimes need to reboot. We already updated the kernel and linux-firmware to the latest, but still could reproduce above issue. And we run the same test on other machines with different Bluetooth hci like 8087:0026, they all work well with the testing script. The 8087:0029 hci loads the intel/ibt-20-1-3.sfi, we suspect this is an issue on the bluetooth firmware ibt-20-1-3.sfi, since 8087:0026 shares the same kernel driver with 0029, but only 0029 has this issue. And the testing script is attached.
Hi Hui Wang, What's the kernel version? I tried with the tip of the bluetooth-next(https://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth-next.git/) and didn't see the issue. Could you capture the btmon log and dmesg log? Regards,
Hi Tedd, Forgot to mention, before running the bttest.sh, need to open the gnmoe bluetooth setting, so the testing steps are: - boot the system - open a terminal and run 'dmesg -w' - run gnome-settings->bluetooth (please see the attached picture) - run bttest.sh
Created attachment 300055 [details] picture of gnome bluetooth setting
Created attachment 300057 [details] dmesg of the kernel bluetooth-next
Created attachment 300059 [details] btmon log
And if the desktop is not gnome, we could replace "gnome-settings->bluetooth" with a command-line "bluetoothctl scan on", so the test steps are: - boot the system - open a terminal and run 'dmesg -w' - open another terminal and run 'bluetoothctl scan on' and wait for 5 seconds - run bttest.sh
(In reply to Hui Wang from comment #6) > And if the desktop is not gnome, we could replace > "gnome-settings->bluetooth" with a command-line "bluetoothctl scan on", so > the test steps are: > - boot the system > - open a terminal and run 'dmesg -w' > - open another terminal and run 'bluetoothctl scan on' and wait for 5 seconds > - run bttest.sh Please ignore this comment, "bluetoothctl scan on" doesn't help reproduce the crashing.
It seems like kernel oops in the dmesg and it may causes the issue. [ 46.401807] ------------[ cut here ]------------ [ 46.401808] kernel BUG at net/core/skbuff.c:113! [ 46.401819] invalid opcode: 0000 [#1] PREEMPT SMP [ 46.401826] CPU: 7 PID: 142 Comm: kworker/u33:0 Not tainted 5.16.0-rc1+ #1 [ 46.401831] Hardware name: LENOVO 20U8S0PQ00/20U8S0PQ00, BIOS R19ET36W (1.20 ) 07/12/2021 [ 46.401835] Workqueue: hci0 hci_rx_work [bluetooth] [ 46.401890] RIP: 0010:skb_panic+0x4c/0x4e [ 46.401899] Code: 4f 70 50 8b 87 bc 00 00 00 50 8b 87 b8 00 00 00 50 ff b7 c8 00 00 00 4c 8b 8f c0 00 00 00 48 c7 c7 e8 00 a7 98 e8 46 11 fb ff <0f> 0b 48 8b 55 08 48 c7 c1 60 b1 71 98 e8 a2 ff ff ff 48 c7 c6 b0 [ 46.401903] RSP: 0018:ffffa22980aa3c60 EFLAGS: 00010246 [ 46.401908] RAX: 0000000000000088 RBX: 0000000000000000 RCX: 0000000000000000 [ 46.401911] RDX: 0000000000000000 RSI: ffffffff989a0551 RDI: 00000000ffffffff [ 46.401914] RBP: ffffa22980aa3c80 R08: 0000000000000000 R09: ffffa22980aa3a50 [ 46.401917] R10: 0000000000000001 R11: 0000000000000001 R12: 00000000000000a5 [ 46.401919] R13: 00000000000000a5 R14: ffff9409c4e7c300 R15: 00000000000000aa [ 46.401923] FS: 0000000000000000(0000) GS:ffff940ccefc0000(0000) knlGS:0000000000000000 [ 46.401927] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 46.401930] CR2: 00007fae7c01b398 CR3: 000000040c20f000 CR4: 0000000000350ee0 [ 46.401933] Call Trace: [ 46.401937] <TASK> [ 46.401942] skb_put.cold+0x10/0x10 [ 46.401947] mgmt_device_found+0x3c8/0x4c0 [bluetooth] [ 46.402004] hci_extended_inquiry_result_evt+0x162/0x230 [bluetooth] [ 46.402054] hci_event_packet+0x2cd/0x4b0 [bluetooth] [ 46.402099] ? hci_remote_oob_data_request_evt+0x1f0/0x1f0 [bluetooth] [ 46.402144] hci_rx_work+0x1b2/0x3d0 [bluetooth] [ 46.402190] ? queue_delayed_work_on+0x3d/0x50 [ 46.402198] process_one_work+0x1ec/0x3c0 [ 46.402204] worker_thread+0x4d/0x3d0 [ 46.402210] kthread+0x17a/0x1a0 [ 46.402214] ? process_one_work+0x3c0/0x3c0 [ 46.402218] ? set_kthread_struct+0x40/0x40 [ 46.402223] ret_from_fork+0x22/0x30 [ 46.402232] </TASK> [ 46.402234] Modules linked in: rfcomm bnep nls_iso8859_1 intel_rapl_msr intel_rapl_common snd_ctl_led snd_hda_codec_realtek snd_hda_codec_generic snd_hda_codec_hdmi snd_hda_intel edac_mce_amd snd_intel_dspcfg snd_intel_sdw_acpi uvcvideo snd_hda_codec videobuf2_vmalloc snd_hda_core videobuf2_memops btusb snd_hwdep videobuf2_v4l2 btrtl kvm_amd btbcm snd_pcm input_leds btintel videobuf2_common kvm snd_seq_midi btmtk snd_seq_midi_event bluetooth videodev snd_rawmidi joydev iwlmvm ecdh_generic think_lmi thinkpad_acpi mc rapl ecc serio_raw firmware_attributes_class efi_pstore wmi_bmof snd_seq nvram mac80211 platform_profile libarc4 ledtrig_audio snd_seq_device iwlwifi snd_timer snd_rn_pci_acp3x ucsi_acpi ipmi_devintf snd typec_ucsi snd_pci_acp3x cfg80211 k10temp ipmi_msghandler soundcore typec mac_hid sch_fq_codel msr parport_pc ppdev lp parport sunrpc ip_tables x_tables autofs4 btrfs zstd_compress raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq [ 46.402343] libcrc32c raid1 raid0 multipath linear dm_mirror dm_region_hash dm_log amdgpu iommu_v2 gpu_sched i2c_algo_bit drm_ttm_helper ttm drm_kms_helper syscopyarea sysfillrect rtsx_pci_sdmmc sysimgblt fb_sys_fops crc32c_intel psmouse cec rc_core drm xhci_pci i2c_piix4 xhci_pci_renesas nvme rtsx_pci r8169 realtek nvme_core wmi video i2c_scmi [ 46.402425] ---[ end trace e9aa209601e9071e ]---
This is like fixed by: https://lore.kernel.org/linux-bluetooth/20211213212650.2067066-1-luiz.dentz@gmail.com/T/#t It was fixup on the original patch and force pushed to bluetooth-next yesterday.
@Tedd and Luiz, Got it, will re-test with the latest bluetooth-next. And we didn't experience the kernel oops with ubuntu kernel 5.13 and 5.14, this kernel oops only happen in the bluetooth-next kernel, so I guess the root cause is not the kernel oops. Anyway, I will try to catch the log without the kernel oops (with the fix of comment 9). Thanks.
Indeed, if kernel oops doesn't happen, this issue can't be reproduced with bluetooth-next kernel anymore. I will bisect kernel to find the fixing commit/commits for this bug, then close this bug. Thanks.
Through bisecting, I found the commit cf75ad8b41d2 ("Bluetooth: hci_sync: Convert MGMT_SET_POWERED" fixed the bug I reported. The problem is the commit has 10+ depending commits, it is not easy to bakcport the fix to 5.14 and 5.15 kernels. @Tedd and Luiz, Do you have plan to backport the fix to stable 5.15 kernel at least?
Since the problem is fixed in the latest kernel. close this bug now.