Bug 218726

Summary: qca6390 bluetooth fails after disabling/re-enabling bluetooth
Product: Drivers Reporter: Wren Turkal (wt)
Component: BluetoothAssignee: linux-bluetooth (linux-bluetooth)
Status: NEW ---    
Severity: normal CC: pmenzel+bugzilla.kernel.org, quic_zijuhu
Priority: P3    
Hardware: All   
OS: Linux   
Kernel Version: 6.9.0-rc4 Subsystem:
Regression: No Bisected commit-id:
Attachments: 6.9.0-rc4 + logging patch - kernel logs during bt disable/enable
6.9.0-rc4+logging after warm boot since boot
commit cef27048e5 + revert 56d074d26c5828773 after warm boot
00567f70051a41 + Zijun's 2 patches + cold_boot + toggle bt
00567f70051a41 + Zijun's 2 patches + cold_boot
00567f70051a41 + Zijun's 2 patches + warm boot after working cold boot

Description Wren Turkal 2024-04-15 08:21:30 UTC
I have a Dell XPS 13 9310 laptop. This laptop has a QCA6390 bluetooth device. The device works on a cold boot. However, it fails whenever disabling and re-enabling bluetooth (e.g. by cycling bluetooth in the GNOME UI).
Comment 1 Wren Turkal 2024-04-15 08:50:13 UTC
Created attachment 306152 [details]
6.9.0-rc4 + logging patch - kernel logs during bt disable/enable

I have applied the patchset at https://patchwork.kernel.org/project/bluetooth/list/?series=844357 on top of 6.9.0-rc4. Here's the kenrnel logs starting just before disabling bluetooth via the GNOME UI.

I pretty consistently got a very similar log to the above, so I only attached the one copy.
Comment 2 Wren Turkal 2024-04-15 09:01:47 UTC
How logs were captured

1) Apply below patch series
https://patchwork.kernel.org/project/bluetooth/list/?series=844357

2) Disable BT

3) Power off

4) Power on

5) enable more logs

after boot:
echo "module hci_uart  +pft, module btqca  +pft" > /sys/kernel/debug/dynamic_debug/control

as kernel args:
dyndbg="module hci_uart  +pft; module btqca  +pft"

6) enable BT

7) then check this issue again. several disable/enable or reboot cycles.


As for step 7, I will investigate other disable/enable reboot cycles if I find differences.
Comment 3 Wren Turkal 2024-04-15 09:27:18 UTC
Created attachment 306153 [details]
6.9.0-rc4+logging after warm boot since boot

The dyndbg didn't seem to add many messages in this one. Maybe I have the wrong kernel parm to enable the dynamic debugging messages?

After a warm boot, the bluetooth won't enable. This log includes everything past trying to enable bluetooth twice in GNOME. After the second attempt to enable bluetooth, I didn't see additional kernel logs.
Comment 4 Wren Turkal 2024-04-15 09:28:55 UTC
> The dyndbg didn't seem to add many messages in this one. Maybe I have the
> wrong kernel parm to enable the dynamic debugging messages?

This message appears to be wrong. I believe the new messages are in the log. I guess it just doesn't try to load the firmware and is much shorter.
Comment 5 Zijun Hu 2024-04-15 09:34:18 UTC
it will not print below logs if the patches are applied successfully.
Apr 12 00:30:05 braindead.localdomain kernel: Bluetooth: hci0: setting up ROME/QCA6390
Comment 6 Zijun Hu 2024-04-15 10:14:52 UTC
have reverted below commit and update a new patchset
Fixes: 56d074d26c58 ("Bluetooth: hci_qca: don't use IS_ERR_OR_NULL() with gpiod_get_optional()")
Comment 8 Wren Turkal 2024-04-15 19:38:49 UTC
In the first log (after cold boot), I only captures shortly before toggling the bluetooth. In the second lot (after warm boot), I enabled them with the kernel args. I definitely see a message from your patch in the second log.

Do you want to capture additional logging?

I am applying and trying your new patchset. I am applying it on to of the logging patches. I will then use the kernel arg and try to capture as much as I can on a cold boot.
Comment 9 Wren Turkal 2024-04-15 19:59:59 UTC
(In reply to Zijun Hu from comment #7)
> the new patchset is
> https://patchwork.kernel.org/project/bluetooth/patch/1713175927-13093-1-git-
> send-email-quic_zijuhu@quicinc.com/

This patchset does not apply to an upstream commit on Linus' tree, and I see context that differs on the files that the patch changes. What is the parent of your commit? It also does not apply to the bluetooth-next repo master branch? What is the parent of this commit?
Comment 10 Wren Turkal 2024-04-15 20:28:42 UTC
So, I tried just reverting the commit (56d074d26c58) you mentioned on the mailing list. That seems to have worked. I am building with that revert. However, you debug messages patch conflicts, so I do not have the extra logging now. It's just mainline + the revert of the commit.

Is this what you were asking me to try? If so, please confirm. If not, what did you actually want me to try?
Comment 11 Wren Turkal 2024-04-16 01:37:41 UTC
Okay, after reverting 56d074d26c58, disable/enable bluetooth appears to work after a cold boot. I haven't tested a warm boot yet.
Comment 12 Wren Turkal 2024-04-16 02:01:52 UTC
Created attachment 306160 [details]
commit cef27048e5 + revert 56d074d26c5828773 after warm boot

Bluetooth does not work after a warm boot.

This log is after a warm boot. Unfortunately, the following kernel arg didn't seem to enable the logs for the hci_uart and btqca args like I thought it would:
dyndbg="module hci_uart +pft; module btqca +pft"

I am trying to figure out how to turn the debug logs on as soon as the modules are loaded.
Comment 13 Zijun Hu 2024-04-16 02:18:57 UTC
you maybe apply below two changes over the tip of bluetooth-next tree, i should have not conflict, then test by following the steps shared

// it have no change compared with the orignial patch
https://patchwork.kernel.org/project/bluetooth/patch/1713095825-4954-2-git-send-email-quic_zijuhu@quicinc.com/

// it has contained the change reverting commit 56d074d26c58  compared with the origninal patch
https://patchwork.kernel.org/project/bluetooth/patch/1713175927-13093-1-git-send-email-quic_zijuhu@quicinc.com/


you maybe firstly apply above two changes, then perform a power off then power on.
then test disable/enable or warm reboot.

please share logs if any cases are not working fine.

thanks
Comment 14 Zijun Hu 2024-04-16 02:28:45 UTC
for issue "Bluetooth does not work after a warm boot.", we need to log of function
qca_serdev_shutdown(). it is called by the shutdown phase of reboot.
Comment 15 Wren Turkal 2024-04-16 07:52:14 UTC
(In reply to Zijun Hu from comment #13)
> you maybe apply below two changes over the tip of bluetooth-next tree, i
> should have not conflict, then test by following the steps shared

That worked. I am currently building a new kernel to test.
Comment 16 Wren Turkal 2024-04-16 07:54:01 UTC
(In reply to Zijun Hu from comment #14)
> for issue "Bluetooth does not work after a warm boot.", we need to log of
> function
> qca_serdev_shutdown(). it is called by the shutdown phase of reboot.

I can capture that.

Does stopping the bluetooth service cause that to happen as well? Would unloaded certain modules cause it to happen?
Comment 17 Zijun Hu 2024-04-16 08:08:29 UTC
it is great.thank you.
not sure if other cases will trigger this feature.

I think this regression issue is related to below two fixes. will sumbmit formal patches to fix it when have spare time.

// it will cause disable/enable failure.
Fixes: 56d074d26c58 ("Bluetooth: hci_qca: don't use IS_ERR_OR_NULL() with gpiod_get_optional()")

// it maybe cause BT enable failure after reboot.
Fixes: 272970be3dab ("Bluetooth: hci_qca: Fix driver shutdown on closed serdev")
Comment 18 Wren Turkal 2024-04-16 09:24:33 UTC
Okay. so I applied your patches you linked above to the tip of the bluetooth-next repo master branch. That appears to not include all the changes from Linus' master branch. When booting into that kernel, bluetooth works when I login to GNOME. However, bluetooth disable/enable does not work.

I also have the previous kernel that I built that was essentially 6.9.0-rc4 with the revert of 56d074d26c5828773 (git revert ...). Bluetooth works after a cold boot after I login to GNOME. This kernel allows disable/re-enable to work multiple times. However, warm boots do not work.

So now I have some questions:
* Are there fixes in Linus' kernel that are not in the bluetooth-next/master kernel?
* Considering the only functional difference between my revert and your patch is the bottom hunk of the second patch you linked (https://patchwork.kernel.org/project/bluetooth/patch/1713175927-13093-1-git-send-email-quic_zijuhu@quicinc.com/), does that the difference between disable/re-enable working have something to do with the quirk added by that hunk?
Comment 19 Wren Turkal 2024-04-16 09:33:10 UTC
I am trying the bare revert on bluetooth-next/master now to see if not including the quirk hunk allows disable/re-enable like it does on 6.9.0-rc4.
Comment 20 Zijun Hu 2024-04-16 09:41:28 UTC
i think it is not related to linus kernel.
the disable/re-enable working is not related to the quirk added by the hunk.
Comment 21 Wren Turkal 2024-04-16 10:11:21 UTC
On bluetooth-next/master, I simply reverted 56d074d26c5828773 (just like I did on 6.9.0-rc4). This allows disabling/re-enabling bluetooth over and over. I think this indicates there is some kind of problem with that quirk handling hunk.
Comment 22 Wren Turkal 2024-04-16 10:15:21 UTC
BTW, replicate my only change with "git revert 56d074d26c5828773" on top of bluetooth-next or on top of linus' kernel. They both start allowing disabling/re-enabling after a cold boot.

Warm boots still do not work on either kernel.
Comment 23 Wren Turkal 2024-04-16 12:26:18 UTC
Apparently getting that log is not going to be as easy as I thought. I tried to capture it from systemd journal on the next boot, but for some reason the log appears to be incomplete.
Comment 24 Wren Turkal 2024-04-17 06:33:10 UTC
Okay, I think what I am actually seeing is that the timestamps on early kernel logs in systemd's journal have the wrong timestamp. I am not sure why or how to fix.

Anyway, let me try applying your patches and getting a full log from boot to shutdown.
Comment 25 Wren Turkal 2024-04-17 08:18:36 UTC
Created attachment 306171 [details]
00567f70051a41 + Zijun's 2 patches + cold_boot + toggle bt

This is a log from startup to shutdown of bluetooth-next/master + Zijin's patches after a cold boot. I toggled bluetooth, which failed. Then I shutdown.
Comment 26 Wren Turkal 2024-04-17 08:20:10 UTC
Created attachment 306172 [details]
00567f70051a41 + Zijun's 2 patches + cold_boot

This is a log from startup to shutdown of bluetooth-next/master + Zijin's patches after a cold boot. Then I restart (for a warm boot next).
Comment 27 Wren Turkal 2024-04-17 08:22:12 UTC
Created attachment 306173 [details]
00567f70051a41 + Zijun's 2 patches + warm boot after working cold boot

This is a log from startup to shutdown of bluetooth-next/master + Zijin's patches after a warm boot that was restarted from a working cold boot. Bluetooth does not work. Then I shutdown.
Comment 28 Wren Turkal 2024-04-17 08:27:05 UTC
The last three logs I added add a cold boot with failing toggle, a cold boot, and a warm boot.

On the cold boots, bluetooth works if I before I manually toggle it within GNOME. 

I also tried Plasma, but didn't collect logs. Plasma kills bluetooth almost immediately after logging in.

Note that the timestamps appear to have the wrong hour until systemd is signaled. I collected these with the help of journalctl.
Comment 29 Zijun Hu 2024-04-17 08:28:56 UTC
thank you, Wren, for help verification and capturing logs.
will check logs today later.
Comment 30 Wren Turkal 2024-04-17 09:18:22 UTC
Just so you know, I added the following parameters to my kernel boot:
btqca.dyndbg="+pmft" hci_uart.dyndbg="+pmft"

Does this get you all the logging you were looking for?
Comment 31 Wren Turkal 2024-04-17 09:18:48 UTC
(In reply to Zijun Hu from comment #29)
> thank you, Wren, for help verification and capturing logs.
> will check logs today later.

Okay, cool. Thx.
Comment 32 Zijun Hu 2024-04-17 13:57:09 UTC
there are two issues as described below.

1) cold boot -> enable BT -> disable BT -> 2nd Enable BT -> 2nd Disable BT....
BT can't be enabled for 2nd and later enabled operations.

this issue is caused by below commit. i have reverted it and submit a formal below patch
https://patchwork.kernel.org/project/bluetooth/patch/1713354823-17826-1-git-send-email-quic_zijuhu@quicinc.com/

Fixes: 56d074d26c58 ("Bluetooth: hci_qca: don't use IS_ERR_OR_NULL() with gpiod_get_optional()")


2) cold boot (power off then power on) -> enable BT -> warm reboot after disable BT or NOT -> enable is failed to be Enable after warm reboot.

let us explain the 2) issue we are checking as below:
your device does not have H/W reset way since BT reset pin is not configured. so 
qca_serdev_shutdown() needs to send VSC EDL_RESET_REQ to make controller return to initial clean state. the only reason for 2) happens is that EDL_RESET_REQ is not sent successfully during warm reset.
let me check it.

we can get logs for issue 2) by this way.
cold boot -> enable BT -> disable BT -> unload hci_uart.ko -> dmesg kernel log to see if EDL_RESET_REQ was sent successfully.
Comment 33 Zijun Hu 2024-04-17 14:08:50 UTC
(In reply to Wren Turkal from comment #30)
> Just so you know, I added the following parameters to my kernel boot:
> btqca.dyndbg="+pmft" hci_uart.dyndbg="+pmft"
> 
> Does this get you all the logging you were looking for?

yes. the debugging logs are turned on successfully by you.
Comment 34 Zijun Hu 2024-04-17 14:45:45 UTC
Hi Wren,
for logs within comment #26, #27, #28. i don't find abnormal logs.

1)
could you share your drivers/bluetooth/hci_qca.c ?

2) how do you toggle BT? by UI or command line?

3) could you try by following below steps?

apply changes -> cold boot (power off then power on) -> rfkill list(share results) -> hciconfig -> enble BT if it is not enabled -> disable BT -> warm reboot -> rfkill list -> hciconfig ->  enble BT if it is not enabled-> check if BT is able to be enabled succussfully?
Comment 35 Zijun Hu 2024-04-17 15:13:12 UTC
Hi Wren,
sorry, it is my mistake, my 2nd debugging patch don't completely revert the Commit 
56d074d26c58. i have updated that debugging patches.

you ONLY need to apply below to changes over the tip of bluetooth-next tree.

https://patchwork.kernel.org/project/bluetooth/patch/1713095825-4954-2-git-send-email-quic_zijuhu@quicinc.com/

https://patchwork.kernel.org/project/bluetooth/patch/1713366251-22144-1-git-send-email-quic_zijuhu@quicinc.com/

if there are conflict, you maybe git reset beluetooth-next tree into below
commit and apply the above two changes.
e00fc2700a3f Bluetooth: btusb: Fix triggering coredump implementation for QCA
Comment 36 Zijun Hu 2024-04-17 15:13:57 UTC
Hi Wren,
sorry, it is my mistake, my 2nd debugging patch don't completely revert the Commit 
56d074d26c58. i have updated that debugging patches.

you ONLY need to apply below two changes over the tip of bluetooth-next tree.

https://patchwork.kernel.org/project/bluetooth/patch/1713095825-4954-2-git-send-email-quic_zijuhu@quicinc.com/

https://patchwork.kernel.org/project/bluetooth/patch/1713366251-22144-1-git-send-email-quic_zijuhu@quicinc.com/

if there are conflict, you maybe git reset beluetooth-next tree into below
commit and apply the above two changes.
e00fc2700a3f Bluetooth: btusb: Fix triggering coredump implementation for QCA
Comment 37 Wren Turkal 2024-04-18 07:22:10 UTC
It would be hard for you to rock more. This seems to fix everything. Cold boot and warm booting with mutliple disable/re-enable cycles work. I am now not getting any errors in my kernel logs when performing these actions.
Comment 38 Wren Turkal 2024-04-18 07:35:25 UTC
Can I add a Signed-off-by footer to the patch somehow?
Comment 39 Zijun Hu 2024-04-18 07:44:33 UTC
good news. thank you @Wren Turkal again for co-working to debug this issue.
of course. you can add it.
i also add your Signed-off-by for later updated patche serials for this issue.
Comment 40 Wren Turkal 2024-04-18 08:14:13 UTC
I know this is a total newb question. How do I add the footer properly for the purposes of the log? Do you handle it or is there something I need to do?
Comment 41 Zijun Hu 2024-04-18 08:26:14 UTC
(In reply to Wren Turkal from comment #40)
> I know this is a total newb question. How do I add the footer properly for
> the purposes of the log? Do you handle it or is there something I need to do?

they are added by fixes submitter. yes. nothing need to do for  you currently. 
i will submit new patchset and add them for you based on the changes which take effect actually. 

thanks
Comment 42 Wren Turkal 2024-04-18 08:35:19 UTC
Okay. Let me know if you end up needing anything from me. I really appreciate your work on getting this fixed...谢谢
Comment 43 Paul Menzel 2024-04-18 08:46:54 UTC
> i also add your Signed-off-by for later updated patche serials [patch series]
> for this issue.

Shouldn’t

> Tested-by: Wren Turkal <…> # Dell XPS 13 9310 (QCA6390) with X BT mouse

be the correct tag?
Comment 44 Wren Turkal 2024-04-18 08:48:54 UTC
That sounds correct. Thanks for the correction. I really appreciate it, Paul.
Comment 45 Wren Turkal 2024-04-18 09:11:07 UTC
BTW, I ported the revert patch to Linus' latest. I can confirm that it works there as well.
Comment 46 Zijun Hu 2024-04-23 23:52:52 UTC
this issue was discucced at below link before it was reported to bugzilla.

https://lore.kernel.org/linux-bluetooth/ea20bb9b-6b60-47fc-ae42-5eed918ad7b4@quicinc.com/T/#m73d6a71d2f454bb03588c66f3ef7912274d37c6f

as commented at Commented 37.

it is verified by applied apply fix debugging changes over below bluetoot-next tree commit:
e00fc2700a3f Bluetooth: btusb: Fix triggering coredump implementation for QCA

it is verified by the reported device a Dell XPS 13 9310.