Bug 204275
Summary: | bluetoothd consumes 100% cpu on keyboard disconnect | ||
---|---|---|---|
Product: | Drivers | Reporter: | Steven Newbury (steve) |
Component: | Bluetooth | Assignee: | linux-bluetooth (linux-bluetooth) |
Status: | NEW --- | ||
Severity: | normal | CC: | luiz.dentz |
Priority: | P1 | ||
Hardware: | All | ||
OS: | Linux | ||
Kernel Version: | 3.x + | Subsystem: | |
Regression: | No | Bisected commit-id: | |
Attachments: |
Patch to fix high CPU usage with HID keyboard and other devices
Remove sec_watch, instead attempt immediate connection |
Description
Steven Newbury
2019-07-22 13:36:29 UTC
I've tried adding g_source_remove()s for ctrl_watch and intr_watch before shutting down the io channels but it doesn't make any difference. strace results in continuous repeating: poll([{fd=3, events=POLLIN}, {fd=6, events=POLLIN}, {fd=7, events=POLLIN}, {fd=8, events=POLLIN}, {fd=10, events=POLLIN}, {fd=11, events=POLLIN}, {fd=12, events=POLLIN}, {fd=16, events=POLLIN}, {fd=17, events=POLLIN}, {fd=18, events=POLLIN}, {fd=19, events=POLLIN}, {fd=20, events=POLLIN}, {fd=21, events=POLLIN}, {fd=22, events=POLLIN}, {fd=23, events=POLLIN}, {fd=42, events=POLLIN}, {fd=43, events=POLLIN}, {fd=44, events=POLLIN}, {fd=45, events=POLLIN}, {fd=46, events=POLLIN}, {fd=47, events=POLLIN}, {fd=48, events=POLLIN}, {fd=49, events=POLLIN}, {fd=50, events=POLLIN}, {fd=51, events=POLLIN}, {fd=52, events=POLLIN}, {fd=53, events=POLLIN}, {fd=55, events=POLLOUT}], 28, -1) = 1 ([{fd=55, revents=POLLNVAL}]) Ahhh... I was already aware that on first connect the HID input device wasn't created, but I hadn't realised it's part of the same bug. What's happening is that on initial connect: dev->sec_watch = g_io_add_watch(idev->intr_io, G_IO_OUT, encrypt_notify, idev); creates the notify callback. But it never triggers. On second connection the callback gets triggered, but the connect code gets run again so another refcount is added. This means the intr channel never gets closed which means it's stuck on the last event. I flipped the G_IO_OUT to G_IO_IN and it started working on first connect. I don't know why it was G_IO_OUT? Created attachment 283925 [details]
Patch to fix high CPU usage with HID keyboard and other devices
Ive sent a fix upstream: https://lore.kernel.org/linux-bluetooth/20190724110151.4258-1-luiz.dentz@gmail.com/T/#u Let me know if that works. (In reply to Luiz Von Dentz from comment #6) > Ive sent a fix upstream: > > https://lore.kernel.org/linux-bluetooth/20190724110151.4258-1-luiz. > dentz@gmail.com/T/#u > > Let me know if that works. That will prevent the channel getting left dangling, but it doesn't address the issue of the callback not happening on initial connect, which is AFAICT what results in that part of the bug for me. Is it really supposed to be waiting for G_IO_OUT? It seems it only gets triggered for me when a second connection is attempted on the dangling channel, presumably during handshake..? Luiz, did you see (In reply to Luiz Von Dentz from comment #6) > Ive sent a fix upstream: > > https://lore.kernel.org/linux-bluetooth/20190724110151.4258-1-luiz. > dentz@gmail.com/T/#u > > Let me know if that works. Luiz, did you see the patch I attached? Perhaps it's better to remove the old watcher first than not add another if it exists, but it shouldn't happen anyway. The explicit remove should happen to catch the case where the connection is broken before being fully initialised but the callback should get triggered once data starts to flow over the encrypted channel. In my tests, allowing the channel to fully close by removing the sec_watch prevents the keyboard from ever connecting, this is because the callback never gets triggered. I believe the bug is caused by the refcount sec_watch creates not being automatically removed because the callback is left pending, it doesn't get removed on channel shutdown, but has the side-effect of making it work with the channel being already open when the device reconnects because it triggers the callback. What made it work every time is changing sec_watch to wait for G_IO_IN instead of G_IO_OUT, but presumably it is G_IO_OUT for a reason? Im not sure we even need the watch in the first place, the kernel should block any in/out until the connection is encrypted (BT_SK_SUSPEND) so it might be possible to get rid of sec_watch altogether. (In reply to Luiz Von Dentz from comment #9) > Im not sure we even need the watch in the first place, the kernel should > block any in/out until the connection is encrypted (BT_SK_SUSPEND) so it > might be possible to get rid of sec_watch altogether. Going through the history it originally worked that way, I'm not sure why it was changed..? (In reply to Steven Newbury from comment #10) > (In reply to Luiz Von Dentz from comment #9) > > Im not sure we even need the watch in the first place, the kernel should > > block any in/out until the connection is encrypted (BT_SK_SUSPEND) so it > > might be possible to get rid of sec_watch altogether. > > Going through the history it originally worked that way, I'm not sure why it > was changed..? Well lets try to remove it then and see if that fix the problem. (In reply to Luiz Von Dentz from comment #11) > (In reply to Steven Newbury from comment #10) > > (In reply to Luiz Von Dentz from comment #9) > > > Im not sure we even need the watch in the first place, the kernel should > > > block any in/out until the connection is encrypted (BT_SK_SUSPEND) so it > > > might be possible to get rid of sec_watch altogether. > > > > Going through the history it originally worked that way, I'm not sure why > it > > was changed..? > > Well lets try to remove it then and see if that fix the problem. That works. I'll attach a patch on here. Created attachment 284073 [details]
Remove sec_watch, instead attempt immediate connection
(In reply to Steven Newbury from comment #13) > Created attachment 284073 [details] > Remove sec_watch, instead attempt immediate connection Could you please send as a patch to linux-bluetooth? |