Created attachment 307246 [details] dmesg shortly after booting with the "bad" kernel In recent kernels my system keeps hanging when trying to resume from suspend. The fans power up, as do my keyboard LEDs, but the displays stay off and pressing Caps Lock does not toggle its LED. This means I can't get a log when the problem occurs. My motherboard is an MSI MAG X670E TOMAHAWK WIFI with an onboard Mediatek wireless device, and I'm using Arch Linux. lspci: 0f:00.0 Network controller: MEDIATEK Corp. MT7922 802.11ax PCI Express Wireless Network Adapter lsusb: 0e8d:0616 MediaTek Inc. Wireless_Device git bisect points to ceac1cb0259de682d78f5c784ef8e0b13022e9d9 as the first bad commit. It's been difficult to pinpoint it, because the bug isn't 100% consistent, but this commit has exhibited the bug every time I've tried it, and subsequent revisions nearly always do. I've tested the previous commit, 6dc22ab9f085ae165e4ce89d61fb426f94e8a969, several times, and it's successfully resumed every time.
Created attachment 307247 [details] dmesg shortly after booting, then successfully suspending and resuming with the "good" kernel
I've copied btusb.c and btmtk.c from 6dc22ab9f085 to a checkout of 6.13-rc2 and changed a few lines to make it compatible with some things that have changed since then: diff --git a/drivers/bluetooth/btmtk.c b/drivers/bluetooth/btmtk.c index fe3b892f6c6e..9eeddbb7d991 100644 --- a/drivers/bluetooth/btmtk.c +++ b/drivers/bluetooth/btmtk.c @@ -6,7 +6,7 @@ #include <linux/firmware.h> #include <linux/usb.h> #include <linux/iopoll.h> -#include <asm/unaligned.h> +#include <linux/unaligned.h> #include <net/bluetooth/bluetooth.h> #include <net/bluetooth/hci_core.h> diff --git a/drivers/bluetooth/btusb.c b/drivers/bluetooth/btusb.c index 034256c399dd..0e5cc454e2f9 100644 --- a/drivers/bluetooth/btusb.c +++ b/drivers/bluetooth/btusb.c @@ -17,7 +17,7 @@ #include <linux/suspend.h> #include <linux/gpio/consumer.h> #include <linux/debugfs.h> -#include <asm/unaligned.h> +#include <linux/unaligned.h> #include <net/bluetooth/bluetooth.h> #include <net/bluetooth/hci_core.h> @@ -3887,8 +3887,8 @@ static int btusb_probe(struct usb_interface *intf, if (id->driver_info & BTUSB_WIDEBAND_SPEECH) set_bit(HCI_QUIRK_WIDEBAND_SPEECH_SUPPORTED, &hdev->quirks); - if (id->driver_info & BTUSB_VALID_LE_STATES) - set_bit(HCI_QUIRK_VALID_LE_STATES, &hdev->quirks); + if (!(id->driver_info & BTUSB_VALID_LE_STATES)) + set_bit(HCI_QUIRK_BROKEN_LE_STATES, &hdev->quirks); if (id->driver_info & BTUSB_DIGIANSWER) { data->cmdreq_type = USB_TYPE_VENDOR; Some guesswork was involved, but it seems to work for me. I'd like to try to get to the bottom of the issue so I don't have to keep patching my kernel. Are there any options I could try? I've got plenty of experience with C, but not with the kernel, so if you could give me some guidance such as a summary of what changed in ceac1cb0259d, what code paths are taken during suspend/resume and any code tweaks I can try, it would be much appreciated.
Seems that #219290 is related. Also discussed in Fedora boards: https://discussion.fedoraproject.org/t/system-cannot-wake-up/134199 https://discussion.fedoraproject.org/t/kernel-6-11-3-200-fc40-unable-to-resume-from-suspend-when-bluetooth-enabled/134008
I also bisected this and came to commit d019930b0049fc2648a6b279893d8ad330596e81, which is in the same area. I also found similar reports: https://bugzilla.kernel.org/show_bug.cgi?id=219290 https://bugzilla.redhat.com/show_bug.cgi?id=2314036 https://bbs.archlinux.org/viewtopic.php?id=295916 https://bbs.archlinux.org/viewtopic.php?id=299987 https://discussion.fedoraproject.org/t/kernel-6-11-3-200-fc40-unable-to-resume-from-suspend-when-bluetooth-enabled/134008/10 https://discussion.fedoraproject.org/t/system-cannot-wake-up/134199/39 A workaround that works for me and does not require to patch the kernel is the following service: # /etc/systemd/system/bt-fix.service # # Author: Bojan Kseneman # https://discussion.fedoraproject.org/t/kernel-6-11-3-200-fc40-unable-to-resume-from-suspend-when-bluetooth-enabled/134008/17 [Unit] Description=Disable Bluetooth before going to sleep Before=sleep.target StopWhenUnneeded=yes [Service] Type=oneshot RemainAfterExit=yes ExecStart=/usr/sbin/rfkill block bluetooth ExecStop=/usr/sbin/rfkill unblock bluetooth [Install] WantedBy=sleep.target I also wrote on the linux-bluetooth mailing list, with stack trace from a kernel oops: https://lore.kernel.org/linux-bluetooth/073c3b772abe84d480913495eea0c4da73607d6e.camel@croquette.de/T/#u
*** Bug 219290 has been marked as a duplicate of this bug. ***
Quick update: the workaround with rfill does not seem to work reliably for me. I am left with not suspending, or using an old kernel.
Since Bug 219290 is marked as duplicate of this bug, I would like to mention that I needed to rkfill block wifi too, to fix resume from hibernate as rfkill block bluetooth only fixed resume from suspend.
If you also need to kill wifi you should replace the "rfkill block bluetooth" with "rfkill block all" in the script, it should kill both BT & Wifi Anyway, I've installed kernel 6.12.8 today and disabled the service and cannot reproduce this issue anymore. Can someone else also confirm this?
Thank you Bojan. I now do this to suspend: sudo rfkill block all && /usr/bin/systemctl suspend And after waking up: sudo rfkill unblock all So far, it worked twice in a row. Let’s see!
Yes in both use "all" instead of "bluetooth" however you don't need to call `/usr/bin/systemctl suspend`as the service is hooked to sleep.target anyway.
I just tested with 6.12.8 and was able to resume from suspend 3 times in a row. That looks good. However it took a long time to get the BT devices to work again. From dmesg: [ 263.253098] PM: suspend exit [ 263.340683] pci_bus 0000:03: Allocating resources [ 263.348625] pci_bus 0000:03: Allocating resources [ 263.365801] r8169 0000:0b:00.0 enp11s0: Link is Down [ 263.386884] Bluetooth: hci0: HW/SW Version: 0x008a008a, Build Time: 20241106163512 [ 263.395274] RTL8226B_RTL8221B 2.5Gbps PHY r8169-0-b00:00: attached PHY driver (mii_bus:phy_addr=r8169-0-b00:00, irq=MAC) [ 263.575549] r8169 0000:0b:00.0 enp11s0: Link is Down [ 266.524999] r8169 0000:0b:00.0 enp11s0: Link is Up - 1Gbps/Full - flow control rx/tx [ 284.169397] Bluetooth: hci0: Device setup in 20437679 usecs [ 284.169402] Bluetooth: hci0: HCI Enhanced Setup Synchronous Connection command is advertised, but not supported. [ 284.444262] Bluetooth: hci0: AOSP extensions version v1.00 [ 284.444270] Bluetooth: hci0: AOSP quality report is supported [ 284.444433] Bluetooth: MGMT ver 1.23 [ 292.361312] input: Logitech Wireless Mouse MX Master 3 as /devices/virtual/misc/uhid/.../input/input24 [ 292.361454] logitech-hidpp-device input,hidraw4: BLUETOOTH HID v0.15 Keyboard [Logitech Wireless Mouse MX Master 3] [ 292.387566] logitech-hidpp-device: HID++ 4.5 device connected. So it takes around 20 seconds to setup hci0, and 8 more seconds to find the mouse (maybe because it was in sleep mode too though). When I resume I see a black screen for a long time, maybe the 20 seconds.
That's odd, it seems to work normaly on MT9222. All devices take about 3s for me, but again, I don't have a bluetooth mouse [ 1557.845962] PM: resume devices took 3.106 seconds
Seems to be fixed by this commit: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/diff/drivers/bluetooth/btusb.c?id=v6.12.8&id2=v6.12.7
Thanks for the info, this is the commit then: commit f5c5661f02b5539d88aea8497f8d0835d165e945 Author: Chris Lu <chris.lu@mediatek.com> Date: Mon Sep 23 16:47:05 2024 +0800 Bluetooth: btusb: mediatek: change the conditions for ISO interface commit defc33b5541e0a7e45cc2d99d72fbe80a597afc5 upstream. Change conditions for Bluetooth driver claiming and releasing usb ISO interface for MediaTek ISO data transmission. Signed-off-by: Chris Lu <chris.lu@mediatek.com> Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com> Cc: Fedor Pchelkin <boddah8794@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> One more discussion about this issue: https://www.reddit.com/r/Fedora/comments/1hv281d/mt7922_no_longer_causes_kernel_panic_on_resume/