Bug 219290 - AMD Ryzen 7900x freeze when resuming from sleep with 6.11
Summary: AMD Ryzen 7900x freeze when resuming from sleep with 6.11
Status: RESOLVED DUPLICATE of bug 219514
Alias: None
Product: ACPI
Classification: Unclassified
Component: Power-Sleep-Wake (show other bugs)
Hardware: AMD Linux
: P3 normal
Assignee: acpi_power-sleep-wake
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2024-09-19 13:17 UTC by dsgluyu
Modified: 2025-01-04 05:36 UTC (History)
23 users (show)

See Also:
Kernel Version:
Subsystem:
Regression: No
Bisected commit-id:


Attachments
here is dmesg from linux-6.11.arch1-1-x86_64 (109.55 KB, text/plain)
2024-09-19 13:17 UTC, dsgluyu
Details
lspci -kvv (36.36 KB, text/plain)
2024-09-19 13:28 UTC, dsgluyu
Details

Description dsgluyu 2024-09-19 13:17:00 UTC
Created attachment 306895 [details]
here is dmesg from linux-6.11.arch1-1-x86_64

I initially discovered this issue while using linux-cachyos, and it can be reproduced when testing with linux-6.11.arch1-1-x86_64 from the archlinux core-testing repository.

When I press the power button, the screen is woken up and it displays the last frame from before sleep. but I can't do anything. 

The logs stop after "systemd-sleep[4525]: Performing sleep operation 'suspend'..." so I have no idea how to troubleshoot this.

And I disabled sddm and tried to wake up from sleep in the tty, after the monitor wakes up, the tty's cursor is blinking normally, the Num Lock and Caps Lock indicator lights work normally, but I can't make any inputs. Attempting to reconnect the keyboard resulted in the keyboard not being recognized.

My motherboard is MAG B650M MORTAR WIFI. I tried updating the BIOS to the latest version from MSI's official website, but the issue remains the same.
Comment 1 dsgluyu 2024-09-19 13:23:06 UTC
Reverting to 6.10 or using the LTS version doesn’t cause this issue, so it seems like 6.11 made some changes about wake-up?
Comment 2 dsgluyu 2024-09-19 13:28:19 UTC
Created attachment 306896 [details]
lspci -kvv
Comment 3 Artem S. Tashkinov 2024-09-19 13:47:57 UTC
Please report here instead:

https://gitlab.freedesktop.org/drm/amd/-/issues
Comment 4 dsgluyu 2024-09-19 15:14:09 UTC
(In reply to Artem S. Tashkinov from comment #3)
> Please report here instead:
> 
> https://gitlab.freedesktop.org/drm/amd/-/issues

It doesn't seem to be an AMD DRM issue. I disabled the AMD integrated graphics in the BIOS, and my discrete GPU is an Intel Arc A770.
Comment 5 Julian Weissgerber 2024-09-19 18:53:37 UTC
Also experiencing this on 6.11 with Ryzen 7800X3D on MSI X670E GAMING PLUS WIFI. Board firmware 7E16v181, Agesa 1.2.0.1. On kernel 6.10.x resume works fine.

Nothing conclusive is written logs here either.

Sep 18 23:20:53 xxx kernel: PM: suspend entry (deep)
Sep 18 23:20:53 xxx kernel: Filesystems sync: 0.036 seconds
-- Boot 630805d182244f7b82f162e50d41a1da --
Sep 19 09:44:36 xxx kernel: Linux version 6.11.0-cb1.0.fc40.x86_64 (mockbuild@5896c1ab416c4f639c7b1f6c31d7034d) (gcc (GCC) 14.2.1 20240801 (Red Hat 14.2.1-1), GNU ld version 2.41-37.fc40) >

$ uname -a
Linux xxx 6.11.0-cb2.0.fc40.x86_64 #1 SMP PREEMPT_DYNAMIC Thu Sep 19 02:35:36 UTC 2024 x86_64 GNU/Linux
Comment 6 dsgluyu 2024-09-20 06:41:13 UTC
I find my issue is caused by the VMware network device so it shouldn’t be reported there.
Comment 7 dsgluyu 2024-09-20 06:56:15 UTC
(In reply to dsgluyu from comment #6)
> I find my issue is caused by the VMware network device so it shouldn’t be
> reported there.

It was a false alarm—this issue can't be consistently reproduced.
Comment 8 dsgluyu 2024-09-20 08:49:53 UTC
I'm trying to bisect, but there's another bug, which was fixed in 6.11.0, that prevents the bisected kernel from booting on my motherboard, making it hard to verify the wake-up issue.

The bug causes the following error:

kernel: BUG: unable to handle page fault for address: ffffffffffffff98
kernel: #PF: supervisor read access in kernel mode
kernel: #PF: error_code(0x0000) - not-present page

Does anyone know when this bug was fixed?
Comment 9 dsgluyu 2024-09-25 14:28:24 UTC
I previously thought the freeze occurred during the wake-up phase, but I've found in the past couple of days that it actually happens during the suspend phase. It seems very likely to be a DRM-related issue.
Comment 10 Markus Strobl 2024-10-13 15:43:50 UTC
Can confirm this issue. It happens every time I try to suspend-to-RAM. 

Kubuntu 24.10 with 6.11 Kernel. Ryzen 5800X3D. NVidia 3070 with driver 560.35.03. Have also tried NVidia drivers 555 and 550 as well as Nouveau and all act the same, i.e. suspend freezes the computer.

My symptoms: When trying to suspend I get black screen. Sometimes backlight is on, sometimes off. When backlight is on the screen is sometimes all black, sometimes a blinking cursor in upper left corner. Once I had a working mouse pointer on a black screen. Keyboard and mouse stays powered up. Fans keep running. I have to press the reset button to get it to reboot.

Reverting to Kernel 6.10.10 fixes it for me.
Comment 11 Markus Strobl 2024-10-13 15:50:14 UTC
Reading my comment I realize it is misleading: The problem occurs when resuming from suspend. The PC enters sleep ok, the freezes with black screen when resuming. Sorry for being unclear.(In reply to Markus Strobl from comment #10)
Comment 12 dsgluyu 2024-10-13 16:03:04 UTC
I eventually found out it was a bluetooth or wifi related issue, the problem occurs when both are turned on at the same time, not when wifi is turned on and bluetooth is turned off. I haven't tried turning off wifi but turning on bluetooth. The motherboard carries a wireless card RZ616 alias MT7922.
Comment 13 Bernard Cafarelli 2024-10-14 13:52:17 UTC
Thanks for the info, I added a systemd service to stop/start bluetooth around suspend, and system now wakes up properly from suspend again on 6.11 (with that workaround)

Tested on kernel 6.11.3, Asus TUF GAMING X670E-PLUS WIFI, similar card MEDIATEK Corp. MT7922
Comment 14 Markus Strobl 2024-10-14 14:43:34 UTC
Just tried turning BT and WIFI off on my MSI B550 MAX WIFI motherboard and unfortunately kernel 6.11.0 still freezes when resuming.
Comment 15 Adrian 2024-10-14 17:45:39 UTC
I've been plagued by this for a couple weeks too: MSI Mag B650m and 7800x3d. Disabling bluetooth before sleep solved my issue so I guess I need to learn how to write my own systemd services. This is just a little generic bluetooth dongle sat in the front ports of my case.
Comment 16 Markus Strobl 2024-10-14 18:14:35 UTC
Did some more testing. Like I said in comment #11: Disabling bluetooth in BIOS did not fix it. However, when I disable bluetooth in the KDE desktop (and have it enabled in BIOS) resume DOES work. 

Also noticed another bug which is probably a bug in the KDE BT app: After I disabled BT and did a suspend-resume it showed as disabled but it is actually enabled. A second suspend-resume will give the familiar frozen system. If I click enable-disable BT a few times it will eventually sync back up and show the correct status. If I then disable BT I can suspend-resume successfully again.

So yeah, I think we have narrowed it down to that there's a fault in bluetooth. Interesting that the system still freezes even if BT is off in BIOS. Have to disable it in the desktop environment.
Comment 17 dsgluyu 2024-10-15 06:03:56 UTC
(In reply to Adrian from comment #15)
> I've been plagued by this for a couple weeks too: MSI Mag B650m and 7800x3d.
> Disabling bluetooth before sleep solved my issue so I guess I need to learn
> how to write my own systemd services. This is just a little generic
> bluetooth dongle sat in the front ports of my case.

So you're using a usb bluetooth adapter and the motherboard didn't come with one? If that's the case then it doesn't seem to be a problem with the wireless card. Is there a better place to ask looking for an actual solution
Comment 18 dsgluyu 2024-10-15 06:16:19 UTC
(In reply to Markus Strobl from comment #16)
> Did some more testing. Like I said in comment #11: Disabling bluetooth in
> BIOS did not fix it. However, when I disable bluetooth in the KDE desktop
> (and have it enabled in BIOS) resume DOES work. 

I also tried disabling bluetooth in the bios, but it didn't do what you said, it woke up fine
Comment 19 dsgluyu 2024-10-16 02:08:51 UTC
If the USB Bluetooth adapter has issues, the problem might be with the USB itself, since even wireless network cards use USB for Bluetooth.
Comment 20 Julian Weissgerber 2024-10-29 06:41:55 UTC
Disabling Bluetooth also makes sleep stable again on MSI X670E GAMING PLUS WIFI with 6.12.0-rc4.
The convenient workaround for me is to do this with a simple script for systemd: https://gitlab.com/-/snippets/3762978
Comment 21 Tim Richardson 2024-11-18 22:01:56 UTC
I concur with these reports on X670E Asus Proart Creator motherboard (that is, onboard bluetooth). Disabling Bluetooth in EUFI allows power down events to succeed (reboot, shutdown). 

My concern is not suspend/resume. 
I however do want the machine to reboot and shutdown properly, particularly in response to UPS events.


A systemd service that does rfkill bluetooth on 

Before=shutdown.target reboot.target halt.target

also fixes the problem (with bluetooth enabled at UEFI). 

6.11 and 6.12 RCs have the problem.
Comment 22 Michael Husmann 2024-11-19 13:45:12 UTC
Julian Weissgerbers scripts works perfectly on my system:
MSI B650 GAMING PLUS WIFI (MS-7E26) with 6.11.9-arch1-1
Comment 23 Julian Grinblat 2024-11-24 08:13:01 UTC
I have the same issue is affecting me on MSI MEG ACE X670E, and that disabling bluetooth restores resume-from-suspend
Comment 24 kernel-7xes 2024-12-05 17:27:22 UTC
I have the same issue on a Gigabyte B650E AORUS MASTER + AMD 9 7950x. Freezes after the screen contents are displayed on resume. rfkill block bluetooth fixed the system being frozen on resume from suspend though I also had to rfkill block wifi to fix the same freezing on resume from hibernate.
Comment 25 Guy Kendall 2024-12-21 15:26:54 UTC
Is anyone actively working on a fix for this issue? Suspend has been broken in all the 6.11.x and 6.12.x kernels if you have a Mediatek Bluetooth/Wi-Fi chip. Here is the one I have on my ASUS ProArt Creator 670e motherboard:

Network:
  Device-1: MEDIATEK MT7922 802.11ax PCI Express Wireless Network Adapter
    driver: mt7921e

Symptoms are a hang on resuming from suspend. The light on my power button quits flashing and the fans turn on indicating it started to resume, but the system is completely locked up requiring an unsafe power down or SysReq REISUB. There are no entries in journalctl after the suspend, making debugging this difficult.

The issue happens every resume from suspend. Disabling Bluetooth in the Gnome settings made the issue only happen about 1 in 5 or 10 times instead of everytime. Creating a systemd service to rfkill this problem chip seems to prevent it from happening 100% of the time, as documented here:

https://github.com/alimert-t/suspend-freeze-fix-for-mt7921e
https://github.com/glexposito/bluetooth-sleep-toggle

There are reports all over the Fedora and Arch forums reporting this issue, here is one thread:
https://discussion.fedoraproject.org/t/kernel-6-11-3-200-fc40-unable-to-resume-from-suspend-when-bluetooth-enabled/134008

So again I ask, is anyone actively working on fixing this?
Comment 26 Janfi 2024-12-23 13:44:26 UTC
Same issue on Thinkpad E14 Gen4, with same adapter/driver :
Network controller: MEDIATEK Corp. MT7921 802.11ax PCI Express Wireless Network Adapter

It'a a really annoying regression because it was working fine for 2 years "out of the box", and since several majors kernel releases suspend is completely broken without the systemd workaround.
Comment 27 jackyzy823 2024-12-25 01:44:35 UTC
Hi,all. I suffered this issue too.
Yesterday, I tested Fedora's [kernel-6.13.0-0.rc4.36.fc42](https://bodhi.fedoraproject.org/updates/FEDORA-2024-c03e8afd7a) and it looks like the bug has been fixed and the machine is able to resume from sleep.  I'm not sure if this works for other distributions.



Some tests i've done:

kernel-6.11.0-0.rc0.20240716gitd67978318827.2.fc41  This one is good.
kernel-6.11.0-0.rc0.20240717git51835949dda3.5.fc41  failed to boot.
kernel-6.11.0-0.rc1.20240731gite4fc196f5ba3.18.fc41  failed to boot.
kernel-6.11.0-0.rc1.20240802gitc0ecd6388360.20.fc41  failed to resume from sleep.
kernel-6.11.0-0.rc2.20240806gitb446a2dae984.24.fc41  failed to resume from sleep.
kernel-6.12.6-200.fc41  still failed to resume from sleep.
kernel-6.13.0-0.rc4.36.fc42 this one is good again.
Comment 28 Alesh 2024-12-26 14:27:36 UTC
6.13.0 rc4 compiled from source still exhibits problems with sleep/resume. 

The first sleep/wake cycle seems to always work correctly, but on subsequent attempts, the sleep seems to fail more often than not (power indicator light is on, display is on but blank, fans spinning). Sometimes the system wakes on its own after the failed sleep attempt.

When the system does correctly sleep, it seems to always wake.
Comment 29 Perroboc 2024-12-30 19:28:38 UTC
https://bugzilla.kernel.org/show_bug.cgi?id=219514 proposes a fix to this issue after bisecting and finding the guilty commit.
Comment 30 jokeyrhyme 2024-12-30 21:47:17 UTC
(In reply to dsgluyu from comment #8)
> I'm trying to bisect, but there's another bug, which was fixed in 6.11.0,
> that prevents the bisected kernel from booting on my motherboard, making it
> hard to verify the wake-up issue.
> 
> The bug causes the following error:
> 
> kernel: BUG: unable to handle page fault for address: ffffffffffffff98
> kernel: #PF: supervisor read access in kernel mode
> kernel: #PF: error_code(0x0000) - not-present page
> 
> Does anyone know when this bug was fixed?

This looks a lot like https://gitlab.freedesktop.org/drm/amd/-/issues/3868 

Do you also have `kernel: RIP: 0010:copy_stream_update_to_stream.isra.0+0x30d/0x740 [amdgpu]` or similar in the call trace?
Comment 31 Tony Houghton 2024-12-31 12:54:22 UTC
(In reply to Perroboc from comment #29)
> https://bugzilla.kernel.org/show_bug.cgi?id=219514 proposes a fix to this
> issue after bisecting and finding the guilty commit.

That isn't a proposed fix, just a hack to patch the old version of the btusb driver into newer kernels. It's not working quite perfectly:- it seems to take a long time for BT to start up after booting or resuming. If configuring systemd or whatever to disable bluetooth at suspend works, that seems a far more practical workaround for now.

I don't know if there's a separate issue with AMD, but could those of you who can confirm a Mediatek BT adapter is to blame please subscribe to #219514?
Comment 32 Artem S. Tashkinov 2025-01-04 05:36:43 UTC
Let's not spawn duplicates. It looks to be a BT issue tracked in bug 219514

*** This bug has been marked as a duplicate of bug 219514 ***

Note You need to log in before you can comment on or make changes to this bug.