When resuming from hibernation (suspend to disk) I got this error from mt7921e. [T29172] mt7921e 0000:01:00.0: Message 00020007 (seq 11) timeout [T29172] mt7921e 0000:01:00.0: PM: dpm_run_callback(): pci_pm_restore+0x0/0x90 returns -110 [T29172] mt7921e 0000:01:00.0: PM: failed to restore async: error -110 [T29172] mt7921e 0000:01:00.0: HW/SW Version: 0x8a108a10, Build Time: 20220311230842a [T29172] [T29172] mt7921e 0000:01:00.0: WM Firmware Version: ____010000, Build Time: 20220311230931 Full dmesg: https://gitlab.freedesktop.org/drm/amd/uploads/4ae31a3d6b9a7db839943c16e06d8704/Ryzen-5650U_6.1.26-with-8cf17c25e_hibernation-wakeup.txt Came up as part of a different problem: Ryzen 3500U and 5650U: StandBy and External Monitors broken since >= 6.1 https://gitlab.freedesktop.org/drm/amd/-/issues/2492#note_1894147 Maybe related: wifi mediatek mt7921e problem after suspend https://bugzilla.kernel.org/show_bug.cgi?id=215463
Maybe https://patchwork.kernel.org/project/linux-wireless/patch/19f1aae1ab9ea867eb42742fc5b72ed4d7307b0a.1687159671.git.deren.wu@mediatek.com/ helps
(In reply to Mario Limonciello (AMD) from comment #1) > Maybe > https://patchwork.kernel.org/project/linux-wireless/patch/ > 19f1aae1ab9ea867eb42742fc5b72ed4d7307b0a.1687159671.git.deren.wu@mediatek. > com/ helps I've tested Linux-6.1.43 which includes that patch. https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=97ccc14d114b1cf3bc16670fa09f74ec7233b643 dmesg shows these errors multiple times. This is the first occurrence. [ T1314] mt7921e 0000:01:00.0: enabling device (0000 -> 0002) [ T1314] mt7921e 0000:01:00.0: ASIC revision: 79610010 [ T404] mt7921e 0000:01:00.0: HW/SW Version: 0x8a108a10, Build Time: 20230526130917a [ T404] mt7921e 0000:01:00.0: WM Firmware Version: ____010000, Build Time: 20230526130958 [ T3390] mt7921e 0000:01:00.0: Message 00020007 (seq 4) timeout [ T3390] mt7921e 0000:01:00.0: PM: dpm_run_callback(): pci_pm_restore+0x0/0x90 returns -110 [ T3390] mt7921e 0000:01:00.0: PM: failed to restore async: error -110 Later on there are also multiple stacktraces involving net/mac80211/rx.c This is the first occurrence. [ T1410] ------------[ cut here ]------------ [ T1410] WARNING: CPU: 2 PID: 1410 at /home/myuser/opt/linux-kernel/build.backup-exclude-m461c/build_bisect/worktree/net/mac80211/rx.c:5169 ieee80211_rx_list+0x588/0xc60 [mac80211] [...] [ T1410] CPU: 2 PID: 1410 Comm: napi/phy0-8193 Tainted: G W E 6.1.43-v6.1.43 #18 [ T1410] Hardware name: HP HP EliteBook 845 G8 Notebook PC/8895, BIOS T82 Ver. 01.13.01 03/31/2023 [ T1410] RIP: 0010:ieee80211_rx_list+0x588/0xc60 [mac80211] [ T1410] Code: ff 8b b5 18 05 00 00 85 f6 74 0b a9 00 00 04 00 0f 84 98 00 00 00 48 89 df e8 a4 06 60 ea e9 6d fb ff ff 0f 0b e9 59 fb ff ff <0f> 0b e9 52 fb ff ff 8b 85 78 15 00 00 85 c0 0f 84 10 fe ff ff e9 [ T1410] RSP: 0018:ffffb40a006b7c20 EFLAGS: 00010246 [ T1410] RAX: 000000ff0000ff00 RBX: ffff8eba878cd600 RCX: ffff8eb99a602198 [ T1410] RDX: ffff8eb99a6003a0 RSI: 0000000000000000 RDI: ffff8eb99a6008e0 [ T1410] RBP: ffff8eb99a6008e0 R08: 00000000ffffffa6 R09: ffffb40a006b7d30 [ T1410] R10: 000000000000143c R11: 000000002d1db16d R12: ffff8eb99a602080 [ T1410] R13: 0000000000000000 R14: ffff8eb99a603748 R15: 0000000000000000 [ T1410] FS: 0000000000000000(0000) GS:ffff8ec7ce680000(0000) knlGS:0000000000000000 [ T1410] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ T1410] CR2: 00007f1a789f7020 CR3: 0000000094010000 CR4: 0000000000750ee0 [ T1410] PKRU: 55555554 [ T1410] Call Trace: [ T1410] <TASK> [ T1410] ? __warn+0x7d/0x140 [ T1410] ? ieee80211_rx_list+0x588/0xc60 [mac80211] [ T1410] ? report_bug+0xf8/0x1e0 [ T1410] ? handle_bug+0x44/0x80 [ T1410] ? exc_invalid_op+0x13/0x60 [ T1410] ? asm_exc_invalid_op+0x16/0x20 [ T1410] ? ieee80211_rx_list+0x588/0xc60 [mac80211] [ T1410] ? swiotlb_tbl_map_single+0x5d6/0x6b0 [ T1410] mt76_rx_complete+0x198/0x2e0 [mt76] [ T1410] ? swiotlb_map+0x96/0x260 [ T1410] mt76_rx_poll_complete+0x373/0x570 [mt76] [ T1410] ? mt76_dma_rx_poll+0x25d/0x480 [mt76] [ T1410] mt76_dma_rx_poll+0x25d/0x480 [mt76] [ T1410] ? __napi_poll+0x1b0/0x1b0 [ T1410] mt7921_poll_rx+0x4a/0xe0 [mt7921e] [ T1410] __napi_poll+0x29/0x1b0 [ T1410] ? napi_threaded_poll+0x80/0x100 [ T1410] napi_threaded_poll+0x9d/0x100 [ T1410] kthread+0xd9/0x100 [ T1410] ? kthread_complete_and_exit+0x20/0x20 [ T1410] ret_from_fork+0x22/0x30 [ T1410] </TASK> [ T1410] ---[ end trace 0000000000000000 ]--- Finally the system is crashing when waking from standby. The crash itself is more likely an issue from the amdgpu driver. But there's also another stacktrace involving mt7921e which I could recover from pstore. [ T1410] ------------[ cut here ]------------ [ T1410] WARNING: CPU: 9 PID: 1410 at /home/myuser/opt/linux-kernel/build.backup-exclude-m461c/build_bisect/worktree/net/mac80211/rx.c:5169 ieee80211_rx_list+0x588/0xc60 [mac80211] [...] [ T1410] CPU: 9 PID: 1410 Comm: napi/phy0-8193 Tainted: G W E 6.1.43-v6.1.43 #18 [ T1410] Hardware name: HP HP EliteBook 845 G8 Notebook PC/8895, BIOS T82 Ver. 01.13.01 03/31/2023 [ T1410] RIP: 0010:ieee80211_rx_list+0x588/0xc60 [mac80211] [ T1410] Code: ff 8b b5 18 05 00 00 85 f6 74 0b a9 00 00 04 00 0f 84 98 00 00 00 48 89 df e8 a4 06 60 ea e9 6d fb ff ff 0f 0b e9 59 fb ff ff <0f> 0b e9 52 fb ff ff 8b 85 78 15 00 00 85 c0 0f 84 10 fe ff ff e9 [ T1410] RSP: 0018:ffffb40a006b7c20 EFLAGS: 00010246 [ T1410] RAX: 000000ff0000ff00 RBX: ffff8ec16adfca00 RCX: ffff8eb99a602198 [ T1410] RDX: ffff8eb99a6003a0 RSI: 0000000000000000 RDI: ffff8eb99a6008e0 [ T1410] RBP: ffff8eb99a6008e0 R08: 00000000ffffffa7 R09: ffffb40a006b7d30 [ T1410] R10: 000000000000143c R11: 0000000004c72c40 R12: ffff8eb99a602080 [ T1410] R13: 0000000000000000 R14: ffff8eb99a603748 R15: 0000000000000000 [ T1410] FS: 0000000000000000(0000) GS:ffff8ec7ce840000(0000) knlGS:0000000000000000 [ T1410] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ T1410] CR2: 0000000000000000 CR3: 0000000094010000 CR4: 0000000000750ee0 [ T1410] PKRU: 55555554 [ T1410] Call Trace: [ T1410] <TASK> [ T1410] ? __warn+0x7d/0x140 [ T1410] ? ieee80211_rx_list+0x588/0xc60 [mac80211] [ T1410] ? report_bug+0xf8/0x1e0 [ T1410] ? handle_bug+0x44/0x80 [ T1410] ? exc_invalid_op+0x13/0x60 [ T1410] ? asm_exc_invalid_op+0x16/0x20 [ T1410] ? ieee80211_rx_list+0x588/0xc60 [mac80211] [ T1410] ? swiotlb_tbl_map_single+0x5d6/0x6b0 [ T1410] mt76_rx_complete+0x198/0x2e0 [mt76] [ T1410] ? swiotlb_map+0x96/0x260 [ T1410] mt76_rx_poll_complete+0x373/0x570 [mt76] [ T1410] ? mt76_dma_rx_poll+0x25d/0x480 [mt76] [ T1410] mt76_dma_rx_poll+0x25d/0x480 [mt76] [ T1410] ? __napi_poll+0x1b0/0x1b0 [ T1410] mt7921_poll_rx+0x4a/0xe0 [mt7921e] [ T1410] __napi_poll+0x29/0x1b0 [ T1410] ? napi_threaded_poll+0x80/0x100 [ T1410] napi_threaded_poll+0x9d/0x100 [ T1410] kthread+0xd9/0x100 [ T1410] ? kthread_complete_and_exit+0x20/0x20 [ T1410] ret_from_fork+0x22/0x30 [ T1410] </TASK> [ T1410] ---[ end trace 0000000000000000 ]--- See here for full logs and the mentioned amdgpu driver issue: https://gitlab.freedesktop.org/drm/amd/-/issues/2492#note_2043652
Created attachment 306000 [details] dmesg with hibernation and failure to start wifi I'm getting an ongoing issue with the same symptoms. Coming up from hibernate, wifi is completely dead and similar messages in dmesg. Running `rmmod mt7921e` followed by `modprobe mt7921e` fixes it, with the exception that sometimes that command refuses to finish, and more importantly won't respond to a `kill -9`, and blocks reboot indefinitely. I'll try to get a `dmesg` recording next time the `rmmod` fails to finish, but I don't know if it shows anything. Attached the dmesg for a failure coming up out of hibernate.
Created attachment 306040 [details] dmesg with failure to rmmod attaching a dmesg of when I tried to `rmmod` and it timed out and failed. This happened every several times I do it.
Created attachment 306041 [details] failure to reboot after rmmod stalls and a photo of the output that eventually shows up when the rmmod gets stuck and I try rebooting.
I'm having the same issue on my desktop PC as well. Ryzen 5 5600X, MSI x470 gaming pro, Radeon RX 6800, and MT7922. I get the exact same error after my device wakes up from suspend. However, it's not all of the time, but most of the time. Running Arch Linux with Linux 6.9.2
@hurricanepootis mt7921e is a wireless network driver for the MT7922. But your mainboard doesn't seem to have a wireless network chip like the MT7922. https://www.msi.com/Motherboard/X470-GAMING-PRO/Specification Please share hardware details about your wireless network card. Do you use an extra PCI or USB wireless network card? And can you share a dmesg log after waking from suspend? sudo dmesg Thanks! @Alex Maras From your dmesg log I guess this is your computer. Please reply is this isn't correct. [ 0.000000] DMI: Framework Laptop 13 (AMD Ryzen 7040Series)/FRANMDCP07, BIOS 03.03 10/17/2023 Framework seems have official Linux support. https://knowledgebase.frame.work/en_us/can-i-install-linux-By6nAJ7td You might ask the Framework support if they can help with this problem. https://framework.kustomer.help/en_us/contact/support-request-ryon9uAuq I searched the Framework forum. https://community.frame.work/tag/linux There are some Linux users having similar problems with the mt7921e driver. https://community.frame.work/t/framework-13-amd-on-arch-issues-with-wireless-after-resume/44597 https://community.frame.work/t/responded-issues-on-arch-linux-with-rz616-on-framework-13-amd-7040-on-linux-kernel-6-5-7/38404 https://community.frame.work/t/tracking-unstable-and-unreliable-wlan-rz616-mt7922-fw13-amd-diy/40316 Placing new firmware into /lib/firmware/ seems to be an interesting idea discussed there. https://git.kernel.org/pub/scm/linux/kernel/git/firmware/linux-firmware.git into I didn't read fully trough those. Maybe have a look if you can spare some time.
@kolAflash, I am using a generic m.2 wifi key e to pcie x1 slot with a cable coming for motherboard usb 2.0 header for the Bluetooth. As gar as I am aware, there is no actual chip handling any logic on the adapter, just splitting out what's needed from the m.2 slot. As a side note, in the past I have used an Intel AX210 in that adapter and had no problems with suspending with it, and also briefly used a Mediatek Mt7921 and cannot recall if I did or did not have problems.
Same issue, with Lenovo ARX8 (R9-7945HX). This is result of `lspci -kvv`: ``` 04:00.0 Network controller: MEDIATEK Corp. MT7922 802.11ax PCI Express Wireless Network Adapter Subsystem: Lenovo Device e0c6 Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx+ Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx- Latency: 0, Cache Line Size: 64 bytes Interrupt: pin A routed to IRQ 123 IOMMU group: 1 Region 0: Memory at 3ffc02000000 (64-bit, prefetchable) [size=1M] Region 2: Memory at d1700000 (64-bit, non-prefetchable) [size=32K] Capabilities: [80] Express (v2) Endpoint, IntMsgNum 0 DevCap: MaxPayload 128 bytes, PhantFunc 0, Latency L0s unlimited, L1 unlimited ExtTag+ AttnBtn- AttnInd- PwrInd- RBE+ FLReset+ SlotPowerLimit 75W TEE-IO- DevCtl: CorrErr- NonFatalErr- FatalErr- UnsupReq- RlxdOrd- ExtTag+ PhantFunc- AuxPwr- NoSnoop+ FLReset- MaxPayload 128 bytes, MaxReadReq 512 bytes DevSta: CorrErr- NonFatalErr- FatalErr- UnsupReq- AuxPwr- TransPend- LnkCap: Port #1, Speed 5GT/s, Width x1, ASPM L0s L1, Exit Latency L0s <2us, L1 <8us ClockPM- Surprise- LLActRep- BwNot- ASPMOptComp+ LnkCtl: ASPM L1 Enabled; RCB 64 bytes, LnkDisable- CommClk+ ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt- LnkSta: Speed 5GT/s, Width x1 TrErr- Train- SlotClk+ DLActive- BWMgmt- ABWMgmt- DevCap2: Completion Timeout: Range ABCD, TimeoutDis+ NROPrPrP- LTR+ 10BitTagComp- 10BitTagReq- OBFF Not Supported, ExtFmt+ EETLPPrefix- EmergencyPowerReduction Not Supported, EmergencyPowerReductionInit- FRS- TPHComp- ExtTPHComp- AtomicOpsCap: 32bit- 64bit- 128bitCAS- DevCtl2: Completion Timeout: 50us to 50ms, TimeoutDis- AtomicOpsCtl: ReqEn- IDOReq- IDOCompl- LTR+ EmergencyPowerReductionReq- 10BitTagReq- OBFF Disabled, EETLPPrefixBlk- LnkCap2: Supported Link Speeds: 2.5-5GT/s, Crosslink- Retimer- 2Retimers- DRS- LnkCtl2: Target Link Speed: 5GT/s, EnterCompliance- SpeedDis- Transmit Margin: Normal Operating Range, EnterModifiedCompliance- ComplianceSOS- Compliance Preset/De-emphasis: -6dB de-emphasis, 0dB preshoot LnkSta2: Current De-emphasis Level: -3.5dB, EqualizationComplete- EqualizationPhase1- EqualizationPhase2- EqualizationPhase3- LinkEqualizationRequest- Retimer- 2Retimers- CrosslinkRes: unsupported Capabilities: [e0] MSI: Enable+ Count=1/32 Maskable+ 64bit+ Address: 00000000fee00000 Data: 0000 Masking: fffffffe Pending: 00000000 Capabilities: [f8] Power Management version 3 Flags: PMEClk- DSI- D1- D2- AuxCurrent=0mA PME(D0+,D1-,D2-,D3hot+,D3cold+) Status: D0 NoSoftRst+ PME-Enable- DSel=0 DScale=0 PME- Kernel driver in use: mt7921e Kernel modules: mt7921e ```
Adding kernel parameter `mt7921e.disable_aspm=y` workarounded this issue on my machine. I am using linux 6.9.
Adding 'mt7921e.disable_aspm=y' did not fix my issue, nor did upgrading my bios to AMD AGESA ComboAm4v2PI 1.2.0.A.
I mitigated this issue by removing the module before placing my PC to sleep. I automated this by creating a script with the following: ``` #!/usr/bin/env sh case ${1} in pre) rmmod mt7921e echo "Removing mt7921e kernel module" ;; post) modprobe mt7921e echo "Adding mt7921e kernel module" ;; esac ``` And placing it at `/usr/lib/systemd/system-sleep/mt7921e-sleep-fix.sh`. What this script does is that whenever the system starts to suspend, systemd will execute anything in that folder with either `pre` or `post` as the first argument. It's a simple script tbh, you can read more about how it works on the man page for systemd-sleep.
Created attachment 306537 [details] dmesg failure I faced the same issue on Debian 6.7.12-1. Attached is the error in dmesg. Removing and loading the module `mt7921e` again fixes the issue.
Created attachment 306681 [details] dmesg with error Have same issue with mt9721e and Asus Vivobook, Linux Mint 21.3 and kernel 6.5.0-45-generic. Happening when waking up from sleep. What is interesting is that it breaks after being in sleep longer than few minutes. If I close the lid, wait few minutes and then open, it works. When I let it sleep for like an hour or more then I can see issues. Those issues prevent OS shutdown. Screen goes blank on shutdown but laptop do not shutdown completely. I need to do a hard shutdown by pressing power button for like 10s or more. Workaround with rmmod and modprobe on systemd sleep seems to be working okay.
Is this still reproducible under 6.10.4 when using the latest firmware files?
I am on Kernel 6.10.4 and am now using linux-firmware-git 20240809.59460076 on Arch Linux. I have put my computer to sleep and back a few times now without the script I wrote (see above). I will need to use wifi more through the next few days, and I will also try on firmware 20240703.e94a2a3b (the current linux-firmware on arch) to see if the kernel and/or firmware has potentially fixed the issue.
I have the same problem in Ubuntu 24.04 (6.8.0-41). Aspire A715-42G Aug 23 15:44:16 hacker kernel: mt7921e 0000:04:00.0: Message 00020007 (seq 10) timeout Aug 23 15:44:16 hacker kernel: mt7921e 0000:04:00.0: PM: dpm_run_callback(): pci_pm_resume+0x0/0x110 returns -110 Aug 23 15:44:16 hacker kernel: mt7921e 0000:04:00.0: PM: failed to resume async: error -110
I installed a new kernel via mainline kernels Kernel: 6.10.6-061006-generic OS: Ubuntu 24.04 LTS x86_64 Aug 24 08:02:39 hacker kernel: mt7921e 0000:04:00.0: Message 00020007 (seq 3) timeout Aug 24 08:02:39 hacker kernel: mt7921e 0000:04:00.0: PM: dpm_run_callback(): pci_pm_resume returns -110 Aug 24 08:02:39 hacker kernel: mt7921e 0000:04:00.0: PM: failed to resume async: error -110 Conclusion: the new kernel did not fix this
Confirming this bug also affects the ASUS Vivobook 16X K3605ZU with latest Ubuntu 24.04.1. The messages in the syslog are a bit different, but it is likely the same or very related problem. the Ubuntu apport tool uploaded all the logs here: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/2059744 I have also compiled the same version of kernel 6.8.0-41-generic from sources from apt packages using the usual guide https://wiki.ubuntu.com/Kernel/BuildYourOwnKernel (which required a lot of additional packages as well) and it behaves the same, but now it is ready for trying any patches, or changing kernel config options, if you have any suggestions, please. Also tried to download a newer kernel through mainline, for version 6.9.12-060912-generic it still works the same - no wifi after wakeup from suspend, but the kernels 6.10.* fail to build NVIDIA module on this system, even after attempts to fix, I could not get that kernel to work to test. Would need some more help to try those. Would you have any suggestions what to try?
To add some more observations. It is not only the wifi, it is also sound. And the failure (as you may have noticed in the logs) is in the 0000:00:1c.0 PCI bridge: Intel Corporation Alder Lake PCH-P PCI Express Root Port #9 (rev 01) before the attempt to wake up wifi. syslog notices: pcieport 0000:00:1c.0: broken device, retraining non-functional downstream link at 2.5GT/s and then pcieport 0000:00:1c.0: retraining failed exactly before the first problem with the mt7921: 0000:00:1c.0 PCI bridge: Intel Corporation Alder Lake PCH-P PCI Express Root Port #9 (rev 01)
(the last line should have been) pci 0000:2c:00.0: not ready 1023ms after resume;
A new kernel 6.8.0-45 arrived in Ubuntu 24.04.1 today. But it is even worse than before. After wakeup from suspend, even usb tethering does not work properly, which used to work with v41, and system freezes quite easily. Here is a syslog from the suspending part: https://capek.ii.fmph.uniba.sk/suspending-syslog and from the waking up after suspend: https://capek.ii.fmph.uniba.sk/wakeup-syslog
Pavel, I'm not a developer, maybe you should buy an Intel WiFi card 🤔 I read that some laptops have the ability to change the Wi-Fi card.
(In reply to Vadim from comment #23) > Pavel, I'm not a developer, maybe you should buy an Intel WiFi card 🤔 I > read that some laptops have the ability to change the Wi-Fi card. Thanks, but it is easier to turn off automatic suspend in the settings, and be careful not to leave the PC on unattended, it's not that terrible, but it would be nice to have it fixed, hopefully some expert on PCI will take a look eventually. I do not know how much it is related to the network card, or the PCI bus itself.
A new kernel 6.8.0-47 arrived in Ubuntu 24.04.1 today. Still similar story. https://capek.ii.fmph.uniba.sk/wakeup-syslog-47 Suspend ruins the system. It is annoying to turn off the computer completely all the time when moving somewhere. Having it ON while in the bag is bad, it could overheat, no air to cool... I hope someone will fix this, please.
For distro kernels please report issues to distros and please keep the focus on upstream bugs on upstream kernels.
Thank you Mario for your suggestion. But well, we were just sent here from there. It is likely an active bug in upstream kernel as well. If you or someone else can give clear instructions how to test such upstream kernel on this Ubuntu machine and report its details, I will be glad to do that. As mentioned above, when I tried mainline, 6.10.* failed to build NVIDIA module on this system.
New Ubuntu kernel 6.8.0-48, same result. Kernel crashes, machine does not even reboot after wakeup. https://capek.ii.fmph.uniba.sk/6.8.0-48-wakeup-from-suspend.txt 2024-11-02T21:48:58.952069+01:00 buchlovice kernel: pcieport 0000:00:1c.0: broken device, retraining non-functional downstream link at 2.5GT/s 2024-11-02T21:48:58.952070+01:00 buchlovice kernel: pcieport 0000:00:1c.0: retraining failed 2024-11-02T21:48:58.952072+01:00 buchlovice kernel: pcieport 0000:00:1c.0: broken device, retraining non-functional downstream link at 2.5GT/s 2024-11-02T21:48:58.952074+01:00 buchlovice kernel: pcieport 0000:00:1c.0: retraining failed 2024-11-02T21:48:58.952075+01:00 buchlovice kernel: mt7921e 0000:2c:00.0: not ready 1023ms after resume; waiting 2024-11-02T21:48:58.952077+01:00 buchlovice kernel: mt7921e 0000:2c:00.0: not ready 2047ms after resume; waiting 2024-11-02T21:48:58.952079+01:00 buchlovice kernel: mt7921e 0000:2c:00.0: not ready 4095ms after resume; waiting 2024-11-02T21:48:58.952081+01:00 buchlovice kernel: mt7921e 0000:2c:00.0: not ready 8191ms after resume; waiting 2024-11-02T21:48:58.952083+01:00 buchlovice kernel: mt7921e 0000:2c:00.0: not ready 16383ms after resume; waiting 2024-11-02T21:48:58.952085+01:00 buchlovice kernel: mt7921e 0000:2c:00.0: not ready 32767ms after resume; waiting 2024-11-02T21:48:58.952088+01:00 buchlovice kernel: mt7921e 0000:2c:00.0: not ready 65535ms after resume; giving up 2024-11-02T21:48:58.952089+01:00 buchlovice kernel: mt7921e 0000:2c:00.0: Unable to change power state from D3cold to D0, device inaccessible
New Ubuntu kernel 6.8.0-49, same result. Kernel crashes, machine does not even reboot after wakeup. 2024-11-21T16:27:34.851605+01:00 buchlovice kernel: Freezing user space processes 2024-11-21T16:27:34.851692+01:00 buchlovice kernel: Freezing user space processes completed (elapsed 0.001 seconds) 2024-11-21T16:27:34.851698+01:00 buchlovice kernel: OOM killer disabled. 2024-11-21T16:27:34.851700+01:00 buchlovice kernel: Freezing remaining freezable tasks 2024-11-21T16:27:34.851702+01:00 buchlovice kernel: Freezing remaining freezable tasks completed (elapsed 0.001 seconds) 2024-11-21T16:27:34.851703+01:00 buchlovice kernel: printk: Suspending console(s) (use no_console_suspend to debug) 2024-11-21T16:27:34.851705+01:00 buchlovice kernel: ACPI: EC: interrupt blocked 2024-11-21T16:27:34.851706+01:00 buchlovice kernel: ACPI: EC: interrupt unblocked 2024-11-21T16:27:34.851707+01:00 buchlovice kernel: pcieport 0000:00:1c.0: broken device, retraining non-functional downstream link at 2.5GT/s 2024-11-21T16:27:34.851708+01:00 buchlovice kernel: pcieport 0000:00:1c.0: retraining failed 2024-11-21T16:27:34.851709+01:00 buchlovice kernel: pcieport 0000:00:1c.0: broken device, retraining non-functional downstream link at 2.5GT/s 2024-11-21T16:27:34.851710+01:00 buchlovice kernel: pcieport 0000:00:1c.0: retraining failed 2024-11-21T16:27:34.851711+01:00 buchlovice kernel: mt7921e 0000:2c:00.0: not ready 1023ms after resume; waiting 2024-11-21T16:27:34.851712+01:00 buchlovice kernel: mt7921e 0000:2c:00.0: not ready 2047ms after resume; waiting 2024-11-21T16:27:34.851714+01:00 buchlovice kernel: mt7921e 0000:2c:00.0: not ready 4095ms after resume; waiting 2024-11-21T16:27:34.851715+01:00 buchlovice kernel: mt7921e 0000:2c:00.0: not ready 8191ms after resume; waiting 2024-11-21T16:27:34.851716+01:00 buchlovice kernel: mt7921e 0000:2c:00.0: not ready 16383ms after resume; waiting 2024-11-21T16:27:34.851717+01:00 buchlovice kernel: mt7921e 0000:2c:00.0: not ready 32767ms after resume; waiting 2024-11-21T16:27:34.851718+01:00 buchlovice kernel: mt7921e 0000:2c:00.0: not ready 65535ms after resume; giving up 2024-11-21T16:27:34.851718+01:00 buchlovice kernel: mt7921e 0000:2c:00.0: Unable to change power state from D3cold to D0, device inaccessible 2024-11-21T16:27:34.851720+01:00 buchlovice kernel: pcieport 10000:e0:06.0: can't derive routing for PCI INT A 2024-11-21T16:27:34.851720+01:00 buchlovice kernel: nvme 10000:e1:00.0: PCI INT A: no GSI 2024-11-21T16:27:34.851721+01:00 buchlovice kernel: i915 0000:00:02.0: [drm] GT0: GuC firmware i915/adlp_guc_70.bin version 70.20.0 2024-11-21T16:27:34.851722+01:00 buchlovice kernel: i915 0000:00:02.0: [drm] GT0: HuC firmware i915/tgl_huc.bin version 7.9.3 2024-11-21T16:27:34.851723+01:00 buchlovice kernel: i915 0000:00:02.0: [drm] GT0: HuC: authenticated for all workloads 2024-11-21T16:27:34.851724+01:00 buchlovice kernel: i915 0000:00:02.0: [drm] GT0: GUC: submission enabled 2024-11-21T16:27:34.851726+01:00 buchlovice kernel: i915 0000:00:02.0: [drm] GT0: GUC: SLPC enabled 2024-11-21T16:27:34.851727+01:00 buchlovice kernel: i915 0000:00:02.0: [drm] GT0: GUC: RC enabled 2024-11-21T16:27:34.851728+01:00 buchlovice kernel: nvme nvme0: 16/0/0 default/read/poll queues 2024-11-21T16:27:34.851729+01:00 buchlovice kernel: mt7921e 0000:2c:00.0: driver own failed 2024-11-21T16:27:34.851730+01:00 buchlovice kernel: mt7921e 0000:2c:00.0: PM: dpm_run_callback(): pci_pm_resume+0x0/0x110 returns -5 2024-11-21T16:27:34.851731+01:00 buchlovice kernel: mt7921e 0000:2c:00.0: PM: failed to resume async: error -5
> It is likely an active bug in upstream kernel as well. It very well may be, but we don't know for sure until the upstream kernel has been tested. 6.8 is end of life upstream and won't be picking up any new patches or actively looked at. As I said in comment #26, please keep distro kernel bugs in distro trackers. TBH from the logs that have been shown on distro kernel this "could" be an issue in PCIe core or firmware not even in Mediatek driver. 6.12 is likely to be declared the next LTS kernel, this would be a good place to test.
(In reply to Mario Limonciello (AMD) from comment #30) > > It is likely an active bug in upstream kernel as well. > > It very well may be, but we don't know for sure until the upstream kernel > has been tested. 6.8 is end of life upstream and won't be picking up any > new patches or actively looked at. > > As I said in comment #26, please keep distro kernel bugs in distro trackers. > TBH from the logs that have been shown on distro kernel this "could" be an > issue in PCIe core or firmware not even in Mediatek driver. > > 6.12 is likely to be declared the next LTS kernel, this would be a good > place to test. Thank you, could you, please, point me to a tutorial that will explain how to create a bootable ISO with some distribution and an upstream kernel? Otherwise I do not see how a regular Linux user should respond to your rant. As I mentioned above, I tried, but did not get far.
https://dl.fedoraproject.org/pub/fedora/linux/development/rawhide/Workstation/x86_64/iso/
And https://docs.fedoraproject.org/en-US/fedora/latest/preparing-boot-media/
Many thanks, Mario. I have downloaded the ISO you linked, written it to a bootable medium, started Fedora from the medium, issued Suspend, it suspended fine (power LED went off). Pressing a key started the wakeup process (power LED went on), display immediately showed the latest contents of the screen, but the computer froze. Live medium, so no logs, therefore I shrank my Ubuntu partition by 20GB, installed Fedora on the main disk, booted from the disk, made sure the system is updated, and clicked Suspend. Power LED went off. After a key press, power LED went on, but nothing came up for more than 5 minutes, even the display was dark. journalctl does not contain a single line between the suspend and next boot after hard power OFF and ON. It is saved here: https://capek.ii.fmph.uniba.sk/6.13.0-0.rc0.20241119git158f238aa69d.2.fc42.x86_64-no-wakeup-after-suspend.txt It contains the last two boots - one that ends with suspend and the following one after hard power-off. If there is anything to try or other logs to provide, please, let me know.
At this point I can confidently say that we're looking at different issues from you and the original reporter (@kolAflash). I think it would be best to split up your issue into a "few" bugs to get the attention of the right people for each component I see a problem. Let me pull a few things from your log to show you what I mean. > Nov 21 20:09:42 fedora kernel: pcieport 10000:e0:06.0: can't derive routing > for PCI INT A > Nov 21 20:09:42 fedora kernel: nvme 10000:e1:00.0: PCI INT A: not connected There is some problem with what /appears/ to be interrupt handling for your NVME disk. FWIW this might not be crucial. On AMD platforms we had a similar problem with the IOMMU showing this. I dug into it and confirmed it was a false positive and it's sorted on AMD with this (that won't do anything for Intel). https://github.com/torvalds/linux/commit/0feda94c868d396fac3b3cb14089d2d989a07c72 It would be best to have Intel guys confirm if that's a problem or not. However... > Nov 21 19:59:25 fedora kernel: WARNING: CPU: 15 PID: 11 at > mm/page_alloc.c:4727 __alloc_pages_noprof+0x2ca/0x330 Nov 21 19:59:25 fedora kernel: Modules linked in: nvme nvme_core nvme_auth i915(+) nouveau(+) mxm_wmi drm_ttm_helper gpu_sched drm_gpuvm drm_exec i2c_algo_bit drm_buddy crct10dif_pclmul crc32_pclmul ttm crc32c_intel polyval_clmulni ucsi_acpi polyval_generic hid_multitouch drm_display_helper ghash_clmulni_intel sha512_ssse3 typec_ucsi sha256_ssse3 sha1_ssse3 cec typec vmd i2c_hid_acpi video i2c_hid wmi pinctrl_tigerlake serio_raw fuse There is a page allocation failure right after this, so NVME HMB /might/ not have gotten set up properly. > Nov 21 18:59:55 fedora kernel: nouveau 0000:01:00.0: gsp: rc engn:00000001 > chid:0 type:45 scope:1 part:233 > Nov 21 18:59:55 fedora kernel: nouveau 0000:01:00.0: > fifo:c00000:0000:0000:[(udev-worker)[542]] errored - disabling channel > Nov 21 18:59:55 fedora kernel: nouveau 0000:01:00.0: DRM: channel 0 killed! > Nov 21 18:59:55 fedora kernel: nouveau 0000:01:00.0: gsp: rc engn:00000001 > chid:8 type:45 scope:1 part:233 > Nov 21 18:59:55 fedora kernel: nouveau 0000:01:00.0: > fifo:c00400:0001:0008:[(udev-worker)[542]] errored - disabling channel > Nov 21 18:59:55 fedora kernel: nouveau 0000:01:00.0: DRM: channel 8 killed! > Nov 21 18:59:55 fedora kernel: nouveau 0000:01:00.0: gsp:msg fn:103 > len:0x78/0x58 res:0x62 resp:0x62 > Nov 21 18:59:55 fedora kernel: msg: 00000000: 03 00 d0 c1 03 00 d0 c1 00 00 > 1d de 80 00 00 00 ................ > Nov 21 18:59:55 fedora kernel: msg: 00000010: 62 00 00 00 38 00 00 00 00 00 > 00 00 00 00 00 00 b...8........... > Nov 21 18:59:55 fedora kernel: msg: 00000020: 00 00 00 00 03 00 d0 c1 00 00 > 00 00 00 00 00 00 ................ > Nov 21 18:59:55 fedora kernel: msg: 00000030: 00 00 00 00 00 00 00 00 00 00 > 00 00 00 00 00 00 ................ > Nov 21 18:59:55 fedora kernel: msg: 00000040: 00 00 00 00 00 00 00 00 00 00 > 00 00 00 00 00 00 ................ > Nov 21 18:59:55 fedora kernel: msg: 00000050: 00 00 00 00 00 00 00 00 > ........ > Nov 21 18:59:55 fedora kernel: nouveau 0000:01:00.0: systemd-logind[1123]: > VMM allocation failed: -22 > Nov 21 18:59:55 fedora kernel: nouveau 0000:01:00.0: gsp: rc engn:00000001 > chid:0 type:45 scope:1 part:233 > Nov 21 18:59:55 fedora kernel: nouveau 0000:01:00.0: gsp: rc engn:00000001 > chid:8 type:45 scope:1 part:233 > Nov 21 18:59:55 fedora gnome-shell[2393]: Failed to open gpu > '/dev/dri/card0': GDBus.Error:org.freedesktop.DBus.Error.InvalidArgs: Invalid > argument > Nov 21 18:59:55 fedora kernel: nouveau 0000:01:00.0: gsp:msg fn:103 > len:0x78/0x58 res:0x62 resp:0x62 > Nov 21 18:59:55 fedora kernel: msg: 00000000: 03 00 d0 c1 03 00 d0 c1 00 00 > 1d de 80 00 00 00 ................ > Nov 21 18:59:55 fedora kernel: msg: 00000010: 62 00 00 00 38 00 00 00 00 00 > 00 00 00 00 00 00 b...8........... > Nov 21 18:59:55 fedora kernel: msg: 00000020: 00 00 00 00 03 00 d0 c1 00 00 > 00 00 00 00 00 00 ................ > Nov 21 18:59:55 fedora kernel: msg: 00000030: 00 00 00 00 00 00 00 00 00 00 > 00 00 00 00 00 00 ................ > Nov 21 18:59:55 fedora kernel: msg: 00000040: 00 00 00 00 00 00 00 00 00 00 > 00 00 00 00 00 00 ................ > Nov 21 18:59:55 fedora kernel: msg: 00000050: 00 00 00 00 00 00 00 00 > ........ > Nov 21 18:59:55 fedora kernel: nouveau 0000:01:00.0: systemd-logind[1123]: > VMM allocation failed: -22 > Nov 21 18:59:55 fedora kernel: nouveau 0000:01:00.0: gsp: rc engn:00000001 > chid:0 type:45 scope:1 part:233 > Nov 21 18:59:55 fedora kernel: nouveau 0000:01:00.0: gsp: rc engn:00000001 > chid:8 type:45 scope:1 part:233 Nouveau seems to be misbehaving here. This should be a bug filed against https://gitlab.freedesktop.org/drm/nouveau/-/issues > Nov 21 19:03:21 fedora kernel: PM: suspend entry (s2idle) > Nov 21 19:03:21 fedora kernel: Filesystems sync: 0.014 seconds > -- Boot a07ba43e15ca4a569cfb9dbcd7512e47 -- Hanging on s2idle needs to be triaged by the Intel s2idle triage script: https://github.com/intel/S0ixSelftestTool I suggest you run that and then open up another bug for Intel guys to look at the results.