Created attachment 292739 [details] dmesg resume fail output with kernel 5.9.0-rc6 I've been having random resume problems from around kernel 5.5, and it persists even up to 5.9-rc6. When this occurs I can still login to SSH and give a reboot command, but though SSH disconnects my computer doesn't reboot and I have to press the reset button. I have an ASUS Gaming TUF X570 motherboard, R7 3700X CPU, RX 580 GPU, and 16GB of RAM. The primary error recorded over and over in dmesg is: [xxxxx.xxxxxx] amdgpu: failed to send message 201 ret is 65535 [xxxxx.xxxxxx] amdgpu: last message was failed ret is 65535 I've included the part of dmesg beginning with suspend event through the resume failure for kernel 5.9-rc6.
Please attach your full dmesg output from boot. Can you bisect?
Created attachment 292741 [details] Resume failure, full dmesg output from kernel 5.8.5 The last full dmesg output I have is from kernel 5.8.5, and I've attached it to this response. However the messages haven't changed since then. Going forward would you rather I run the current 5.8 (on arch it's 5.8.12) or the 5.9 RC release candidates (currently 5.9-rc6) to capture the next event? I can bisect, but don't know how to bisect a random issue like this. It's difficult to say how often it happens, but I'd estimate one out of seven to twelve times. I actually tried purposely going through multiple suspend/resume cycles sometime ago in hopes of gathering more info for a bug report, but got to 20 cycles with no errors so I gave up. So it seems the issue only occurs if my computer has been suspended for a significant period of time, as it only occurs when my computer has been suspended overnight. It's also significant to note that I have two identical XFX Radeon RX 580 GTS XXX Edition GPUs, and one is passed through via VFIO at boot. In any case I'll be happy to assist on this issue in any way I can. I've seen multiple complaints about it online, but saw other bug reports that I assumed were already addressing it or I would have filed a new bug report sooner. I wasn't aware of my error until this morning.
Looks like you attached the wrong file? Can you elaborate on how you use your GPUs? If you take vfio out of the picture, do you still have the issues?
You are correct, the restored 5.8.5 dmesg output doesn't have the full output either, and it's the only other output I can find in my backups. I apologize for my error. Unfortunately I can't remove my VFIO setup for any extended period of time because I'm working on a project with other musicians that demands I use my Windows 10 VM daily for software that has no Linux alternative. There is other almost-equivalent software that could have been used (which I actually prefer) but the other musicians aren't willing to switch to Linux. In their defense they did all try quite awhile ago, but it was just to difficult for them, and their frustration ended up causing anger and contention among our group. In any case here's my VFIO passthrough setup: /etc/default/grub boot command line: GRUB_CMDLINE_LINUX_DEFAULT="quiet loglevel=3 video=efifb:off audit=0 acpi_enforce_resources=lax rd.modules-load=vfio-pci amd_iommu=on iommu= pt" /etc/modprobe.d/kvm.conf: options kvm_amd avic=1 /etc/modprobe.d/vfio.conf: options vfio-pci disable_vga=1 softdep amdgpu pre: vfio-pci softdep radeon pre: vfio-pci softdep ahci pre: vfio-pci softdep xhci_pci pre: vfio-pci install vfio-pci /usr/local/bin/vfio-pci-override.sh /usr/local/bin/vfio-pci-override.sh ``` #!/bin/sh DEVS="0000:0b:00.0 0000:0b:00.1" if [ ! -z "$(ls -A /sys/class/iommu)" ]; then for DEV in $DEVS; do echo "vfio-pci" > /sys/bus/pci/devices/$DEV/driver_override done fi modprobe -i vfio-pci ```
Created attachment 292753 [details] Full dmesg resume fail output for kernel 5.8.12 I suspended my computer during dinner and when I tried to resume it failed. I've attached the full dmesg output to this message. The full Xorg log will follow.
Created attachment 292755 [details] Full Xorg resume fail output for kernel 5.8.12 Here is the Xorg.0.log log output for the resume fail.
This bug still persists with kernel 5.9.0. I didn't attach new logs because the bug output is identical to the 5.8 kernel series.
[ 3399.070651] pcieport 0000:03:02.0: can't change power state from D3hot to D0 (config space inaccessible) [ 3399.073473] amdgpu 0000:05:00.0: can't change power state from D3hot to D0 (config space inaccessible) [ 3399.136581] snd_hda_intel 0000:05:00.1: can't change power state from D3hot to D0 (config space inaccessible) Seems like the card never gets powered back up by the platform on resume.
The same type of problem also occurred when I had my old R9-390 and GT 710 GPUs, FX-6300 CPU, and Gigabyte GA-990FXA-UD5 motherboard. However if I put the GT 710 in the primary PCIE slot the resume problem never occurred. I can't be certain it was the exact same problem though, because there were a lot of AMDGPU resume problems and I just assumed it was because the hardware I had was so old. And since my R9 390 AMDGPU support was considered experimental I figured I had to live with the issue. So I really hoped it would go away when I got my two new RX 580 GPUs, R7 3700X CPU, and X570 motherboard, but unfortunately the resume problem still occurs. And I gave away my GT 710 so I can't check to see if it still alleviates the issue.
I'm having the same problem; I'm using Ubuntu 18.04 LTS and whatever they backported to kernel 5.4.0-51-generic started causing this problem; while the problem goes away in 5.4.0-48-generic (Ubuntu flavors) I have more information: - Card is Radeon RX 560 Series (POLARIS11, DRM 3.35.0, 5.4.0-48-generic, LLVM 10.0.1) - The bug sometimes also triggers when plugging or unplugging an HDMI TV. (this may be https://bugzilla.kernel.org/show_bug.cgi?id=204241 ?) - The keyboard locks up, but I can still login via SSH - 'sudo shutdown now' will never finish. The kernel is stuck - In my case dmesg nor xorg.log notice at all something went wrong - Trying to kill X reveals the following: [ 1571.941734] Call Trace: [ 1571.941747] __schedule+0x293/0x720 [ 1571.941752] ? __queue_work+0x14c/0x400 [ 1571.941758] schedule+0x33/0xa0 [ 1571.941765] rpm_resume+0x108/0x780 [ 1571.941769] ? __switch_to_asm+0x40/0x70 [ 1571.941776] ? wait_woken+0x80/0x80 [ 1571.941782] __pm_runtime_resume+0x4e/0x80 [ 1571.941939] amdgpu_drm_ioctl+0x39/0x80 [amdgpu] [ 1571.941944] do_vfs_ioctl+0xa9/0x640 [ 1571.941950] ? __schedule+0x29b/0x720 [ 1571.941954] ksys_ioctl+0x75/0x80 [ 1571.941957] __x64_sys_ioctl+0x1a/0x20 [ 1571.941964] do_syscall_64+0x57/0x190 [ 1571.941968] entry_SYSCALL_64_after_hwframe+0x44/0xa9 [ 1571.941973] RIP: 0033:0x7f746d5a96d7 [ 1571.941982] Code: Bad RIP value. [ 1571.941985] RSP: 002b:00007fff1ec6a7a8 EFLAGS: 00003246 ORIG_RAX: 0000000000000010 [ 1571.941990] RAX: ffffffffffffffda RBX: 00007fff1ec6a7e0 RCX: 00007f746d5a96d7 [ 1571.941992] RDX: 00007fff1ec6a7e0 RSI: 00000000c06864a2 RDI: 000000000000000d [ 1571.941994] RBP: 00007fff1ec6a7e0 R08: 0000000000000000 R09: 0000000000000000 [ 1571.941996] R10: 0000000000000000 R11: 0000000000003246 R12: 00000000c06864a2 [ 1571.941998] R13: 000000000000000d R14: 000055f52f391780 R15: 000055f52f2176a0 [ 1571.942021] INFO: task chrome:shlo0:2563 blocked for more than 120 seconds. [ 1571.942026] Tainted: G OE 5.4.0-51-generic #56~18.04.1-Ubuntu [ 1571.942029] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [ 1692.774402] python3:disk$2 D 0 6187 1 0x80004002 [ 1692.774404] Call Trace: [ 1692.774410] __schedule+0x293/0x720 [ 1692.774414] ? __switch_to_asm+0x40/0x70 [ 1692.774419] schedule+0x33/0xa0 [ 1692.774424] schedule_preempt_disabled+0xe/0x10 [ 1692.774429] __mutex_lock.isra.9+0x26d/0x4e0 [ 1692.774436] __mutex_lock_slowpath+0x13/0x20 [ 1692.774441] ? __mutex_lock_slowpath+0x13/0x20 [ 1692.774446] mutex_lock+0x2f/0x40 [ 1692.774472] drm_release+0x2e/0xd0 [drm] [ 1692.774476] __fput+0xc6/0x260 [ 1692.774481] ____fput+0xe/0x10 [ 1692.774485] task_work_run+0x9d/0xc0 [ 1692.774491] do_exit+0x382/0xb80 [ 1692.774496] ? mem_cgroup_try_charge+0x75/0x190 [ 1692.774503] do_group_exit+0x43/0xa0 [ 1692.774506] get_signal+0x14f/0x860 [ 1692.774512] do_signal+0x34/0x6d0 [ 1692.774515] ? strlcpy+0x32/0x50 [ 1692.774519] ? __x64_sys_futex+0x13f/0x190 [ 1692.774525] exit_to_usermode_loop+0x90/0x130 [ 1692.774530] do_syscall_64+0x170/0x190 [ 1692.774534] entry_SYSCALL_64_after_hwframe+0x44/0xa9 [ 1692.774536] RIP: 0033:0x7f7f31d789f3 [ 1692.774541] Code: Bad RIP value. [ 1692.774543] RSP: 002b:00007f7ef49abd10 EFLAGS: 00000246 ORIG_RAX: 00000000000000ca [ 1692.774546] RAX: fffffffffffffe00 RBX: 0000000002041e80 RCX: 00007f7f31d789f3 [ 1692.774548] RDX: 0000000000000000 RSI: 0000000000000080 RDI: 0000000002041ea8 [ 1692.774549] RBP: 0000000002041ea4 R08: 0000000000000000 R09: 0000000000000000 [ 1692.774551] R10: 0000000000000000 R11: 0000000000000246 R12: 0000000002041ea8 [ 1692.774553] R13: 0000000000000000 R14: 0000000002041e58 R15: 0000000000000002 [ 1692.774558] INFO: task kworker/4:1:6532 blocked for more than 241 seconds. [ 1692.774561] Tainted: G OE 5.4.0-51-generic #56~18.04.1-Ubuntu [ 1692.774563] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [ 1692.774566] kworker/4:1 D 0 6532 2 0x80004000
This bug still exists in kernel 5.10-rc1.
Btw I reported that I was experiencing this too in my RX 560, however for me it went away with 5.8.15 I think my problem was unrelated to the one in this ticket, sorry. Btw it may be worth writing down whether the GPU requires an extra PCIE power plug, as this may be relevant. My RX560 requires one (and is plugged).
(In reply to dark_sylinc from comment #12) > Btw I reported that I was experiencing this too in my RX 560, however for me > it went away with 5.8.15 > > I think my problem was unrelated to the one in this ticket, sorry. > > Btw it may be worth writing down whether the GPU requires an extra PCIE > power plug, as this may be relevant. > My RX560 requires one (and is plugged). I have two XFX Radeon RX 580 GTS XXX cards, one for Linux and one for a KVM Windows VM. They have a single 8 pin power connector.
Hi all, I'm not sure if I'm experiencing the same bug, but the outcome and some of the log messages seem to be the same. For me, resuming from suspend is broken with kernel 5.9.0-2 and works when I boot with 5.7.0-1, keeping the rest of my system the same. I'm on debian sid. For SEO, in dmesg I see messages like nov. 11 17:27:35 [ 202.045603] amdgpu: [powerplay] nov. 11 17:27:35 failed to send message 146 ret is 0 nov. 11 17:27:35 [ 203.073392] [drm:uvd_v4_2_start [amdgpu]] *ERROR* UVD not responding, trying to reset the VCPU!!! nov. 11 17:27:35 [ 216.242177] amdgpu 0000:01:00.0: [drm:amdgpu_ring_test_helper [amdgpu]] *ERROR* ring uvd test failed (-110) nov. 11 17:27:35 [ 216.242245] [drm:amdgpu_device_ip_resume_phase2 [amdgpu]] *ERROR* resume of IP block <uvd_v4_2> failed -110 nov. 11 17:27:35 [ 216.242312] [drm:amdgpu_device_resume [amdgpu]] *ERROR* amdgpu_device_ip_resume failed (-110). nov. 11 17:27:44 [ 224.963014] [drm] Fence fallback timer expired on ring sdma0 My hardware: - CPU: AMD FX-6200 Six-Core Processor - VGA: [AMD/ATI] Hawaii PRO [Radeon R9 290/390] - Motherboard: GA-990FXA-UD3 Firmware F9 Software - debian sid - firmware-amd-graphics: 20200918-1 - libdrm-amdgpu1: 2.4.102-1 - mesa: 20.2.1-1 To reiterate, just booting into kernel 5.7 instead of 5.9, resume from suspend will work, keeping the above software the same.
Created attachment 293629 [details] dmesg output when booting with kernel 5.9, suspending, then resuming
I have the same problem with the Radeon HD 7770.
(In reply to Илья Индиго from comment #17) > I have the same problem with the Radeon HD 7770. I have the same problem with the Radeon HD 7770. This also happens with the amdgpu and radeonsi drivers. It enters the S1 mode (although in the BIOS I specified to use only S3) and does not exit it. With the old videocard, the 8600GT with nouveau entered S3 mode and exited normally.
Have same bug with Vega 3 on fedora 33 on kernel >= 5.9 and newest
I am experiencing what appears to be the same problem. My hardware is a Lenovo Thinkpad 14s with AMD Ryzen 4750u. The notebook quite frequently doesn't come out of suspend. Or rather it seems to come out of suspend, but can not initialize the graphics hardware, resulting in a black screen: Mar 05 13:31:23 zapp systemd[1]: Starting Suspend... Mar 05 13:31:23 zapp systemd-sleep[4072]: Suspending system... Mar 05 13:31:23 zapp kernel: PM: suspend entry (s2idle) Mar 05 13:31:23 zapp kernel: Filesystems sync: 0.009 seconds Mar 05 13:31:38 zapp kernel: rfkill: input handler enabled Mar 05 13:31:38 zapp kernel: Freezing user space processes ... (elapsed 0.003 seconds) done. Mar 05 13:31:38 zapp kernel: OOM killer disabled. Mar 05 13:31:38 zapp kernel: Freezing remaining freezable tasks ... (elapsed 0.001 seconds) done. Mar 05 13:31:38 zapp kernel: [drm] free PSP TMR buffer Mar 05 13:31:38 zapp kernel: ACPI: EC: interrupt blocked Mar 05 13:31:38 zapp kernel: ACPI: button: The lid device is not compliant to SW_LID. Mar 05 13:31:38 zapp kernel: ACPI: EC: interrupt unblocked Mar 05 13:31:38 zapp kernel: pci 0000:00:00.2: can't derive routing for PCI INT A Mar 05 13:31:38 zapp kernel: pci 0000:00:00.2: PCI INT A: no GSI Mar 05 13:31:38 zapp kernel: usb usb2: root hub lost power or was reset Mar 05 13:31:38 zapp kernel: usb usb3: root hub lost power or was reset Mar 05 13:31:38 zapp kernel: xhci_hcd 0000:05:00.0: Zeroing 64bit base registers, expecting fault Mar 05 13:31:38 zapp kernel: [drm] PCIE GART of 1024M enabled (table at 0x000000F400900000). Mar 05 13:31:38 zapp kernel: [drm] PSP is resuming... Mar 05 13:31:38 zapp kernel: [drm] reserve 0x400000 from 0xf41f800000 for PSP TMR Mar 05 13:31:38 zapp kernel: amdgpu 0000:06:00.0: amdgpu: SMU is resuming... Mar 05 13:31:38 zapp kernel: amdgpu 0000:06:00.0: amdgpu: dpm has been disabled Mar 05 13:31:38 zapp kernel: amdgpu 0000:06:00.0: amdgpu: SMU is resumed successfully! Mar 05 13:31:38 zapp kernel: usb 2-2: reset high-speed USB device number 2 using xhci_hcd Mar 05 13:31:38 zapp kernel: [drm] kiq ring mec 2 pipe 1 q 0 Mar 05 13:31:38 zapp kernel: [drm] DMUB hardware initialized: version=0x00000001 Mar 05 13:31:38 zapp kernel: [drm] VCN decode and encode initialized successfully(under DPG Mode). Mar 05 13:31:38 zapp kernel: [drm] JPEG decode initialized successfully. Mar 05 13:31:38 zapp kernel: amdgpu 0000:06:00.0: amdgpu: ring gfx uses VM inv eng 0 on hub 0 Mar 05 13:31:38 zapp kernel: amdgpu 0000:06:00.0: amdgpu: ring comp_1.0.0 uses VM inv eng 1 on hub 0 Mar 05 13:31:38 zapp kernel: amdgpu 0000:06:00.0: amdgpu: ring comp_1.1.0 uses VM inv eng 4 on hub 0 Mar 05 13:31:38 zapp kernel: amdgpu 0000:06:00.0: amdgpu: ring comp_1.2.0 uses VM inv eng 5 on hub 0 Mar 05 13:31:38 zapp kernel: amdgpu 0000:06:00.0: amdgpu: ring comp_1.3.0 uses VM inv eng 6 on hub 0 Mar 05 13:31:38 zapp kernel: amdgpu 0000:06:00.0: amdgpu: ring comp_1.0.1 uses VM inv eng 7 on hub 0 Mar 05 13:31:38 zapp kernel: amdgpu 0000:06:00.0: amdgpu: ring comp_1.1.1 uses VM inv eng 8 on hub 0 Mar 05 13:31:38 zapp kernel: amdgpu 0000:06:00.0: amdgpu: ring comp_1.2.1 uses VM inv eng 9 on hub 0 Mar 05 13:31:38 zapp kernel: amdgpu 0000:06:00.0: amdgpu: ring comp_1.3.1 uses VM inv eng 10 on hub 0 Mar 05 13:31:38 zapp kernel: amdgpu 0000:06:00.0: amdgpu: ring kiq_2.1.0 uses VM inv eng 11 on hub 0 Mar 05 13:31:38 zapp kernel: amdgpu 0000:06:00.0: amdgpu: ring sdma0 uses VM inv eng 0 on hub 1 Mar 05 13:31:38 zapp kernel: amdgpu 0000:06:00.0: amdgpu: ring vcn_dec uses VM inv eng 1 on hub 1 Mar 05 13:31:38 zapp kernel: amdgpu 0000:06:00.0: amdgpu: ring vcn_enc0 uses VM inv eng 4 on hub 1 Mar 05 13:31:38 zapp kernel: amdgpu 0000:06:00.0: amdgpu: ring vcn_enc1 uses VM inv eng 5 on hub 1 Mar 05 13:31:38 zapp kernel: amdgpu 0000:06:00.0: amdgpu: ring jpeg_dec uses VM inv eng 6 on hub 1 Mar 05 13:31:38 zapp kernel: amdgpu 0000:06:00.0: [drm:amdgpu_ib_ring_tests [amdgpu]] *ERROR* IB test failed on gfx (-110). Mar 05 13:31:38 zapp kernel: fbcon: Taking over console Mar 05 13:31:38 zapp kernel: [drm:amdgpu_device_delayed_init_work_handler [amdgpu]] *ERROR* ib ring test failed (-110). Mar 05 13:31:38 zapp kernel: [drm] Failed to add display topology, DTM TA is not initialized. The most relevant part seems to be [drm:amdgpu_ib_ring_tests [amdgpu]] *ERROR* IB test failed on gfx (-110). This is on 5.12-rc1+, compiled from master this morning. But I have seen this problem with Ubuntu mainline kernels 5.10.17, 5.10.20 and 5.11.3 as well.
Unless you have a polaris board please file your own bug.
With 5.12.3, monitor remains blank after resume. Relevant log: ``` May 15 20:21:37 fedora kernel: amdgpu 0000:04:00.0: amdgpu: failed to write reg 28b4 wait reg 28c6 May 15 20:21:37 fedora kernel: amdgpu 0000:04:00.0: amdgpu: failed to write reg 1a6f4 wait reg 1a706 May 15 20:21:37 fedora kernel: amdgpu 0000:04:00.0: amdgpu: SMU is resuming... May 15 20:21:37 fedora kernel: amdgpu 0000:04:00.0: amdgpu: dpm has been disabled May 15 20:21:37 fedora kernel: amdgpu 0000:04:00.0: amdgpu: SMU is resumed successfully! May 15 20:21:37 fedora kernel: amdgpu 0000:04:00.0: [drm:amdgpu_ring_test_helper [amdgpu]] *ERROR* ring sdma0 test failed (-110) May 15 20:21:37 fedora kernel: [drm:amdgpu_device_ip_resume_phase2 [amdgpu]] *ERROR* resume of IP block <sdma_v4_0> failed -110 May 15 20:21:37 fedora kernel: amdgpu 0000:04:00.0: amdgpu: amdgpu_device_ip_resume failed (-110). May 15 20:21:37 fedora kernel: amdgpu 0000:04:00.0: PM: failed to resume async: error -110 May 15 20:21:37 fedora kernel: amdgpu 0000:04:00.0: amdgpu: couldn't schedule ib on ring <sdma0> May 15 20:21:37 fedora kernel: [drm:amdgpu_job_run [amdgpu]] *ERROR* Error scheduling IBs (-22) May 15 20:21:37 fedora kernel: amdgpu 0000:04:00.0: amdgpu: couldn't schedule ib on ring <sdma0> May 15 20:21:37 fedora kernel: [drm:amdgpu_job_run [amdgpu]] *ERROR* Error scheduling IBs (-22) May 15 20:21:47 fedora kernel: [drm:amdgpu_job_timedout [amdgpu]] *ERROR* ring gfx timeout, signaled seq=139973, emitted seq=139977 May 15 20:21:47 fedora kernel: [drm:amdgpu_job_timedout [amdgpu]] *ERROR* Process information: process gnome-shell pid 1895 thread gnome-shel:cs0 pid 1926 May 15 20:21:47 fedora kernel: amdgpu 0000:04:00.0: amdgpu: GPU reset begin! May 15 20:21:47 fedora kernel: amdgpu 0000:04:00.0: [drm:amdgpu_ring_test_helper [amdgpu]] *ERROR* ring kiq_2.1.0 test failed (-110) May 15 20:21:47 fedora kernel: amdgpu 0000:04:00.0: amdgpu: MODE2 reset May 15 20:21:47 fedora kernel: amdgpu 0000:04:00.0: amdgpu: GPU reset succeeded, trying to resume May 15 20:21:48 fedora kernel: amdgpu 0000:04:00.0: amdgpu: RAS: optional ras ta ucode is not available May 15 20:21:48 fedora kernel: amdgpu 0000:04:00.0: amdgpu: RAP: optional rap ta ucode is not available May 15 20:21:48 fedora kernel: amdgpu 0000:04:00.0: amdgpu: SECUREDISPLAY: securedisplay ta ucode is not available May 15 20:21:48 fedora kernel: amdgpu 0000:04:00.0: amdgpu: SMU is resuming... May 15 20:21:48 fedora kernel: amdgpu 0000:04:00.0: amdgpu: dpm has been disabled May 15 20:21:48 fedora kernel: amdgpu 0000:04:00.0: amdgpu: SMU is resumed successfully! May 15 20:21:48 fedora kernel: amdgpu 0000:04:00.0: [drm:amdgpu_ring_test_helper [amdgpu]] *ERROR* ring sdma0 test failed (-110) May 15 20:21:48 fedora kernel: [drm:amdgpu_device_ip_resume_phase2 [amdgpu]] *ERROR* resume of IP block <sdma_v4_0> failed -110 May 15 20:21:48 fedora kernel: amdgpu 0000:04:00.0: amdgpu: GPU reset(2) failed May 15 20:21:48 fedora kernel: amdgpu 0000:04:00.0: amdgpu: GPU reset end with ret = -110 May 15 20:21:58 fedora kernel: [drm:amdgpu_job_timedout [amdgpu]] *ERROR* ring gfx timeout, but soft recovered May 15 20:22:08 fedora kernel: [drm:amdgpu_job_timedout [amdgpu]] *ERROR* ring gfx timeout, but soft recovered May 15 20:22:12 fedora kernel: amdgpu 0000:04:00.0: amdgpu: couldn't schedule ib on ring <sdma0> May 15 20:22:12 fedora kernel: [drm:amdgpu_job_run [amdgpu]] *ERROR* Error scheduling IBs (-22) May 15 20:22:12 fedora kernel: amdgpu 0000:04:00.0: amdgpu: couldn't schedule ib on ring <sdma0> ``` AMD Ryzen 7 4700U with Radeon Graphics, Lenovo Ideapad 5.
I'm facing exactly the same issue with a Ryzen 7 Vega 10 Graphics integrated GPU. I'll put my kernel log below, it began to happen after kernel 5.4, I had to downgrade my kernel to 5.4-lts from AUR and it's already 3 days without any GPU reset event. Kernel crash log in amdgpu driver: mai 26 16:39:14 S145 kernel: [drm:amdgpu_job_timedout [amdgpu]] *ERROR* ring sdma0 timeout, signaled seq=26777, emitted seq=26778 mai 26 16:39:14 S145 kernel: [drm:amdgpu_job_timedout [amdgpu]] *ERROR* Process information: process pid 0 thread pid 0 mai 26 16:39:14 S145 kernel: amdgpu 0000:03:00.0: amdgpu: GPU reset begin! mai 26 16:39:14 S145 kernel: amdgpu 0000:03:00.0: amdgpu: MODE2 reset mai 26 16:39:14 S145 kernel: amdgpu 0000:03:00.0: amdgpu: GPU reset succeeded, trying to resume mai 26 16:39:14 S145 kernel: amdgpu 0000:03:00.0: amdgpu: RAS: optional ras ta ucode is not available mai 26 16:39:14 S145 kernel: amdgpu 0000:03:00.0: amdgpu: RAP: optional rap ta ucode is not available mai 26 16:39:15 S145 kernel: amdgpu 0000:03:00.0: amdgpu: ring gfx uses VM inv eng 0 on hub 0 mai 26 16:39:15 S145 kernel: amdgpu 0000:03:00.0: amdgpu: ring comp_1.0.0 uses VM inv eng 1 on hub 0 mai 26 16:39:15 S145 kernel: amdgpu 0000:03:00.0: amdgpu: ring comp_1.1.0 uses VM inv eng 4 on hub 0 mai 26 16:39:15 S145 kernel: amdgpu 0000:03:00.0: amdgpu: ring comp_1.2.0 uses VM inv eng 5 on hub 0 mai 26 16:39:15 S145 kernel: amdgpu 0000:03:00.0: amdgpu: ring comp_1.3.0 uses VM inv eng 6 on hub 0 mai 26 16:39:15 S145 kernel: amdgpu 0000:03:00.0: amdgpu: ring comp_1.0.1 uses VM inv eng 7 on hub 0 mai 26 16:39:15 S145 kernel: amdgpu 0000:03:00.0: amdgpu: ring comp_1.1.1 uses VM inv eng 8 on hub 0 mai 26 16:39:15 S145 kernel: amdgpu 0000:03:00.0: amdgpu: ring comp_1.2.1 uses VM inv eng 9 on hub 0 mai 26 16:39:15 S145 kernel: amdgpu 0000:03:00.0: amdgpu: ring comp_1.3.1 uses VM inv eng 10 on hub 0 mai 26 16:39:15 S145 kernel: amdgpu 0000:03:00.0: amdgpu: ring kiq_2.1.0 uses VM inv eng 11 on hub 0 mai 26 16:39:15 S145 kernel: amdgpu 0000:03:00.0: amdgpu: ring sdma0 uses VM inv eng 0 on hub 1 mai 26 16:39:15 S145 kernel: amdgpu 0000:03:00.0: amdgpu: ring vcn_dec uses VM inv eng 1 on hub 1 mai 26 16:39:15 S145 kernel: amdgpu 0000:03:00.0: amdgpu: ring vcn_enc0 uses VM inv eng 4 on hub 1 mai 26 16:39:15 S145 kernel: amdgpu 0000:03:00.0: amdgpu: ring vcn_enc1 uses VM inv eng 5 on hub 1 mai 26 16:39:15 S145 kernel: amdgpu 0000:03:00.0: amdgpu: ring jpeg_dec uses VM inv eng 6 on hub 1 mai 26 16:39:15 S145 kernel: amdgpu 0000:03:00.0: amdgpu: recover vram bo from shadow start mai 26 16:39:15 S145 kernel: amdgpu 0000:03:00.0: amdgpu: recover vram bo from shadow done mai 26 16:39:15 S145 kernel: amdgpu 0000:03:00.0: amdgpu: GPU reset(1) succeeded! mai 26 16:39:15 S145 kernel: amdgpu 0000:03:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x0000 address=0x10b480b00 flags=0x0070] mai 26 16:39:15 S145 kernel: amdgpu 0000:03:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x0000 address=0x10b480b40 flags=0x0070] mai 26 16:39:15 S145 kernel: amdgpu 0000:03:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x0000 address=0x10b480b20 flags=0x0070] mai 26 16:39:15 S145 kernel: amdgpu 0000:03:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x0000 address=0x10b480b60 flags=0x0070] mai 26 16:39:15 S145 kernel: amdgpu 0000:03:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x0000 address=0x10b480b80 flags=0x0070] mai 26 16:39:15 S145 kernel: amdgpu 0000:03:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x0000 address=0x10b480bc0 flags=0x0070] mai 26 16:39:15 S145 kernel: amdgpu 0000:03:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x0000 address=0x10b480ba0 flags=0x0070] mai 26 16:39:15 S145 kernel: amdgpu 0000:03:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x0000 address=0x10b480c00 flags=0x0070] mai 26 16:39:15 S145 kernel: amdgpu 0000:03:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x0000 address=0x10b480be0 flags=0x0070] mai 26 16:39:15 S145 kernel: amdgpu 0000:03:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x0000 address=0x10b480c40 flags=0x0070] mai 26 16:39:25 S145 kernel: [drm:amdgpu_job_timedout [amdgpu]] *ERROR* ring gfx timeout, but soft recovered mai 26 16:39:35 S145 kernel: [drm:amdgpu_job_timedout [amdgpu]] *ERROR* ring gfx timeout, signaled seq=2117313, emitted seq=2117316 mai 26 16:39:35 S145 kernel: [drm:amdgpu_job_timedout [amdgpu]] *ERROR* Process information: process plasmashell pid 1137 thread plasmashel:cs0 pid 1234 mai 26 16:39:35 S145 kernel: amdgpu 0000:03:00.0: amdgpu: GPU reset begin! mai 26 16:39:35 S145 kernel: amdgpu 0000:03:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x0000 address=0x10b485b40 flags=0x0070] mai 26 16:39:35 S145 kernel: amdgpu 0000:03:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x0000 address=0x10b485b60 flags=0x0070] mai 26 16:39:35 S145 kernel: amdgpu 0000:03:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x0000 address=0x10b485b80 flags=0x0070] mai 26 16:39:35 S145 kernel: amdgpu 0000:03:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x0000 address=0x10b485ba0 flags=0x0070] mai 26 16:39:35 S145 kernel: amdgpu 0000:03:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x0000 address=0x10b485bc0 flags=0x0070] mai 26 16:39:35 S145 kernel: amdgpu 0000:03:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x0000 address=0x10b485be0 flags=0x0070] mai 26 16:39:35 S145 kernel: amdgpu 0000:03:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x0000 address=0x10b485c20 flags=0x0070] mai 26 16:39:35 S145 kernel: amdgpu 0000:03:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x0000 address=0x10b485c00 flags=0x0070] mai 26 16:39:35 S145 kernel: amdgpu 0000:03:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x0000 address=0x10b485c40 flags=0x0070] mai 26 16:39:35 S145 kernel: amdgpu 0000:03:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x0000 address=0x10b485c60 flags=0x0070] mai 26 16:39:36 S145 kernel: amdgpu 0000:03:00.0: amdgpu: MODE2 reset mai 26 16:39:36 S145 kernel: amdgpu 0000:03:00.0: amdgpu: GPU reset succeeded, trying to resume mai 26 16:39:36 S145 kernel: amdgpu 0000:03:00.0: amdgpu: RAS: optional ras ta ucode is not available mai 26 16:39:37 S145 kernel: amdgpu 0000:03:00.0: amdgpu: RAP: optional rap ta ucode is not available mai 26 16:39:37 S145 kernel: amdgpu 0000:03:00.0: [drm:amdgpu_ring_test_helper [amdgpu]] *ERROR* ring sdma0 test failed (-110) mai 26 16:39:37 S145 kernel: [drm:amdgpu_device_ip_resume_phase2 [amdgpu]] *ERROR* resume of IP block <sdma_v4_0> failed -110 mai 26 16:39:37 S145 kernel: amdgpu 0000:03:00.0: amdgpu: GPU reset(4) failed mai 26 16:39:37 S145 kernel: amdgpu 0000:03:00.0: amdgpu: couldn't schedule ib on ring <sdma0> mai 26 16:39:37 S145 kernel: amdgpu 0000:03:00.0: amdgpu: GPU reset end with ret = -110 mai 26 16:39:37 S145 kernel: [drm:amdgpu_job_run [amdgpu]] *ERROR* Error scheduling IBs (-22) mai 26 16:39:37 S145 kernel: amdgpu 0000:03:00.0: amdgpu: couldn't schedule ib on ring <sdma0> mai 26 16:39:37 S145 kernel: [drm:amdgpu_job_run [amdgpu]] *ERROR* Error scheduling IBs (-22) mai 26 16:39:37 S145 kernel: amdgpu 0000:03:00.0: amdgpu: couldn't schedule ib on ring <sdma0> mai 26 16:39:37 S145 kernel: [drm:amdgpu_job_run [amdgpu]] *ERROR* Error scheduling IBs (-22) mai 26 16:39:37 S145 kernel: amdgpu 0000:03:00.0: amdgpu: couldn't schedule ib on ring <sdma0> mai 26 16:39:37 S145 kernel: [drm:amdgpu_job_run [amdgpu]] *ERROR* Error scheduling IBs (-22) mai 26 16:39:37 S145 kernel: amdgpu 0000:03:00.0: amdgpu: couldn't schedule ib on ring <sdma0> mai 26 16:39:37 S145 kernel: [drm:amdgpu_job_run [amdgpu]] *ERROR* Error scheduling IBs (-22) mai 26 16:39:37 S145 kernel: amdgpu 0000:03:00.0: amdgpu: couldn't schedule ib on ring <sdma0> mai 26 16:39:37 S145 kernel: [drm:amdgpu_job_run [amdgpu]] *ERROR* Error scheduling IBs (-22) mai 26 16:39:37 S145 kernel: amdgpu 0000:03:00.0: amdgpu: couldn't schedule ib on ring <sdma0> mai 26 16:39:37 S145 kernel: [drm:amdgpu_job_run [amdgpu]] *ERROR* Error scheduling IBs (-22) mai 26 16:39:37 S145 kernel: amdgpu 0000:03:00.0: amdgpu: couldn't schedule ib on ring <sdma0> mai 26 16:39:37 S145 kernel: [drm:amdgpu_job_run [amdgpu]] *ERROR* Error scheduling IBs (-22) mai 26 16:39:37 S145 kernel: amdgpu 0000:03:00.0: amdgpu: couldn't schedule ib on ring <sdma0> mai 26 16:39:37 S145 kernel: [drm:amdgpu_job_run [amdgpu]] *ERROR* Error scheduling IBs (-22) mai 26 16:39:37 S145 kernel: amdgpu 0000:03:00.0: amdgpu: couldn't schedule ib on ring <sdma0> mai 26 16:39:37 S145 kernel: [drm:amdgpu_job_run [amdgpu]] *ERROR* Error scheduling IBs (-22) mai 26 16:39:37 S145 kernel: amdgpu 0000:03:00.0: amdgpu: couldn't schedule ib on ring <sdma0> mai 26 16:39:37 S145 kernel: [drm:amdgpu_job_run [amdgpu]] *ERROR* Error scheduling IBs (-22) mai 26 16:39:37 S145 kernel: amdgpu 0000:03:00.0: amdgpu: couldn't schedule ib on ring <sdma0> mai 26 16:39:37 S145 kernel: [drm:amdgpu_job_run [amdgpu]] *ERROR* Error scheduling IBs (-22) mai 26 16:39:37 S145 kernel: amdgpu 0000:03:00.0: amdgpu: couldn't schedule ib on ring <sdma0> mai 26 16:39:37 S145 kernel: [drm:amdgpu_job_run [amdgpu]] *ERROR* Error scheduling IBs (-22) mai 26 16:39:37 S145 kernel: amdgpu 0000:03:00.0: amdgpu: couldn't schedule ib on ring <sdma0> mai 26 16:39:37 S145 kernel: [drm:amdgpu_job_run [amdgpu]] *ERROR* Error scheduling IBs (-22) mai 26 16:39:37 S145 kernel: amdgpu 0000:03:00.0: amdgpu: couldn't schedule ib on ring <sdma0> mai 26 16:39:37 S145 kernel: [drm:amdgpu_job_run [amdgpu]] *ERROR* Error scheduling IBs (-22) mai 26 16:39:37 S145 kernel: amdgpu 0000:03:00.0: amdgpu: couldn't schedule ib on ring <sdma0> mai 26 16:39:37 S145 kernel: [drm:amdgpu_job_run [amdgpu]] *ERROR* Error scheduling IBs (-22) mai 26 16:39:37 S145 kernel: amdgpu 0000:03:00.0: amdgpu: couldn't schedule ib on ring <sdma0> mai 26 16:39:37 S145 kernel: [drm:amdgpu_job_run [amdgpu]] *ERROR* Error scheduling IBs (-22) mai 26 16:39:37 S145 kernel: amdgpu 0000:03:00.0: amdgpu: couldn't schedule ib on ring <sdma0> mai 26 16:39:37 S145 kernel: [drm:amdgpu_job_run [amdgpu]] *ERROR* Error scheduling IBs (-22) mai 26 16:39:37 S145 kernel: amdgpu 0000:03:00.0: amdgpu: couldn't schedule ib on ring <sdma0> mai 26 16:39:37 S145 kernel: [drm:amdgpu_job_run [amdgpu]] *ERROR* Error scheduling IBs (-22) mai 26 16:39:37 S145 kernel: amdgpu 0000:03:00.0: amdgpu: couldn't schedule ib on ring <sdma0> mai 26 16:39:37 S145 kernel: [drm:amdgpu_job_run [amdgpu]] *ERROR* Error scheduling IBs (-22) mai 26 16:39:37 S145 kernel: amdgpu 0000:03:00.0: amdgpu: couldn't schedule ib on ring <sdma0> mai 26 16:39:37 S145 kernel: [drm:amdgpu_job_run [amdgpu]] *ERROR* Error scheduling IBs (-22) mai 26 16:39:37 S145 kernel: amdgpu 0000:03:00.0: amdgpu: couldn't schedule ib on ring <sdma0> mai 26 16:39:37 S145 kernel: [drm:amdgpu_job_run [amdgpu]] *ERROR* Error scheduling IBs (-22) mai 26 16:39:37 S145 kernel: amdgpu 0000:03:00.0: amdgpu: couldn't schedule ib on ring <sdma0> mai 26 16:39:37 S145 kernel: [drm:amdgpu_job_run [amdgpu]] *ERROR* Error scheduling IBs (-22) mai 26 16:39:37 S145 kernel: amdgpu 0000:03:00.0: amdgpu: couldn't schedule ib on ring <sdma0> mai 26 16:39:37 S145 kernel: [drm:amdgpu_job_run [amdgpu]] *ERROR* Error scheduling IBs (-22) mai 26 16:39:37 S145 kernel: amdgpu 0000:03:00.0: amdgpu: couldn't schedule ib on ring <sdma0> mai 26 16:39:37 S145 kernel: [drm:amdgpu_job_run [amdgpu]] *ERROR* Error scheduling IBs (-22) mai 26 16:39:37 S145 kernel: amdgpu 0000:03:00.0: amdgpu: couldn't schedule ib on ring <sdma0> mai 26 16:39:37 S145 kernel: [drm:amdgpu_job_run [amdgpu]] *ERROR* Error scheduling IBs (-22) mai 26 16:39:37 S145 kernel: amdgpu 0000:03:00.0: amdgpu: couldn't schedule ib on ring <sdma0> mai 26 16:39:37 S145 kernel: [drm:amdgpu_job_run [amdgpu]] *ERROR* Error scheduling IBs (-22) mai 26 16:39:37 S145 kernel: amdgpu 0000:03:00.0: amdgpu: couldn't schedule ib on ring <sdma0> mai 26 16:39:37 S145 kernel: [drm:amdgpu_job_run [amdgpu]] *ERROR* Error scheduling IBs (-22) mai 26 16:39:37 S145 kernel: amdgpu 0000:03:00.0: amdgpu: couldn't schedule ib on ring <sdma0> mai 26 16:39:37 S145 kernel: [drm:amdgpu_job_run [amdgpu]] *ERROR* Error scheduling IBs (-22) mai 26 16:39:37 S145 kernel: amdgpu 0000:03:00.0: amdgpu: couldn't schedule ib on ring <sdma0> mai 26 16:39:37 S145 kernel: [drm:amdgpu_job_run [amdgpu]] *ERROR* Error scheduling IBs (-22) mai 26 16:39:37 S145 kernel: amdgpu 0000:03:00.0: amdgpu: couldn't schedule ib on ring <sdma0> mai 26 16:39:37 S145 kernel: [drm:amdgpu_job_run [amdgpu]] *ERROR* Error scheduling IBs (-22) mai 26 16:39:37 S145 kernel: amdgpu 0000:03:00.0: amdgpu: couldn't schedule ib on ring <sdma0> mai 26 16:39:37 S145 kernel: [drm:amdgpu_job_run [amdgpu]] *ERROR* Error scheduling IBs (-22) mai 26 16:39:37 S145 kernel: amdgpu 0000:03:00.0: amdgpu: couldn't schedule ib on ring <sdma0> mai 26 16:39:37 S145 kernel: [drm:amdgpu_job_run [amdgpu]] *ERROR* Error scheduling IBs (-22) mai 26 16:39:37 S145 kernel: amdgpu 0000:03:00.0: amdgpu: couldn't schedule ib on ring <sdma0> mai 26 16:39:37 S145 kernel: [drm:amdgpu_job_run [amdgpu]] *ERROR* Error scheduling IBs (-22) mai 26 16:39:37 S145 kernel: amdgpu 0000:03:00.0: amdgpu: couldn't schedule ib on ring <sdma0> mai 26 16:39:37 S145 kernel: [drm:amdgpu_job_run [amdgpu]] *ERROR* Error scheduling IBs (-22) mai 26 16:39:37 S145 kernel: amdgpu 0000:03:00.0: amdgpu: couldn't schedule ib on ring <sdma0> mai 26 16:39:37 S145 kernel: [drm:amdgpu_job_run [amdgpu]] *ERROR* Error scheduling IBs (-22) mai 26 16:39:37 S145 kernel: amdgpu 0000:03:00.0: amdgpu: couldn't schedule ib on ring <sdma0> mai 26 16:39:37 S145 kernel: [drm:amdgpu_job_run [amdgpu]] *ERROR* Error scheduling IBs (-22) mai 26 16:39:37 S145 kernel: amdgpu 0000:03:00.0: amdgpu: couldn't schedule ib on ring <sdma0> mai 26 16:39:37 S145 kernel: [drm:amdgpu_job_run [amdgpu]] *ERROR* Error scheduling IBs (-22) mai 26 16:39:37 S145 kernel: amdgpu 0000:03:00.0: amdgpu: couldn't schedule ib on ring <sdma0> mai 26 16:39:37 S145 kernel: [drm:amdgpu_job_run [amdgpu]] *ERROR* Error scheduling IBs (-22) mai 26 16:39:37 S145 kernel: amdgpu 0000:03:00.0: amdgpu: couldn't schedule ib on ring <sdma0> mai 26 16:39:37 S145 kernel: [drm:amdgpu_job_run [amdgpu]] *ERROR* Error scheduling IBs (-22) mai 26 16:39:37 S145 kernel: amdgpu 0000:03:00.0: amdgpu: couldn't schedule ib on ring <sdma0> mai 26 16:39:37 S145 kernel: [drm:amdgpu_job_run [amdgpu]] *ERROR* Error scheduling IBs (-22) mai 26 16:39:37 S145 kernel: amdgpu 0000:03:00.0: amdgpu: couldn't schedule ib on ring <sdma0> mai 26 16:39:37 S145 kernel: [drm:amdgpu_job_run [amdgpu]] *ERROR* Error scheduling IBs (-22) mai 26 16:39:37 S145 kernel: amdgpu 0000:03:00.0: amdgpu: couldn't schedule ib on ring <sdma0> mai 26 16:39:37 S145 kernel: [drm:amdgpu_job_run [amdgpu]] *ERROR* Error scheduling IBs (-22) mai 26 16:39:37 S145 kernel: amdgpu 0000:03:00.0: amdgpu: couldn't schedule ib on ring <sdma0> mai 26 16:39:37 S145 kernel: [drm:amdgpu_job_run [amdgpu]] *ERROR* Error scheduling IBs (-22) mai 26 16:39:37 S145 kernel: amdgpu 0000:03:00.0: amdgpu: couldn't schedule ib on ring <sdma0> mai 26 16:39:37 S145 kernel: [drm:amdgpu_job_run [amdgpu]] *ERROR* Error scheduling IBs (-22) mai 26 16:39:37 S145 kernel: amdgpu 0000:03:00.0: amdgpu: couldn't schedule ib on ring <sdma0> mai 26 16:39:37 S145 kernel: [drm:amdgpu_job_run [amdgpu]] *ERROR* Error scheduling IBs (-22) mai 26 16:39:37 S145 kernel: amdgpu 0000:03:00.0: amdgpu: couldn't schedule ib on ring <sdma0> mai 26 16:39:37 S145 kernel: [drm:amdgpu_job_run [amdgpu]] *ERROR* Error scheduling IBs (-22) mai 26 16:39:37 S145 kernel: amdgpu 0000:03:00.0: amdgpu: couldn't schedule ib on ring <sdma0> mai 26 16:39:37 S145 kernel: [drm:amdgpu_job_run [amdgpu]] *ERROR* Error scheduling IBs (-22) mai 26 16:39:37 S145 kernel: amdgpu 0000:03:00.0: amdgpu: couldn't schedule ib on ring <sdma0> mai 26 16:39:37 S145 kernel: [drm:amdgpu_job_run [amdgpu]] *ERROR* Error scheduling IBs (-22) mai 26 16:39:37 S145 kernel: amdgpu 0000:03:00.0: amdgpu: couldn't schedule ib on ring <sdma0> mai 26 16:39:37 S145 kernel: [drm:amdgpu_job_run [amdgpu]] *ERROR* Error scheduling IBs (-22) mai 26 16:39:37 S145 kernel: amdgpu 0000:03:00.0: amdgpu: couldn't schedule ib on ring <sdma0> mai 26 16:39:37 S145 kernel: [drm:amdgpu_job_run [amdgpu]] *ERROR* Error scheduling IBs (-22) mai 26 16:39:37 S145 kernel: amdgpu 0000:03:00.0: amdgpu: couldn't schedule ib on ring <sdma0> mai 26 16:39:37 S145 kernel: [drm:amdgpu_job_run [amdgpu]] *ERROR* Error scheduling IBs (-22) mai 26 16:39:37 S145 kernel: amdgpu 0000:03:00.0: amdgpu: couldn't schedule ib on ring <sdma0> mai 26 16:39:37 S145 kernel: [drm:amdgpu_job_run [amdgpu]] *ERROR* Error scheduling IBs (-22) mai 26 16:39:37 S145 kernel: amdgpu 0000:03:00.0: amdgpu: couldn't schedule ib on ring <sdma0> mai 26 16:39:37 S145 kernel: [drm:amdgpu_job_run [amdgpu]] *ERROR* Error scheduling IBs (-22) mai 26 16:39:37 S145 kernel: amdgpu 0000:03:00.0: amdgpu: couldn't schedule ib on ring <sdma0> mai 26 16:39:37 S145 kernel: [drm:amdgpu_job_run [amdgpu]] *ERROR* Error scheduling IBs (-22) mai 26 16:39:37 S145 kernel: amdgpu 0000:03:00.0: amdgpu: couldn't schedule ib on ring <sdma0> mai 26 16:39:37 S145 kernel: [drm:amdgpu_job_run [amdgpu]] *ERROR* Error scheduling IBs (-22) mai 26 16:39:37 S145 kernel: amdgpu 0000:03:00.0: amdgpu: couldn't schedule ib on ring <sdma0> mai 26 16:39:37 S145 kernel: [drm:amdgpu_job_run [amdgpu]] *ERROR* Error scheduling IBs (-22) mai 26 16:39:37 S145 kernel: amdgpu 0000:03:00.0: amdgpu: couldn't schedule ib on ring <sdma0> mai 26 16:39:37 S145 kernel: [drm:amdgpu_job_run [amdgpu]] *ERROR* Error scheduling IBs (-22) mai 26 16:39:37 S145 kernel: amdgpu 0000:03:00.0: amdgpu: couldn't schedule ib on ring <sdma0> mai 26 16:39:37 S145 kernel: [drm:amdgpu_job_run [amdgpu]] *ERROR* Error scheduling IBs (-22) mai 26 16:39:37 S145 kernel: amdgpu 0000:03:00.0: amdgpu: couldn't schedule ib on ring <sdma0> mai 26 16:39:37 S145 kernel: [drm:amdgpu_job_run [amdgpu]] *ERROR* Error scheduling IBs (-22) mai 26 16:39:37 S145 kernel: amdgpu 0000:03:00.0: amdgpu: couldn't schedule ib on ring <sdma0> mai 26 16:39:37 S145 kernel: [drm:amdgpu_job_run [amdgpu]] *ERROR* Error scheduling IBs (-22) mai 26 16:39:37 S145 kernel: amdgpu 0000:03:00.0: amdgpu: couldn't schedule ib on ring <sdma0> mai 26 16:39:37 S145 kernel: [drm:amdgpu_job_run [amdgpu]] *ERROR* Error scheduling IBs (-22) mai 26 16:39:37 S145 kernel: amdgpu 0000:03:00.0: amdgpu: couldn't schedule ib on ring <sdma0> mai 26 16:39:37 S145 kernel: [drm:amdgpu_job_run [amdgpu]] *ERROR* Error scheduling IBs (-22) mai 26 16:39:37 S145 kernel: amdgpu 0000:03:00.0: amdgpu: couldn't schedule ib on ring <sdma0> mai 26 16:39:37 S145 kernel: [drm:amdgpu_job_run [amdgpu]] *ERROR* Error scheduling IBs (-22) mai 26 16:39:37 S145 kernel: amdgpu 0000:03:00.0: amdgpu: couldn't schedule ib on ring <sdma0> mai 26 16:39:37 S145 kernel: [drm:amdgpu_job_run [amdgpu]] *ERROR* Error scheduling IBs (-22) mai 26 16:39:37 S145 kernel: amdgpu 0000:03:00.0: amdgpu: couldn't schedule ib on ring <sdma0> mai 26 16:39:37 S145 kernel: [drm:amdgpu_job_run [amdgpu]] *ERROR* Error scheduling IBs (-22) mai 26 16:39:37 S145 kernel: amdgpu 0000:03:00.0: amdgpu: couldn't schedule ib on ring <sdma0> mai 26 16:39:37 S145 kernel: [drm:amdgpu_job_run [amdgpu]] *ERROR* Error scheduling IBs (-22) mai 26 16:39:37 S145 kernel: amdgpu 0000:03:00.0: amdgpu: couldn't schedule ib on ring <sdma0> mai 26 16:39:37 S145 kernel: [drm:amdgpu_job_run [amdgpu]] *ERROR* Error scheduling IBs (-22) mai 26 16:39:37 S145 kernel: amdgpu 0000:03:00.0: amdgpu: couldn't schedule ib on ring <sdma0> mai 26 16:39:37 S145 kernel: [drm:amdgpu_job_run [amdgpu]] *ERROR* Error scheduling IBs (-22) mai 26 16:39:37 S145 kernel: amdgpu 0000:03:00.0: amdgpu: couldn't schedule ib on ring <sdma0> mai 26 16:39:37 S145 kernel: [drm:amdgpu_job_run [amdgpu]] *ERROR* Error scheduling IBs (-22) mai 26 16:39:37 S145 kernel: amdgpu 0000:03:00.0: amdgpu: couldn't schedule ib on ring <sdma0> mai 26 16:39:37 S145 kernel: [drm:amdgpu_job_run [amdgpu]] *ERROR* Error scheduling IBs (-22) mai 26 16:39:37 S145 kernel: amdgpu 0000:03:00.0: amdgpu: couldn't schedule ib on ring <sdma0> mai 26 16:39:37 S145 kernel: [drm:amdgpu_job_run [amdgpu]] *ERROR* Error scheduling IBs (-22) mai 26 16:39:37 S145 kernel: amdgpu 0000:03:00.0: amdgpu: couldn't schedule ib on ring <sdma0> mai 26 16:39:37 S145 kernel: [drm:amdgpu_job_run [amdgpu]] *ERROR* Error scheduling IBs (-22) mai 26 16:39:37 S145 kernel: amdgpu 0000:03:00.0: amdgpu: couldn't schedule ib on ring <sdma0> mai 26 16:39:37 S145 kernel: [drm:amdgpu_job_run [amdgpu]] *ERROR* Error scheduling IBs (-22) mai 26 16:39:37 S145 kernel: amdgpu 0000:03:00.0: amdgpu: couldn't schedule ib on ring <sdma0> mai 26 16:39:37 S145 kernel: [drm:amdgpu_job_run [amdgpu]] *ERROR* Error scheduling IBs (-22) mai 26 16:39:37 S145 kernel: amdgpu 0000:03:00.0: amdgpu: couldn't schedule ib on ring <sdma0> mai 26 16:39:37 S145 kernel: [drm:amdgpu_job_run [amdgpu]] *ERROR* Error scheduling IBs (-22) mai 26 16:39:37 S145 kernel: amdgpu 0000:03:00.0: amdgpu: couldn't schedule ib on ring <sdma0> mai 26 16:39:37 S145 kernel: [drm:amdgpu_job_run [amdgpu]] *ERROR* Error scheduling IBs (-22) mai 26 16:39:37 S145 kernel: amdgpu 0000:03:00.0: amdgpu: couldn't schedule ib on ring <sdma0> mai 26 16:39:37 S145 kernel: [drm:amdgpu_job_run [amdgpu]] *ERROR* Error scheduling IBs (-22) mai 26 16:39:37 S145 kernel: amdgpu 0000:03:00.0: amdgpu: couldn't schedule ib on ring <sdma0> mai 26 16:39:37 S145 kernel: [drm:amdgpu_job_run [amdgpu]] *ERROR* Error scheduling IBs (-22) mai 26 16:39:37 S145 kernel: amdgpu 0000:03:00.0: amdgpu: couldn't schedule ib on ring <sdma0> mai 26 16:39:37 S145 kernel: [drm:amdgpu_job_run [amdgpu]] *ERROR* Error scheduling IBs (-22) mai 26 16:39:37 S145 kernel: amdgpu 0000:03:00.0: amdgpu: couldn't schedule ib on ring <sdma0> mai 26 16:39:37 S145 kernel: [drm:amdgpu_job_run [amdgpu]] *ERROR* Error scheduling IBs (-22) mai 26 16:39:37 S145 kernel: amdgpu 0000:03:00.0: amdgpu: couldn't schedule ib on ring <sdma0> mai 26 16:39:37 S145 kernel: [drm:amdgpu_job_run [amdgpu]] *ERROR* Error scheduling IBs (-22) mai 26 16:39:37 S145 kernel: amdgpu 0000:03:00.0: amdgpu: couldn't schedule ib on ring <sdma0> mai 26 16:39:37 S145 kernel: [drm:amdgpu_job_run [amdgpu]] *ERROR* Error scheduling IBs (-22) mai 26 16:39:37 S145 kernel: amdgpu 0000:03:00.0: amdgpu: couldn't schedule ib on ring <sdma0> mai 26 16:39:37 S145 kernel: [drm:amdgpu_job_run [amdgpu]] *ERROR* Error scheduling IBs (-22) mai 26 16:39:37 S145 kernel: amdgpu 0000:03:00.0: amdgpu: couldn't schedule ib on ring <sdma0> mai 26 16:39:37 S145 kernel: [drm:amdgpu_job_run [amdgpu]] *ERROR* Error scheduling IBs (-22) mai 26 16:39:37 S145 kernel: amdgpu 0000:03:00.0: amdgpu: couldn't schedule ib on ring <sdma0> mai 26 16:39:37 S145 kernel: [drm:amdgpu_job_run [amdgpu]] *ERROR* Error scheduling IBs (-22) mai 26 16:39:37 S145 kernel: amdgpu 0000:03:00.0: amdgpu: couldn't schedule ib on ring <sdma0> mai 26 16:39:37 S145 kernel: [drm:amdgpu_job_run [amdgpu]] *ERROR* Error scheduling IBs (-22) mai 26 16:39:37 S145 kernel: amdgpu 0000:03:00.0: amdgpu: couldn't schedule ib on ring <sdma0> mai 26 16:39:37 S145 kernel: [drm:amdgpu_job_run [amdgpu]] *ERROR* Error scheduling IBs (-22) mai 26 16:39:37 S145 kernel: amdgpu 0000:03:00.0: amdgpu: couldn't schedule ib on ring <sdma0> mai 26 16:39:37 S145 kernel: [drm:amdgpu_job_run [amdgpu]] *ERROR* Error scheduling IBs (-22) mai 26 16:39:37 S145 kernel: amdgpu 0000:03:00.0: amdgpu: couldn't schedule ib on ring <sdma0> mai 26 16:39:37 S145 kernel: [drm:amdgpu_job_run [amdgpu]] *ERROR* Error scheduling IBs (-22) mai 26 16:39:37 S145 kernel: amdgpu 0000:03:00.0: amdgpu: couldn't schedule ib on ring <sdma0> mai 26 16:39:37 S145 kernel: [drm:amdgpu_job_run [amdgpu]] *ERROR* Error scheduling IBs (-22) mai 26 16:39:37 S145 kernel: amdgpu 0000:03:00.0: amdgpu: couldn't schedule ib on ring <sdma0> mai 26 16:39:37 S145 kernel: [drm:amdgpu_job_run [amdgpu]] *ERROR* Error scheduling IBs (-22) mai 26 16:39:37 S145 kernel: amdgpu 0000:03:00.0: amdgpu: couldn't schedule ib on ring <sdma0> mai 26 16:39:37 S145 kernel: [drm:amdgpu_job_run [amdgpu]] *ERROR* Error scheduling IBs (-22) mai 26 16:39:47 S145 kernel: [drm:amdgpu_job_timedout [amdgpu]] *ERROR* ring gfx timeout, but soft recovered
I forgot to mention the kernel version I was using when it crashed. It was 5.10.x (In reply to Leandro Jacques from comment #23) > I'm facing exactly the same issue with a Ryzen 7 Vega 10 Graphics integrated > GPU. I'll put my kernel log below, it began to happen after kernel 5.4, I > had to downgrade my kernel to 5.4-lts from AUR and it's already 3 days > without any GPU reset event. > > Kernel crash log in amdgpu driver: > > mai 26 16:39:14 S145 kernel: [drm:amdgpu_job_timedout [amdgpu]] *ERROR* ring > sdma0 timeout, signaled seq=26777, emitted seq=26778 > mai 26 16:39:14 S145 kernel: [drm:amdgpu_job_timedout [amdgpu]] *ERROR* > Process information: process pid 0 thread pid 0 > mai 26 16:39:14 S145 kernel: amdgpu 0000:03:00.0: amdgpu: GPU reset begin! > mai 26 16:39:14 S145 kernel: amdgpu 0000:03:00.0: amdgpu: MODE2 reset > mai 26 16:39:14 S145 kernel: amdgpu 0000:03:00.0: amdgpu: GPU reset > succeeded, trying to resume > mai 26 16:39:14 S145 kernel: amdgpu 0000:03:00.0: amdgpu: RAS: optional ras > ta ucode is not available > mai 26 16:39:14 S145 kernel: amdgpu 0000:03:00.0: amdgpu: RAP: optional rap > ta ucode is not available > mai 26 16:39:15 S145 kernel: amdgpu 0000:03:00.0: amdgpu: ring gfx uses VM > inv eng 0 on hub 0 > mai 26 16:39:15 S145 kernel: amdgpu 0000:03:00.0: amdgpu: ring comp_1.0.0 > uses VM inv eng 1 on hub 0 > mai 26 16:39:15 S145 kernel: amdgpu 0000:03:00.0: amdgpu: ring comp_1.1.0 > uses VM inv eng 4 on hub 0 > mai 26 16:39:15 S145 kernel: amdgpu 0000:03:00.0: amdgpu: ring comp_1.2.0 > uses VM inv eng 5 on hub 0 > mai 26 16:39:15 S145 kernel: amdgpu 0000:03:00.0: amdgpu: ring comp_1.3.0 > uses VM inv eng 6 on hub 0 > mai 26 16:39:15 S145 kernel: amdgpu 0000:03:00.0: amdgpu: ring comp_1.0.1 > uses VM inv eng 7 on hub 0 > mai 26 16:39:15 S145 kernel: amdgpu 0000:03:00.0: amdgpu: ring comp_1.1.1 > uses VM inv eng 8 on hub 0 > mai 26 16:39:15 S145 kernel: amdgpu 0000:03:00.0: amdgpu: ring comp_1.2.1 > uses VM inv eng 9 on hub 0 > mai 26 16:39:15 S145 kernel: amdgpu 0000:03:00.0: amdgpu: ring comp_1.3.1 > uses VM inv eng 10 on hub 0 > mai 26 16:39:15 S145 kernel: amdgpu 0000:03:00.0: amdgpu: ring kiq_2.1.0 > uses VM inv eng 11 on hub 0 > mai 26 16:39:15 S145 kernel: amdgpu 0000:03:00.0: amdgpu: ring sdma0 uses VM > inv eng 0 on hub 1 > mai 26 16:39:15 S145 kernel: amdgpu 0000:03:00.0: amdgpu: ring vcn_dec uses > VM inv eng 1 on hub 1 > mai 26 16:39:15 S145 kernel: amdgpu 0000:03:00.0: amdgpu: ring vcn_enc0 uses > VM inv eng 4 on hub 1 > mai 26 16:39:15 S145 kernel: amdgpu 0000:03:00.0: amdgpu: ring vcn_enc1 uses > VM inv eng 5 on hub 1 > mai 26 16:39:15 S145 kernel: amdgpu 0000:03:00.0: amdgpu: ring jpeg_dec uses > VM inv eng 6 on hub 1 > mai 26 16:39:15 S145 kernel: amdgpu 0000:03:00.0: amdgpu: recover vram bo > from shadow start > mai 26 16:39:15 S145 kernel: amdgpu 0000:03:00.0: amdgpu: recover vram bo > from shadow done > mai 26 16:39:15 S145 kernel: amdgpu 0000:03:00.0: amdgpu: GPU reset(1) > succeeded! > mai 26 16:39:15 S145 kernel: amdgpu 0000:03:00.0: AMD-Vi: Event logged > [IO_PAGE_FAULT domain=0x0000 address=0x10b480b00 flags=0x0070] > mai 26 16:39:15 S145 kernel: amdgpu 0000:03:00.0: AMD-Vi: Event logged > [IO_PAGE_FAULT domain=0x0000 address=0x10b480b40 flags=0x0070] > mai 26 16:39:15 S145 kernel: amdgpu 0000:03:00.0: AMD-Vi: Event logged > [IO_PAGE_FAULT domain=0x0000 address=0x10b480b20 flags=0x0070] > mai 26 16:39:15 S145 kernel: amdgpu 0000:03:00.0: AMD-Vi: Event logged > [IO_PAGE_FAULT domain=0x0000 address=0x10b480b60 flags=0x0070] > mai 26 16:39:15 S145 kernel: amdgpu 0000:03:00.0: AMD-Vi: Event logged > [IO_PAGE_FAULT domain=0x0000 address=0x10b480b80 flags=0x0070] > mai 26 16:39:15 S145 kernel: amdgpu 0000:03:00.0: AMD-Vi: Event logged > [IO_PAGE_FAULT domain=0x0000 address=0x10b480bc0 flags=0x0070] > mai 26 16:39:15 S145 kernel: amdgpu 0000:03:00.0: AMD-Vi: Event logged > [IO_PAGE_FAULT domain=0x0000 address=0x10b480ba0 flags=0x0070] > mai 26 16:39:15 S145 kernel: amdgpu 0000:03:00.0: AMD-Vi: Event logged > [IO_PAGE_FAULT domain=0x0000 address=0x10b480c00 flags=0x0070] > mai 26 16:39:15 S145 kernel: amdgpu 0000:03:00.0: AMD-Vi: Event logged > [IO_PAGE_FAULT domain=0x0000 address=0x10b480be0 flags=0x0070] > mai 26 16:39:15 S145 kernel: amdgpu 0000:03:00.0: AMD-Vi: Event logged > [IO_PAGE_FAULT domain=0x0000 address=0x10b480c40 flags=0x0070] > mai 26 16:39:25 S145 kernel: [drm:amdgpu_job_timedout [amdgpu]] *ERROR* ring > gfx timeout, but soft recovered > mai 26 16:39:35 S145 kernel: [drm:amdgpu_job_timedout [amdgpu]] *ERROR* ring > gfx timeout, signaled seq=2117313, emitted seq=2117316 > mai 26 16:39:35 S145 kernel: [drm:amdgpu_job_timedout [amdgpu]] *ERROR* > Process information: process plasmashell pid 1137 thread plasmashel:cs0 pid > 1234 > mai 26 16:39:35 S145 kernel: amdgpu 0000:03:00.0: amdgpu: GPU reset begin! > mai 26 16:39:35 S145 kernel: amdgpu 0000:03:00.0: AMD-Vi: Event logged > [IO_PAGE_FAULT domain=0x0000 address=0x10b485b40 flags=0x0070] > mai 26 16:39:35 S145 kernel: amdgpu 0000:03:00.0: AMD-Vi: Event logged > [IO_PAGE_FAULT domain=0x0000 address=0x10b485b60 flags=0x0070] > mai 26 16:39:35 S145 kernel: amdgpu 0000:03:00.0: AMD-Vi: Event logged > [IO_PAGE_FAULT domain=0x0000 address=0x10b485b80 flags=0x0070] > mai 26 16:39:35 S145 kernel: amdgpu 0000:03:00.0: AMD-Vi: Event logged > [IO_PAGE_FAULT domain=0x0000 address=0x10b485ba0 flags=0x0070] > mai 26 16:39:35 S145 kernel: amdgpu 0000:03:00.0: AMD-Vi: Event logged > [IO_PAGE_FAULT domain=0x0000 address=0x10b485bc0 flags=0x0070] > mai 26 16:39:35 S145 kernel: amdgpu 0000:03:00.0: AMD-Vi: Event logged > [IO_PAGE_FAULT domain=0x0000 address=0x10b485be0 flags=0x0070] > mai 26 16:39:35 S145 kernel: amdgpu 0000:03:00.0: AMD-Vi: Event logged > [IO_PAGE_FAULT domain=0x0000 address=0x10b485c20 flags=0x0070] > mai 26 16:39:35 S145 kernel: amdgpu 0000:03:00.0: AMD-Vi: Event logged > [IO_PAGE_FAULT domain=0x0000 address=0x10b485c00 flags=0x0070] > mai 26 16:39:35 S145 kernel: amdgpu 0000:03:00.0: AMD-Vi: Event logged > [IO_PAGE_FAULT domain=0x0000 address=0x10b485c40 flags=0x0070] > mai 26 16:39:35 S145 kernel: amdgpu 0000:03:00.0: AMD-Vi: Event logged > [IO_PAGE_FAULT domain=0x0000 address=0x10b485c60 flags=0x0070] > mai 26 16:39:36 S145 kernel: amdgpu 0000:03:00.0: amdgpu: MODE2 reset > mai 26 16:39:36 S145 kernel: amdgpu 0000:03:00.0: amdgpu: GPU reset > succeeded, trying to resume > mai 26 16:39:36 S145 kernel: amdgpu 0000:03:00.0: amdgpu: RAS: optional ras > ta ucode is not available > mai 26 16:39:37 S145 kernel: amdgpu 0000:03:00.0: amdgpu: RAP: optional rap > ta ucode is not available > mai 26 16:39:37 S145 kernel: amdgpu 0000:03:00.0: > [drm:amdgpu_ring_test_helper [amdgpu]] *ERROR* ring sdma0 test failed (-110) > mai 26 16:39:37 S145 kernel: [drm:amdgpu_device_ip_resume_phase2 [amdgpu]] > *ERROR* resume of IP block <sdma_v4_0> failed -110 > mai 26 16:39:37 S145 kernel: amdgpu 0000:03:00.0: amdgpu: GPU reset(4) failed > mai 26 16:39:37 S145 kernel: amdgpu 0000:03:00.0: amdgpu: couldn't schedule > ib on ring <sdma0> > mai 26 16:39:37 S145 kernel: amdgpu 0000:03:00.0: amdgpu: GPU reset end with > ret = -110 > mai 26 16:39:37 S145 kernel: [drm:amdgpu_job_run [amdgpu]] *ERROR* Error > scheduling IBs (-22) > mai 26 16:39:37 S145 kernel: amdgpu 0000:03:00.0: amdgpu: couldn't schedule > ib on ring <sdma0> > mai 26 16:39:37 S145 kernel: [drm:amdgpu_job_run [amdgpu]] *ERROR* Error > scheduling IBs (-22) > mai 26 16:39:37 S145 kernel: amdgpu 0000:03:00.0: amdgpu: couldn't schedule > ib on ring <sdma0> > mai 26 16:39:37 S145 kernel: [drm:amdgpu_job_run [amdgpu]] *ERROR* Error > scheduling IBs (-22) > mai 26 16:39:37 S145 kernel: amdgpu 0000:03:00.0: amdgpu: couldn't schedule > ib on ring <sdma0> > mai 26 16:39:37 S145 kernel: [drm:amdgpu_job_run [amdgpu]] *ERROR* Error > scheduling IBs (-22) > mai 26 16:39:37 S145 kernel: amdgpu 0000:03:00.0: amdgpu: couldn't schedule > ib on ring <sdma0> > mai 26 16:39:37 S145 kernel: [drm:amdgpu_job_run [amdgpu]] *ERROR* Error > scheduling IBs (-22) > mai 26 16:39:37 S145 kernel: amdgpu 0000:03:00.0: amdgpu: couldn't schedule > ib on ring <sdma0> > mai 26 16:39:37 S145 kernel: [drm:amdgpu_job_run [amdgpu]] *ERROR* Error > scheduling IBs (-22) > mai 26 16:39:37 S145 kernel: amdgpu 0000:03:00.0: amdgpu: couldn't schedule > ib on ring <sdma0> > mai 26 16:39:37 S145 kernel: [drm:amdgpu_job_run [amdgpu]] *ERROR* Error > scheduling IBs (-22) > mai 26 16:39:37 S145 kernel: amdgpu 0000:03:00.0: amdgpu: couldn't schedule > ib on ring <sdma0> > mai 26 16:39:37 S145 kernel: [drm:amdgpu_job_run [amdgpu]] *ERROR* Error > scheduling IBs (-22) > mai 26 16:39:37 S145 kernel: amdgpu 0000:03:00.0: amdgpu: couldn't schedule > ib on ring <sdma0> > mai 26 16:39:37 S145 kernel: [drm:amdgpu_job_run [amdgpu]] *ERROR* Error > scheduling IBs (-22) > mai 26 16:39:37 S145 kernel: amdgpu 0000:03:00.0: amdgpu: couldn't schedule > ib on ring <sdma0> > mai 26 16:39:37 S145 kernel: [drm:amdgpu_job_run [amdgpu]] *ERROR* Error > scheduling IBs (-22) > mai 26 16:39:37 S145 kernel: amdgpu 0000:03:00.0: amdgpu: couldn't schedule > ib on ring <sdma0> > mai 26 16:39:37 S145 kernel: [drm:amdgpu_job_run [amdgpu]] *ERROR* Error > scheduling IBs (-22) > mai 26 16:39:37 S145 kernel: amdgpu 0000:03:00.0: amdgpu: couldn't schedule > ib on ring <sdma0> > mai 26 16:39:37 S145 kernel: [drm:amdgpu_job_run [amdgpu]] *ERROR* Error > scheduling IBs (-22) > mai 26 16:39:37 S145 kernel: amdgpu 0000:03:00.0: amdgpu: couldn't schedule > ib on ring <sdma0> > mai 26 16:39:37 S145 kernel: [drm:amdgpu_job_run [amdgpu]] *ERROR* Error > scheduling IBs (-22) > mai 26 16:39:37 S145 kernel: amdgpu 0000:03:00.0: amdgpu: couldn't schedule > ib on ring <sdma0> > mai 26 16:39:37 S145 kernel: [drm:amdgpu_job_run [amdgpu]] *ERROR* Error > scheduling IBs (-22) > mai 26 16:39:37 S145 kernel: amdgpu 0000:03:00.0: amdgpu: couldn't schedule > ib on ring <sdma0> > mai 26 16:39:37 S145 kernel: [drm:amdgpu_job_run [amdgpu]] *ERROR* Error > scheduling IBs (-22) > mai 26 16:39:37 S145 kernel: amdgpu 0000:03:00.0: amdgpu: couldn't schedule > ib on ring <sdma0> > mai 26 16:39:37 S145 kernel: [drm:amdgpu_job_run [amdgpu]] *ERROR* Error > scheduling IBs (-22) > mai 26 16:39:37 S145 kernel: amdgpu 0000:03:00.0: amdgpu: couldn't schedule > ib on ring <sdma0> > mai 26 16:39:37 S145 kernel: [drm:amdgpu_job_run [amdgpu]] *ERROR* Error > scheduling IBs (-22) > mai 26 16:39:37 S145 kernel: amdgpu 0000:03:00.0: amdgpu: couldn't schedule > ib on ring <sdma0> > mai 26 16:39:37 S145 kernel: [drm:amdgpu_job_run [amdgpu]] *ERROR* Error > scheduling IBs (-22) > mai 26 16:39:37 S145 kernel: amdgpu 0000:03:00.0: amdgpu: couldn't schedule > ib on ring <sdma0> > mai 26 16:39:37 S145 kernel: [drm:amdgpu_job_run [amdgpu]] *ERROR* Error > scheduling IBs (-22) > mai 26 16:39:37 S145 kernel: amdgpu 0000:03:00.0: amdgpu: couldn't schedule > ib on ring <sdma0> > mai 26 16:39:37 S145 kernel: [drm:amdgpu_job_run [amdgpu]] *ERROR* Error > scheduling IBs (-22) > mai 26 16:39:37 S145 kernel: amdgpu 0000:03:00.0: amdgpu: couldn't schedule > ib on ring <sdma0> > mai 26 16:39:37 S145 kernel: [drm:amdgpu_job_run [amdgpu]] *ERROR* Error > scheduling IBs (-22) > mai 26 16:39:37 S145 kernel: amdgpu 0000:03:00.0: amdgpu: couldn't schedule > ib on ring <sdma0> > mai 26 16:39:37 S145 kernel: [drm:amdgpu_job_run [amdgpu]] *ERROR* Error > scheduling IBs (-22) > mai 26 16:39:37 S145 kernel: amdgpu 0000:03:00.0: amdgpu: couldn't schedule > ib on ring <sdma0> > mai 26 16:39:37 S145 kernel: [drm:amdgpu_job_run [amdgpu]] *ERROR* Error > scheduling IBs (-22) > mai 26 16:39:37 S145 kernel: amdgpu 0000:03:00.0: amdgpu: couldn't schedule > ib on ring <sdma0> > mai 26 16:39:37 S145 kernel: [drm:amdgpu_job_run [amdgpu]] *ERROR* Error > scheduling IBs (-22) > mai 26 16:39:37 S145 kernel: amdgpu 0000:03:00.0: amdgpu: couldn't schedule > ib on ring <sdma0> > mai 26 16:39:37 S145 kernel: [drm:amdgpu_job_run [amdgpu]] *ERROR* Error > scheduling IBs (-22) > mai 26 16:39:37 S145 kernel: amdgpu 0000:03:00.0: amdgpu: couldn't schedule > ib on ring <sdma0> > mai 26 16:39:37 S145 kernel: [drm:amdgpu_job_run [amdgpu]] *ERROR* Error > scheduling IBs (-22) > mai 26 16:39:37 S145 kernel: amdgpu 0000:03:00.0: amdgpu: couldn't schedule > ib on ring <sdma0> > mai 26 16:39:37 S145 kernel: [drm:amdgpu_job_run [amdgpu]] *ERROR* Error > scheduling IBs (-22) > mai 26 16:39:37 S145 kernel: amdgpu 0000:03:00.0: amdgpu: couldn't schedule > ib on ring <sdma0> > mai 26 16:39:37 S145 kernel: [drm:amdgpu_job_run [amdgpu]] *ERROR* Error > scheduling IBs (-22) > mai 26 16:39:37 S145 kernel: amdgpu 0000:03:00.0: amdgpu: couldn't schedule > ib on ring <sdma0> > mai 26 16:39:37 S145 kernel: [drm:amdgpu_job_run [amdgpu]] *ERROR* Error > scheduling IBs (-22) > mai 26 16:39:37 S145 kernel: amdgpu 0000:03:00.0: amdgpu: couldn't schedule > ib on ring <sdma0> > mai 26 16:39:37 S145 kernel: [drm:amdgpu_job_run [amdgpu]] *ERROR* Error > scheduling IBs (-22) > mai 26 16:39:37 S145 kernel: amdgpu 0000:03:00.0: amdgpu: couldn't schedule > ib on ring <sdma0> > mai 26 16:39:37 S145 kernel: [drm:amdgpu_job_run [amdgpu]] *ERROR* Error > scheduling IBs (-22) > mai 26 16:39:37 S145 kernel: amdgpu 0000:03:00.0: amdgpu: couldn't schedule > ib on ring <sdma0> > mai 26 16:39:37 S145 kernel: [drm:amdgpu_job_run [amdgpu]] *ERROR* Error > scheduling IBs (-22) > mai 26 16:39:37 S145 kernel: amdgpu 0000:03:00.0: amdgpu: couldn't schedule > ib on ring <sdma0> > mai 26 16:39:37 S145 kernel: [drm:amdgpu_job_run [amdgpu]] *ERROR* Error > scheduling IBs (-22) > mai 26 16:39:37 S145 kernel: amdgpu 0000:03:00.0: amdgpu: couldn't schedule > ib on ring <sdma0> > mai 26 16:39:37 S145 kernel: [drm:amdgpu_job_run [amdgpu]] *ERROR* Error > scheduling IBs (-22) > mai 26 16:39:37 S145 kernel: amdgpu 0000:03:00.0: amdgpu: couldn't schedule > ib on ring <sdma0> > mai 26 16:39:37 S145 kernel: [drm:amdgpu_job_run [amdgpu]] *ERROR* Error > scheduling IBs (-22) > mai 26 16:39:37 S145 kernel: amdgpu 0000:03:00.0: amdgpu: couldn't schedule > ib on ring <sdma0> > mai 26 16:39:37 S145 kernel: [drm:amdgpu_job_run [amdgpu]] *ERROR* Error > scheduling IBs (-22) > mai 26 16:39:37 S145 kernel: amdgpu 0000:03:00.0: amdgpu: couldn't schedule > ib on ring <sdma0> > mai 26 16:39:37 S145 kernel: [drm:amdgpu_job_run [amdgpu]] *ERROR* Error > scheduling IBs (-22) > mai 26 16:39:37 S145 kernel: amdgpu 0000:03:00.0: amdgpu: couldn't schedule > ib on ring <sdma0> > mai 26 16:39:37 S145 kernel: [drm:amdgpu_job_run [amdgpu]] *ERROR* Error > scheduling IBs (-22) > mai 26 16:39:37 S145 kernel: amdgpu 0000:03:00.0: amdgpu: couldn't schedule > ib on ring <sdma0> > mai 26 16:39:37 S145 kernel: [drm:amdgpu_job_run [amdgpu]] *ERROR* Error > scheduling IBs (-22) > mai 26 16:39:37 S145 kernel: amdgpu 0000:03:00.0: amdgpu: couldn't schedule > ib on ring <sdma0> > mai 26 16:39:37 S145 kernel: [drm:amdgpu_job_run [amdgpu]] *ERROR* Error > scheduling IBs (-22) > mai 26 16:39:37 S145 kernel: amdgpu 0000:03:00.0: amdgpu: couldn't schedule > ib on ring <sdma0> > mai 26 16:39:37 S145 kernel: [drm:amdgpu_job_run [amdgpu]] *ERROR* Error > scheduling IBs (-22) > mai 26 16:39:37 S145 kernel: amdgpu 0000:03:00.0: amdgpu: couldn't schedule > ib on ring <sdma0> > mai 26 16:39:37 S145 kernel: [drm:amdgpu_job_run [amdgpu]] *ERROR* Error > scheduling IBs (-22) > mai 26 16:39:37 S145 kernel: amdgpu 0000:03:00.0: amdgpu: couldn't schedule > ib on ring <sdma0> > mai 26 16:39:37 S145 kernel: [drm:amdgpu_job_run [amdgpu]] *ERROR* Error > scheduling IBs (-22) > mai 26 16:39:37 S145 kernel: amdgpu 0000:03:00.0: amdgpu: couldn't schedule > ib on ring <sdma0> > mai 26 16:39:37 S145 kernel: [drm:amdgpu_job_run [amdgpu]] *ERROR* Error > scheduling IBs (-22) > mai 26 16:39:37 S145 kernel: amdgpu 0000:03:00.0: amdgpu: couldn't schedule > ib on ring <sdma0> > mai 26 16:39:37 S145 kernel: [drm:amdgpu_job_run [amdgpu]] *ERROR* Error > scheduling IBs (-22) > mai 26 16:39:37 S145 kernel: amdgpu 0000:03:00.0: amdgpu: couldn't schedule > ib on ring <sdma0> > mai 26 16:39:37 S145 kernel: [drm:amdgpu_job_run [amdgpu]] *ERROR* Error > scheduling IBs (-22) > mai 26 16:39:37 S145 kernel: amdgpu 0000:03:00.0: amdgpu: couldn't schedule > ib on ring <sdma0> > mai 26 16:39:37 S145 kernel: [drm:amdgpu_job_run [amdgpu]] *ERROR* Error > scheduling IBs (-22) > mai 26 16:39:37 S145 kernel: amdgpu 0000:03:00.0: amdgpu: couldn't schedule > ib on ring <sdma0> > mai 26 16:39:37 S145 kernel: [drm:amdgpu_job_run [amdgpu]] *ERROR* Error > scheduling IBs (-22) > mai 26 16:39:37 S145 kernel: amdgpu 0000:03:00.0: amdgpu: couldn't schedule > ib on ring <sdma0> > mai 26 16:39:37 S145 kernel: [drm:amdgpu_job_run [amdgpu]] *ERROR* Error > scheduling IBs (-22) > mai 26 16:39:37 S145 kernel: amdgpu 0000:03:00.0: amdgpu: couldn't schedule > ib on ring <sdma0> > mai 26 16:39:37 S145 kernel: [drm:amdgpu_job_run [amdgpu]] *ERROR* Error > scheduling IBs (-22) > mai 26 16:39:37 S145 kernel: amdgpu 0000:03:00.0: amdgpu: couldn't schedule > ib on ring <sdma0> > mai 26 16:39:37 S145 kernel: [drm:amdgpu_job_run [amdgpu]] *ERROR* Error > scheduling IBs (-22) > mai 26 16:39:37 S145 kernel: amdgpu 0000:03:00.0: amdgpu: couldn't schedule > ib on ring <sdma0> > mai 26 16:39:37 S145 kernel: [drm:amdgpu_job_run [amdgpu]] *ERROR* Error > scheduling IBs (-22) > mai 26 16:39:37 S145 kernel: amdgpu 0000:03:00.0: amdgpu: couldn't schedule > ib on ring <sdma0> > mai 26 16:39:37 S145 kernel: [drm:amdgpu_job_run [amdgpu]] *ERROR* Error > scheduling IBs (-22) > mai 26 16:39:37 S145 kernel: amdgpu 0000:03:00.0: amdgpu: couldn't schedule > ib on ring <sdma0> > mai 26 16:39:37 S145 kernel: [drm:amdgpu_job_run [amdgpu]] *ERROR* Error > scheduling IBs (-22) > mai 26 16:39:37 S145 kernel: amdgpu 0000:03:00.0: amdgpu: couldn't schedule > ib on ring <sdma0> > mai 26 16:39:37 S145 kernel: [drm:amdgpu_job_run [amdgpu]] *ERROR* Error > scheduling IBs (-22) > mai 26 16:39:37 S145 kernel: amdgpu 0000:03:00.0: amdgpu: couldn't schedule > ib on ring <sdma0> > mai 26 16:39:37 S145 kernel: [drm:amdgpu_job_run [amdgpu]] *ERROR* Error > scheduling IBs (-22) > mai 26 16:39:37 S145 kernel: amdgpu 0000:03:00.0: amdgpu: couldn't schedule > ib on ring <sdma0> > mai 26 16:39:37 S145 kernel: [drm:amdgpu_job_run [amdgpu]] *ERROR* Error > scheduling IBs (-22) > mai 26 16:39:37 S145 kernel: amdgpu 0000:03:00.0: amdgpu: couldn't schedule > ib on ring <sdma0> > mai 26 16:39:37 S145 kernel: [drm:amdgpu_job_run [amdgpu]] *ERROR* Error > scheduling IBs (-22) > mai 26 16:39:37 S145 kernel: amdgpu 0000:03:00.0: amdgpu: couldn't schedule > ib on ring <sdma0> > mai 26 16:39:37 S145 kernel: [drm:amdgpu_job_run [amdgpu]] *ERROR* Error > scheduling IBs (-22) > mai 26 16:39:37 S145 kernel: amdgpu 0000:03:00.0: amdgpu: couldn't schedule > ib on ring <sdma0> > mai 26 16:39:37 S145 kernel: [drm:amdgpu_job_run [amdgpu]] *ERROR* Error > scheduling IBs (-22) > mai 26 16:39:37 S145 kernel: amdgpu 0000:03:00.0: amdgpu: couldn't schedule > ib on ring <sdma0> > mai 26 16:39:37 S145 kernel: [drm:amdgpu_job_run [amdgpu]] *ERROR* Error > scheduling IBs (-22) > mai 26 16:39:37 S145 kernel: amdgpu 0000:03:00.0: amdgpu: couldn't schedule > ib on ring <sdma0> > mai 26 16:39:37 S145 kernel: [drm:amdgpu_job_run [amdgpu]] *ERROR* Error > scheduling IBs (-22) > mai 26 16:39:37 S145 kernel: amdgpu 0000:03:00.0: amdgpu: couldn't schedule > ib on ring <sdma0> > mai 26 16:39:37 S145 kernel: [drm:amdgpu_job_run [amdgpu]] *ERROR* Error > scheduling IBs (-22) > mai 26 16:39:37 S145 kernel: amdgpu 0000:03:00.0: amdgpu: couldn't schedule > ib on ring <sdma0> > mai 26 16:39:37 S145 kernel: [drm:amdgpu_job_run [amdgpu]] *ERROR* Error > scheduling IBs (-22) > mai 26 16:39:37 S145 kernel: amdgpu 0000:03:00.0: amdgpu: couldn't schedule > ib on ring <sdma0> > mai 26 16:39:37 S145 kernel: [drm:amdgpu_job_run [amdgpu]] *ERROR* Error > scheduling IBs (-22) > mai 26 16:39:37 S145 kernel: amdgpu 0000:03:00.0: amdgpu: couldn't schedule > ib on ring <sdma0> > mai 26 16:39:37 S145 kernel: [drm:amdgpu_job_run [amdgpu]] *ERROR* Error > scheduling IBs (-22) > mai 26 16:39:37 S145 kernel: amdgpu 0000:03:00.0: amdgpu: couldn't schedule > ib on ring <sdma0> > mai 26 16:39:37 S145 kernel: [drm:amdgpu_job_run [amdgpu]] *ERROR* Error > scheduling IBs (-22) > mai 26 16:39:37 S145 kernel: amdgpu 0000:03:00.0: amdgpu: couldn't schedule > ib on ring <sdma0> > mai 26 16:39:37 S145 kernel: [drm:amdgpu_job_run [amdgpu]] *ERROR* Error > scheduling IBs (-22) > mai 26 16:39:37 S145 kernel: amdgpu 0000:03:00.0: amdgpu: couldn't schedule > ib on ring <sdma0> > mai 26 16:39:37 S145 kernel: [drm:amdgpu_job_run [amdgpu]] *ERROR* Error > scheduling IBs (-22) > mai 26 16:39:37 S145 kernel: amdgpu 0000:03:00.0: amdgpu: couldn't schedule > ib on ring <sdma0> > mai 26 16:39:37 S145 kernel: [drm:amdgpu_job_run [amdgpu]] *ERROR* Error > scheduling IBs (-22) > mai 26 16:39:37 S145 kernel: amdgpu 0000:03:00.0: amdgpu: couldn't schedule > ib on ring <sdma0> > mai 26 16:39:37 S145 kernel: [drm:amdgpu_job_run [amdgpu]] *ERROR* Error > scheduling IBs (-22) > mai 26 16:39:37 S145 kernel: amdgpu 0000:03:00.0: amdgpu: couldn't schedule > ib on ring <sdma0> > mai 26 16:39:37 S145 kernel: [drm:amdgpu_job_run [amdgpu]] *ERROR* Error > scheduling IBs (-22) > mai 26 16:39:37 S145 kernel: amdgpu 0000:03:00.0: amdgpu: couldn't schedule > ib on ring <sdma0> > mai 26 16:39:37 S145 kernel: [drm:amdgpu_job_run [amdgpu]] *ERROR* Error > scheduling IBs (-22) > mai 26 16:39:37 S145 kernel: amdgpu 0000:03:00.0: amdgpu: couldn't schedule > ib on ring <sdma0> > mai 26 16:39:37 S145 kernel: [drm:amdgpu_job_run [amdgpu]] *ERROR* Error > scheduling IBs (-22) > mai 26 16:39:37 S145 kernel: amdgpu 0000:03:00.0: amdgpu: couldn't schedule > ib on ring <sdma0> > mai 26 16:39:37 S145 kernel: [drm:amdgpu_job_run [amdgpu]] *ERROR* Error > scheduling IBs (-22) > mai 26 16:39:37 S145 kernel: amdgpu 0000:03:00.0: amdgpu: couldn't schedule > ib on ring <sdma0> > mai 26 16:39:37 S145 kernel: [drm:amdgpu_job_run [amdgpu]] *ERROR* Error > scheduling IBs (-22) > mai 26 16:39:37 S145 kernel: amdgpu 0000:03:00.0: amdgpu: couldn't schedule > ib on ring <sdma0> > mai 26 16:39:37 S145 kernel: [drm:amdgpu_job_run [amdgpu]] *ERROR* Error > scheduling IBs (-22) > mai 26 16:39:37 S145 kernel: amdgpu 0000:03:00.0: amdgpu: couldn't schedule > ib on ring <sdma0> > mai 26 16:39:37 S145 kernel: [drm:amdgpu_job_run [amdgpu]] *ERROR* Error > scheduling IBs (-22) > mai 26 16:39:37 S145 kernel: amdgpu 0000:03:00.0: amdgpu: couldn't schedule > ib on ring <sdma0> > mai 26 16:39:37 S145 kernel: [drm:amdgpu_job_run [amdgpu]] *ERROR* Error > scheduling IBs (-22) > mai 26 16:39:37 S145 kernel: amdgpu 0000:03:00.0: amdgpu: couldn't schedule > ib on ring <sdma0> > mai 26 16:39:37 S145 kernel: [drm:amdgpu_job_run [amdgpu]] *ERROR* Error > scheduling IBs (-22) > mai 26 16:39:37 S145 kernel: amdgpu 0000:03:00.0: amdgpu: couldn't schedule > ib on ring <sdma0> > mai 26 16:39:37 S145 kernel: [drm:amdgpu_job_run [amdgpu]] *ERROR* Error > scheduling IBs (-22) > mai 26 16:39:37 S145 kernel: amdgpu 0000:03:00.0: amdgpu: couldn't schedule > ib on ring <sdma0> > mai 26 16:39:37 S145 kernel: [drm:amdgpu_job_run [amdgpu]] *ERROR* Error > scheduling IBs (-22) > mai 26 16:39:37 S145 kernel: amdgpu 0000:03:00.0: amdgpu: couldn't schedule > ib on ring <sdma0> > mai 26 16:39:37 S145 kernel: [drm:amdgpu_job_run [amdgpu]] *ERROR* Error > scheduling IBs (-22) > mai 26 16:39:37 S145 kernel: amdgpu 0000:03:00.0: amdgpu: couldn't schedule > ib on ring <sdma0> > mai 26 16:39:37 S145 kernel: [drm:amdgpu_job_run [amdgpu]] *ERROR* Error > scheduling IBs (-22) > mai 26 16:39:37 S145 kernel: amdgpu 0000:03:00.0: amdgpu: couldn't schedule > ib on ring <sdma0> > mai 26 16:39:37 S145 kernel: [drm:amdgpu_job_run [amdgpu]] *ERROR* Error > scheduling IBs (-22) > mai 26 16:39:37 S145 kernel: amdgpu 0000:03:00.0: amdgpu: couldn't schedule > ib on ring <sdma0> > mai 26 16:39:37 S145 kernel: [drm:amdgpu_job_run [amdgpu]] *ERROR* Error > scheduling IBs (-22) > mai 26 16:39:37 S145 kernel: amdgpu 0000:03:00.0: amdgpu: couldn't schedule > ib on ring <sdma0> > mai 26 16:39:37 S145 kernel: [drm:amdgpu_job_run [amdgpu]] *ERROR* Error > scheduling IBs (-22) > mai 26 16:39:37 S145 kernel: amdgpu 0000:03:00.0: amdgpu: couldn't schedule > ib on ring <sdma0> > mai 26 16:39:37 S145 kernel: [drm:amdgpu_job_run [amdgpu]] *ERROR* Error > scheduling IBs (-22) > mai 26 16:39:37 S145 kernel: amdgpu 0000:03:00.0: amdgpu: couldn't schedule > ib on ring <sdma0> > mai 26 16:39:37 S145 kernel: [drm:amdgpu_job_run [amdgpu]] *ERROR* Error > scheduling IBs (-22) > mai 26 16:39:37 S145 kernel: amdgpu 0000:03:00.0: amdgpu: couldn't schedule > ib on ring <sdma0> > mai 26 16:39:37 S145 kernel: [drm:amdgpu_job_run [amdgpu]] *ERROR* Error > scheduling IBs (-22) > mai 26 16:39:37 S145 kernel: amdgpu 0000:03:00.0: amdgpu: couldn't schedule > ib on ring <sdma0> > mai 26 16:39:37 S145 kernel: [drm:amdgpu_job_run [amdgpu]] *ERROR* Error > scheduling IBs (-22) > mai 26 16:39:37 S145 kernel: amdgpu 0000:03:00.0: amdgpu: couldn't schedule > ib on ring <sdma0> > mai 26 16:39:37 S145 kernel: [drm:amdgpu_job_run [amdgpu]] *ERROR* Error > scheduling IBs (-22) > mai 26 16:39:37 S145 kernel: amdgpu 0000:03:00.0: amdgpu: couldn't schedule > ib on ring <sdma0> > mai 26 16:39:37 S145 kernel: [drm:amdgpu_job_run [amdgpu]] *ERROR* Error > scheduling IBs (-22) > mai 26 16:39:47 S145 kernel: [drm:amdgpu_job_timedout [amdgpu]] *ERROR* ring > gfx timeout, but soft recovered
I don't have this issue with kernel 5.12.10
Created attachment 297415 [details] Kernel crash log for kernel 5.10.x
Comment on attachment 297415 [details] Kernel crash log for kernel 5.10.x I had to downgrade to kernel 5.4 LTS to get rid of any problems
Created attachment 297465 [details] amdgpu crash log for kernel 5.4.126 Another problem appeared in kernel 5.4.126 as the attached log shows. Before version 5.4.126 I was running out of problems.
Created attachment 297567 [details] Linux Firmware version info I tried downgrading the kernel to 5.4.123 and it didn't work out, I had the same issues. So I downgraded linux-firmware to see if the problem disappears, I'm using linux-firmware 20210315
(In reply to Leandro Jacques from comment #29) Until now, no problems. So the problem is with newer firmware versions, working without any issues since 2021-06-22 17:16:25 UTC with version 20210315
How to file a bug to the linux-firmware project for the amdgpu driver? After the downgrade I haven't experienced any issues anymore.
Can you narrow down which specific firmware file (ce, me, smc, etc.) causes the problem?
(In reply to Alex Deucher from comment #32) Sorry for that, but when I had the problem I wasn't paying attention for the firmware version, I thought it was a kernel problem. I focused on linux-firmware package only when I saw some people complaining about the same issues I was having, but blaming the linux-firmware package. So I saw a post telling that the latest good linux-firmware that was working well was 20210315, so I downgraded to this version and problem is gone, since the downgrade on 2021-06-22, I locked that package to not be upgraded anymore. I'll try to updgrade it again and see if the latest solves the problem too, but, by now, I can only guarantee that 20210315 version works without issues.
Created attachment 297851 [details] Linux Firmware version info 20210511.7685cf4
Created attachment 297853 [details] Linux Firmware version info 20210511.7685cf4 Firmware version when crashed
Created attachment 297855 [details] Kernel crash log for linux firmware version 20210511.7685cf4 Kernel log when crashed.
(In reply to Alex Deucher from comment #32) As you asked about the firmware version details, I upgraded my linux-firmware package to see if the problem would come back and it came back. So, this time, I could attatch the kernel log for the amdgpu driver and the amdgpu firmware versions details as of the crash event to narrow down the issue. By now, I'll return to the older version to make my system stable again.
(In reply to Leandro Jacques from comment #37) > (In reply to Alex Deucher from comment #32) > As you asked about the firmware version details, I upgraded my > linux-firmware package to see if the problem would come back and it came > back. So, this time, I could attatch the kernel log for the amdgpu driver > and the amdgpu firmware versions details as of the crash event to narrow > down the issue. By now, I'll return to the older version to make my system > stable again. You have a Picasso system. The original bug was about an RX 580. I don't think this is the same issue. Sounds like you are seeing this issue: https://lists.freedesktop.org/archives/amd-gfx/2021-July/066452.html
(In reply to Alex Deucher from comment #38) > > You have a Picasso system. The original bug was about an RX 580. I don't > think this is the same issue. Sounds like you are seeing this issue: > https://lists.freedesktop.org/archives/amd-gfx/2021-July/066452.html No, the error message is exactly the same of this one https://bugzilla.kernel.org/show_bug.cgi?id=213391