Bug 213145

Summary: AMDGPU resets, timesout and crashes after "*ERROR* Waiting for fences timed out!"
Product: Drivers Reporter: Tomas Gayoso (tgayoso)
Component: Video(DRI - non Intel)Assignee: drivers_video-dri
Status: NEW ---    
Severity: high CC: a, alexdeucher, ayurtsev, braiamp, bugzilla-kernel, chewi, cousinmarc, emlodnaor, fichterfrancis, fmhirtz, grizaster+kernel, halturin, jackburgess124, jonathan.j.rayner, kernel, kernelbugs, mastercatz, meep, michal.przybylowicz, mikk351, nix.sasl, nvaert1986, pmenzel+bugzilla.kernel.org, sgasgar, tuupic, vyoepdeygrbbsivo
Priority: P1    
Hardware: x86-64   
OS: Linux   
Kernel Version: 5.10.37 until 5.10.42 Subsystem:
Regression: No Bisected commit-id:
Attachments: lspci output
lsmod output
kernel configuration
5.10.42 dmesg output with crash aftrer reset
patch for mesa 21.1.2

Description Tomas Gayoso 2021-05-19 14:46:42 UTC
Created attachment 296867 [details]
lspci output

AMDGPU driver crashes randomly corrupting screen and freezing X with:


[   60.449781] [drm:0xffffffffc25a7a57] *ERROR* Waiting for fences timed out!
[   60.971941] [drm:0xffffffffc25281ae] *ERROR* ring gfx timeout, signaled seq=3658, emitted seq=3660
[   60.971946] [drm:0xffffffffc25281cb] *ERROR* Process information: process Xorg pid 1192 thread Xorg:cs0 pid 1193
[   60.971952] amdgpu 0000:05:00.0: amdgpu: GPU reset begin!

... (some output suppressed for clarity, look at attached dmesg, please). 

[   61.800343] amdgpu 0000:05:00.0: amdgpu: recover vram bo from shadow start
[   61.800346] amdgpu 0000:05:00.0: amdgpu: recover vram bo from shadow done
[   61.800348] [drm] Skip scheduling IBs!
[   61.800350] [drm] Skip scheduling IBs!
[   61.800382] [drm] Skip scheduling IBs!
[   61.800398] amdgpu 0000:05:00.0: amdgpu: GPU reset(2) succeeded!
[   61.800566] [drm] Skip scheduling IBs!
[   61.800580] [drm] Skip scheduling IBs!
[   61.800627] [drm] Skip scheduling IBs!
[   61.801012] [drm] Skip scheduling IBs!
[   61.801024] [drm] Skip scheduling IBs!
[   61.801052] [drm] Skip scheduling IBs!
[   61.801062] [drm] Skip scheduling IBs!
[   61.801096] [drm] Skip scheduling IBs!
[   61.801105] [drm] Skip scheduling IBs!
[   61.801137] [drm] Skip scheduling IBs!
[   61.801806] [drm:0xffffffffc24219b8] *ERROR* Failed to initialize parser -125!
[   61.808746] [drm:0xffffffffc24219b8] *ERROR* Failed to initialize parser -125!
[   61.809392] [drm:0xffffffffc24219b8] *ERROR* Failed to initialize parser -125!
[   61.809764] [drm:0xffffffffc24219b8] *ERROR* Failed to initialize parser -125!
[   61.810389] [drm:0xffffffffc24219b8] *ERROR* Failed to initialize parser -125!
[   61.810866] [drm:0xffffffffc24219b8] *ERROR* Failed to initialize parser -125!
[   61.812529] [drm:0xffffffffc24219b8] *ERROR* Failed to initialize parser -125!
[   61.813359] [drm:0xffffffffc24219b8] *ERROR* Failed to initialize parser -125!
[   61.814770] [drm:0xffffffffc24219b8] *ERROR* Failed to initialize parser -125!
[   61.816488] [drm:0xffffffffc24219b8] *ERROR* Failed to initialize parser -125!
[   62.541982] ucsi_acpi USBC000:00: PPM init failed (-110)
[   67.004898] amdgpu_cs_ioctl: 1467 callbacks suppressed




Hardware: ASUS TUF506IU laptop (dual GPU Renoir and Nvidia GeForce GTX 1660 Ti Mobile)

Detailed hardware in lspci.txt. 
Detailed modules in lsmod.txt. 
Kernle config attached in Kernel_config.
Comment 1 Tomas Gayoso 2021-05-19 14:47:17 UTC
Created attachment 296869 [details]
lsmod  output
Comment 2 Tomas Gayoso 2021-05-19 14:47:48 UTC
Created attachment 296871 [details]
kernel configuration
Comment 3 meep 2021-05-27 13:43:04 UTC
We have some crashes probably related to this at
https://gitlab.freedesktop.org/drm/amd/-/issues/1591  

I have the exact the same kernel log, however missing the very first line.

I can trigger this bug reproducibly by just opening a lot of xterm windows if truetype fonts enabled.

My Machine is brand new. So I dont have any last-working-setup/date.
Thinkpad T14s. AMD U4750 (no dedicated GFX).
Archlinux updated from 19.5.2021 till 27.5.2021 for sure has this bug in it.

Tried various kernel down/grades from 5.10.1 to 5.12.6, all crashed.
Tried a lot of different linux-firmwares too.

Complete Archlinux Rollback to 1/March/2021 stopped crashing.
I chose this date randomly so some days later everything might be fine too.

Not experienced in that whole Graphics-System and how packages/components interconnect.

Willing to help pinpointing the problem, testing fixes/solutions.
Comment 4 Tomas Gayoso 2021-06-04 21:37:18 UTC
Created attachment 297161 [details]
5.10.42 dmesg output with crash aftrer reset

Bug is still present on 5.10.42. Locking and crashing X after driver fails to reset. Attaching dmesg output.
Comment 5 meep 2021-06-05 00:38:04 UTC
https://gitlab.freedesktop.org/mesa/mesa/-/issues/4866

seems exactly same Problem, same config as another guy.
seems to be introduced with removing some check from mesa to test size of drawbuffer against zero.
not yet decided if its a bug to remove this,
or if radeonsi or amdgpu firmware should handle this case.

there are some solutions and quickfixes posted,
you could try them and report back then.
Comment 6 Tomas Gayoso 2021-06-07 19:37:38 UTC
I upgraded my setep to mesa-21.1.2, but I can still trigger the bug at will on fresh boot by opening a single xterm on i3wm and resizing it while running any fast scrolling character binary, like cli-visualizer ( https://github.com/dpayne/cli-visualizer ). 

The other suggestions on the thread for the amdgpu module setting do nothing for me. 

Willing to patch my kernel and test, I can compile my own.
Comment 7 Tomas Gayoso 2021-06-07 19:41:11 UTC
System config for the record as in the mesa bug: 

root@tufboxen:~# date 
Mon Jun  7 16:40:27 -03 2021
root@tufboxen:~# inxi -GSC -xx
System:    Host: tufboxen.lan Kernel: 5.10.42-TUF x86_64 bits: 64 compiler: gcc v: 2.36.1-slack15 Desktop: i3 4.18.3 dm: XDM 
           Distro: Slackware 14.2 
CPU:       Info: 8-Core model: AMD Ryzen 7 4800H with Radeon Graphics bits: 64 type: MT MCP arch: Zen 2 rev: 1 cache: 
           L1: 512 KiB L2: 4 MiB L3: 8 MiB 
           flags: avx avx2 lm nx pae sse sse2 sse3 sse4_1 sse4_2 sse4a ssse3 svm bogomips: 92624 
           Speed: 1370 MHz min/max: 1400/2900 MHz boost: enabled Core speeds (MHz): 1: 1370 2: 1290 3: 1402 4: 1497 5: 1388 
           6: 1327 7: 1390 8: 1391 9: 1566 10: 1370 11: 1301 12: 1396 13: 1394 14: 1411 15: 1397 16: 1517 
Graphics:  Device-1: NVIDIA TU116M [GeForce GTX 1660 Ti Mobile] vendor: ASUSTeK driver: nvidia v: 465.31 bus-ID: 01:00.0 
           chip-ID: 10de:2191 
           Device-2: Advanced Micro Devices [AMD/ATI] Renoir vendor: ASUSTeK driver: amdgpu v: kernel bus-ID: 05:00.0 
           chip-ID: 1002:1636 
           Device-3: IMC Networks USB2.0 HD UVC WebCam type: USB driver: uvcvideo bus-ID: 3-4:4 chip-ID: 13d3:56a2 
           Display: server: X.Org 1.20.11 driver: loaded: amdgpu,ati,nvidia unloaded: modesetting,nouveau,nv,vesa 
           alternate: fbdev resolution: 1: 1920x1080~144Hz 2: 2560x1440 s-dpi: 96 
           OpenGL: renderer: AMD RENOIR (DRM 3.40.0 5.10.42-TUF LLVM 12.0.0) v: 4.6 Mesa 21.1.2 direct render: Yes 
root@tufboxen:~#
Comment 8 meep 2021-06-08 16:38:42 UTC
how about reading any comments ?
Comment 9 Tomas Gayoso 2021-06-08 17:17:54 UTC
Created attachment 297243 [details]
patch for mesa 21.1.2

Thanks for the irony. 

Recompiling Mesa 21.1.2 with the attached patch fixes the issue for me in kernels  5.10.42 and 5.12.9. 

I followed this suggestion from the mesa bug report:

can you try 21.1.2 and change this line:
file: src/mesa/main/draw.c  
function: validate_draw_arrays

change:
if (count < 0 || numInstances < 0)
into:
if (count <= 0 || numInstances <= 0) 

Cheers,
Comment 10 meep 2021-06-08 17:25:15 UTC
ok nice :)

please report back to https://gitlab.freedesktop.org/mesa/mesa/-/issues/4866 with your config and that this fix resolved random crashes

thanks :)

(its not a kernel issue)
Comment 11 meep 2021-06-08 17:30:39 UTC
if you have an easy and reproducible way to trigger this bug please tell them
Comment 12 Eric Wheeler 2022-05-25 02:21:08 UTC
Considering this is a kernel crash, why wouldn't we still consider this a kernel bug?

Just because it can be fixed in userspace doesn't mean we shouldn't address the kernel crash because userspace should not be able to crash the kernel!  

For servers it is considered a security problem when non-root users can crash the machine.
Comment 13 Alex Deucher 2022-05-25 02:30:55 UTC
(In reply to Eric Wheeler from comment #12)
> Considering this is a kernel crash, why wouldn't we still consider this a
> kernel bug?
> 
> Just because it can be fixed in userspace doesn't mean we shouldn't address
> the kernel crash because userspace should not be able to crash the kernel!  
> 
> For servers it is considered a security problem when non-root users can
> crash the machine.

There is no reasonable way for the kernel to handle it.  We would need to parse every texture, vertex buffer, command buffer, and shader that the application submits to the GPU to try and verify if that combination of data, shader code and pipeline state would possibly cause a hang.  A more robust solution would for session managers to monitor the GPU for resets and then restart the user's session so they don't lose their desktop.  This is how it works on other OSes.
Comment 14 Eric Wheeler 2022-05-25 04:03:59 UTC
Interesting, that makes sense.  Does the kernel have a framework for notifying window managers that the graphics driver reset?  If not then that seems like that would be the next logical step.
Comment 15 Eric Wheeler 2022-05-25 04:07:05 UTC
I found this bug because our card was hanging the entire system, it didn't even ping and we had to get dmesg output from netconsole.  While it would be (more) acceptable for the gpu driver to hang and X to need a restart, it would be nice if the OS remained up.

I'm trying the brand-new 5.18 to see if that helps.

(We were on a vendor-supplied 5.4.x Oracle UEK kernel on OL8 and the 5.13 amdgpu driver from amd's site built by DKMS so maybe something newer will help.)
Comment 16 Eric Wheeler 2022-05-25 20:18:32 UTC
FYI: 5.18 seems to be working!  I guess the 5.13 driver from AMD needs to be fixed, but thats on them...
Comment 17 Nix\ 2022-05-29 07:40:38 UTC
I have the exact bug, Lenovo LENOVO 81LW AMD Ryzen 3 3200U with Radeon Vega Mobile Gfx
When run the OpenCL geekbench test or OpenCL in libreoffice.
With Vulkan works fine.

[11824.771725] [drm:amdgpu_dm_atomic_commit_tail [amdgpu]] *ERROR* Waiting for fences timed out!
[11829.688441] [drm:amdgpu_job_timedout [amdgpu]] *ERROR* ring gfx timeout, signaled seq=1319920, emitted seq=1319922
[11829.688735] [drm:amdgpu_job_timedout [amdgpu]] *ERROR* Process information: process Xorg pid 1908 thread Xorg:cs0 pid 2276
[11829.689018] amdgpu 0000:03:00.0: amdgpu: GPU reset begin!
[11829.787762] [drm] free PSP TMR buffer
[11829.823613] CPU: 0 PID: 40202 Comm: kworker/u32:0 Tainted: G        W  OE     5.18.0-2-MANJARO #1 e81df7241f6a360dc27e43ab195df7d97a8118f5
[11829.823619] Hardware name: LENOVO 81LW/LNVNB161216, BIOS ARCN37WW 05/14/2021
[11829.823621] Workqueue: amdgpu-reset-dev drm_sched_job_timedout [gpu_sched]
[11829.823630] Call Trace:
[11829.823633]  <TASK>
[11829.823634]  dump_stack_lvl+0x48/0x5d
[11829.823640]  amdgpu_do_asic_reset+0x2a/0x470 [amdgpu 3e08061ca61bf3b1e24b4c9d96c6f8977494b21f]
[11829.823954]  amdgpu_device_gpu_recover_imp.cold+0x535/0x8ca [amdgpu 3e08061ca61bf3b1e24b4c9d96c6f8977494b21f]
[11829.824258]  amdgpu_job_timedout+0x18c/0x1c0 [amdgpu 3e08061ca61bf3b1e24b4c9d96c6f8977494b21f]
[11829.824541]  drm_sched_job_timedout+0x73/0x100 [gpu_sched ac691790925fcace6384349a4c749a4fc130c519]
[11829.824548]  process_one_work+0x1c4/0x380
[11829.824552]  worker_thread+0x51/0x380
[11829.824554]  ? rescuer_thread+0x3a0/0x3a0
[11829.824557]  kthread+0xdb/0x110
[11829.824560]  ? kthread_complete_and_exit+0x20/0x20
[11829.824563]  ret_from_fork+0x1f/0x30
[11829.824568]  </TASK>
[11829.824571] amdgpu 0000:03:00.0: amdgpu: MODE2 reset
[11829.824616] amdgpu 0000:03:00.0: amdgpu: GPU reset succeeded, trying to resume
[11829.825038] [drm] PCIE GART of 1024M enabled.
[11829.825039] [drm] PTB located at 0x000000F400900000
[11829.825055] [drm] PSP is resuming...
[11829.845091] [drm] reserve 0x400000 from 0xf47fc00000 for PSP TMR
[11829.919126] amdgpu 0000:03:00.0: amdgpu: RAS: optional ras ta ucode is not available
[11829.925670] amdgpu 0000:03:00.0: amdgpu: RAP: optional rap ta ucode is not available
[11830.673326] [drm] kiq ring mec 2 pipe 1 q 0
[11830.684064] [drm] VCN decode and encode initialized successfully(under SPG Mode).
[11830.684072] amdgpu 0000:03:00.0: amdgpu: ring gfx uses VM inv eng 0 on hub 0
[11830.684076] amdgpu 0000:03:00.0: amdgpu: ring comp_1.0.0 uses VM inv eng 1 on hub 0
[11830.684078] amdgpu 0000:03:00.0: amdgpu: ring comp_1.1.0 uses VM inv eng 4 on hub 0
[11830.684079] amdgpu 0000:03:00.0: amdgpu: ring comp_1.2.0 uses VM inv eng 5 on hub 0
[11830.684082] amdgpu 0000:03:00.0: amdgpu: ring comp_1.3.0 uses VM inv eng 6 on hub 0
[11830.684083] amdgpu 0000:03:00.0: amdgpu: ring comp_1.0.1 uses VM inv eng 7 on hub 0
[11830.684085] amdgpu 0000:03:00.0: amdgpu: ring comp_1.1.1 uses VM inv eng 8 on hub 0
[11830.684087] amdgpu 0000:03:00.0: amdgpu: ring comp_1.2.1 uses VM inv eng 9 on hub 0
[11830.684089] amdgpu 0000:03:00.0: amdgpu: ring comp_1.3.1 uses VM inv eng 10 on hub 0
[11830.684091] amdgpu 0000:03:00.0: amdgpu: ring kiq_2.1.0 uses VM inv eng 11 on hub 0
[11830.684093] amdgpu 0000:03:00.0: amdgpu: ring sdma0 uses VM inv eng 0 on hub 1
[11830.684094] amdgpu 0000:03:00.0: amdgpu: ring vcn_dec uses VM inv eng 1 on hub 1
[11830.684096] amdgpu 0000:03:00.0: amdgpu: ring vcn_enc0 uses VM inv eng 4 on hub 1
[11830.684098] amdgpu 0000:03:00.0: amdgpu: ring vcn_enc1 uses VM inv eng 5 on hub 1
[11830.684100] amdgpu 0000:03:00.0: amdgpu: ring jpeg_dec uses VM inv eng 6 on hub 1
[11830.688342] [drm:amdgpu_dm_set_vline0_irq_state [amdgpu]] *ERROR* amdgpu_dm_set_vline0_irq_state: crtc is NULL at id :3
[11830.688698] [drm:amdgpu_dm_set_vline0_irq_state [amdgpu]] *ERROR* amdgpu_dm_set_vline0_irq_state: crtc is NULL at id :3
[11830.689009] [drm:amdgpu_dm_set_vline0_irq_state [amdgpu]] *ERROR* amdgpu_dm_set_vline0_irq_state: crtc is NULL at id :3
[11830.689319] [drm:amdgpu_dm_set_crtc_irq_state [amdgpu]] *ERROR* amdgpu_dm_set_crtc_irq_state: crtc is NULL at id :3
[11830.689628] [drm:amdgpu_dm_set_crtc_irq_state [amdgpu]] *ERROR* amdgpu_dm_set_crtc_irq_state: crtc is NULL at id :3
[11830.689933] [drm:amdgpu_dm_set_crtc_irq_state [amdgpu]] *ERROR* amdgpu_dm_set_crtc_irq_state: crtc is NULL at id :3
[11830.690243] [drm:amdgpu_dm_set_pflip_irq_state [amdgpu]] *ERROR* amdgpu_dm_set_pflip_irq_state: crtc is NULL at id :3
[11830.690551] [drm:amdgpu_dm_set_pflip_irq_state [amdgpu]] *ERROR* amdgpu_dm_set_pflip_irq_state: crtc is NULL at id :3
[11830.690860] [drm:amdgpu_dm_set_pflip_irq_state [amdgpu]] *ERROR* amdgpu_dm_set_pflip_irq_state: crtc is NULL at id :3
[11830.691169] [drm:amdgpu_dm_set_pflip_irq_state [amdgpu]] *ERROR* amdgpu_dm_set_pflip_irq_state: crtc is NULL at id :3
[11830.691478] [drm:amdgpu_dm_set_vupdate_irq_state [amdgpu]] *ERROR* amdgpu_dm_set_vupdate_irq_state: crtc is NULL at id :3
[11830.691807] [drm:amdgpu_dm_set_vupdate_irq_state [amdgpu]] *ERROR* amdgpu_dm_set_vupdate_irq_state: crtc is NULL at id :3
[11830.692112] [drm:amdgpu_dm_set_vupdate_irq_state [amdgpu]] *ERROR* amdgpu_dm_set_vupdate_irq_state: crtc is NULL at id :3
[11830.696428] amdgpu 0000:03:00.0: amdgpu: recover vram bo from shadow start
[11830.696430] amdgpu 0000:03:00.0: amdgpu: recover vram bo from shadow done
[11830.696432] [drm] Skip scheduling IBs!
[11830.696434] [drm] Skip scheduling IBs!
[11830.696498] amdgpu 0000:03:00.0: amdgpu: GPU reset(2) succeeded!
[11830.696501] [drm] Skip scheduling IBs!
[11830.696511] [drm] Skip scheduling IBs!
[11830.696518] [drm] Skip scheduling IBs!
[11830.696523] [drm] Skip scheduling IBs!
[11830.696529] [drm] Skip scheduling IBs!
[11830.696535] [drm] Skip scheduling IBs!
[11830.696541] [drm] Skip scheduling IBs!
[11830.696547] [drm] Skip scheduling IBs!
[11830.696554] [drm] Skip scheduling IBs!
[11830.696631] [drm] Skip scheduling IBs!
[11830.696729] [drm] Skip scheduling IBs!
[11830.696748] [drm] Skip scheduling IBs!
[11830.696822] [drm] Skip scheduling IBs!
[11830.696832] [drm] Skip scheduling IBs!
[11830.696850] [drm] Skip scheduling IBs!
[11830.696857] [drm] Skip scheduling IBs!
[11830.696864] [drm] Skip scheduling IBs!
[11830.696870] [drm] Skip scheduling IBs!
[11830.696876] [drm] Skip scheduling IBs!
[11830.696883] [drm] Skip scheduling IBs!
[11830.696898] [drm] Skip scheduling IBs!
[11830.696905] [drm] Skip scheduling IBs!
[11830.696913] [drm] Skip scheduling IBs!
[11830.696930] [drm] Skip scheduling IBs!
[11830.697050] [drm] Skip scheduling IBs!
[11830.697059] [drm] Skip scheduling IBs!
[11830.697098] [drm] Skip scheduling IBs!
[11830.697107] [drm] Skip scheduling IBs!
[11830.697116] [drm] Skip scheduling IBs!
[11830.697125] [drm] Skip scheduling IBs!
[11830.697142] [drm] Skip scheduling IBs!
[11830.699931] [drm:amdgpu_cs_ioctl [amdgpu]] *ERROR* Failed to initialize parser -125!
[11830.708071] [drm:amdgpu_cs_ioctl [amdgpu]] *ERROR* Failed to initialize parser -125!
[11830.708980] [drm:amdgpu_cs_ioctl [amdgpu]] *ERROR* Failed to initialize parser -125!
[11830.710191] [drm:amdgpu_cs_ioctl [amdgpu]] *ERROR* Failed to initialize parser -125!
[11830.711089] [drm:amdgpu_cs_ioctl [amdgpu]] *ERROR* Failed to initialize parser -125!
[11830.711615] [drm:amdgpu_cs_ioctl [amdgpu]] *ERROR* Failed to initialize parser -125!
[11830.712737] [drm:amdgpu_cs_ioctl [amdgpu]] *ERROR* Failed to initialize parser -125!
[11830.713297] [drm:amdgpu_cs_ioctl [amdgpu]] *ERROR* Failed to initialize parser -125!
[11830.713959] [drm:amdgpu_cs_ioctl [amdgpu]] *ERROR* Failed to initialize parser -125!
[11830.714583] [drm:amdgpu_cs_ioctl [amdgpu]] *ERROR* Failed to initialize parser -125!
Comment 18 Eric Wheeler 2022-05-31 19:59:37 UTC
Nix, did you try Linux 5.18?  It worked for me...
Comment 19 emlodnaor 2022-06-13 06:58:14 UTC
It seemed to work for me at first, but just crashed again...
However, for me it happens more randomly, and I haven't found a way to reproduce it... 

I've also once had a similar crash in windows, so have been suspecting faulty hardware?
Comment 20 Michal Przybylowicz 2022-07-26 20:42:56 UTC
I have the same issue but on kernel: 5.18.14-xanmod1-x64v2, I have this as long as I remember almost 6mc now... On different kernels. I have also tried latest firmware (manually downloaded) and lastest amdgpu, still the same. This happens seemingly randomly but always when i use vivaldi (based on chrome).


Jul 26 22:35:48 dagon kernel: [drm:amdgpu_dm_atomic_commit_tail [amdgpu]] *ERROR* Waiting for fences timed out!
Jul 26 22:35:48 dagon kernel: [drm:amdgpu_job_timedout [amdgpu]] *ERROR* ring gfx_0.0.0 timeout, signaled seq=12513753, emitted seq=12513755
Jul 26 22:35:48 dagon kernel: [drm:amdgpu_job_timedout [amdgpu]] *ERROR* Process information: process vivaldi-bin pid 1540 thread vivaldi-bi:cs0 pid 1564
Jul 26 22:35:48 dagon kernel: amdgpu 0000:03:00.0: amdgpu: GPU reset begin!
Jul 26 22:35:48 dagon kernel: amdgpu 0000:03:00.0: [drm:amdgpu_ring_test_helper [amdgpu]] *ERROR* ring kiq_2.1.0 test failed (-110)
Jul 26 22:35:48 dagon kernel: [drm:gfx_v10_0_hw_fini [amdgpu]] *ERROR* KGQ disable failed
Jul 26 22:35:48 dagon kernel: amdgpu 0000:03:00.0: [drm:amdgpu_ring_test_helper [amdgpu]] *ERROR* ring kiq_2.1.0 test failed (-110)
Jul 26 22:35:48 dagon kernel: [drm:gfx_v10_0_hw_fini [amdgpu]] *ERROR* KCQ disable failed
Jul 26 22:35:48 dagon kernel: [drm:gfx_v10_0_hw_fini [amdgpu]] *ERROR* failed to halt cp gfx
Jul 26 22:35:49 dagon kernel: [drm] free PSP TMR buffer
Jul 26 22:35:49 dagon kernel: CPU: 2 PID: 131736 Comm: kworker/u32:3 Not tainted 5.18.14-xanmod1-x64v2 #0~git20220723.debb916
Jul 26 22:35:49 dagon kernel: Hardware name: Micro-Star International Co., Ltd. MS-7C80/MAG Z490 TOMAHAWK (MS-7C80), BIOS 1.B0 03/31/2022
Jul 26 22:35:49 dagon kernel: Workqueue: amdgpu-reset-dev drm_sched_job_timedout [gpu_sched]
Jul 26 22:35:49 dagon kernel: Call Trace:
Jul 26 22:35:49 dagon kernel:  <TASK>
Jul 26 22:35:49 dagon kernel:  dump_stack_lvl+0x44/0x5c
Jul 26 22:35:49 dagon kernel:  amdgpu_do_asic_reset+0x21/0x41b [amdgpu]
Jul 26 22:35:49 dagon kernel:  amdgpu_device_gpu_recover_imp.cold+0x55c/0x8f9 [amdgpu]
Jul 26 22:35:49 dagon kernel:  amdgpu_job_timedout+0x151/0x180 [amdgpu]
Jul 26 22:35:49 dagon kernel:  ? __switch_to_asm+0x42/0x70
Jul 26 22:35:49 dagon kernel:  ? __schedule+0x388/0x1180
Jul 26 22:35:49 dagon kernel:  drm_sched_job_timedout+0x5f/0xf0 [gpu_sched]
Jul 26 22:35:49 dagon kernel:  process_one_work+0x1ea/0x330
Jul 26 22:35:49 dagon kernel:  worker_thread+0x45/0x3b0
Jul 26 22:35:49 dagon kernel:  ? process_one_work+0x330/0x330
Jul 26 22:35:49 dagon kernel:  kthread+0xbb/0xe0
Jul 26 22:35:49 dagon kernel:  ? kthread_complete_and_exit+0x20/0x20
Jul 26 22:35:49 dagon kernel:  ret_from_fork+0x1f/0x30
Jul 26 22:35:49 dagon kernel:  </TASK>
Jul 26 22:35:49 dagon kernel: amdgpu 0000:03:00.0: amdgpu: MODE1 reset
Jul 26 22:35:49 dagon kernel: amdgpu 0000:03:00.0: amdgpu: GPU mode1 reset
Jul 26 22:35:49 dagon kernel: amdgpu 0000:03:00.0: amdgpu: GPU smu mode1 reset
Jul 26 22:35:49 dagon kernel: amdgpu 0000:03:00.0: amdgpu: GPU reset succeeded, trying to resume
Comment 21 nvaert1986 2022-09-12 17:07:07 UTC
I'm experiencing the same issue on 5.19 with mesa. It rarely happens, but when it happens my whole system needs a reboot. I've seen it happening with Firefox and Steam so far.

[drm:0xffffffffc04e61a6] *ERROR* Waiting for fences timed out!
[drm:0xffffffffc04e61a6] *ERROR* Waiting for fences timed out!
[drm:0xffffffffc0465370] *ERROR* Process information: process firefox pid 1918 thread firefox:cs0 pid 2069
amdgpu 0000:03:00.0: amdgpu: GPU reset begin!
amdgpu 0000:03:00.0: [drm:0xffffffffc0321d49] *ERROR* ring kiq_2.1.0 test failed (-110)
[drm:0xffffffffc03b4bfc] *ERROR* KGQ disable failed
[drm:0xffffffffc03b4a60] *ERROR* failed to halt cp gfx
[drm] free PSP TMR buffer
DMAR: DRHD: handling fault status reg 3
 DMAR: [DMA Read NO_PASID] Request device [03:00.0] fault addr 0x77d0541a000 [fault reason 0x04] Access beyond MGAW
 DMAR: DRHD: handling fault status reg 3
 DMAR: [DMA Read NO_PASID] Request device [03:00.0] fault addr 0x77d0541e000 [fault reason 0x04] Access beyond MGAW
CPU: 0 PID: 1028 Comm: kworker/u48:22 Tainted: G           O      5.19.1
 Hardware name: Micro-Star International Co., Ltd. MS-7D31/MPG Z690 EDGE WIFI DDR4 (MS-7D31), BIOS 1.40 05/18/2022
Workqueue: amdgpu-reset-dev 0xffffffffc0242a90
Call Trace:
  <TASK>
0xffffffffa28f2514
0xffffffffc065e673
0xffffffffc065ef99
0xffffffffc04653ca
0xffffffffc0242aeb
0xffffffffa1f30ae8amdgpu 0000:03:00.0: amdgpu: GPU mode1 reset

0xffffffffa1f31048
? 0xffffffffa1f31000
0xffffffffa1f372fa
? 0xffffffffa1f37220
0xffffffffa1e010ef
</TASK>
amdgpu 0000:03:00.0: amdgpu: MODE1 reset
amdgpu 0000:03:00.0: amdgpu: GPU smu mode1 reset
amdgpu 0000:03:00.0: amdgpu: GPU reset succeeded, trying to resume
 [drm] PCIE GART of 512M enabled (table at 0x0000008000300000).
 [drm] VRAM is lost due to GPU reset!
 [drm] PSP is resuming...

 
 Here it initializes my full GPU, but then throws:
 [drm] Skip scheduling IBs!
and the crash starts over again.
Comment 22 Taras 2022-09-30 15:00:10 UTC
Experiencing the same issue on 5.19.11 (NixOS 22.11pre411613.7e52b35fe98) with RX 6800. Random freezing when I use vivaldi browser. 


 vivaldi-stable.desktop[49450]: [49444:49444:0930/100113.311398:ERROR:CONSOLE(0)] "Uncaught (in promise) Error: A listener indicated an asynchronous response by returning true>
 vivaldi-stable.desktop[49450]: [49444:49444:0930/100116.501866:ERROR:CONSOLE(0)] "Uncaught (in promise) Error: A listener indicated an asynchronous response by returning true>
 kernel: [drm:amdgpu_dm_atomic_commit_tail [amdgpu]] *ERROR* Waiting for fences timed out!
 kernel: [drm:amdgpu_job_timedout [amdgpu]] *ERROR* ring sdma3 timeout, signaled seq=114786, emitted seq=114788
 kernel: [drm:amdgpu_job_timedout [amdgpu]] *ERROR* Process information: process  pid 0 thread  pid 0
 kernel: amdgpu 0000:4c:00.0: amdgpu: GPU reset begin!
 kernel: amdgpu 0000:4c:00.0: [drm:amdgpu_ring_test_helper [amdgpu]] *ERROR* ring kiq_2.1.0 test failed (-110)
 kernel: [drm:gfx_v10_0_hw_fini [amdgpu]] *ERROR* KGQ disable failed
 kernel: [drm:gfx_v10_0_hw_fini [amdgpu]] *ERROR* failed to halt cp gfx
 kernel: [drm] free PSP TMR buffer
 kernel: amdgpu 0000:4c:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x0038 address=0xf7d00e3bb00 flags=0x0010]
 kernel: amdgpu 0000:4c:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x0038 address=0xf7d00e22300 flags=0x0010]
 kernel: amdgpu 0000:4c:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x0038 address=0xf7d00e30c00 flags=0x0010]
 kernel: amdgpu 0000:4c:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x0038 address=0xf7d00e16000 flags=0x0010]
 kernel: amdgpu 0000:4c:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x0038 address=0xf7d00e38600 flags=0x0010]
 kernel: amdgpu 0000:4c:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x0038 address=0xf7d00e2ea00 flags=0x0010]
 kernel: amdgpu 0000:4c:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x0038 address=0xf7d00e3d000 flags=0x0010]
 kernel: amdgpu 0000:4c:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x0038 address=0xf7d00e37700 flags=0x0010]
 kernel: amdgpu 0000:4c:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x0038 address=0xf7d00e32400 flags=0x0010]
 kernel: amdgpu 0000:4c:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x0038 address=0xf7d00e31c00 flags=0x0010]
 kernel: CPU: 12 PID: 96188 Comm: kworker/u256:1 Tainted: G        W         5.19.11 #1-NixOS
 kernel: Hardware name: Micro-Star International Co., Ltd. MS-7C60/TRX40 PRO WIFI (MS-7C60), BIOS 2.80 05/17/2022
 kernel: Workqueue: amdgpu-reset-dev drm_sched_job_timedout [gpu_sched]
 kernel: Call Trace:
 kernel:  <TASK>
 kernel:  dump_stack_lvl+0x45/0x5e
 kernel:  amdgpu_do_asic_reset+0x28/0x438 [amdgpu]
 kernel:  amdgpu_device_gpu_recover_imp.cold+0x5ad/0x90a [amdgpu]
 kernel:  amdgpu_job_timedout+0x153/0x190 [amdgpu]
 kernel:  drm_sched_job_timedout+0x76/0x110 [gpu_sched]
 kernel:  process_one_work+0x1e5/0x3b0
 kernel:  worker_thread+0x50/0x3a0
 kernel:  ? rescuer_thread+0x390/0x390
 kernel:  kthread+0xe8/0x110
 kernel:  ? kthread_complete_and_exit+0x20/0x20
 kernel:  ret_from_fork+0x22/0x30
 kernel:  </TASK>
 kernel: amdgpu 0000:4c:00.0: amdgpu: MODE1 reset
 kernel: amdgpu 0000:4c:00.0: amdgpu: GPU mode1 reset
 kernel: amdgpu 0000:4c:00.0: amdgpu: GPU smu mode1 reset
 kernel: amdgpu 0000:4c:00.0: amdgpu: GPU reset succeeded, trying to resume
 kernel: [drm] PCIE GART of 512M enabled (table at 0x0000008000300000).
 kernel: [drm] VRAM is lost due to GPU reset!
 kernel: [drm] PSP is resuming...
 kernel: [drm] reserve 0xa00000 from 0x83fe000000 for PSP TMR
 kernel: amdgpu 0000:4c:00.0: amdgpu: SECUREDISPLAY: securedisplay ta ucode is not available
 kernel: amdgpu 0000:4c:00.0: amdgpu: SMU is resuming...
 kernel: amdgpu 0000:4c:00.0: amdgpu: smu driver if version = 0x00000040, smu fw if version = 0x00000041, smu fw program = 0, version = 0x003a5400 (58.84.0)
 kernel: amdgpu 0000:4c:00.0: amdgpu: SMU driver if version not matched
 kernel: amdgpu 0000:4c:00.0: amdgpu: use vbios provided pptable
 kernel: amdgpu 0000:4c:00.0: amdgpu: SMU is resumed successfully!
 kernel: [drm] DMUB hardware initialized: version=0x02020013
 kernel: [drm] kiq ring mec 2 pipe 1 q 0
 kernel: [drm] VCN decode and encode initialized successfully(under DPG Mode).
 kernel: [drm] JPEG decode initialized successfully.
 kernel: amdgpu 0000:4c:00.0: amdgpu: ring gfx_0.0.0 uses VM inv eng 0 on hub 0
 kernel: amdgpu 0000:4c:00.0: amdgpu: ring comp_1.0.0 uses VM inv eng 1 on hub 0
 kernel: amdgpu 0000:4c:00.0: amdgpu: ring comp_1.1.0 uses VM inv eng 4 on hub 0
 kernel: amdgpu 0000:4c:00.0: amdgpu: ring comp_1.2.0 uses VM inv eng 5 on hub 0
 kernel: amdgpu 0000:4c:00.0: amdgpu: ring comp_1.3.0 uses VM inv eng 6 on hub 0
 kernel: amdgpu 0000:4c:00.0: amdgpu: ring comp_1.0.1 uses VM inv eng 7 on hub 0
 kernel: amdgpu 0000:4c:00.0: amdgpu: ring comp_1.1.1 uses VM inv eng 8 on hub 0
 kernel: amdgpu 0000:4c:00.0: amdgpu: ring comp_1.2.1 uses VM inv eng 9 on hub 0
 kernel: amdgpu 0000:4c:00.0: amdgpu: ring comp_1.3.1 uses VM inv eng 10 on hub 0
 kernel: amdgpu 0000:4c:00.0: amdgpu: ring kiq_2.1.0 uses VM inv eng 11 on hub 0
 kernel: amdgpu 0000:4c:00.0: amdgpu: ring sdma0 uses VM inv eng 12 on hub 0
 kernel: amdgpu 0000:4c:00.0: amdgpu: ring sdma1 uses VM inv eng 13 on hub 0
 kernel: amdgpu 0000:4c:00.0: amdgpu: ring sdma2 uses VM inv eng 14 on hub 0
 kernel: amdgpu 0000:4c:00.0: amdgpu: ring sdma3 uses VM inv eng 15 on hub 0
 kernel: amdgpu 0000:4c:00.0: amdgpu: ring vcn_dec_0 uses VM inv eng 0 on hub 1
 kernel: amdgpu 0000:4c:00.0: amdgpu: ring vcn_enc_0.0 uses VM inv eng 1 on hub 1
 kernel: amdgpu 0000:4c:00.0: amdgpu: ring vcn_enc_0.1 uses VM inv eng 4 on hub 1
 kernel: amdgpu 0000:4c:00.0: amdgpu: ring vcn_dec_1 uses VM inv eng 5 on hub 1
 kernel: amdgpu 0000:4c:00.0: amdgpu: ring vcn_enc_1.0 uses VM inv eng 6 on hub 1
 kernel: amdgpu 0000:4c:00.0: amdgpu: ring vcn_enc_1.1 uses VM inv eng 7 on hub 1
 kernel: amdgpu 0000:4c:00.0: amdgpu: ring jpeg_dec uses VM inv eng 8 on hub 1
 kernel: amdgpu 0000:4c:00.0: amdgpu: recover vram bo from shadow start
 kernel: amdgpu 0000:4c:00.0: amdgpu: recover vram bo from shadow done
 kernel: [drm] Skip scheduling IBs!
 kernel: [drm] Skip scheduling IBs!
 kernel: [drm] Skip scheduling IBs!
 kernel: [drm] Skip scheduling IBs!
 kernel: [drm] Skip scheduling IBs!
 kernel: [drm] Skip scheduling IBs!
 kernel: amdgpu 0000:4c:00.0: amdgpu: GPU reset(1) succeeded!
 kernel: [drm] Skip scheduling IBs!
 kernel: [drm] Skip scheduling IBs!
 kernel: [drm] Skip scheduling IBs!
 kernel: [drm] Skip scheduling IBs!
 kernel: [drm] Skip scheduling IBs!
 kernel: [drm] Skip scheduling IBs!
 kernel: [drm] Skip scheduling IBs!
 kernel: [drm] Skip scheduling IBs!
 kernel: [drm] Skip scheduling IBs!
 kernel: [drm] Skip scheduling IBs!
 kernel: [drm] Skip scheduling IBs!
 kernel: [drm] Skip scheduling IBs!
 kernel: [drm] Skip scheduling IBs!
 kernel: [drm] Skip scheduling IBs!
 kernel: [drm] Skip scheduling IBs!
 kernel: [drm] Skip scheduling IBs!
 kernel: [drm] Skip scheduling IBs!
 kernel: [drm] Skip scheduling IBs!
 kernel: [drm] Skip scheduling IBs!
 kernel: [drm] Skip scheduling IBs!
 kernel: [drm] Skip scheduling IBs!
 kernel: [drm] Skip scheduling IBs!
 kernel: [drm] Skip scheduling IBs!
 kernel: [drm] Skip scheduling IBs!
 kernel: [drm] Skip scheduling IBs!
 kernel: [drm] Skip scheduling IBs!
 kernel: [drm] Skip scheduling IBs!
 kernel: [drm] Skip scheduling IBs!
 kernel: [drm:amdgpu_cs_ioctl [amdgpu]] *ERROR* Failed to initialize parser -125!
 kernel: [drm:amdgpu_cs_ioctl [amdgpu]] *ERROR* Failed to initialize parser -125!
 gnome-shell[3076]: amdgpu: The CS has been cancelled because the context is lost.
 gnome-shell[3076]: amdgpu: The CS has been cancelled because the context is lost.
 kernel: [drm:amdgpu_cs_ioctl [amdgpu]] *ERROR* Failed to initialize parser -125!
 kernel: [drm:amdgpu_cs_ioctl [amdgpu]] *ERROR* Failed to initialize parser -125!
 kernel: [drm:amdgpu_cs_ioctl [amdgpu]] *ERROR* Failed to initialize parser -125!
 kernel: [drm:amdgpu_cs_ioctl [amdgpu]] *ERROR* Failed to initialize parser -125!
 kernel: [drm:amdgpu_cs_ioctl [amdgpu]] *ERROR* Failed to initialize parser -125!
 kernel: [drm:amdgpu_cs_ioctl [amdgpu]] *ERROR* Failed to initialize parser -125!
 kernel: [drm:amdgpu_cs_ioctl [amdgpu]] *ERROR* Failed to initialize parser -125!
 kernel: [drm:amdgpu_cs_ioctl [amdgpu]] *ERROR* Failed to initialize parser -125!
 gnome-shell[3076]: amdgpu: The CS has been cancelled because the context is lost.
 gnome-shell[3076]: amdgpu: The CS has been cancelled because the context is lost.
 gnome-shell[3076]: amdgpu: The CS has been cancelled because the context is lost.
 gnome-shell[3076]: amdgpu: The CS has been cancelled because the context is lost.
 gnome-shell[3076]: amdgpu: The CS has been cancelled because the context is lost.
 gnome-shell[3076]: amdgpu: The CS has been cancelled because the context is lost.
 gnome-shell[3076]: amdgpu: The CS has been cancelled because the context is lost.
 gnome-shell[3076]: amdgpu: The CS has been cancelled because the context is lost.
 gnome-shell[3076]: amdgpu: The CS has been cancelled because the context is lost.
 gnome-shell[3076]: amdgpu: The CS has been cancelled because the context is lost.
 gnome-shell[3076]: amdgpu: The CS has been cancelled because the context is lost.
 gnome-shell[3076]: amdgpu: The CS has been cancelled because the context is lost.
 gnome-shell[3076]: amdgpu: The CS has been cancelled because the context is lost.
 gnome-shell[3076]: amdgpu: The CS has been cancelled because the context is lost.
 gnome-shell[3076]: amdgpu: The CS has been cancelled because the context is lost.
 gnome-shell[3076]: amdgpu: The CS has been cancelled because the context is lost.
 vivaldi-stable.desktop[49450]: [49657:49664:0930/100759.348288:ERROR:display.cc(286)] Frame latency is negative: -210.699 ms
 gnome-shell[3076]: amdgpu: The CS has been cancelled because the context is lost.
 gnome-shell[3076]: amdgpu: The CS has been cancelled because the context is lost.
 gnome-shell[2555]: amdgpu: The CS has been cancelled because the context is lost.
 gnome-shell[3076]: amdgpu: The CS has been cancelled because the context is lost.
 gnome-shell[3076]: amdgpu: The CS has been cancelled because the context is lost.
 gnome-shell[2555]: amdgpu: amdgpu_cs_query_fence_status failed.
 org.gnome.Totem[67100]: amdgpu: The CS has been cancelled because the context is lost.
 org.gnome.Totem[67100]: amdgpu: The CS has been cancelled because the context is lost.
 org.gnome.Totem[67100]: amdgpu: The CS has been cancelled because the context is lost.
 org.gnome.Totem[67100]: amdgpu: The CS has been cancelled because the context is lost.
 org.gnome.Totem[67100]: amdgpu: The CS has been cancelled because the context is lost.
 org.gnome.Totem[67100]: amdgpu: The CS has been cancelled because the context is lost.
 gnome-shell[2555]: amdgpu: The CS has been cancelled because the context is lost.
 gnome-shell[2555]: amdgpu: The CS has been cancelled because the context is lost.
 gnome-shell[2555]: amdgpu: The CS has been cancelled because the context is lost.
 gnome-shell[2555]: amdgpu: The CS has been cancelled because the context is lost.
 gnome-shell[2555]: amdgpu: The CS has been cancelled because the context is lost.
 gnome-shell[2555]: amdgpu: The CS has been cancelled because the context is lost.
 gnome-shell[2555]: amdgpu: The CS has been cancelled because the context is lost.
 gnome-shell[2555]: amdgpu: The CS has been cancelled because the context is lost.
 gnome-shell[2555]: amdgpu: The CS has been cancelled because the context is lost.
 gnome-shell[2555]: amdgpu: The CS has been cancelled because the context is lost.
 gnome-shell[2555]: amdgpu: The CS has been cancelled because the context is lost.
Comment 23 James Le Cuirot 2022-10-09 20:11:53 UTC
Same. I see this quite frequently with my RX 6800 XT, particularly when using Vivaldi, though that may just be a coincidence, since I use it a lot.
Comment 24 rv1sr 2022-10-10 18:24:30 UTC
Do you guys by any chance use KWin?

Had experienced this exact issue on a daily basis (kernel 5.19 + amdgpu), especially while running Firefox or Vivaldi.

After setting the following environment variable in /etc/environment two weeks ago, the issue no longer persists.

KWIN_DRM_NO_DIRECT_SCANOUT=1
Comment 25 James Le Cuirot 2022-10-12 10:51:43 UTC
I do, yes, under Wayland. I did a system update recently but if the problem reoccurs, I'll try that next. Thanks!
Comment 26 Paul Menzel 2022-10-12 12:01:05 UTC
(In reply to Tomas Gayoso from comment #9)

[…]

> Recompiling Mesa 21.1.2 with the attached patch fixes the issue for me in
> kernels  5.10.42 and 5.12.9.

As the original report is not a Linux kernel but a Mesa issue, can you please close this issue.

All other commenters, if you still experiencing this issue with the latest Linux kernel and Mesa version, please create a new issue at the Mesa issue tracker or here, if it’s a Linux kernel issue.
Comment 27 nvaert1986 2022-10-30 16:57:19 UTC
(In reply to rv1sr from comment #24)
> Do you guys by any chance use KWin?
> 
> Had experienced this exact issue on a daily basis (kernel 5.19 + amdgpu),
> especially while running Firefox or Vivaldi.
> 
> After setting the following environment variable in /etc/environment two
> weeks ago, the issue no longer persists.
> 
> KWIN_DRM_NO_DIRECT_SCANOUT=1

I tried this for a couple of days, but after a few days Xorg still crashed unfortunately. It does seem to be less frequent though.

plasmashell[1280]: amdgpu: amdgpu_cs_query_fence_status failed.
kwin_x11[1255]: amdgpu: amdgpu_cs_query_fence_status failed.
plasmashell[447733]: amdgpu: amdgpu_cs_query_fence_status failed.
plasmashell[447733]: Crash Annotation GraphicsCriticalError: |[0][GFX1-]: GFX: RenderThread detected a device reset in PostUpdate (t=5437.93) [GFX1-]: GFX: RenderThrea>
plasmashell[1280]: amdgpu: The CS has been cancelled because the context is lost.
plasmashell[1280]: amdgpu: amdgpu_cs_query_fence_status failed.
plasmashell[1280]: amdgpu: The CS has been cancelled because the context is lost.
plasmashell[447733]: ATTENTION: default value of option mesa_glthread overridden by environment.
Comment 28 fmhirtz 2022-11-12 16:24:21 UTC
I'm seeing what appears to be this on Fedora 37 with an AMD 5700xt. Normal desktop use in Wayland/Gnome will sporadically freeze and crash every couple of days. It normally will reset back to the login given some time:

Kernel: 6.0.7-301.fc37.x86_64
Mesa: mesa-*23.0.0-0.3.git74bbeb5.fc37

~~~
Nov 08 02:01:33 workstation kernel: [drm:amdgpu_dm_atomic_commit_tail [amdgpu]] *ERROR* Waiting for fences timed out!
Nov 08 02:01:33 workstation kernel: [drm:amdgpu_job_timedout [amdgpu]] *ERROR* ring gfx_0.0.0 timeout, signaled seq=14613616, emitted seq=14613618
Nov 08 02:01:33 workstation kernel: [drm:amdgpu_job_timedout [amdgpu]] *ERROR* Process information: process firefox pid 21845 thread firefox:cs0 pid 21922
Nov 08 02:01:33 workstation kernel: amdgpu 0000:0c:00.0: amdgpu: GPU reset begin!
Nov 08 02:01:34 workstation kernel: amdgpu 0000:0c:00.0: [drm:amdgpu_ring_test_helper [amdgpu]] *ERROR* ring kiq_2.1.0 test failed (-110)
Nov 08 02:01:34 workstation kernel: [drm:gfx_v10_0_hw_fini [amdgpu]] *ERROR* KGQ disable failed
Nov 08 02:01:34 workstation kernel: [drm:gfx_v10_0_hw_fini [amdgpu]] *ERROR* failed to halt cp gfx
Nov 08 02:01:34 workstation kernel: [drm] free PSP TMR buffer
Nov 08 02:01:34 workstation kernel: CPU: 19 PID: 871009 Comm: kworker/u64:3 Not tainted 5.19.16-301.fc37.x86_64 #1
Nov 08 02:01:34 workstation kernel: Hardware name: MicroElectronics G464/TUF GAMING X570-PLUS (WI-FI), BIOS 3001 12/04/2020
Nov 08 02:01:34 workstation kernel: Workqueue: amdgpu-reset-dev drm_sched_job_timedout [gpu_sched]
Nov 08 02:01:34 workstation kernel: Call Trace:
Nov 08 02:01:34 workstation kernel:  <TASK>
Nov 08 02:01:34 workstation kernel:  dump_stack_lvl+0x44/0x5c
Nov 08 02:01:34 workstation kernel:  amdgpu_do_asic_reset+0x26/0x459 [amdgpu]
Nov 08 02:01:34 workstation kernel:  amdgpu_device_gpu_recover_imp.cold+0x59d/0x8cb [amdgpu]
Nov 08 02:01:34 workstation kernel:  amdgpu_job_timedout+0x156/0x190 [amdgpu]
Nov 08 02:01:34 workstation kernel:  ? __switch_to+0x106/0x430
Nov 08 02:01:34 workstation kernel:  drm_sched_job_timedout+0x76/0x110 [gpu_sched]
Nov 08 02:01:34 workstation kernel:  process_one_work+0x1c7/0x380
Nov 08 02:01:34 workstation kernel:  worker_thread+0x4d/0x380
Nov 08 02:01:34 workstation kernel:  ? _raw_spin_lock_irqsave+0x23/0x50
Nov 08 02:01:34 workstation kernel:  ? process_one_work+0x380/0x380
Nov 08 02:01:34 workstation kernel:  kthread+0xe9/0x110
Nov 08 02:01:34 workstation kernel:  ? kthread_complete_and_exit+0x20/0x20
Nov 08 02:01:34 workstation kernel:  ret_from_fork+0x22/0x30
Nov 08 02:01:34 workstation kernel:  </TASK>
Nov 08 02:01:34 workstation kernel: amdgpu 0000:0c:00.0: amdgpu: BACO reset
Nov 08 02:01:37 workstation kernel: amdgpu 0000:0c:00.0: amdgpu: GPU reset succeeded, trying to resume
Nov 08 02:01:37 workstation kernel: [drm] PCIE GART of 512M enabled (table at 0x0000008000300000).
Nov 08 02:01:37 workstation kernel: [drm] VRAM is lost due to GPU reset!
Nov 08 02:01:37 workstation kernel: [drm] PSP is resuming...
Nov 08 02:01:37 workstation kernel: [drm] reserve 0x900000 from 0x81fe600000 for PSP TMR
...
~~~
Comment 29 Viktor 2022-11-20 21:57:52 UTC
Same problem on Lenovo Thinkpad T14 Gen3 with Ryzen 7 and Radeon 680M. Spontaneous freezes on kernels 5.17.* and 6.0.*. 
Here is the log:
Nov 20 22:31:39 calculate kernel: [drm:amdgpu_dm_commit_planes [amdgpu]] *ERROR* Waiting for fences timed out!
Nov 20 22:31:39 calculate kernel: [drm:amdgpu_job_timedout [amdgpu]] *ERROR* ring sdma0 timeout, signaled seq=146659, emitted seq=146661
Nov 20 22:31:39 calculate kernel: [drm:amdgpu_job_timedout [amdgpu]] *ERROR* Process information: process  pid 0 thread  pid 0
Nov 20 22:31:39 calculate kernel: amdgpu 0000:04:00.0: amdgpu: GPU reset begin!
Nov 20 22:31:39 calculate kernel: [drm:amdgpu_job_timedout [amdgpu]] *ERROR* ring gfx_0.0.0 timeout, signaled seq=986766, emitted seq=986766
Nov 20 22:31:39 calculate kernel: [drm:amdgpu_job_timedout [amdgpu]] *ERROR* Process information: process X pid 4963 thread X:cs0 pid 5224
Nov 20 22:31:39 calculate kernel: amdgpu 0000:04:00.0: amdgpu: GPU reset begin!
Nov 20 22:31:39 calculate kernel: amdgpu 0000:04:00.0: amdgpu: Bailing on TDR for s_job:df2df, as another already in progress
Nov 20 22:31:40 calculate kernel: amdgpu 0000:04:00.0: [drm:amdgpu_ring_test_helper [amdgpu]] *ERROR* ring kiq_2.1.0 test failed (-110)
Nov 20 22:31:40 calculate kernel: [drm:gfx_v10_0_hw_fini [amdgpu]] *ERROR* KGQ disable failed
Nov 20 22:31:40 calculate kernel: amdgpu 0000:04:00.0: [drm:amdgpu_ring_test_helper [amdgpu]] *ERROR* ring kiq_2.1.0 test failed (-110)
Nov 20 22:31:40 calculate kernel: [drm:gfx_v10_0_hw_fini [amdgpu]] *ERROR* KCQ disable failed
Nov 20 22:31:40 calculate kernel: [drm:gfx_v10_0_hw_fini [amdgpu]] *ERROR* failed to halt cp gfx
Nov 20 22:31:40 calculate kernel: [drm] free PSP TMR buffer
Nov 20 22:31:40 calculate kernel: amdgpu 0000:04:00.0: amdgpu: MODE2 reset
Nov 20 22:31:40 calculate kernel: amdgpu 0000:04:00.0: amdgpu: GPU reset succeeded, trying to resume
Nov 20 22:31:40 calculate kernel: [drm] PCIE GART of 512M enabled (table at 0x000000F4008C9000).
Nov 20 22:31:40 calculate kernel: [drm] PSP is resuming...
Nov 20 22:31:40 calculate kernel: [drm] reserve 0xa00000 from 0xf43f400000 for PSP TMR
Nov 20 22:31:41 calculate kernel: amdgpu 0000:04:00.0: amdgpu: RAS: optional ras ta ucode is not available
Nov 20 22:31:41 calculate kernel: amdgpu 0000:04:00.0: amdgpu: RAP: optional rap ta ucode is not available
Nov 20 22:31:41 calculate kernel: amdgpu 0000:04:00.0: amdgpu: SECUREDISPLAY: securedisplay ta ucode is not available
Nov 20 22:31:41 calculate kernel: amdgpu 0000:04:00.0: amdgpu: SMU is resuming...
Nov 20 22:31:41 calculate kernel: amdgpu 0000:04:00.0: amdgpu: SMU is resumed successfully!
Nov 20 22:31:41 calculate kernel: [drm] DMUB hardware initialized: version=0x0400001A
Nov 20 22:31:41 calculate kernel: [drm] kiq ring mec 2 pipe 1 q 0
Nov 20 22:31:41 calculate kernel: [drm] VCN decode and encode initialized successfully(under DPG Mode).
Nov 20 22:31:41 calculate kernel: [drm] JPEG decode initialized successfully.
Nov 20 22:31:41 calculate kernel: amdgpu 0000:04:00.0: amdgpu: ring gfx_0.0.0 uses VM inv eng 0 on hub 0
Nov 20 22:31:41 calculate kernel: amdgpu 0000:04:00.0: amdgpu: ring comp_1.0.0 uses VM inv eng 1 on hub 0
Nov 20 22:31:41 calculate kernel: amdgpu 0000:04:00.0: amdgpu: ring comp_1.1.0 uses VM inv eng 4 on hub 0
Nov 20 22:31:41 calculate kernel: amdgpu 0000:04:00.0: amdgpu: ring comp_1.2.0 uses VM inv eng 5 on hub 0
Nov 20 22:31:41 calculate kernel: amdgpu 0000:04:00.0: amdgpu: ring comp_1.3.0 uses VM inv eng 6 on hub 0
Nov 20 22:31:41 calculate kernel: amdgpu 0000:04:00.0: amdgpu: ring comp_1.0.1 uses VM inv eng 7 on hub 0
Nov 20 22:31:41 calculate kernel: amdgpu 0000:04:00.0: amdgpu: ring comp_1.1.1 uses VM inv eng 8 on hub 0
Nov 20 22:31:41 calculate kernel: amdgpu 0000:04:00.0: amdgpu: ring comp_1.2.1 uses VM inv eng 9 on hub 0
Nov 20 22:31:41 calculate kernel: amdgpu 0000:04:00.0: amdgpu: ring comp_1.3.1 uses VM inv eng 10 on hub 0
Nov 20 22:31:41 calculate kernel: amdgpu 0000:04:00.0: amdgpu: ring kiq_2.1.0 uses VM inv eng 11 on hub 0
Nov 20 22:31:41 calculate kernel: amdgpu 0000:04:00.0: amdgpu: ring sdma0 uses VM inv eng 12 on hub 0
Nov 20 22:31:41 calculate kernel: amdgpu 0000:04:00.0: amdgpu: ring vcn_dec_0 uses VM inv eng 0 on hub 1
Nov 20 22:31:41 calculate kernel: amdgpu 0000:04:00.0: amdgpu: ring vcn_enc_0.0 uses VM inv eng 1 on hub 1
Nov 20 22:31:41 calculate kernel: amdgpu 0000:04:00.0: amdgpu: ring vcn_enc_0.1 uses VM inv eng 4 on hub 1
Nov 20 22:31:41 calculate kernel: amdgpu 0000:04:00.0: amdgpu: ring jpeg_dec uses VM inv eng 5 on hub 1
Nov 20 22:31:41 calculate kernel: amdgpu 0000:04:00.0: amdgpu: recover vram bo from shadow start
Nov 20 22:31:41 calculate kernel: amdgpu 0000:04:00.0: amdgpu: recover vram bo from shadow done
Nov 20 22:31:41 calculate kernel: amdgpu 0000:04:00.0: amdgpu: GPU reset(2) succeeded!
Nov 20 22:31:41 calculate kernel: [drm] Skip scheduling IBs!
Nov 20 22:31:41 calculate kernel: [drm] Skip scheduling IBs!
Nov 20 22:31:41 calculate kernel: [drm] Skip scheduling IBs!
Nov 20 22:31:41 calculate kernel: [drm] Skip scheduling IBs!
Nov 20 22:31:41 calculate kernel: [drm] Skip scheduling IBs!
Comment 30 Elliot 2022-11-26 20:19:32 UTC
(In reply to Viktor from comment #29)
> Same problem on Lenovo Thinkpad T14 Gen3 with Ryzen 7 and Radeon 680M.

Getting same errors on Thinkpad T14s Gen3 Ryzen 6800U (Radeon 680m iGPU). Running i3 / XOrg on Archlinux, no DE. Was using Gnome on Wayland and saw more crashes + screen glitches. Switching to i3 definitely lowered glitch rate.

Crash seems to occur after leaving laptop on after a while (10+ hours).
Comment 31 MasterCATZ 2023-07-01 05:25:57 UTC
6800xt Ubuntu 
I might have to roll back to the 5's currently in the 6's and its happening every hour !!! pretty much randomly when I change to another video file for playback 
or when swapping between browsers with multi monitor displays

5.15 it was every few weeks
Comment 33 mikkk 2024-01-04 11:33:42 UTC
Also getting gfx_0.0.0 timeout error on brand new Lenovo ThinkPad T14 Gen 4 AMD Ryzen™ 7 PRO 7840U w/ Radeon™ 780M Graphics × 16 on Debian.
Comment 34 fichterfrancis 2024-02-04 00:04:54 UTC
Hello,

With my mini pc ryzen 5 6600H (six core)

processor	: 11
vendor_id	: AuthenticAMD
cpu family	: 25
model		: 68
model name	: AMD Ryzen 5 6600H with Radeon Graphics
stepping	: 1
microcode	: 0xa404102
cpu MHz		: 400.000
cache size	: 512 KB
physical id	: 0
siblings	: 12
core id		: 5
cpu cores	: 6
apicid		: 11
initial apicid	: 11
fpu		: yes
fpu_exception	: yes
cpuid level	: 16
wp		: yes
flags		: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt pdpe1gb rdtscp lm constant_tsc rep_good nopl nonstop_tsc cpuid extd_apicid aperfmperf rapl pni pclmulqdq monitor ssse3 fma cx16 sse4_1 sse4_2 x2apic movbe popcnt aes xsave avx f16c rdrand lahf_lm cmp_legacy svm extapic cr8_legacy abm sse4a misalignsse 3dnowprefetch osvw ibs skinit wdt tce topoext perfctr_core perfctr_nb bpext perfctr_llc mwaitx cpb cat_l3 cdp_l3 hw_pstate ssbd mba ibrs ibpb stibp vmmcall fsgsbase bmi1 avx2 smep bmi2 erms invpcid cqm rdt_a rdseed adx smap clflushopt clwb sha_ni xsaveopt xsavec xgetbv1 xsaves cqm_llc cqm_occup_llc cqm_mbm_total cqm_mbm_local clzero irperf xsaveerptr rdpru wbnoinvd cppc arat npt lbrv svm_lock nrip_save tsc_scale vmcb_clean flushbyasid decodeassists pausefilter pfthreshold avic v_vmsave_vmload vgif v_spec_ctrl umip pku ospke vaes vpclmulqdq rdpid overflow_recov succor smca fsrm
bugs		: sysret_ss_attrs spectre_v1 spectre_v2 spec_store_bypass srso
bogomips	: 6587.56
TLB size	: 2560 4K pages
clflush size	: 64
cache_alignment	: 64
address sizes	: 48 bits physical, 48 bits virtual
power management: ts ttp tm hwpstate cpb eff_freq_ro [13] [14]


cat /etc/debian_version 
trixie/sid

cat /etc/debian_version 
trixie/sid

Feb  3 21:38:50 debser kernel: [drm] PCIE GART of 1024M enabled (table at 0x000000F4FFC00000).
Feb  3 21:38:50 debser kernel: [drm] PSP is resuming...
Feb  3 21:38:50 debser kernel: [drm] reserve 0xa00000 from 0xf4fe000000 for PSP TMR
Feb  3 21:38:51 debser kernel: [drm] DMUB hardware initialized: version=0x0400003C
Feb  3 21:38:51 debser kernel: [drm] kiq ring mec 2 pipe 1 q 0
Feb  3 21:38:51 debser kernel: [drm] VCN decode and encode initialized successfully(under DPG Mode).
Feb  3 21:38:51 debser kernel: [drm] JPEG decode initialized successfully.
Feb  3 21:38:51 debser kernel: [drm] Skip scheduling IBs!
Feb  3 21:38:51 debser kernel: [drm] Skip scheduling IBs!
Feb  3 21:38:51 debser kernel: [drm] Skip scheduling IBs!
Feb  3 21:38:51 debser kernel: [drm] Skip scheduling IBs!
Feb  3 21:38:51 debser kernel: [drm] Skip scheduling IBs!
Feb  3 21:38:51 debser kernel: [drm] Skip scheduling IBs!
Feb  3 21:38:51 debser kernel: [drm] Skip scheduling IBs!
Feb  3 21:38:51 debser kernel: [drm] Skip scheduling IBs!
Feb  3 21:38:51 debser kernel: [drm] Skip scheduling IBs!
Feb  3 21:38:51 debser kernel: [drm] Skip scheduling IBs!
Feb  3 21:38:51 debser kernel: [drm] Skip scheduling IBs!
Feb  3 21:38:51 debser kernel: [drm] Skip scheduling IBs!
Feb  3 21:38:51 debser kernel: [drm] Skip scheduling IBs!
Feb  3 21:38:51 debser kernel: [drm] Skip scheduling IBs!
Feb  3 21:38:51 debser kernel: [drm] Skip scheduling IBs!
Feb  3 21:38:51 debser kernel: [drm] Skip scheduling IBs!
Feb  3 21:38:51 debser kernel: [drm] Skip scheduling IBs!
Feb  3 21:38:51 debser kernel: [drm] Skip scheduling IBs!
Feb  3 21:38:51 debser kernel: [drm] Skip Feb  3 21:38:50 debser kernel: [drm] PCIE GART of 1024M enabled (table at 0x000000F4FFC00000).
Feb  3 21:38:50 debser kernel: [drm] PSP is resuming...
Feb  3 21:38:50 debser kernel: [drm] reserve 0xa00000 from 0xf4fe000000 for PSP TMR
Feb  3 21:38:51 debser kernel: [drm] DMUB hardware initialized: version=0x0400003C
Feb  3 21:38:51 debser kernel: [drm] kiq ring mec 2 pipe 1 q 0
Feb  3 21:38:51 debser kernel: [drm] VCN decode and encode initialized successfully(under DPG Mode).
Feb  3 21:38:51 debser kernel: [drm] JPEG decode initialized successfully.
Feb  3 21:38:51 debser kernel: [drm] Skip scheduling IBs!
Feb  3 21:38:51 debser kernel: [drm] Skip scheduling IBs!
Feb  3 21:38:51 debser kernel: [drm] Skip scheduling IBs!
Feb  3 21:38:51 debser kernel: [drm] Skip scheduling IBs!
Feb  3 21:38:51 debser kernel: [drm] Skip scheduling IBs!
Feb  3 21:38:51 debser kernel: [drm] Skip scheduling IBs!
Feb  3 21:38:51 debser kernel: [drm] Skip scheduling IBs!
Feb  3 21:38:51 debser kernel: [drm] Skip scheduling IBs!
Feb  3 21:38:51 debser kernel: [drm] Skip scheduling IBs!
Feb  3 21:38:51 debser kernel: [drm] Skip scheduling IBs!
Feb  3 21:38:51 debser kernel: [drm] Skip scheduling IBs!
Feb  3 21:38:51 debser kernel: [drm] Skip scheduling IBs!
Feb  3 21:38:51 debser kernel: [drm] Skip scheduling IBs!
Feb  3 21:38:51 debser kernel: [drm] Skip scheduling IBs!
Feb  3 21:38:51 debser kernel: [drm] Skip scheduling IBs!
Feb  3 21:38:51 debser kernel: [drm] Skip scheduling IBs!
Feb  3 21:38:51 debser kernel: [drm] Skip scheduling IBs!
Feb  3 21:38:51 debser kernel: [drm] Skip scheduling IBs!
Feb  3 21:38:51 debser kernel: [drm] Skip scheduling IBs!
Feb  3 21:38:51 debser kernel: [drm] Skip scheduling IBs!
Feb  3 21:38:51 debser kernel: [drm] Skip scheduling IBs!
Feb  3 21:38:51 debser kernel: [drm] Skip scheduling IBs!
Feb  3 21:38:51 debser kernel: [drm] Skip scheduling IBs!
Feb  3 21:38:51 debser kernel: [drm] Skip scheduling IBs!
Feb  3 21:38:51 debser kernel: [drm] Skip scheduling IBs!
Feb  3 21:38:51 debser kernel: [drm] Skip scheduling IBs!
Feb  3 21:38:51 debser kernel: [drm] Skip scheduling IBs!
Feb  3 21:38:51 debser kernel: [drm] Skip scheduling IBs!
Feb  3 21:38:51 debser kernel: [drm] Skip scheduling IBs!
Feb  3 21:38:51 debser kernel: [drm] Skip scheduling IBs!
Feb  3 21:38:51 debser kernel: [drm] Skip scheduling IBs!
Feb  3 21:38:51 debser kernel: [drm] Skip scheduling IBs!
Feb  3 21:38:51 debser kernel: [drm] Skip scheduling IBs!
Feb  3 21:38:51 debser kernel: [drm] Skip scheduling IBs!
Feb  3 21:38:51 debser kernel: [drm] Skip scheduling IBs!
Feb  3 21:38:51 debser kernel: [drm] Skip scheduling IBs!
Feb  3 21:38:51 debser kernel: [drm] Skip scheduling IBs!
Feb  3 21:38:51 debser kernel: [drm] Skip scheduling IBs!
Feb  3 21:38:51 debser kernel: [drm] Skip scheduling IBs!
Feb  3 21:38:51 debser kernel: [drm] Skip scheduling IBs!
Feb  3 21:38:51 debser kernel: [drm] Skip scheduling IBs!
Feb  3 21:38:51 debser kernel: [drm] Skip scheduling IBs!
Feb  3 21:38:51 debser kernel: [drm] Skip scheduling IBs!
Feb  3 21:38:51 debser kernel: [drm] Skip scheduling IBs!
Feb  3 21:38:51 debser kernel: [drm] Skip scheduling IBs!
Feb  3 21:38:51 debser kernel: [drm] Skip scheduling IBs!
Feb  3 21:38:51 debser kernel: [drm] Skip scheduling IBs!
Feb  3 21:38:51 debser kernel: [drm] Skip scheduling IBs!
Feb  3 21:38:51 debser kernel: [drm] Skip scheduling IBs!
Feb  3 21:38:51 debser kernel: [drm] Skip scheduling IBs!
Feb  3 21:38:51 debser kernel: [drm] Skip scheduling IBs!
Feb  3 21:38:51 debser kernel: [drm] Skip scheduling IBs!
Feb  3 21:38:51 debser kernel: [drm] Skip scheduling IBs!
Feb  3 21:38:51 debser kernel: [drm] Skip scheduling IBs!
Feb  3 21:38:51 debser kernel: [drm] Skip scheduling IBs!
Feb  3 21:38:51 debser kernel: [drm] Skip scheduling IBs!
Feb  3 21:38:51 debser kernel: [drm] Skip scheduling IBs!
Feb  3 21:38:51 debser kernel: [drm] Skip scheduling IBs!
Feb  3 21:38:51 debser kernel: [drm] Skip scheduling IBs!
Feb  3 21:38:51 debser kernel: [drm] Skip scheduling IBs!
Feb  3 21:38:51 debser kernel: [drm] Skip scheduling IBs!
Feb  3 21:38:51 debser kernel: [drm] Skip scheduling IBs!
Feb  3 21:38:51 debser kernel: [drm] Skip scheduling IBs!
Feb  3 21:38:51 debser kernel: [drm] Skip scheduling IBs!
Feb  3 21:38:51 debser kernel: [drm] Skip scheduling IBs!
Feb  3 21:38:51 debser kernel: [drm] Skip scheduling IBs!
Feb  3 21:38:51 debser kernel: [drm] Skip scheduling IBs!
Feb  3 21:38:51 debser kernel: [drm] Skip scheduling IBs!
Feb  3 21:38:51 debser kernel: [drm] Skip scheduling IBs!
Feb  3 21:38:51 debser kernel: [drm] Skip scheduling IBs!
Feb  3 21:38:51 debser kernel: [drm] Skip scheduling IBs!
Feb  3 21:38:51 debser kernel: [drm] Skip scheduling IBs!
Feb  3 21:38:51 debser kernel: [drm] Skip scheduling IBs!
Feb  3 21:38:51 debser kernel: [drm] Skip scheduling IBs!
Feb  3 21:38:51 debser kernel: [drm] Skip scheduling IBs!
Feb  3 21:38:51 debser kernel: [drm] Skip scheduling IBs!
Feb  3 21:38:51 debser kernel: [drm] Skip scheduling IBs!
Feb  3 21:38:51 debser kernel: [drm] Skip scheduling IBs!
Feb  3 21:38:51 debser kernel: [drm] Skip scheduling IBs!
Feb  3 21:38:51 debser kernel: [drm] Skip scheduling IBs!
Feb  3 21:38:51 debser kernel: [drm] Skip scheduling IBs!
Feb  3 21:38:51 debser kernel: [drm] Skip scheduling IBs!
Feb  3 21:38:51 debser kernel: [drm] Skip scheduling IBs!
Feb  3 21:38:51 debser kernel: [drm] Skip scheduling IBs!
Feb  3 21:38:51 debser kernel: [drm] Skip scheduling IBs!
Feb  3 21:38:51 debser kernel: [drm] Skip scheduling IBs!
Feb  3 21:38:51 debser kernel: [drm] Skip scheduling IBs!
Feb  3 21:38:51 debser kernel: [drm] Skip scheduling IBs!
Feb  3 21:38:51 debser kernel: [drm] Skip scheduling IBs!
Feb  3 21:38:51 debser kernel: [drm] Skip scheduling IBs!
Feb  3 21:38:51 debser kernel: [drm] Skip scheduling IBs!
Feb  3 21:38:51 debser kernel: [drm] Skip scheduling IBs!
Feb  3 21:38:51 debser kernel: [drm] Skip scheduling IBs!
Feb  3 21:38:51 debser kernel: [drm] Skip scheduling IBs!
Feb  3 21:38:51 debser kernel: [drm] Skip scheduling IBs!
Feb  3 21:38:51 debser kernel: [drm] Skip scheduling IBs!
Feb  3 21:38:51 debser kernel: [drm] Skip scheduling IBs!
Feb  3 21:38:51 debser kernel: [drm] Skip scheduling IBs!
Feb  3 21:38:51 debser kernel: [drm] Skip scheduling IBs!
Feb  3 21:38:51 debser kernel: [drm] Skip scheduling IBs!
Feb  3 21:38:51 debser kernel: [drm] Skip scheduling IBs!
Feb  3 21:38:51 debser kernel: [drm] Skip scheduling IBs!
Feb  3 21:38:51 debser kernel: [drm] Skip scheduling IBs!
Feb  3 21:38:51 debser kernel: [drm] Skip scheduling IBs!
Feb  3 21:38:51 debser kernel: [drm] Skip scheduling IBs!
Feb  3 21:38:51 debser kernel: [drm] Skip scheduling IBs!
Feb  3 21:38:51 debser kernel: [drm] Skip scheduling IBs!
Feb  3 21:38:51 debser kernel: [drm] Skip scheduling IBs!
Feb  3 21:38:51 debser kernel: [drm] Skip scheduling IBs!
Feb  3 21:38:51 debser kernel: [drm] Skip scheduling IBs!
Feb  3 21:38:51 debser kernel: [drm] Skip scheduling IBs!
Feb  3 21:38:51 debser kernel: [drm] Skip scheduling IBs!
Feb  3 21:38:51 debser kernel: [drm] Skip scheduling IBs!
Feb  3 21:38:51 debser kernel: [drm] Skip scheduling IBs!
Feb  3 21:38:51 debser kernel: [drm] Skip scheduling IBs!
Feb  3 21:38:51 debser kernel: [drm] Skip scheduling IBs!
Feb  3 21:38:51 debser kernel: [drm] Skip scheduling IBs!
Feb  3 21:46:14 debser kernel: ACPI: bus type drm_connector registered
Feb  3 21:46:14 debser kernel: [drm] amdgpu kernel modesetting enabled.
scheduling IBs!
Feb  3 21:38:51 debser kernel: [drm] Skip scheduling IBs!
Feb  3 21:38:51 debser kernel: [drm] Skip scheduling IBs!
Feb  3 21:38:51 debser kernel: [drm] Skip scheduling IBs!
Feb  3 21:38:51 debser kernel: [drm] Skip scheduling IBs!
Feb  3 21:38:51 debser kernel: [drm] Skip scheduling IBs!
Feb  3 21:38:51 debser kernel: [drm] Skip scheduling IBs!
Feb  3 21:38:51 debser kernel: [drm] Skip scheduling IBs!
Feb  3 21:38:51 debser kernel: [drm] Skip scheduling IBs!
Feb  3 21:38:51 debser kernel: [drm] Skip scheduling IBs!
Feb  3 21:38:51 debser kernel: [drm] Skip scheduling IBs!
Feb  3 21:38:51 debser kernel: [drm] Skip scheduling IBs!
Feb  3 21:38:51 debser kernel: [drm] Skip scheduling IBs!
Feb  3 21:38:51 debser kernel: [drm] Skip scheduling IBs!
Feb  3 21:38:51 debser kernel: [drm] Skip scheduling IBs!
Feb  3 21:38:51 debser kernel: [drm] Skip scheduling IBs!
Feb  3 21:38:51 debser kernel: [drm] Skip scheduling IBs!
Feb  3 21:38:51 debser kernel: [drm] Skip scheduling IBs!
Feb  3 21:38:51 debser kernel: [drm] Skip scheduling IBs!
Feb  3 21:38:51 debser kernel: [drm] Skip scheduling IBs!
Feb  3 21:38:51 debser kernel: [drm] Skip scheduling IBs!
Feb  3 21:38:51 debser kernel: [drm] Skip scheduling IBs!
Feb  3 21:38:51 debser kernel: [drm] Skip scheduling IBs!
Feb  3 21:38:51 debser kernel: [drm] Skip scheduling IBs!
Feb  3 21:38:51 debser kernel: [drm] Skip scheduling IBs!
Feb  3 21:38:51 debser kernel: [drm] Skip scheduling IBs!
Feb  3 21:38:51 debser kernel: [drm] Skip scheduling IBs!
Feb  3 21:38:51 debser kernel: [drm] Skip scheduling IBs!
Feb  3 21:38:51 debser kernel: [drm] Skip scheduling IBs!
Feb  3 21:38:51 debser kernel: [drm] Skip scheduling IBs!
Feb  3 21:38:51 debser kernel: [drm] Skip scheduling IBs!
Feb  3 21:38:51 debser kernel: [drm] Skip scheduling IBs!
Feb  3 21:38:51 debser kernel: [drm] Skip scheduling IBs!
Feb  3 21:38:51 debser kernel: [drm] Skip scheduling IBs!
Feb  3 21:38:51 debser kernel: [drm] Skip scheduling IBs!
Feb  3 21:38:51 debser kernel: [drm] Skip scheduling IBs!
Feb  3 21:38:51 debser kernel: [drm] Skip scheduling IBs!
Feb  3 21:38:51 debser kernel: [drm] Skip scheduling IBs!
Feb  3 21:38:51 debser kernel: [drm] Skip scheduling IBs!
Feb  3 21:38:51 debser kernel: [drm] Skip scheduling IBs!
Feb  3 21:38:51 debser kernel: [drm] Skip scheduling IBs!
Feb  3 21:38:51 debser kernel: [drm] Skip scheduling IBs!
Feb  3 21:38:51 debser kernel: [drm] Skip scheduling IBs!
Feb  3 21:38:51 debser kernel: [drm] Skip scheduling IBs!
Feb  3 21:38:51 debser kernel: [drm] Skip scheduling IBs!
Feb  3 21:38:51 debser kernel: [drm] Skip scheduling IBs!
Feb  3 21:38:51 debser kernel: [drm] Skip scheduling IBs!
Feb  3 21:38:51 debser kernel: [drm] Skip scheduling IBs!
Feb  3 21:38:51 debser kernel: [drm] Skip scheduling IBs!
Feb  3 21:38:51 debser kernel: [drm] Skip scheduling IBs!
Feb  3 21:38:51 debser kernel: [drm] Skip scheduling IBs!
Feb  3 21:38:51 debser kernel: [drm] Skip scheduling IBs!
Feb  3 21:38:51 debser kernel: [drm] Skip scheduling IBs!
Feb  3 21:38:51 debser kernel: [drm] Skip scheduling IBs!
Feb  3 21:38:51 debser kernel: [drm] Skip scheduling IBs!
Feb  3 21:38:51 debser kernel: [drm] Skip scheduling IBs!
Feb  3 21:38:51 debser kernel: [drm] Skip scheduling IBs!
Feb  3 21:38:51 debser kernel: [drm] Skip scheduling IBs!
Feb  3 21:38:51 debser kernel: [drm] Skip scheduling IBs!
Feb  3 21:38:51 debser kernel: [drm] Skip scheduling IBs!
Feb  3 21:38:51 debser kernel: [drm] Skip scheduling IBs!
Feb  3 21:38:51 debser kernel: [drm] Skip scheduling IBs!
Feb  3 21:38:51 debser kernel: [drm] Skip scheduling IBs!
Feb  3 21:38:51 debser kernel: [drm] Skip scheduling IBs!
Feb  3 21:38:51 debser kernel: [drm] Skip scheduling IBs!
Feb  3 21:38:51 debser kernel: [drm] Skip scheduling IBs!
Feb  3 21:38:51 debser kernel: [drm] Skip scheduling IBs!
Feb  3 21:38:51 debser kernel: [drm] Skip scheduling IBs!
Feb  3 21:38:51 debser kernel: [drm] Skip scheduling IBs!
Feb  3 21:38:51 debser kernel: [drm] Skip scheduling IBs!
Feb  3 21:38:51 debser kernel: [drm] Skip scheduling IBs!
Feb  3 21:38:51 debser kernel: [drm] Skip scheduling IBs!
Feb  3 21:38:51 debser kernel: [drm] Skip scheduling IBs!
Feb  3 21:38:51 debser kernel: [drm] Skip scheduling IBs!
Feb  3 21:38:51 debser kernel: [drm] Skip scheduling IBs!
Feb  3 21:38:51 debser kernel: [drm] Skip scheduling IBs!
Feb  3 21:38:51 debser kernel: [drm] Skip scheduling IBs!
Feb  3 21:38:51 debser kernel: [drm] Skip scheduling IBs!
Feb  3 21:38:51 debser kernel: [drm] Skip scheduling IBs!
Feb  3 21:38:51 debser kernel: [drm] Skip scheduling IBs!
Feb  3 21:38:51 debser kernel: [drm] Skip scheduling IBs!
Feb  3 21:38:51 debser kernel: [drm] Skip scheduling IBs!
Feb  3 21:38:51 debser kernel: [drm] Skip scheduling IBs!
Feb  3 21:38:51 debser kernel: [drm] Skip scheduling IBs!
Feb  3 21:38:51 debser kernel: [drm] Skip scheduling IBs!
Feb  3 21:38:51 debser kernel: [drm] Skip scheduling IBs!
Feb  3 21:38:51 debser kernel: [drm] Skip scheduling IBs!
Feb  3 21:38:51 debser kernel: [drm] Skip scheduling IBs!
Feb  3 21:38:51 debser kernel: [drm] Skip scheduling IBs!
Feb  3 21:38:51 debser kernel: [drm] Skip scheduling IBs!
Feb  3 21:38:51 debser kernel: [drm] Skip scheduling IBs!
Feb  3 21:38:51 debser kernel: [drm] Skip scheduling IBs!
Feb  3 21:38:51 debser kernel: [drm] Skip scheduling IBs!
Feb  3 21:38:51 debser kernel: [drm] Skip scheduling IBs!
Feb  3 21:38:51 debser kernel: [drm] Skip scheduling IBs!
Feb  3 21:38:51 debser kernel: [drm] Skip scheduling IBs!
Feb  3 21:38:51 debser kernel: [drm] Skip scheduling IBs!
Feb  3 21:38:51 debser kernel: [drm] Skip scheduling IBs!
Feb  3 21:38:51 debser kernel: [drm] Skip scheduling IBs!
Feb  3 21:46:14 debser kernel: ACPI: bus type drm_connector registered
Feb  3 21:46:14 debser kernel: [drm] amdgpu kernel modesetting enabled.

Freeze screen and system at 21:38:51, in this time i am with firefox esr.

for info