My laptop failed rarely during game play (twice a day). First is start to laggin baddly and in one second freeze completely. I still able to login remotely using ssh, but keyboard, caps lock or alt+f1 not working. dmesg shows a related errors: [ 7738.264852] amdgpu 0000:03:00.0: amdgpu: in page starting at address 0x00008001bfd14000 from client 27 [ 7738.264856] amdgpu 0000:03:00.0: amdgpu: VM_L2_PROTECTION_FAULT_STATUS:0x00701431 [ 7738.264860] amdgpu 0000:03:00.0: amdgpu: Faulty UTCL2 client ID: SQC (data) (0xa) [ 7738.264864] amdgpu 0000:03:00.0: amdgpu: MORE_FAULTS: 0x1 [ 7738.264867] amdgpu 0000:03:00.0: amdgpu: WALKER_ERROR: 0x0 [ 7738.264871] amdgpu 0000:03:00.0: amdgpu: PERMISSION_FAULTS: 0x3 [ 7738.264874] amdgpu 0000:03:00.0: amdgpu: MAPPING_ERROR: 0x0 [ 7738.264878] amdgpu 0000:03:00.0: amdgpu: RW: 0x0 [ 7738.265691] amdgpu 0000:03:00.0: amdgpu: [gfxhub0] retry page fault (src_id:0 ring:0 vmid:7 pasid:32778, for process PathOfExile_x64 pid 14159 thread PathOfExile_x64 pid 14231) Full log: https://linux-hardware.org/?probe=7f9336625d
Same here. This happens with 5.11 and 5.12 (AFAIK). Strange boxes appear on screen, freezes etc at very random intervals.
Started to happen after 5.12.4 or so... Bug also been "backported" to 5.10.x Happens often, always with Firefox, when opening heavy graphics pages under CPU load (usually) Sometimes can recover, sometimes freezing the system completely. Hardware: Lenovo 330S 15ARR AMD Ryzen 5 2500U with Radeon Vega Mobile Gfx AuthenticAMD GNU/Linux AMD Radeon(TM) Vega 8 Graphics Jun 06 21:40:02 [kernel] [11217.699553] amdgpu 0000:04:00.0: amdgpu: in page starting at address 0x0000800119c80000 from client 27 Jun 06 21:40:02 [kernel] [11217.699555] amdgpu 0000:04:00.0: amdgpu: VM_L2_PROTECTION_FAULT_STATUS:0x006C0071 Jun 06 21:40:02 [kernel] [11217.699557] amdgpu 0000:04:00.0: amdgpu: _ Faulty UTCL2 client ID: CB (0x0) Jun 06 21:40:02 [kernel] [11217.699559] amdgpu 0000:04:00.0: amdgpu: _ MORE_FAULTS: 0x1 Jun 06 21:40:02 [kernel] [11217.699560] amdgpu 0000:04:00.0: amdgpu: _ WALKER_ERROR: 0x0 Jun 06 21:40:02 [kernel] [11217.699562] amdgpu 0000:04:00.0: amdgpu: _ PERMISSION_FAULTS: 0x7 Jun 06 21:40:02 [kernel] [11217.699564] amdgpu 0000:04:00.0: amdgpu: _ MAPPING_ERROR: 0x0 Jun 06 21:40:02 [kernel] [11217.699566] amdgpu 0000:04:00.0: amdgpu: _ RW: 0x1 Jun 06 21:40:07 [kernel] [11222.696469] gmc_v9_0_process_interrupt: 5900 callbacks suppressed Jun 06 21:40:07 [kernel] [11222.696476] amdgpu 0000:04:00.0: amdgpu: [gfxhub0] retry page fault (src_id:0 ring:0 vmid:6 pasid:32777, for process GPU Process pid 3227 thread firefox:cs0 pid 3288) Jun 06 21:40:07 [kernel] [11222.699857] amdgpu 0000:04:00.0: amdgpu: in page starting at address 0x0000800119c80000 from client 27 Jun 06 21:40:07 [kernel] [11222.699858] amdgpu 0000:04:00.0: amdgpu: VM_L2_PROTECTION_FAULT_STATUS:0x006C0071 Jun 06 21:40:07 [kernel] [11222.699860] amdgpu 0000:04:00.0: amdgpu: _ Faulty UTCL2 client ID: CB (0x0) Jun 06 21:40:07 [kernel] [11222.699862] amdgpu 0000:04:00.0: amdgpu: _ MORE_FAULTS: 0x1 Jun 06 21:40:07 [kernel] [11222.699863] amdgpu 0000:04:00.0: amdgpu: _ WALKER_ERROR: 0x0 Jun 06 21:40:07 [kernel] [11222.699865] amdgpu 0000:04:00.0: amdgpu: _ PERMISSION_FAULTS: 0x7 Jun 06 21:40:07 [kernel] [11222.699867] amdgpu 0000:04:00.0: amdgpu: _ MAPPING_ERROR: 0x0 Jun 06 21:40:07 [kernel] [11222.699868] amdgpu 0000:04:00.0: amdgpu: _ RW: 0x1 Jun 06 21:40:07 [kernel] [11222.700717] amdgpu 0000:04:00.0: amdgpu: [gfxhub0] retry page fault (src_id:0 ring:0 vmid:6 pasid:32777, for process GPU Process pid 3227 thread firefox:cs0 pid 3288)
I think I found a solution. This applies to TLP users specifically, but non-TLP users can try that too. In `/etc/tlp.conf` file, set this to: RUNTIME_PM_DRIVER_BLACKLIST="" # originally, it looks like this: # RUNTIME_PM_DRIVER_BLACKLIST="amdgpu mei_me nouveau pcieport radeon"
(In reply to Arunanshu Biswas from comment #3) > I think I found a solution. This applies to TLP users specifically, but > non-TLP users can try that too. > > In `/etc/tlp.conf` file, set this to: > > RUNTIME_PM_DRIVER_BLACKLIST="" > # originally, it looks like this: > # RUNTIME_PM_DRIVER_BLACKLIST="amdgpu mei_me nouveau pcieport radeon" Or you can simply blacklist it in modprobe.
I have removed ivrs_ioapic[32]=00:14.0 froom boot params, and it seems to be that problem vanished (or just for now.. still keeping an eye for crashes), this param was required to boot my Lenovo laptop with earlier kernels. 00:14.0 SMBus: Advanced Micro Devices, Inc. [AMD] FCH SMBus Controller (rev 61)
I have same problem. All working fine some time and then black screen. Music still playing but keyboard's multimedia key does not stop it anymore. Also Alt+F* not working. Kernel: 5.12.12-zen1-1-zen CPU: AMD Ryzen 5 3400G (8) @ 3.700GHz GPU: AMD ATI 07:00.0 Picasso Arch Linux. And it does not matter -zen or not.
(In reply to Sylvia from comment #5) > I have removed ivrs_ioapic[32]=00:14.0 froom boot params, and it seems to be > that problem vanished (or just for now.. still keeping an eye for crashes), Still.. [ 430.943898] amdgpu 0000:04:00.0: amdgpu: [gfxhub0] retry page fault (src_id:0 ring:0 vmid:6 pasid:32775, for process GPU Process pid 2960 thread firefox:cs0 pid 3006) [ 430.943905] amdgpu 0000:04:00.0: amdgpu: in page starting at address 0x0000800112200000 from client 27 [ 430.943908] amdgpu 0000:04:00.0: amdgpu: VM_L2_PROTECTION_FAULT_STATUS:0x00641051 [ 430.943912] amdgpu 0000:04:00.0: amdgpu: Faulty UTCL2 client ID: TCP (0x8) [ 430.943914] amdgpu 0000:04:00.0: amdgpu: MORE_FAULTS: 0x1 [ 430.943916] amdgpu 0000:04:00.0: amdgpu: WALKER_ERROR: 0x0 [ 430.943918] amdgpu 0000:04:00.0: amdgpu: PERMISSION_FAULTS: 0x5 [ 430.943921] amdgpu 0000:04:00.0: amdgpu: MAPPING_ERROR: 0x0 [ 430.943923] amdgpu 0000:04:00.0: amdgpu: RW: 0x1 [ 435.948029] gmc_v9_0_process_interrupt: 47127 callbacks suppressed [ 440.952210] gmc_v9_0_process_interrupt: 47115 callbacks suppressed [ 445.955904] gmc_v9_0_process_interrupt: 615433 callbacks suppressed [ 455.963501] gmc_v9_0_process_interrupt: 623975 callbacks suppressed [ 460.967605] gmc_v9_0_process_interrupt: 622849 callbacks suppressed [ 465.971907] gmc_v9_0_process_interrupt: 624947 callbacks suppressed [ 470.976127] gmc_v9_0_process_interrupt: 625531 callbacks suppressed [ 480.104842] gmc_v9_0_process_interrupt: 107333 callbacks suppressed [ 485.107666] gmc_v9_0_process_interrupt: 624573 callbacks suppressed [ 490.111856] gmc_v9_0_process_interrupt: 624584 callbacks suppressed 490.266839] [drm:amdgpu_job_timedout [amdgpu]] *ERROR* ring gfx timeout, but soft recovered [ 495.115724] gmc_v9_0_process_interrupt: 626231 callbacks suppressed [ 500.119513] gmc_v9_0_process_interrupt: 626410 callbacks suppressed [ 500.506030] [drm:amdgpu_job_timedout [amdgpu]] *ERROR* ring gfx timeout, but soft recovered [ 505.123885] gmc_v9_0_process_interrupt: 124455 callbacks suppressed [ 510.128389] amdgpu 0000:04:00.0: amdgpu: [gfxhub0] retry page fault (src_id:0 ring:0 vmid:6 pasid:32775, for process GPU Process pid 2960 thread firefox:cs0 pid 3006) [ 510.128391] amdgpu 0000:04:00.0: amdgpu: in page starting at address 0x000080011220f000 from client 27 [ 510.128393] amdgpu 0000:04:00.0: amdgpu: VM_L2_PROTECTION_FAULT_STATUS:0x00641051 [ 510.128395] amdgpu 0000:04:00.0: amdgpu: Faulty UTCL2 client ID: TCP (0x8) [ 510.128396] amdgpu 0000:04:00.0: amdgpu: MORE_FAULTS: 0x1 [ 510.128397] amdgpu 0000:04:00.0: amdgpu: WALKER_ERROR: 0x0 [ 510.128399] amdgpu 0000:04:00.0: amdgpu: PERMISSION_FAULTS: 0x5 [ 510.128400] amdgpu 0000:04:00.0: amdgpu: MAPPING_ERROR: 0x0 [ 510.128402] amdgpu 0000:04:00.0: amdgpu: RW: 0x1 [ 510.743952] [drm:amdgpu_job_timedout [amdgpu]] *ERROR* ring gfx timeout, but soft recovered
Still an issue with 5.13.0 (AMD Ryzen 5 PRO 3500U, Fedora 34, Mesa 21.1.4) Jul 06 03:34:26 kernel: gmc_v9_0_process_interrupt: 116 callbacks suppressed Jul 06 03:34:26 kernel: amdgpu 0000:06:00.0: amdgpu: [gfxhub0] retry page fault (src_id:0 ring:0 vmid:6 pasid:32770, for process Xorg pid 1574 thread Xorg:cs0 pid 1627) Jul 06 03:34:26 kernel: amdgpu 0000:06:00.0: amdgpu: in page starting at address 0x00008001110a2000 from IH client 0x1b (UTCL2) Jul 06 03:34:26 kernel: amdgpu 0000:06:00.0: amdgpu: VM_L2_PROTECTION_FAULT_STATUS:0x00641051 Jul 06 03:34:26 kernel: amdgpu 0000:06:00.0: amdgpu: Faulty UTCL2 client ID: TCP (0x8) Jul 06 03:34:26 kernel: amdgpu 0000:06:00.0: amdgpu: MORE_FAULTS: 0x1 Jul 06 03:34:26 kernel: amdgpu 0000:06:00.0: amdgpu: WALKER_ERROR: 0x0 Jul 06 03:34:26 kernel: amdgpu 0000:06:00.0: amdgpu: PERMISSION_FAULTS: 0x5 Jul 06 03:34:26 kernel: amdgpu 0000:06:00.0: amdgpu: MAPPING_ERROR: 0x0 Jul 06 03:34:26 kernel: amdgpu 0000:06:00.0: amdgpu: RW: 0x1 Jul 06 03:34:26 kernel: amdgpu 0000:06:00.0: amdgpu: [gfxhub0] retry page fault (src_id:0 ring:0 vmid:6 pasid:32770, for process Xorg pid 1574 thread Xorg:cs0 pid 1627) Jul 06 03:34:26 kernel: amdgpu 0000:06:00.0: amdgpu: in page starting at address 0x00008001110a0000 from IH client 0x1b (UTCL2) Jul 06 03:34:26 kernel: amdgpu 0000:06:00.0: amdgpu: VM_L2_PROTECTION_FAULT_STATUS:0x00641051 Jul 06 03:34:26 kernel: amdgpu 0000:06:00.0: amdgpu: Faulty UTCL2 client ID: TCP (0x8) Jul 06 03:34:26 kernel: amdgpu 0000:06:00.0: amdgpu: MORE_FAULTS: 0x1 Jul 06 03:34:26 kernel: amdgpu 0000:06:00.0: amdgpu: WALKER_ERROR: 0x0 Jul 06 03:34:26 kernel: amdgpu 0000:06:00.0: amdgpu: PERMISSION_FAULTS: 0x5 Jul 06 03:34:26 kernel: amdgpu 0000:06:00.0: amdgpu: MAPPING_ERROR: 0x0 Jul 06 03:34:26 kernel: amdgpu 0000:06:00.0: amdgpu: RW: 0x1 Jul 06 03:34:26 kernel: amdgpu 0000:06:00.0: amdgpu: [gfxhub0] retry page fault (src_id:0 ring:0 vmid:6 pasid:32770, for process Xorg pid 1574 thread Xorg:cs0 pid 1627) Jul 06 03:34:26 kernel: amdgpu 0000:06:00.0: amdgpu: in page starting at address 0x00008001110a3000 from IH client 0x1b (UTCL2) Jul 06 03:34:26 kernel: amdgpu 0000:06:00.0: amdgpu: VM_L2_PROTECTION_FAULT_STATUS:0x00641051 Jul 06 03:34:26 kernel: amdgpu 0000:06:00.0: amdgpu: Faulty UTCL2 client ID: TCP (0x8) Jul 06 03:34:26 kernel: amdgpu 0000:06:00.0: amdgpu: MORE_FAULTS: 0x1 Jul 06 03:34:26 kernel: amdgpu 0000:06:00.0: amdgpu: WALKER_ERROR: 0x0 Jul 06 03:34:26 kernel: amdgpu 0000:06:00.0: amdgpu: PERMISSION_FAULTS: 0x5 Jul 06 03:34:26 kernel: amdgpu 0000:06:00.0: amdgpu: MAPPING_ERROR: 0x0 Jul 06 03:34:26 kernel: amdgpu 0000:06:00.0: amdgpu: RW: 0x1 Jul 06 03:34:26 kernel: amdgpu 0000:06:00.0: amdgpu: [gfxhub0] retry page fault (src_id:0 ring:0 vmid:6 pasid:32770, for process Xorg pid 1574 thread Xorg:cs0 pid 1627) Jul 06 03:34:26 kernel: amdgpu 0000:06:00.0: amdgpu: in page starting at address 0x00008001110a1000 from IH client 0x1b (UTCL2) Jul 06 03:34:26 kernel: amdgpu 0000:06:00.0: amdgpu: VM_L2_PROTECTION_FAULT_STATUS:0x00641051 Jul 06 03:34:26 kernel: amdgpu 0000:06:00.0: amdgpu: Faulty UTCL2 client ID: TCP (0x8) Jul 06 03:34:26 kernel: amdgpu 0000:06:00.0: amdgpu: MORE_FAULTS: 0x1 Jul 06 03:34:26 kernel: amdgpu 0000:06:00.0: amdgpu: WALKER_ERROR: 0x0 Jul 06 03:34:26 kernel: amdgpu 0000:06:00.0: amdgpu: PERMISSION_FAULTS: 0x5 Jul 06 03:34:26 kernel: amdgpu 0000:06:00.0: amdgpu: MAPPING_ERROR: 0x0 Jul 06 03:34:26 kernel: amdgpu 0000:06:00.0: amdgpu: RW: 0x1 Jul 06 03:34:26 kernel: amdgpu 0000:06:00.0: amdgpu: [gfxhub0] retry page fault (src_id:0 ring:0 vmid:6 pasid:32770, for process Xorg pid 1574 thread Xorg:cs0 pid 1627) Jul 06 03:34:26 kernel: amdgpu 0000:06:00.0: amdgpu: in page starting at address 0x00008001110a4000 from IH client 0x1b (UTCL2) Jul 06 03:34:26 kernel: amdgpu 0000:06:00.0: amdgpu: VM_L2_PROTECTION_FAULT_STATUS:0x00641051 Jul 06 03:34:26 kernel: amdgpu 0000:06:00.0: amdgpu: Faulty UTCL2 client ID: TCP (0x8) Jul 06 03:34:26 kernel: amdgpu 0000:06:00.0: amdgpu: MORE_FAULTS: 0x1 Jul 06 03:34:26 kernel: amdgpu 0000:06:00.0: amdgpu: WALKER_ERROR: 0x0 Jul 06 03:34:26 kernel: amdgpu 0000:06:00.0: amdgpu: PERMISSION_FAULTS: 0x5 Jul 06 03:34:26 kernel: amdgpu 0000:06:00.0: amdgpu: MAPPING_ERROR: 0x0 Jul 06 03:34:26 kernel: amdgpu 0000:06:00.0: amdgpu: RW: 0x1 Jul 06 03:34:26 kernel: amdgpu 0000:06:00.0: amdgpu: [gfxhub0] retry page fault (src_id:0 ring:0 vmid:6 pasid:32770, for process Xorg pid 1574 thread Xorg:cs0 pid 1627) Jul 06 03:34:26 kernel: amdgpu 0000:06:00.0: amdgpu: in page starting at address 0x00008001110a5000 from IH client 0x1b (UTCL2) Jul 06 03:34:26 kernel: amdgpu 0000:06:00.0: amdgpu: VM_L2_PROTECTION_FAULT_STATUS:0x00641051 Jul 06 03:34:26 kernel: amdgpu 0000:06:00.0: amdgpu: Faulty UTCL2 client ID: TCP (0x8) Jul 06 03:34:26 kernel: amdgpu 0000:06:00.0: amdgpu: MORE_FAULTS: 0x1 Jul 06 03:34:26 kernel: amdgpu 0000:06:00.0: amdgpu: WALKER_ERROR: 0x0 Jul 06 03:34:26 kernel: amdgpu 0000:06:00.0: amdgpu: PERMISSION_FAULTS: 0x5 Jul 06 03:34:26 kernel: amdgpu 0000:06:00.0: amdgpu: MAPPING_ERROR: 0x0 Jul 06 03:34:26 kernel: amdgpu 0000:06:00.0: amdgpu: RW: 0x1 Jul 06 03:34:26 kernel: amdgpu 0000:06:00.0: amdgpu: [gfxhub0] retry page fault (src_id:0 ring:0 vmid:6 pasid:32770, for process Xorg pid 1574 thread Xorg:cs0 pid 1627) Jul 06 03:34:26 kernel: amdgpu 0000:06:00.0: amdgpu: in page starting at address 0x00008001110a6000 from IH client 0x1b (UTCL2) Jul 06 03:34:26 kernel: amdgpu 0000:06:00.0: amdgpu: VM_L2_PROTECTION_FAULT_STATUS:0x00641051 Jul 06 03:34:26 kernel: amdgpu 0000:06:00.0: amdgpu: Faulty UTCL2 client ID: TCP (0x8) Jul 06 03:34:26 kernel: amdgpu 0000:06:00.0: amdgpu: MORE_FAULTS: 0x1 Jul 06 03:34:26 kernel: amdgpu 0000:06:00.0: amdgpu: WALKER_ERROR: 0x0 Jul 06 03:34:26 kernel: amdgpu 0000:06:00.0: amdgpu: PERMISSION_FAULTS: 0x5 Jul 06 03:34:26 kernel: amdgpu 0000:06:00.0: amdgpu: MAPPING_ERROR: 0x0 Jul 06 03:34:26 kernel: amdgpu 0000:06:00.0: amdgpu: RW: 0x1 Jul 06 03:34:26 kernel: amdgpu 0000:06:00.0: amdgpu: [gfxhub0] retry page fault (src_id:0 ring:0 vmid:6 pasid:32770, for process Xorg pid 1574 thread Xorg:cs0 pid 1627) Jul 06 03:34:26 kernel: amdgpu 0000:06:00.0: amdgpu: in page starting at address 0x00008001110a8000 from IH client 0x1b (UTCL2) Jul 06 03:34:26 kernel: amdgpu 0000:06:00.0: amdgpu: VM_L2_PROTECTION_FAULT_STATUS:0x00641051 Jul 06 03:34:26 kernel: amdgpu 0000:06:00.0: amdgpu: Faulty UTCL2 client ID: TCP (0x8) Jul 06 03:34:26 kernel: amdgpu 0000:06:00.0: amdgpu: MORE_FAULTS: 0x1 Jul 06 03:34:26 kernel: amdgpu 0000:06:00.0: amdgpu: WALKER_ERROR: 0x0 Jul 06 03:34:26 kernel: amdgpu 0000:06:00.0: amdgpu: PERMISSION_FAULTS: 0x5 Jul 06 03:34:26 kernel: amdgpu 0000:06:00.0: amdgpu: MAPPING_ERROR: 0x0 Jul 06 03:34:26 kernel: amdgpu 0000:06:00.0: amdgpu: RW: 0x1 Jul 06 03:34:26 kernel: amdgpu 0000:06:00.0: amdgpu: [gfxhub0] retry page fault (src_id:0 ring:0 vmid:6 pasid:32770, for process Xorg pid 1574 thread Xorg:cs0 pid 1627) Jul 06 03:34:26 kernel: amdgpu 0000:06:00.0: amdgpu: in page starting at address 0x00008001110a7000 from IH client 0x1b (UTCL2) Jul 06 03:34:26 kernel: amdgpu 0000:06:00.0: amdgpu: VM_L2_PROTECTION_FAULT_STATUS:0x00641051 Jul 06 03:34:26 kernel: amdgpu 0000:06:00.0: amdgpu: Faulty UTCL2 client ID: TCP (0x8) Jul 06 03:34:26 kernel: amdgpu 0000:06:00.0: amdgpu: MORE_FAULTS: 0x1 Jul 06 03:34:26 kernel: amdgpu 0000:06:00.0: amdgpu: WALKER_ERROR: 0x0 Jul 06 03:34:26 kernel: amdgpu 0000:06:00.0: amdgpu: PERMISSION_FAULTS: 0x5 Jul 06 03:34:26 kernel: amdgpu 0000:06:00.0: amdgpu: MAPPING_ERROR: 0x0 Jul 06 03:34:26 kernel: amdgpu 0000:06:00.0: amdgpu: RW: 0x1 Jul 06 03:34:26 kernel: amdgpu 0000:06:00.0: amdgpu: [gfxhub0] retry page fault (src_id:0 ring:0 vmid:6 pasid:32770, for process Xorg pid 1574 thread Xorg:cs0 pid 1627) Jul 06 03:34:26 kernel: amdgpu 0000:06:00.0: amdgpu: in page starting at address 0x00008001110aa000 from IH client 0x1b (UTCL2) Jul 06 03:34:26 kernel: amdgpu 0000:06:00.0: amdgpu: VM_L2_PROTECTION_FAULT_STATUS:0x00641051 Jul 06 03:34:26 kernel: amdgpu 0000:06:00.0: amdgpu: Faulty UTCL2 client ID: TCP (0x8) Jul 06 03:34:26 kernel: amdgpu 0000:06:00.0: amdgpu: MORE_FAULTS: 0x1 Jul 06 03:34:26 kernel: amdgpu 0000:06:00.0: amdgpu: WALKER_ERROR: 0x0 Jul 06 03:34:26 kernel: amdgpu 0000:06:00.0: amdgpu: PERMISSION_FAULTS: 0x5 Jul 06 03:34:26 kernel: amdgpu 0000:06:00.0: amdgpu: MAPPING_ERROR: 0x0 Jul 06 03:34:26 kernel: amdgpu 0000:06:00.0: amdgpu: RW: 0x1 Jul 06 03:34:36 kernel: [drm:amdgpu_job_timedout [amdgpu]] *ERROR* ring gfx timeout, signaled seq=413630, emitted seq=413632 Jul 06 03:34:36 kernel: [drm:amdgpu_job_timedout [amdgpu]] *ERROR* Process information: process Xorg pid 1574 thread Xorg:cs0 pid 1627 Jul 06 03:34:36 kernel: amdgpu 0000:06:00.0: amdgpu: GPU reset begin! Jul 06 03:34:36 kernel: amdgpu 0000:06:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x0000 address=0x110440000 flags=0x0070] Jul 06 03:34:36 kernel: amdgpu 0000:06:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x0000 address=0x11041fde0 flags=0x0070] Jul 06 03:34:36 kernel: amdgpu 0000:06:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x0000 address=0x11041fe00 flags=0x0070] Jul 06 03:34:36 kernel: amdgpu 0000:06:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x0000 address=0x110440000 flags=0x0070] Jul 06 03:34:36 kernel: amdgpu 0000:06:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x0000 address=0x11041fe20 flags=0x0070] Jul 06 03:34:36 kernel: amdgpu 0000:06:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x0000 address=0x11041fe40 flags=0x0070] Jul 06 03:34:36 kernel: amdgpu 0000:06:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x0000 address=0x110440000 flags=0x0070] Jul 06 03:34:36 kernel: amdgpu 0000:06:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x0000 address=0x11041fe60 flags=0x0070] Jul 06 03:34:36 kernel: amdgpu 0000:06:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x0000 address=0x11041fe80 flags=0x0070] Jul 06 03:34:36 kernel: amdgpu 0000:06:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x0000 address=0x110440000 flags=0x0070] Jul 06 03:34:36 kernel: amd_iommu_report_page_fault: 21 callbacks suppressed Jul 06 03:34:36 kernel: AMD-Vi: Event logged [IO_PAGE_FAULT device=06:00.0 domain=0x0000 address=0x11041fea0 flags=0x0070] Jul 06 03:34:36 kernel: AMD-Vi: Event logged [IO_PAGE_FAULT device=06:00.0 domain=0x0000 address=0x11041fec0 flags=0x0070] Jul 06 03:34:36 kernel: AMD-Vi: Event logged [IO_PAGE_FAULT device=06:00.0 domain=0x0000 address=0x110440000 flags=0x0070] Jul 06 03:34:36 kernel: AMD-Vi: Event logged [IO_PAGE_FAULT device=06:00.0 domain=0x0000 address=0x11041fee0 flags=0x0070] Jul 06 03:34:36 kernel: AMD-Vi: Event logged [IO_PAGE_FAULT device=06:00.0 domain=0x0000 address=0x11041ff00 flags=0x0070] Jul 06 03:34:36 kernel: AMD-Vi: Event logged [IO_PAGE_FAULT device=06:00.0 domain=0x0000 address=0x110440000 flags=0x0070] Jul 06 03:34:36 kernel: AMD-Vi: Event logged [IO_PAGE_FAULT device=06:00.0 domain=0x0000 address=0x11041ff20 flags=0x0070] Jul 06 03:34:36 kernel: AMD-Vi: Event logged [IO_PAGE_FAULT device=06:00.0 domain=0x0000 address=0x11041ff40 flags=0x0070] Jul 06 03:34:36 kernel: AMD-Vi: Event logged [IO_PAGE_FAULT device=06:00.0 domain=0x0000 address=0x110440000 flags=0x0070] Jul 06 03:34:36 kernel: AMD-Vi: Event logged [IO_PAGE_FAULT device=06:00.0 domain=0x0000 address=0x11041ff60 flags=0x0070] Jul 06 03:34:36 kernel: [drm] free PSP TMR buffer Jul 06 03:34:36 kernel: amdgpu 0000:06:00.0: amdgpu: MODE2 reset Jul 06 03:34:36 kernel: amdgpu 0000:06:00.0: amdgpu: GPU reset succeeded, trying to resume Jul 06 03:34:36 kernel: [drm] PCIE GART of 1024M enabled. Jul 06 03:34:36 kernel: [drm] PTB located at 0x000000F401FA4000 Jul 06 03:34:36 kernel: [drm] PSP is resuming... Jul 06 03:34:36 kernel: [drm] reserve 0x400000 from 0xf47fc00000 for PSP TMR Jul 06 03:34:36 kernel: amdgpu 0000:06:00.0: amdgpu: RAS: optional ras ta ucode is not available Jul 06 03:34:36 kernel: amdgpu 0000:06:00.0: amdgpu: RAP: optional rap ta ucode is not available Jul 06 03:34:37 kernel: [drm] kiq ring mec 2 pipe 1 q 0 Jul 06 03:34:37 kernel: [drm] VCN decode and encode initialized successfully(under SPG Mode). Jul 06 03:34:37 kernel: amdgpu 0000:06:00.0: amdgpu: ring gfx uses VM inv eng 0 on hub 0 Jul 06 03:34:37 kernel: amdgpu 0000:06:00.0: amdgpu: ring comp_1.0.0 uses VM inv eng 1 on hub 0 Jul 06 03:34:37 kernel: amdgpu 0000:06:00.0: amdgpu: ring comp_1.1.0 uses VM inv eng 4 on hub 0 Jul 06 03:34:37 kernel: amdgpu 0000:06:00.0: amdgpu: ring comp_1.2.0 uses VM inv eng 5 on hub 0 Jul 06 03:34:37 kernel: amdgpu 0000:06:00.0: amdgpu: ring comp_1.3.0 uses VM inv eng 6 on hub 0 Jul 06 03:34:37 kernel: amdgpu 0000:06:00.0: amdgpu: ring comp_1.0.1 uses VM inv eng 7 on hub 0 Jul 06 03:34:37 kernel: amdgpu 0000:06:00.0: amdgpu: ring comp_1.1.1 uses VM inv eng 8 on hub 0 Jul 06 03:34:37 kernel: amdgpu 0000:06:00.0: amdgpu: ring comp_1.2.1 uses VM inv eng 9 on hub 0 Jul 06 03:34:37 kernel: amdgpu 0000:06:00.0: amdgpu: ring comp_1.3.1 uses VM inv eng 10 on hub 0 Jul 06 03:34:37 kernel: amdgpu 0000:06:00.0: amdgpu: ring kiq_2.1.0 uses VM inv eng 11 on hub 0 Jul 06 03:34:37 kernel: amdgpu 0000:06:00.0: amdgpu: ring sdma0 uses VM inv eng 0 on hub 1 Jul 06 03:34:37 kernel: amdgpu 0000:06:00.0: amdgpu: ring vcn_dec uses VM inv eng 1 on hub 1 Jul 06 03:34:37 kernel: amdgpu 0000:06:00.0: amdgpu: ring vcn_enc0 uses VM inv eng 4 on hub 1 Jul 06 03:34:37 kernel: amdgpu 0000:06:00.0: amdgpu: ring vcn_enc1 uses VM inv eng 5 on hub 1 Jul 06 03:34:37 kernel: amdgpu 0000:06:00.0: amdgpu: ring jpeg_dec uses VM inv eng 6 on hub 1 Jul 06 03:34:37 kernel: amdgpu 0000:06:00.0: amdgpu: recover vram bo from shadow start Jul 06 03:34:37 kernel: amdgpu 0000:06:00.0: amdgpu: recover vram bo from shadow done Jul 06 03:34:37 kernel: [drm] Skip scheduling IBs! Jul 06 03:34:37 kernel: amdgpu 0000:06:00.0: amdgpu: GPU reset(2) succeeded! Jul 06 03:34:47 kernel: [drm:amdgpu_job_timedout [amdgpu]] *ERROR* ring gfx timeout, signaled seq=413639, emitted seq=413641 Jul 06 03:34:47 kernel: [drm:amdgpu_job_timedout [amdgpu]] *ERROR* Process information: process Xorg pid 1574 thread Xorg:cs0 pid 1627 Jul 06 03:34:47 kernel: amdgpu 0000:06:00.0: amdgpu: GPU reset begin! Jul 06 03:34:47 kernel: amd_iommu_report_page_fault: 36 callbacks suppressed Jul 06 03:34:47 kernel: amdgpu 0000:06:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x0000 address=0x1104075a0 flags=0x0070] Jul 06 03:34:47 kernel: amdgpu 0000:06:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x0000 address=0x110440000 flags=0x0070] Jul 06 03:34:47 kernel: [drm] free PSP TMR buffer Jul 06 03:34:47 kernel: amdgpu 0000:06:00.0: amdgpu: MODE2 reset Jul 06 03:34:47 kernel: amdgpu 0000:06:00.0: amdgpu: GPU reset succeeded, trying to resume Jul 06 03:34:47 kernel: [drm] PCIE GART of 1024M enabled. Jul 06 03:34:47 kernel: [drm] PTB located at 0x000000F401FA4000 Jul 06 03:34:47 kernel: [drm] PSP is resuming... Jul 06 03:34:47 kernel: [drm] reserve 0x400000 from 0xf47fc00000 for PSP TMR Jul 06 03:34:47 kernel: amdgpu 0000:06:00.0: amdgpu: RAS: optional ras ta ucode is not available Jul 06 03:34:47 kernel: amdgpu 0000:06:00.0: amdgpu: RAP: optional rap ta ucode is not available Jul 06 03:34:47 kernel: [drm] kiq ring mec 2 pipe 1 q 0 Jul 06 03:34:48 kernel: amdgpu 0000:06:00.0: [drm:amdgpu_ring_test_helper [amdgpu]] *ERROR* ring sdma0 test failed (-110) Jul 06 03:34:48 kernel: [drm:amdgpu_device_ip_resume_phase2 [amdgpu]] *ERROR* resume of IP block <sdma_v4_0> failed -110 Jul 06 03:34:48 kernel: amdgpu 0000:06:00.0: amdgpu: GPU reset(4) failed Jul 06 03:34:48 kernel: amdgpu 0000:06:00.0: amdgpu: GPU reset end with ret = -110 Jul 06 03:34:58 kernel: [drm:amdgpu_job_timedout [amdgpu]] *ERROR* ring gfx timeout, but soft recovered Jul 06 03:35:08 kernel: [drm:amdgpu_job_timedout [amdgpu]] *ERROR* ring gfx timeout, but soft recovered
Today on AMD 3550H, Arch Linux, linux 5.13.5, linux-firmware 20210315.3568f96. Froze for a few seconds, then recovered, had to restart Firefox though. Log: júl 27 15:04:40 tuf-red kernel: amdgpu 0000:05:00.0: amdgpu: [gfxhub0] retry page fault (src_id:0 ring:0 vmid:4 pasid:32778, for process firefox pid 23395 thread firefox:cs0 pid 23464) júl 27 15:04:40 tuf-red kernel: amdgpu 0000:05:00.0: amdgpu: in page starting at address 0x0000800120e35000 from IH client 0x1b (UTCL2) júl 27 15:04:40 tuf-red kernel: amdgpu 0000:05:00.0: amdgpu: VM_L2_PROTECTION_FAULT_STATUS:0x00401031 júl 27 15:04:40 tuf-red kernel: amdgpu 0000:05:00.0: amdgpu: Faulty UTCL2 client ID: TCP (0x8) júl 27 15:04:40 tuf-red kernel: amdgpu 0000:05:00.0: amdgpu: MORE_FAULTS: 0x1 júl 27 15:04:40 tuf-red kernel: amdgpu 0000:05:00.0: amdgpu: WALKER_ERROR: 0x0 júl 27 15:04:40 tuf-red kernel: amdgpu 0000:05:00.0: amdgpu: PERMISSION_FAULTS: 0x3 júl 27 15:04:40 tuf-red kernel: amdgpu 0000:05:00.0: amdgpu: MAPPING_ERROR: 0x0 júl 27 15:04:40 tuf-red kernel: amdgpu 0000:05:00.0: amdgpu: RW: 0x0 júl 27 15:04:40 tuf-red kernel: amdgpu 0000:05:00.0: amdgpu: [gfxhub0] retry page fault (src_id:0 ring:0 vmid:4 pasid:32778, for process firefox pid 23395 thread firefox:cs0 pid 23464) júl 27 15:04:40 tuf-red kernel: amdgpu 0000:05:00.0: amdgpu: in page starting at address 0x0000800120e36000 from IH client 0x1b (UTCL2) júl 27 15:04:40 tuf-red kernel: amdgpu 0000:05:00.0: amdgpu: VM_L2_PROTECTION_FAULT_STATUS:0x00000000 júl 27 15:04:40 tuf-red kernel: amdgpu 0000:05:00.0: amdgpu: Faulty UTCL2 client ID: CB (0x0) júl 27 15:04:40 tuf-red kernel: amdgpu 0000:05:00.0: amdgpu: MORE_FAULTS: 0x0 júl 27 15:04:40 tuf-red kernel: amdgpu 0000:05:00.0: amdgpu: WALKER_ERROR: 0x0 júl 27 15:04:40 tuf-red kernel: amdgpu 0000:05:00.0: amdgpu: PERMISSION_FAULTS: 0x0 júl 27 15:04:40 tuf-red kernel: amdgpu 0000:05:00.0: amdgpu: MAPPING_ERROR: 0x0 júl 27 15:04:40 tuf-red kernel: amdgpu 0000:05:00.0: amdgpu: RW: 0x0 júl 27 15:04:40 tuf-red kernel: amdgpu 0000:05:00.0: amdgpu: [gfxhub0] retry page fault (src_id:0 ring:0 vmid:4 pasid:32778, for process firefox pid 23395 thread firefox:cs0 pid 23464) júl 27 15:04:40 tuf-red kernel: amdgpu 0000:05:00.0: amdgpu: in page starting at address 0x0000800120e32000 from IH client 0x1b (UTCL2) júl 27 15:04:40 tuf-red kernel: amdgpu 0000:05:00.0: amdgpu: VM_L2_PROTECTION_FAULT_STATUS:0x00000000 júl 27 15:04:40 tuf-red kernel: amdgpu 0000:05:00.0: amdgpu: Faulty UTCL2 client ID: CB (0x0) júl 27 15:04:40 tuf-red kernel: amdgpu 0000:05:00.0: amdgpu: MORE_FAULTS: 0x0 júl 27 15:04:40 tuf-red kernel: amdgpu 0000:05:00.0: amdgpu: WALKER_ERROR: 0x0 júl 27 15:04:40 tuf-red kernel: amdgpu 0000:05:00.0: amdgpu: PERMISSION_FAULTS: 0x0 júl 27 15:04:40 tuf-red kernel: amdgpu 0000:05:00.0: amdgpu: MAPPING_ERROR: 0x0 júl 27 15:04:40 tuf-red kernel: amdgpu 0000:05:00.0: amdgpu: RW: 0x0 júl 27 15:04:40 tuf-red kernel: amdgpu 0000:05:00.0: amdgpu: [gfxhub0] retry page fault (src_id:0 ring:0 vmid:4 pasid:32778, for process firefox pid 23395 thread firefox:cs0 pid 23464) júl 27 15:04:40 tuf-red kernel: amdgpu 0000:05:00.0: amdgpu: in page starting at address 0x0000800120e26000 from IH client 0x1b (UTCL2) júl 27 15:04:40 tuf-red kernel: amdgpu 0000:05:00.0: amdgpu: VM_L2_PROTECTION_FAULT_STATUS:0x00000000 júl 27 15:04:40 tuf-red kernel: amdgpu 0000:05:00.0: amdgpu: Faulty UTCL2 client ID: CB (0x0) júl 27 15:04:40 tuf-red kernel: amdgpu 0000:05:00.0: amdgpu: MORE_FAULTS: 0x0 júl 27 15:04:40 tuf-red kernel: amdgpu 0000:05:00.0: amdgpu: WALKER_ERROR: 0x0 júl 27 15:04:40 tuf-red kernel: amdgpu 0000:05:00.0: amdgpu: PERMISSION_FAULTS: 0x0 júl 27 15:04:40 tuf-red kernel: amdgpu 0000:05:00.0: amdgpu: MAPPING_ERROR: 0x0 júl 27 15:04:40 tuf-red kernel: amdgpu 0000:05:00.0: amdgpu: RW: 0x0 júl 27 15:04:40 tuf-red kernel: amdgpu 0000:05:00.0: amdgpu: [gfxhub0] retry page fault (src_id:0 ring:0 vmid:4 pasid:32778, for process firefox pid 23395 thread firefox:cs0 pid 23464) júl 27 15:04:40 tuf-red kernel: amdgpu 0000:05:00.0: amdgpu: in page starting at address 0x0000800120e33000 from IH client 0x1b (UTCL2) júl 27 15:04:40 tuf-red kernel: amdgpu 0000:05:00.0: amdgpu: VM_L2_PROTECTION_FAULT_STATUS:0x00000000 júl 27 15:04:40 tuf-red kernel: amdgpu 0000:05:00.0: amdgpu: Faulty UTCL2 client ID: CB (0x0) júl 27 15:04:40 tuf-red kernel: amdgpu 0000:05:00.0: amdgpu: MORE_FAULTS: 0x0 júl 27 15:04:40 tuf-red kernel: amdgpu 0000:05:00.0: amdgpu: WALKER_ERROR: 0x0 júl 27 15:04:40 tuf-red kernel: amdgpu 0000:05:00.0: amdgpu: PERMISSION_FAULTS: 0x0 júl 27 15:04:40 tuf-red kernel: amdgpu 0000:05:00.0: amdgpu: MAPPING_ERROR: 0x0 júl 27 15:04:40 tuf-red kernel: amdgpu 0000:05:00.0: amdgpu: RW: 0x0 júl 27 15:04:40 tuf-red kernel: amdgpu 0000:05:00.0: amdgpu: [gfxhub0] retry page fault (src_id:0 ring:0 vmid:4 pasid:32778, for process firefox pid 23395 thread firefox:cs0 pid 23464) júl 27 15:04:40 tuf-red kernel: amdgpu 0000:05:00.0: amdgpu: in page starting at address 0x0000800120e0b000 from IH client 0x1b (UTCL2) júl 27 15:04:40 tuf-red kernel: amdgpu 0000:05:00.0: amdgpu: VM_L2_PROTECTION_FAULT_STATUS:0x00000000 júl 27 15:04:40 tuf-red kernel: amdgpu 0000:05:00.0: amdgpu: Faulty UTCL2 client ID: CB (0x0) júl 27 15:04:40 tuf-red kernel: amdgpu 0000:05:00.0: amdgpu: MORE_FAULTS: 0x0 júl 27 15:04:40 tuf-red kernel: amdgpu 0000:05:00.0: amdgpu: WALKER_ERROR: 0x0 júl 27 15:04:40 tuf-red kernel: amdgpu 0000:05:00.0: amdgpu: PERMISSION_FAULTS: 0x0 júl 27 15:04:40 tuf-red kernel: amdgpu 0000:05:00.0: amdgpu: MAPPING_ERROR: 0x0 júl 27 15:04:40 tuf-red kernel: amdgpu 0000:05:00.0: amdgpu: RW: 0x0 júl 27 15:04:40 tuf-red kernel: amdgpu 0000:05:00.0: amdgpu: [gfxhub0] retry page fault (src_id:0 ring:0 vmid:4 pasid:32778, for process firefox pid 23395 thread firefox:cs0 pid 23464) júl 27 15:04:40 tuf-red kernel: amdgpu 0000:05:00.0: amdgpu: in page starting at address 0x0000800120e03000 from IH client 0x1b (UTCL2) júl 27 15:04:40 tuf-red kernel: amdgpu 0000:05:00.0: amdgpu: VM_L2_PROTECTION_FAULT_STATUS:0x00000000 júl 27 15:04:40 tuf-red kernel: amdgpu 0000:05:00.0: amdgpu: Faulty UTCL2 client ID: CB (0x0) júl 27 15:04:40 tuf-red kernel: amdgpu 0000:05:00.0: amdgpu: MORE_FAULTS: 0x0 júl 27 15:04:40 tuf-red kernel: amdgpu 0000:05:00.0: amdgpu: WALKER_ERROR: 0x0 júl 27 15:04:40 tuf-red kernel: amdgpu 0000:05:00.0: amdgpu: PERMISSION_FAULTS: 0x0 júl 27 15:04:40 tuf-red kernel: amdgpu 0000:05:00.0: amdgpu: MAPPING_ERROR: 0x0 júl 27 15:04:40 tuf-red kernel: amdgpu 0000:05:00.0: amdgpu: RW: 0x0 júl 27 15:04:40 tuf-red kernel: amdgpu 0000:05:00.0: amdgpu: [gfxhub0] retry page fault (src_id:0 ring:0 vmid:4 pasid:32778, for process firefox pid 23395 thread firefox:cs0 pid 23464) júl 27 15:04:40 tuf-red kernel: amdgpu 0000:05:00.0: amdgpu: in page starting at address 0x0000800120e0a000 from IH client 0x1b (UTCL2) júl 27 15:04:40 tuf-red kernel: amdgpu 0000:05:00.0: amdgpu: VM_L2_PROTECTION_FAULT_STATUS:0x00000000 júl 27 15:04:40 tuf-red kernel: amdgpu 0000:05:00.0: amdgpu: Faulty UTCL2 client ID: CB (0x0) júl 27 15:04:40 tuf-red kernel: amdgpu 0000:05:00.0: amdgpu: MORE_FAULTS: 0x0 júl 27 15:04:40 tuf-red kernel: amdgpu 0000:05:00.0: amdgpu: WALKER_ERROR: 0x0 júl 27 15:04:40 tuf-red kernel: amdgpu 0000:05:00.0: amdgpu: PERMISSION_FAULTS: 0x0 júl 27 15:04:40 tuf-red kernel: amdgpu 0000:05:00.0: amdgpu: MAPPING_ERROR: 0x0 júl 27 15:04:40 tuf-red kernel: amdgpu 0000:05:00.0: amdgpu: RW: 0x0 júl 27 15:04:40 tuf-red kernel: amdgpu 0000:05:00.0: amdgpu: [gfxhub0] retry page fault (src_id:0 ring:0 vmid:4 pasid:32778, for process firefox pid 23395 thread firefox:cs0 pid 23464) júl 27 15:04:40 tuf-red kernel: amdgpu 0000:05:00.0: amdgpu: in page starting at address 0x0000800120e27000 from IH client 0x1b (UTCL2) júl 27 15:04:40 tuf-red kernel: amdgpu 0000:05:00.0: amdgpu: VM_L2_PROTECTION_FAULT_STATUS:0x00000000 júl 27 15:04:40 tuf-red kernel: amdgpu 0000:05:00.0: amdgpu: Faulty UTCL2 client ID: CB (0x0) júl 27 15:04:40 tuf-red kernel: amdgpu 0000:05:00.0: amdgpu: MORE_FAULTS: 0x0 júl 27 15:04:40 tuf-red kernel: amdgpu 0000:05:00.0: amdgpu: WALKER_ERROR: 0x0 júl 27 15:04:40 tuf-red kernel: amdgpu 0000:05:00.0: amdgpu: PERMISSION_FAULTS: 0x0 júl 27 15:04:40 tuf-red kernel: amdgpu 0000:05:00.0: amdgpu: MAPPING_ERROR: 0x0 júl 27 15:04:40 tuf-red kernel: amdgpu 0000:05:00.0: amdgpu: RW: 0x0 júl 27 15:04:40 tuf-red kernel: amdgpu 0000:05:00.0: amdgpu: [gfxhub0] retry page fault (src_id:0 ring:0 vmid:4 pasid:32778, for process firefox pid 23395 thread firefox:cs0 pid 23464) júl 27 15:04:40 tuf-red kernel: amdgpu 0000:05:00.0: amdgpu: in page starting at address 0x0000800120e02000 from IH client 0x1b (UTCL2) júl 27 15:04:40 tuf-red kernel: amdgpu 0000:05:00.0: amdgpu: VM_L2_PROTECTION_FAULT_STATUS:0x00000000 júl 27 15:04:40 tuf-red kernel: amdgpu 0000:05:00.0: amdgpu: Faulty UTCL2 client ID: CB (0x0) júl 27 15:04:40 tuf-red kernel: amdgpu 0000:05:00.0: amdgpu: MORE_FAULTS: 0x0 júl 27 15:04:40 tuf-red kernel: amdgpu 0000:05:00.0: amdgpu: WALKER_ERROR: 0x0 júl 27 15:04:40 tuf-red kernel: amdgpu 0000:05:00.0: amdgpu: PERMISSION_FAULTS: 0x0 júl 27 15:04:40 tuf-red kernel: amdgpu 0000:05:00.0: amdgpu: MAPPING_ERROR: 0x0 júl 27 15:04:40 tuf-red kernel: amdgpu 0000:05:00.0: amdgpu: RW: 0x0
Since this issue is present for multiple users can it be considered CONFIRMED? Also, does anyone know of a linux and linux-firmware version that is proven to be working correctly? I personally didn't have an issues yet on 5.10.17 + 20210315.3568f96.
(In reply to Mthw from comment #10) > Since this issue is present for multiple users can it be considered > CONFIRMED? Also, does anyone know of a linux and linux-firmware version that > is proven to be working correctly? I personally didn't have an issues yet on > 5.10.17 + 20210315.3568f96. Seems to be working fine on anything under linux-firmware version 20210517.
Using amdgpu.noretry=0 is the best workaround right now. The previous answers (by me) failed at one point or another.
updating bios to latest version helped for me this works: LENOVO 81FB/LNVNB161216, BIOS 7WCN38WW 11/04/2019 this one been faulty: LENOVO 81FB/LNVNB161216, BIOS 7WCN36WW 05/10/2019 maybe not so compatible with recent AMD microcode updates?
for me, this error begins with kernel 5.14. no issues on 5.13.13. amd apu 4800u.
here is a snippet: Nov 02 21:54:07 zeus kernel: amdgpu 0000:05:00.0: amdgpu: [mmhub0] no-retry page fault (src_id:0 ring:24 vmid:0 pasid:0, for process pid 0 thread pid 0) Nov 02 21:54:07 zeus kernel: amdgpu 0000:05:00.0: amdgpu: in page starting at address 0x0000000000002000 from IH client 0x12 (VMC) Nov 02 21:54:07 zeus kernel: amdgpu 0000:05:00.0: amdgpu: VM_L2_PROTECTION_FAULT_STATUS:0x00000000 Nov 02 21:54:07 zeus kernel: amdgpu 0000:05:00.0: amdgpu: Faulty UTCL2 client ID: MP1 (0x0) Nov 02 21:54:07 zeus kernel: amdgpu 0000:05:00.0: amdgpu: MORE_FAULTS: 0x0 Nov 02 21:54:07 zeus kernel: amdgpu 0000:05:00.0: amdgpu: WALKER_ERROR: 0x0 Nov 02 21:54:07 zeus kernel: amdgpu 0000:05:00.0: amdgpu: PERMISSION_FAULTS: 0x0 Nov 02 21:54:07 zeus kernel: amdgpu 0000:05:00.0: amdgpu: MAPPING_ERROR: 0x0 Nov 02 21:54:07 zeus kernel: amdgpu 0000:05:00.0: amdgpu: RW: 0x0 Nov 02 21:54:07 zeus kernel: amdgpu 0000:05:00.0: amdgpu: [mmhub0] no-retry page fault (src_id:0 ring:24 vmid:0 pasid:0, for process pid 0 thread pid 0) Nov 02 21:54:07 zeus kernel: amdgpu 0000:05:00.0: amdgpu: in page starting at address 0x0000000000003000 from IH client 0x12 (VMC) Nov 02 21:54:07 zeus kernel: amdgpu 0000:05:00.0: amdgpu: VM_L2_PROTECTION_FAULT_STATUS:0x00000031 Nov 02 21:54:07 zeus kernel: amdgpu 0000:05:00.0: amdgpu: Faulty UTCL2 client ID: MP1 (0x0) Nov 02 21:54:07 zeus kernel: amdgpu 0000:05:00.0: amdgpu: MORE_FAULTS: 0x1 Nov 02 21:54:07 zeus kernel: amdgpu 0000:05:00.0: amdgpu: WALKER_ERROR: 0x0 Nov 02 21:54:07 zeus kernel: amdgpu 0000:05:00.0: amdgpu: PERMISSION_FAULTS: 0x3 Nov 02 21:54:07 zeus kernel: amdgpu 0000:05:00.0: amdgpu: MAPPING_ERROR: 0x0 Nov 02 21:54:07 zeus kernel: amdgpu 0000:05:00.0: amdgpu: RW: 0x0 Nov 02 21:54:07 zeus kernel: amdgpu 0000:05:00.0: amdgpu: [mmhub0] no-retry page fault (src_id:0 ring:24 vmid:0 pasid:0, for process pid 0 thread pid 0) Nov 02 21:54:07 zeus kernel: amdgpu 0000:05:00.0: amdgpu: in page starting at address 0x0000000000004000 from IH client 0x12 (VMC) Nov 02 21:54:07 zeus kernel: amdgpu 0000:05:00.0: amdgpu: VM_L2_PROTECTION_FAULT_STATUS:0x00000000 Nov 02 21:54:07 zeus kernel: amdgpu 0000:05:00.0: amdgpu: Faulty UTCL2 client ID: MP1 (0x0) Nov 02 21:54:07 zeus kernel: amdgpu 0000:05:00.0: amdgpu: MORE_FAULTS: 0x0 Nov 02 21:54:07 zeus kernel: amdgpu 0000:05:00.0: amdgpu: WALKER_ERROR: 0x0 Nov 02 21:54:07 zeus kernel: amdgpu 0000:05:00.0: amdgpu: PERMISSION_FAULTS: 0x0 Nov 02 21:54:07 zeus kernel: amdgpu 0000:05:00.0: amdgpu: MAPPING_ERROR: 0x0 Nov 02 21:54:07 zeus kernel: amdgpu 0000:05:00.0: amdgpu: RW: 0x0 Nov 02 21:54:07 zeus kernel: amdgpu 0000:05:00.0: amdgpu: [mmhub0] no-retry page fault (src_id:0 ring:24 vmid:0 pasid:0, for process pid 0 thread pid 0) Nov 02 21:54:07 zeus kernel: amdgpu 0000:05:00.0: amdgpu: in page starting at address 0x0000000000005000 from IH client 0x12 (VMC) Nov 02 21:54:07 zeus kernel: amdgpu 0000:05:00.0: amdgpu: VM_L2_PROTECTION_FAULT_STATUS:0x00000000 Nov 02 21:54:07 zeus kernel: amdgpu 0000:05:00.0: amdgpu: Faulty UTCL2 client ID: MP1 (0x0) Nov 02 21:54:07 zeus kernel: amdgpu 0000:05:00.0: amdgpu: MORE_FAULTS: 0x0 Nov 02 21:54:07 zeus kernel: amdgpu 0000:05:00.0: amdgpu: WALKER_ERROR: 0x0 Nov 02 21:54:07 zeus kernel: amdgpu 0000:05:00.0: amdgpu: PERMISSION_FAULTS: 0x0 Nov 02 21:54:07 zeus kernel: amdgpu 0000:05:00.0: amdgpu: MAPPING_ERROR: 0x0 Nov 02 21:54:07 zeus kernel: amdgpu 0000:05:00.0: amdgpu: RW: 0x0
I have the same issue. System information (inxi) ``` System: Host: kraeh.datenwolf.net Kernel: 5.15.11_1 x86_64 bits: 64 Desktop: awesome 4.3 Distro: void Machine: Type: Laptop System: LENOVO product: 20UES00L00 v: ThinkPad T14 Gen 1 serial: PF2CQSMV Mobo: LENOVO model: 20UES00L00 serial: L1HF06R00VB UEFI: LENOVO v: R1BET61W(1.30 ) date: 12/21/2020 CPU: Info: 8-Core model: AMD Ryzen 7 PRO 4750U with Radeon Graphics bits: 64 type: MT MCP cache: L2: 4 MiB Speed: 1396 MHz min/max: 1400/1700 MHz Core speeds (MHz): 1: 1396 2: 1396 3: 1397 4: 1397 5: 1397 6: 1397 7: 1418 8: 1397 9: 1398 10: 1397 11: 1397 12: 1397 13: 1387 14: 1396 15: 1415 16: 1397 Graphics: Device-1: AMD Renoir driver: amdgpu v: kernel Device-2: Chicony Integrated Camera type: USB driver: uvcvideo Display: server: X.Org 1.20.14 driver: loaded: amdgpu,ati unloaded: fbdev,modesetting,vesa resolution: 1920x1080~60Hz OpenGL: renderer: AMD RENOIR (DRM 3.42.0 5.15.11_1 LLVM 12.0.1) v: 4.6 Mesa 21.3.2 Audio: Device-1: AMD driver: snd_hda_intel Device-2: AMD Raven/Raven2/FireFlight/Renoir Audio Processor driver: snd_rn_pci_acp3x Device-3: AMD Family 17h HD Audio driver: snd_hda_intel Sound Server-1: ALSA v: k5.15.11_1 running: yes Sound Server-2: PipeWire v: 0.3.42 running: yes ``` I'm running reliably into this issue when launching Elite:Dangerous through Steam/Proton. The crash happens during the planet generation shader profiling phase. The kernel log shows this: ``` [ 732.515287] amdgpu 0000:07:00.0: amdgpu: [mmhub0] no-retry page fault (src_id:0 ring:16 vmid:0 pasid:0, for process pid 0 thread pid 0) [ 732.515299] amdgpu 0000:07:00.0: amdgpu: in page starting at address 0x0000000000010000 from IH client 0x12 (VMC) [ 732.515306] amdgpu 0000:07:00.0: amdgpu: VM_L2_PROTECTION_FAULT_STATUS:0x00000021 [ 732.515309] amdgpu 0000:07:00.0: amdgpu: Faulty UTCL2 client ID: MP1 (0x0) [ 732.515312] amdgpu 0000:07:00.0: amdgpu: MORE_FAULTS: 0x1 [ 732.515314] amdgpu 0000:07:00.0: amdgpu: WALKER_ERROR: 0x0 [ 732.515316] amdgpu 0000:07:00.0: amdgpu: PERMISSION_FAULTS: 0x2 [ 732.515318] amdgpu 0000:07:00.0: amdgpu: MAPPING_ERROR: 0x0 [ 732.515320] amdgpu 0000:07:00.0: amdgpu: RW: 0x0 [ 732.515322] amdgpu 0000:07:00.0: amdgpu: [mmhub0] no-retry page fault (src_id:0 ring:24 vmid:0 pasid:0, for process pid 0 thread pid 0) [ 732.515327] amdgpu 0000:07:00.0: amdgpu: in page starting at address 0x0000000000011000 from IH client 0x12 (VMC) [ 732.515331] amdgpu 0000:07:00.0: amdgpu: VM_L2_PROTECTION_FAULT_STATUS:0x00000000 [ 732.515333] amdgpu 0000:07:00.0: amdgpu: Faulty UTCL2 client ID: MP1 (0x0) [ 732.515335] amdgpu 0000:07:00.0: amdgpu: MORE_FAULTS: 0x0 [ 732.515337] amdgpu 0000:07:00.0: amdgpu: WALKER_ERROR: 0x0 [ 732.515339] amdgpu 0000:07:00.0: amdgpu: PERMISSION_FAULTS: 0x0 [ 732.515341] amdgpu 0000:07:00.0: amdgpu: MAPPING_ERROR: 0x0 [ 732.515343] amdgpu 0000:07:00.0: amdgpu: RW: 0x0 ... [ 742.526075] [drm:amdgpu_job_timedout [amdgpu]] *ERROR* ring sdma0 timeout, signaled seq=12215, emitted seq=12218 [ 742.526228] [drm:amdgpu_job_timedout [amdgpu]] *ERROR* Process information: process pid 0 thread pid 0 [ 742.526361] amdgpu 0000:07:00.0: amdgpu: GPU reset begin! [ 742.630706] [drm] free PSP TMR buffer [ 742.658826] amdgpu 0000:07:00.0: amdgpu: MODE2 reset [ 742.658903] amdgpu 0000:07:00.0: amdgpu: GPU reset succeeded, trying to resume [ 742.659047] [drm] PCIE GART of 1024M enabled. [ 742.659066] [drm] PTB located at 0x000000F400900000 [ 742.659080] [drm] VRAM is lost due to GPU reset! [ 742.659470] [drm] PSP is resuming... [ 742.679525] [drm] reserve 0x400000 from 0xf41f800000 for PSP TMR [ 742.762836] amdgpu 0000:07:00.0: amdgpu: RAS: optional ras ta ucode is not available [ 742.771617] amdgpu 0000:07:00.0: amdgpu: RAP: optional rap ta ucode is not available [ 742.771622] amdgpu 0000:07:00.0: amdgpu: SECUREDISPLAY: securedisplay ta ucode is not available [ 742.771627] amdgpu 0000:07:00.0: amdgpu: SMU is resuming... [ 742.772490] amdgpu 0000:07:00.0: amdgpu: SMU is resumed successfully! [ 743.018374] [drm] kiq ring mec 2 pipe 1 q 0 [ 743.221239] amdgpu 0000:07:00.0: [drm:amdgpu_ring_test_helper [amdgpu]] *ERROR* ring sdma0 test failed (-110) [ 743.221401] [drm:amdgpu_device_ip_resume_phase2 [amdgpu]] *ERROR* resume of IP block <sdma_v4_0> failed -110 [ 743.221590] amdgpu 0000:07:00.0: amdgpu: GPU reset(1) failed [ 743.221704] amdgpu 0000:07:00.0: amdgpu: GPU reset end with ret = -110 [ 747.518075] gmc_v9_0_process_interrupt: 708172 callbacks suppressed [ 747.518084] amdgpu 0000:07:00.0: amdgpu: [mmhub0] no-retry page fault (src_id:0 ring:136 vmid:0 pasid:0, for process pid 0 thread pid 0) [ 747.518093] amdgpu 0000:07:00.0: amdgpu: in page starting at address 0x0000000000014000 from IH client 0x12 (VMC) [ 747.518100] amdgpu 0000:07:00.0: amdgpu: VM_L2_PROTECTION_FAULT_STATUS:0x00000111 [ 747.518102] amdgpu 0000:07:00.0: amdgpu: Faulty UTCL2 client ID: MP1 (0x0) [ 747.518105] amdgpu 0000:07:00.0: amdgpu: MORE_FAULTS: 0x1 [ 747.518106] amdgpu 0000:07:00.0: amdgpu: WALKER_ERROR: 0x0 [ 747.518108] amdgpu 0000:07:00.0: amdgpu: PERMISSION_FAULTS: 0x1 [ 747.518110] amdgpu 0000:07:00.0: amdgpu: MAPPING_ERROR: 0x1 [ 747.518112] amdgpu 0000:07:00.0: amdgpu: RW: 0x0 [ 747.518114] amdgpu 0000:07:00.0: amdgpu: [mmhub0] no-retry page fault (src_id:0 ring:16 vmid:0 pasid:0, for process pid 0 thread pid 0) [ 747.518118] amdgpu 0000:07:00.0: amdgpu: in page starting at address 0x0000000000010000 from IH client 0x12 (VMC) [ 747.518122] amdgpu 0000:07:00.0: amdgpu: VM_L2_PROTECTION_FAULT_STATUS:0x00000111 [ 747.518124] amdgpu 0000:07:00.0: amdgpu: Faulty UTCL2 client ID: MP1 (0x0) [ 747.518126] amdgpu 0000:07:00.0: amdgpu: MORE_FAULTS: 0x1 [ 747.518128] amdgpu 0000:07:00.0: amdgpu: WALKER_ERROR: 0x0 [ 747.518129] amdgpu 0000:07:00.0: amdgpu: PERMISSION_FAULTS: 0x1 [ 747.518131] amdgpu 0000:07:00.0: amdgpu: MAPPING_ERROR: 0x1 [ 747.518133] amdgpu 0000:07:00.0: amdgpu: RW: 0x0 ```
[Sun Jan 9 04:08:13 2022] Linux version 5.15.12-200.fc35.x86_64 [Sun Jan 9 04:08:13 2022] smpboot: CPU0: AMD Ryzen 7 5700G with Radeon Graphics (family: 0x19, model: 0x50, stepping: 0x0) ... [Sun Jan 9 04:08:13 2022] DMI: To Be Filled By O.E.M. To Be Filled By O.E.M./X300M-STX, BIOS P1.70 07/01/2021 ... [Sun Jan 9 16:43:53 2022] [drm] kiq ring mec 2 pipe 1 q 0 [Sun Jan 9 16:43:53 2022] [drm] DMUB hardware initialized: version=0x0101001C [Sun Jan 9 16:43:53 2022] amdgpu 0000:05:00.0: amdgpu: [gfxhub0] retry page fault (src_id:0 ring:0 vmid:7 pasid:32769, for process Xorg pid 2217 thread Xorg:cs0 pid 2218) [Sun Jan 9 16:43:53 2022] amdgpu 0000:05:00.0: amdgpu: in page starting at address 0x0000000000000000 from IH client 0x1b (UTCL2) [Sun Jan 9 16:43:53 2022] amdgpu 0000:05:00.0: amdgpu: VM_L2_PROTECTION_FAULT_STATUS:0x007C0071 [Sun Jan 9 16:43:53 2022] amdgpu 0000:05:00.0: amdgpu: Faulty UTCL2 client ID: CB (0x0) [Sun Jan 9 16:43:53 2022] amdgpu 0000:05:00.0: amdgpu: MORE_FAULTS: 0x1 [Sun Jan 9 16:43:53 2022] amdgpu 0000:05:00.0: amdgpu: WALKER_ERROR: 0x0 [Sun Jan 9 16:43:53 2022] amdgpu 0000:05:00.0: amdgpu: PERMISSION_FAULTS: 0x7 [Sun Jan 9 16:43:53 2022] amdgpu 0000:05:00.0: amdgpu: MAPPING_ERROR: 0x0 [Sun Jan 9 16:43:53 2022] amdgpu 0000:05:00.0: amdgpu: RW: 0x1
no crash anymore on kernel 5.16