Bug 204987

Summary: fault in amdgpu_dm_atomic_commit_tail on Vega64 with compton and redshift
Product: Drivers Reporter: Frank Steinborn (f.steinborn)
Component: Video(DRI - non Intel)Assignee: drivers_video-dri
Status: RESOLVED OBSOLETE    
Severity: high CC: bhasker, bjo, vindicators
Priority: P1    
Hardware: x86-64   
OS: Linux   
Kernel Version: 5.3.1 Subsystem:
Regression: Yes Bisected commit-id:
Attachments: Kernel crash AMD GPU at amdgpu_dm_atomic_commit_tail

Description Frank Steinborn 2019-09-24 20:18:12 UTC
drm.debug=0x54 log uploaded here due to attachment size limit: https://nognu.de/p/1569355650

This is on 5.3.1 with this patch series applied: https://patchwork.freedesktop.org/series/64505/

It happens between ~5 and ~45 minutes after the system is booted into X. There is no obvious pattern what triggers it.
Comment 1 Frank Steinborn 2019-09-26 08:22:05 UTC
I can reproduce this reliably with running compton in combination with redshift. As soon as compton is running and redshifts starts to shift the screen, the failed commits start to show up.

Running Unigine Heaven while redshift is running but not shifting triggers it too as it resets the redshift gamma and redshift tries to shift it back.

The bug is not triggered when compton uses xrender as backend instead of GLX, which is somewhat expected I guess.
Comment 2 Frank Steinborn 2019-12-14 00:00:55 UTC
Still happens on 5.4.2.
Comment 3 Bhasker C V 2020-09-07 10:42:48 UTC
I get this error after hibernation and resume. This does not happen during immediate resume but if left overnight and resume in the morning, I see the amdgpu_dm_atomic_commit_tail.
I am failing to load kexec kernel when on AMD ryzen. Hence I have a shapshot of the error message. The system freezes and there is nothing that can be done other than to cold reboot. 

Attaching a photo of the crash
Comment 4 Bhasker C V 2020-09-07 10:44:25 UTC
Created attachment 292395 [details]
Kernel crash AMD GPU at amdgpu_dm_atomic_commit_tail