Created attachment 257397 [details] dmesg with lockup warning at the end An amdgpu syscall, called by plasmashell, appears to deadlock randomly and freeze X.org completely. Several graphics processes, plasmashell and X.org are left stuck in D-State. Everything else continues to operate correctly, including audio, networking, etc.. The issue seems to appear more frequently whilst running games, although I am unable to find any particular pattern to it. Running Arch Linux with a custom compiled linux-zen kernel (with ACS override patches) and ZFS, although as far as I can tell those are not related to the issue, Mesa 17.1.4 with Radeon RX 480. The issue has been around for a while and I sadly do not remember when it first occured, but definitely the entire 4.11.x lineup is affected and I am fairly sure 4.10.x was as well. The issue is way too rare though for me to bisect the exact cause however.
Please provide the output of "cat /sys/kernel/debug/dri/0/amdgpu_fence_info" when this happens.
Created attachment 257449 [details] /sys/kernel/debug/dri/0/amdgpu_fence_info after being frozen for a few minutes Got the freeze again randomly, attached the output from /sys/kernel/debug/dri/0/amdgpu_fence_info.
That isn't related to any system call. The problem is simply that the hardware has crashed and some task is trying to push new commands to it, waiting for previous commands to end (which never happens). That is most likely a problem on the user space driver side and not related to the kernel at all. Please open a bug report on FDO for this.
Submitted on freedesktop.org bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=101746
Well, after encountering a possibly unrelated (reproducible) issue, causing the exact same symptons and a GPU reset (in debugfs) seems to recover correctly from that, I think this issue really just runs down to GPU resets not being issued automatically on the kernel side yet.