Bug 210415 - [amdgpu] constant GPU hangs followed by kernel "BUG" and following kernel oops
Summary: [amdgpu] constant GPU hangs followed by kernel "BUG" and following kernel oops
Status: NEW
Alias: None
Product: Drivers
Classification: Unclassified
Component: Video(DRI - non Intel) (show other bugs)
Hardware: x86-64 Linux
: P1 normal
Assignee: drivers_video-dri
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2020-11-29 19:12 UTC by David Rubio
Modified: 2020-11-29 19:14 UTC (History)
0 users

See Also:
Kernel Version: 5.9.11
Subsystem:
Regression: No
Bisected commit-id:


Attachments
dmesg output (125.79 KB, text/plain)
2020-11-29 19:12 UTC, David Rubio
Details
lspci -vvv output (21.94 KB, text/plain)
2020-11-29 19:13 UTC, David Rubio
Details
lscpu output (2.28 KB, text/plain)
2020-11-29 19:14 UTC, David Rubio
Details

Description David Rubio 2020-11-29 19:12:08 UTC
Created attachment 293863 [details]
dmesg output

I have an RX 480. Every few hours after kernel 5.4 (!) I've been getting random GPU hangs, and after kernel 5.9, they became not only more frequent, but afterwards the kernel sent messages like 

Nov 29 15:44:31 reimu kernel: [drm] Bailing on TDR for s_job:34a, as another already in progress
Nov 29 15:44:31 reimu kernel: BUG: kernel NULL pointer dereference, address: 0000000000000020
Nov 29 15:44:31 reimu kernel: #PF: supervisor write access in kernel mode
Nov 29 15:44:31 reimu kernel: #PF: error_code(0x0002) - not-present page

And an Oops right afterwards
Oops: 0002 [#2] PREEMPT SMP NOPTI

The full dmesg is attached. Kernel is compiled with Archlinux kernel preferences, but using a kernel directly from kernel.org and compiled with the modules I need give me the same error.

Attached error.
Comment 1 David Rubio 2020-11-29 19:13:17 UTC
This is really been happening for really long, but the now-appearing kernel oops and BUG prints made me realize it's necessary to post this.
The exact GPU model is MSI RX 480 GAMING X.
Comment 2 David Rubio 2020-11-29 19:13:43 UTC
Created attachment 293865 [details]
lspci -vvv output
Comment 3 David Rubio 2020-11-29 19:14:05 UTC
Created attachment 293867 [details]
lscpu output

Note You need to log in before you can comment on or make changes to this bug.