Bug 205853

Summary: amdgpu kernel bug: kernel null pointer dereference
Product: Drivers Reporter: Janpieter Sollie (janpieter.sollie)
Component: Video(DRI - non Intel)Assignee: drivers_video-dri
Status: RESOLVED DOCUMENTED    
Severity: normal CC: bjo
Priority: P1    
Hardware: x86-64   
OS: Linux   
Kernel Version: 5.4.2 Subsystem:
Regression: No Bisected commit-id:
Attachments: dmesg of kernel 5.4.2
lspci -v output
kernel config file
working .config

Description Janpieter Sollie 2019-12-14 06:55:13 UTC
Created attachment 286277 [details]
dmesg of kernel 5.4.2

possible duplicate of #204181, but as they are talking about X software, and I don't even have an xorg setup (issue happens at boot, device is only present for opencl software), I decided to file a new one:
AMDGPU kernel NULL pointer dereference, address: 0000000000000d71
using fiji device
lspci, dmesg and .config attached.
Comment 1 Janpieter Sollie 2019-12-14 06:56:50 UTC
Created attachment 286279 [details]
lspci -v output
Comment 2 Janpieter Sollie 2019-12-14 06:58:24 UTC
Created attachment 286281 [details]
kernel config file
Comment 3 Janpieter Sollie 2019-12-14 07:52:33 UTC
the issue seems to be in the DC driver: booting the amdgpu kernel module with "dc=0" reduces the output to:
========================
[    7.050414] amdgpu 0000:0a:00.0: [drm:amdgpu_ib_ring_tests [amdgpu]] *ERROR* IB test failed on uvd (-110).
[    7.151761] [drm:process_one_work] *ERROR* ib ring test failed (-110).
========================
Comment 4 Janpieter Sollie 2020-01-02 08:08:22 UTC
Created attachment 286571 [details]
working .config

this modified .config file allows the DC driver to get things up and running; however: as the IB ring error still exists, I'll have to look further into that, but that's another bug