Bug 210739

Summary: Regression in 5.10, Oops at amdgpu_connector_dp_detect()
Product: Drivers Reporter: Kris Karas (bugs-a21)
Component: Video(DRI - non Intel)Assignee: drivers_video-dri
Status: RESOLVED CODE_FIX    
Severity: normal CC: alexdeucher, fkrueger, jshand2013
Priority: P1    
Hardware: All   
OS: Linux   
Kernel Version: 5.10 Subsystem:
Regression: Yes Bisected commit-id:
Attachments: dmesg from where X11 initializes
possible fix
journalctl for debugging purposes

Description Kris Karas 2020-12-17 00:21:20 UTC
Created attachment 294169 [details]
dmesg from where X11 initializes

Booting into mainline 5.10.1 oopses in the amdgpu module in the routine amdgpu_connector_dp_detect().  Kernels 5.9.15 and below are not affected.

Full dmesg attached.  Brief snippet of oops:
    BUG: kernel NULL pointer dereference, address: 0000000000000060
    Call Trace:
     amdgpu_connector_dp_detect+0x159/0x320 [amdgpu]
     drm_helper_probe_detect+0x93/0xd0
     drm_helper_probe_single_connector_modes+0x60c/0x7e0
     drm_client_modeset_probe+0x25c/0x13c0
[etc]

The GPU is inbuilt within the AMD A10-7870K (Radeon R7).
At system start:
    modprobe amdgpu clk_support=1
    modprobe radeon clk_support=0
Console output seems to be OK during system initialization.  But as soon as Xorg starts, the oops occurs, and video output ceases.  System is still responsive via the network.

The oops locks the video hardware spectacularly.  Neither the BIOS nor pressing the motherboard <reset> button has any effect.  Only a power-cycle will get things back to normal.
Comment 1 Alex Deucher 2020-12-17 03:17:24 UTC
Fixed with this patch:
https://patchwork.freedesktop.org/patch/408230/
Which will be landing soon.
Comment 2 Alex Deucher 2020-12-17 03:18:37 UTC
(In reply to Alex Deucher from comment #1)
> Fixed with this patch:
> https://patchwork.freedesktop.org/patch/408230/
> Which will be landing soon.

Nevermind, this only applies when amdgpu.dc=1
Comment 3 Alex Deucher 2020-12-17 03:19:28 UTC
Can you bisect?
Comment 4 Kris Karas 2020-12-17 09:18:02 UTC
OK, just finished the bisect.
The errant commit is:
    65bf2cf95d3ade4b56c35b17bb955a64b7f4b019
    "drm/amdgpu: utilize subconnector property for DP through atombios"

I reverted that out of my 5.10.1, and all is happy on my machine.
Comment 5 Alex Deucher 2020-12-17 17:24:54 UTC
Created attachment 294203 [details]
possible fix

I think this patch should fix it.
Comment 6 Frank Kruger 2020-12-17 21:45:44 UTC
(In reply to Alex Deucher from comment #5)
> Created attachment 294203 [details]
> possible fix
> 
> I think this patch should fix it.

Will the fix be part of 5.10.2? Thx.
Comment 7 Kris Karas 2020-12-18 00:18:05 UTC
Tested the patch from comment 5, and I confirm it works as intended.

Thanks, Alex, for the fix!
I'll mark as closed.

Tested-By: Kris Karas <bugs-a17@moonlit-rail.com>
Comment 8 John Shand 2020-12-26 22:08:01 UTC
Created attachment 294349 [details]
journalctl for debugging purposes