Bug 210739 - Regression in 5.10, Oops at amdgpu_connector_dp_detect()
Summary: Regression in 5.10, Oops at amdgpu_connector_dp_detect()
Status: RESOLVED CODE_FIX
Alias: None
Product: Drivers
Classification: Unclassified
Component: Video(DRI - non Intel) (show other bugs)
Hardware: All Linux
: P1 normal
Assignee: drivers_video-dri
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2020-12-17 00:21 UTC by Kris Karas
Modified: 2020-12-26 22:08 UTC (History)
3 users (show)

See Also:
Kernel Version: 5.10
Subsystem:
Regression: Yes
Bisected commit-id:


Attachments
dmesg from where X11 initializes (6.08 KB, text/plain)
2020-12-17 00:21 UTC, Kris Karas
Details
possible fix (1.65 KB, patch)
2020-12-17 17:24 UTC, Alex Deucher
Details | Diff
journalctl for debugging purposes (2.29 MB, text/plain)
2020-12-26 22:08 UTC, John Shand
Details

Description Kris Karas 2020-12-17 00:21:20 UTC
Created attachment 294169 [details]
dmesg from where X11 initializes

Booting into mainline 5.10.1 oopses in the amdgpu module in the routine amdgpu_connector_dp_detect().  Kernels 5.9.15 and below are not affected.

Full dmesg attached.  Brief snippet of oops:
    BUG: kernel NULL pointer dereference, address: 0000000000000060
    Call Trace:
     amdgpu_connector_dp_detect+0x159/0x320 [amdgpu]
     drm_helper_probe_detect+0x93/0xd0
     drm_helper_probe_single_connector_modes+0x60c/0x7e0
     drm_client_modeset_probe+0x25c/0x13c0
[etc]

The GPU is inbuilt within the AMD A10-7870K (Radeon R7).
At system start:
    modprobe amdgpu clk_support=1
    modprobe radeon clk_support=0
Console output seems to be OK during system initialization.  But as soon as Xorg starts, the oops occurs, and video output ceases.  System is still responsive via the network.

The oops locks the video hardware spectacularly.  Neither the BIOS nor pressing the motherboard <reset> button has any effect.  Only a power-cycle will get things back to normal.
Comment 1 Alex Deucher 2020-12-17 03:17:24 UTC
Fixed with this patch:
https://patchwork.freedesktop.org/patch/408230/
Which will be landing soon.
Comment 2 Alex Deucher 2020-12-17 03:18:37 UTC
(In reply to Alex Deucher from comment #1)
> Fixed with this patch:
> https://patchwork.freedesktop.org/patch/408230/
> Which will be landing soon.

Nevermind, this only applies when amdgpu.dc=1
Comment 3 Alex Deucher 2020-12-17 03:19:28 UTC
Can you bisect?
Comment 4 Kris Karas 2020-12-17 09:18:02 UTC
OK, just finished the bisect.
The errant commit is:
    65bf2cf95d3ade4b56c35b17bb955a64b7f4b019
    "drm/amdgpu: utilize subconnector property for DP through atombios"

I reverted that out of my 5.10.1, and all is happy on my machine.
Comment 5 Alex Deucher 2020-12-17 17:24:54 UTC
Created attachment 294203 [details]
possible fix

I think this patch should fix it.
Comment 6 Frank Kruger 2020-12-17 21:45:44 UTC
(In reply to Alex Deucher from comment #5)
> Created attachment 294203 [details]
> possible fix
> 
> I think this patch should fix it.

Will the fix be part of 5.10.2? Thx.
Comment 7 Kris Karas 2020-12-18 00:18:05 UTC
Tested the patch from comment 5, and I confirm it works as intended.

Thanks, Alex, for the fix!
I'll mark as closed.

Tested-By: Kris Karas <bugs-a17@moonlit-rail.com>
Comment 8 John Shand 2020-12-26 22:08:01 UTC
Created attachment 294349 [details]
journalctl for debugging purposes

Note You need to log in before you can comment on or make changes to this bug.