Bug 201497

Summary: [amdgpu]: '*ERROR* No EDID read' is back in 4.19
Product: Drivers Reporter: Daniel Andersson (engywook)
Component: Video(DRI - non Intel)Assignee: drivers_video-dri
Status: NEW ---    
Severity: normal CC: alexdeucher, andyrtr, chadwik, chriskoch, e.singularitycat, erenoglu, harry.wentland, jacobbrett+kernel.org, jay, kernel, nicholas.kazlauskas, rev, sbernard, sebastian.larsson, spleefer90, starrtennis, the.tbog+kernel, xavierb, yalterz
Priority: P1    
Hardware: All   
OS: Linux   
Kernel Version: 4.19 4.20 5.0-rc1 5.0-rc3, 5.1-rc3 Subsystem:
Regression: Yes Bisected commit-id:
Attachments: 4.19 dmesg
4.20 didn't fix this
4.20 log
5.0-rc3 drm.debug=0x4
dmesg, amd-staging-drm-next-git (d0333a48e54fdcfdae5e378acd898a680967e939)
5.8 dmesg RX 580

Description Daniel Andersson 2018-10-23 17:23:44 UTC
Hey,

I have this EDID bug again, see https://lists.freedesktop.org/archives/amd-gfx/2018-July/023923.html

018d82e5f02ef3583411bcaa4e00c69786f46f19 seems to have gotten back in through:
# first bad commit: [d98c71dadc2d0debdb80beb5a478baf1e6f98758] Merge
drm-upstream/drm-next into drm-misc-next

/ Daniel
Comment 1 Nicholas Kazlauskas 2018-10-24 13:19:04 UTC
The revert was only supposed to apply to 4.18 and below, there should be a series of patches that fixes this for 4.19.

Please post an updated dmesg log for your 4.19 kernel.
Comment 2 Daniel Andersson 2018-10-24 13:41:58 UTC
Created attachment 279133 [details]
4.19 dmesg

dmesg attached
Comment 3 Daniel Andersson 2018-11-30 18:19:58 UTC
Any progress on this issue?
Comment 4 Daniel Andersson 2019-01-04 10:05:50 UTC
Created attachment 280261 [details]
4.20 didn't fix this

Still an issue on 4.20
Comment 5 Ivan Molodetskikh 2019-01-21 09:55:20 UTC
Created attachment 280619 [details]
4.20 log

I'm hitting the same issue with one of my monitors (ASUS VG248QE, G-Sync modded). I tried extracting EDID from kernel 4.18 and forcing it with drm.edid_firmware as you can see in the log, but that doesn't seem to do anything. My GPU is AMD RX 580.
Comment 6 Daniel Andersson 2019-01-23 21:43:48 UTC
Created attachment 280701 [details]
5.0-rc3 drm.debug=0x4

This is still an issue on 5.0-rc3.
Comment 7 Elliot Thomas 2019-01-23 23:49:31 UTC
Same here, RX580 and an ASUS PG278Q monitor, also with G-Sync support.
A different DisplayPort monitor (without G-Sync) worked perfectly.

I've noticed someone else with the same monitor having this issue over on the freedesktop bug tracker: https://bugs.freedesktop.org/show_bug.cgi?id=108806

Interestingly, in more recent kernels, the affected monitor appears to flash rapidly. I still see the EDID read fail in this case. I'll try and grab a dmesg of this at some point.
Comment 8 Elliot Thomas 2019-01-25 23:50:15 UTC
Created attachment 280777 [details]
dmesg, amd-staging-drm-next-git (d0333a48e54fdcfdae5e378acd898a680967e939)
Comment 9 Sebastien Bernard 2019-03-10 15:16:05 UTC
Same here RX480 and ASUS PG278Q also.
My archlinux is stuck at 4.18.16 since october 2018.
Is there a workaround or someone that is looking at this bug ?
It's been broken from 4.19 up to 5.0.
Comment 10 Daniel Andersson 2019-09-02 17:59:16 UTC
Is this going to get fixed?
Comment 11 sebastian.larsson 2019-10-08 04:45:23 UTC
Been struggling with this issue for quite some time
Is there any progress made?

ASUS ROG Swift PG278Q
R9 380X
5.4.0-rc1linux-5.4-rc1

[    2.776432] [drm:dc_link_detect [amdgpu]] *ERROR* No EDID read.
Comment 12 Sebastien Bernard 2019-12-16 19:14:25 UTC
I wonder if it's not a bad EDID from the monitor.
Anyway, that's a regression from 4.18.
Comment 13 chriskoch 2020-01-03 19:25:36 UTC
Same here with Acer XB240H and AMD Vega 56: 
https://bugzilla.kernel.org/show_bug.cgi?id=205987

All Kernels >4.15 have this
Comment 14 chriskoch 2020-01-03 19:29:30 UTC
*** Bug 205987 has been marked as a duplicate of this bug. ***
Comment 15 Ivan Molodetskikh 2020-02-24 12:01:59 UTC
I managed to force the kernel to use the EDID extracted via get-edid on kernel 4.18 thanks to https://bugzilla.kernel.org/show_bug.cgi?id=199799#c4 . The trick was to add video=DP-4:e to the kernel parameters, so in my case the complete addition looks like:

drm.edid_firmware=DP-4:edid/ASUS_VG248QE.bin video=DP-4:e

ASUS_VG248QE.bin was extracted with get-edid on kernel 4.18 and placed in /usr/lib/firmware/edid/. With this in place, I can use my monitor fine on the latest kernel (5.5.5).

Without this, I have the exact same issue on 5.5.5.
Comment 16 Mike Starr 2020-08-29 14:55:19 UTC
I have traced my personal version of this bug down to my Acer monitor and Logitech gamepad. Both are detected, but only the monitor works. The error spits out toward the end of dmesg about ten times in a row, twice. Thank you.
Comment 17 C0rn3j 2020-10-04 19:09:42 UTC
Created attachment 292811 [details]
5.8 dmesg RX 580

I believe am hitting the same issue.

I have a DP monitor ACER XB270HAbprz and it is stuck in 640x480 on my RX 580.

This is on Linux 5.8.

On monitor replug this relevant line pops up:
[drm:dc_link_detect_helper [amdgpu]] *ERROR* No EDID read.

This same monitor with the same cable works great on GTX 970 and GTX 1080 Ti under Nvidia 455.23.04.
Comment 18 chriskoch 2020-12-24 13:19:46 UTC
A short update:
In the meantime I am adding the modeline for my monitor manualy.
The automated solution is a 10-monitor.conf like in this thread:
https://archived.forum.manjaro.org/t/stuck-at-low-display-resolution/115976
This works until Kernel 5.8. With Kernel >= 5.9 my monitor doesn't accept the modeline anymore.
Since this problem has been reported 2 years ago and nothing happened, I am wondering if it will ever be fixed. Let me know how I can follow up/contribute to fix this.
Comment 19 xavier B 2020-12-26 10:32:13 UTC
hi!

I had this issue for a longtime. And then, I recently got a new hdmi cable and never had the error since!

(or at least symptoms of it: the pc having trouble to enable the display after a warm reboot, and having to do a 'really' cold reboot by turning the pc off for a few minutes to get it back.  I haven't been checking the logs extensively since it now works...)
Comment 20 Jay Tuckey 2021-01-24 05:32:55 UTC
Hi, I'm also running into this issue, using an RX Vega 64 with an Asus ROG display. Let me know if there's any logs or testing that I can do to assist.
Comment 21 Sebastien Bernard 2021-01-24 23:08:45 UTC
The more I think about it,
the more it seems to be related only to this monitor.

I think the 4.19 kernel closed a bug and is rejectiting the EDID reported by this screen.

If someone could validate this EDID is correct, it'll be of great help.
Comment 22 Jay Tuckey 2021-01-25 00:29:45 UTC
@Sebastien that could well be the case. The screen works fine under windows, but it could be that they are working around bad EDID data?

Is there any way I can validate if the EDID is bad?
Comment 23 chriskoch 2021-02-19 09:47:30 UTC
Update:
We are now at Kernel 5.11 and this problem is still not fixed. I am stuck at 5.8 + manually entering the EDID on startup.
When 5.8 gets too outdated for me, I will buy a new monitor. I guess we cannot expect any fix after almost 3 years.
Comment 24 Rev 2022-05-19 00:34:00 UTC
Issue is back in 5.17.9 (and 8), AMD 5600G, Mesa 22.2~git2205170600.fffafa~oibaf~f, so its 4 years now?
Comment 25 Emre 2023-04-08 17:26:20 UTC
I still see this error in 6.2.10-arch1-1. It happens for displays connected through a USB-C hub:

[    5.230549] EDID block 0 is all zeroes
[    5.230552] [drm:dc_link_add_remote_sink [amdgpu]] *ERROR* Bad EDID, status3!

A suspend / resume after system boots to login screen solves the issue. But it comes back sometimes when display is suspended due to idle or other system suspend/resume cycles.
Comment 26 Rev 2023-04-09 10:24:14 UTC
My display is connected via HDMI and on 6.2.10 mainline this error is coming up all 2 to 3 minutes in the logs. A relation to a display suspend is quite right in my eyes, because it stops when the display is woken up and starts again on display suspend.
Comment 27 Emre 2023-04-13 09:39:35 UTC
This makes my docked laptop almost unusable :( Once it suspends the displays, there's no coming back. I'd need to undock, and hope it recovers. After undocking, re-docking the USB-C hub does not help, external displays mostly dont come back. Needs reboot :(

I wonder if there's a way to change severity/importance of this bug, I'm worried that nobody looks at it as it's tagged 4.19 and 5.x while it exists in the latest version of kernel. This effects usability big time.
Comment 28 Rev 2023-04-13 09:54:01 UTC
I share that concern. Can the OP create a new report with a recent kernel and updated logs, maybe with a link to this report? That would be wonderful!

By the way, my last comment, that the issue stops when the display is woken up is wrong. Sometimes one display suspend is enough that the issue will repeat in the logs, even if the display is "on" after that all the time. But this is rather sporadic, sometimes the errors stop, sometimes they continue.