Bug 201067 - [bisected] [4.19-rc2 regression] Display corruption with Vega 64 in 4.19-rc2
Summary: [bisected] [4.19-rc2 regression] Display corruption with Vega 64 in 4.19-rc2
Status: NEW
Alias: None
Product: Drivers
Classification: Unclassified
Component: Video(DRI - non Intel) (show other bugs)
Hardware: All Linux
: P1 high
Assignee: drivers_video-dri
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2018-09-09 17:06 UTC by Nick Sarnie
Modified: 2019-01-02 18:56 UTC (History)
7 users (show)

See Also:
Kernel Version: 4.19-rc2
Subsystem:
Regression: Yes
Bisected commit-id:


Attachments
dmesg (81.33 KB, text/plain)
2018-09-09 17:06 UTC, Nick Sarnie
Details
Patch that reverts the 15% reduction in set dispclk (1.03 KB, patch)
2018-09-10 16:43 UTC, Nicholas Kazlauskas
Details | Diff
0001-drm-amd-display-Use-higher-dispclk-value-for-dce120.patch (1.29 KB, patch)
2018-09-11 17:47 UTC, Nicholas Kazlauskas
Details | Diff

Description Nick Sarnie 2018-09-09 17:06:44 UTC
Created attachment 278389 [details]
dmesg

Hi all,

When using a kernel after the below commit, I get visual corruption only on the right most vertical column of my screen:

Author: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com>
Date:   Mon Jul 23 14:13:23 2018 -0400

    drm/amd/display: Use calculated disp_clk_khz value for dce110


Video: 
https://www.youtube.com/watch?v=HrJUrBWMRXU

Using a kernel without this commit does not have the issue.

My GPU is a Sapphire Nitro+ Vega 64. I am using Gentoo with Mesa git and KDE Plasma 5.

My main monitor is connected over DP and runs at 2560x1440 at 144hz. My second monitor is connected over HDMI and runs at 1920x1080 at 60hz. I tried disabling the second monitor in KDE, but the issue still occurs.

It also occurs in fullscreen applications.

I will also attach dmesg.

I can test any patches or answer any questions.

Thanks,
Sarnex
Comment 1 Nicholas Kazlauskas 2018-09-10 16:43:52 UTC
Created attachment 278423 [details]
Patch that reverts the 15% reduction in set dispclk

While I can't reproduce your issue under a similar setup I think I have an idea of what the issue is.

Can you try this patch?
Comment 2 Nick Sarnie 2018-09-10 23:05:17 UTC
Hi Nicholas,

Thank you for the fast response. 

I can confirm the patch fixes the issue.

If you intend to submit this:
Tested-by: Nick Sarnie <sarnex@gentoo.org>

Thanks,
Sarnex
Comment 3 Nicholas Kazlauskas 2018-09-11 17:47:20 UTC
Created attachment 278455 [details]
0001-drm-amd-display-Use-higher-dispclk-value-for-dce120.patch

Do you mind testing another patch?

This patch has a narrower impact than the previous one (since it should only target Vega). I imagine that it would also fix your issue but it'd be nice to have verification.
Comment 4 Nick Sarnie 2018-09-11 22:27:20 UTC
Hi Nicholas,

I can also confirm that the second patch fixes the issue. 

Tested-by: Nick Sarnie <sarnex@gentoo.org>

Please let me know if you need anything else.

Thanks,
Sarnex
Comment 5 Dave Johnson 2018-09-12 00:23:14 UTC
This is possibly related to my issue that popped up in 4.18, Vega 64 works with one of my displayPort monitors but not the DVI.  from what I've seen others reporting on various searches on my issue, it's something with a clock setting where a khz value is referenced as "10" instead of "1" or something like that.  Vague, I know, but I dunno.

Some get black screen, I get one display but not multi, and the user above me has corruption.
Comment 6 Nicholas Kazlauskas 2018-09-14 13:07:58 UTC
I think you're referencing this patch:

https://patchwork.freedesktop.org/patch/238065/

Which should be fixed in 4.18.

Please post a new ticket with a full dmesg log, Xorg log and your distro/desktop environment if you can still reproduce the problem.
Comment 7 Dave Johnson 2018-09-15 16:54:39 UTC
(In reply to Nicholas Kazlauskas from comment #6)
> I think you're referencing this patch:
> 
> https://patchwork.freedesktop.org/patch/238065/
> 
> Which should be fixed in 4.18.
> 
> Please post a new ticket with a full dmesg log, Xorg log and your
> distro/desktop environment if you can still reproduce the problem.

Ok, so yes and no I guess.  I was assuming a single bug while there are apparently two.  The one that causes video corruption and artifacts is in fact fixed for me now, but the multi-head support is still broken.  It worked a couple of kernels ago, opening a new bug report over at opensuse since it seems to be opensuse specific for some reason (other live CDs work fine, looks like it's just something with Tumbleweed for whatever reason)
Comment 8 Dave Johnson 2018-09-15 17:45:18 UTC
Update: for my possibly-separate-issue I can confirm that multi-head works on 4.16.13 and not 4.19_RC3

For now I'm staying on 4.16 as it's working perfectly back there.
Comment 9 Benjamin Xiao 2018-11-14 00:23:34 UTC
I get the same visual corruption as well. It only appears when I run the monitor at 144Hz. 120Hz seems fine.
Comment 10 Benjamin Xiao 2018-11-14 00:28:49 UTC
(In reply to Nicholas Kazlauskas from comment #3)
> Created attachment 278455 [details]
> 0001-drm-amd-display-Use-higher-dispclk-value-for-dce120.patch
> 
> Do you mind testing another patch?
> 
> This patch has a narrower impact than the previous one (since it should only
> target Vega). I imagine that it would also fix your issue but it'd be nice
> to have verification.

Will this patch be backported to 4.19? Seems like right now its only in 4.20.
Comment 11 Harry Wentland 2018-11-14 15:00:42 UTC
GregKH just added the patch for 4.19-stable.
Comment 12 Dave Johnson 2018-12-04 19:43:08 UTC
This is fixed for me in 4.19-stable
Comment 13 Axel 2018-12-19 18:24:46 UTC
In which version should this bug being fixed? I still have this bug with 4.19.9. Or is this only fixed for vega? Because I've only a rx570.
Comment 14 Nicholas Kazlauskas 2018-12-19 18:31:05 UTC
(In reply to Axel from comment #13)
> In which version should this bug being fixed? I still have this bug with
> 4.19.9. Or is this only fixed for vega? Because I've only a rx570.

This was fixed for both Vega and Polaris but I think there was another regression that only affected Polaris in that release.

It should be fixed in amd-staging-drm-next. It'll probably make its way into stable at some point.
Comment 15 Axel 2019-01-02 18:56:51 UTC
It is fixed for me with kernel 4.20.0

Note You need to log in before you can comment on or make changes to this bug.