Bug 199101 - AMDGPU Fury X random screen flicker on Linux kernel 4.16rc5
Summary: AMDGPU Fury X random screen flicker on Linux kernel 4.16rc5
Status: RESOLVED CODE_FIX
Alias: None
Product: Drivers
Classification: Unclassified
Component: Video(DRI - non Intel) (show other bugs)
Hardware: All Linux
: P1 normal
Assignee: drivers_video-dri
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2018-03-13 04:30 UTC by Kevin McCormack
Modified: 2019-02-03 18:42 UTC (History)
16 users (show)

See Also:
Kernel Version: 4.16rc5
Subsystem:
Regression: Yes
Bisected commit-id:


Attachments
Output from journalctl -k on 4.16.0-rc6 (84.59 KB, text/plain)
2018-03-20 16:49 UTC, Kevin McCormack
Details
Output from journalctl -k on 4.16.0-rc6-VEGA (97.30 KB, text/plain)
2018-03-25 02:04 UTC, Thomas Crider
Details
attachment-4360-0.html (1.26 KB, text/html)
2018-03-26 18:09 UTC, Kevin McCormack
Details

Description Kevin McCormack 2018-03-13 04:30:05 UTC
I compiled and installed 4.16rc on Arch Linux via the linux-git AUR package today. After rebooting I noticed some random flickering. Every 20-30 seconds or so a single white flash would take up the entire screen. This occurs on the desktop and in full screen OpenGL Dota 2. 

Here's some more system info

OpenGL version string: 3.0 Mesa 17.3.6


CPU hardware:
  x86_64
  AMD Ryzen 7 1800X Eight-Core Processor
    Max Speed: 4100 MHz
    Current Speed: 3600 MHz


Memory:
  16 GB
  Speed: 3200 MT/s



GPU hardware:
  OpenGL renderer string: AMD Radeon (TM) R9 Fury Series (FIJI / DRM 3.23.0 / 4.15.8-1-ARCH, LLVM 5.0.1)
  0b:00.0 VGA compatible controller [0300]: Advanced Micro Devices, Inc. [AMD/ATI] Fiji [Radeon R9 FURY / NANO Series] [1002:7300] (rev c8)


Motherboard:
  ASUSTeK COMPUTER INC.
  CROSSHAIR VI HERO
  BIOS Version: 3008
Comment 1 Michel Dänzer 2018-03-13 09:51:53 UTC
Please attach the corresponding dmesg output.
Comment 2 Kevin McCormack 2018-03-20 16:49:09 UTC
Created attachment 274825 [details]
Output from journalctl -k on 4.16.0-rc6
Comment 3 Kevin McCormack 2018-03-20 16:49:46 UTC
Still happening with rc6
Comment 4 Thomas Crider 2018-03-25 02:03:32 UTC
I'm also getting this with a fresh compile as of today (3-24). It's a screen flicker that occurs every few seconds.  

OpenGL version string: 3.1 Mesa 18.1.0-devel (git-d60eaf7b1f)  

CPU hardware:  
  x86_64  
  AMD Ryzen 7 1700X Eight-Core Processor  
    Current Speed: 3800 MHz  

Memory:  
  16 GB  
  Speed: 3200 MT/s  

GPU hardware:  
  OpenGL renderer string: Radeon RX Vega (VEGA10 / DRM 3.23.0 / 4.16.0-rc6-gd8a5b80568a9, LLVM 6.0.0)  
  0d:00.0 VGA compatible controller: Advanced Micro Devices, Inc. [AMD/ATI] Vega 10 XT [Radeon RX Vega 64] (rev c1)  

Motherboard:  
  ASUSTeK COMPUTER INC.  
  ROG STRIX X370-F GAMING  
  BIOS Version: 3803
Comment 5 Thomas Crider 2018-03-25 02:04:43 UTC
Created attachment 274921 [details]
Output from journalctl -k on 4.16.0-rc6-VEGA
Comment 6 The Linux kernel's regression tracker (Thorsten Leemhuis) 2018-03-26 08:58:42 UTC
(In reply to Kevin McCormack from comment #3)
> Still happening with rc6

Kevin: Is this still happening? And is this working with 4.15 (that's unclear from the initial report)? Just wondering, because I have this issue on the regression reports for 4.16
Comment 7 Kevin McCormack 2018-03-26 18:09:44 UTC
Created attachment 274949 [details]
attachment-4360-0.html

It's not a problem on 4.15. It was still a problem on the last rc I tried.

On March 26, 2018 4:58:42 AM EDT, bugzilla-daemon@bugzilla.kernel.org wrote:
>https://bugzilla.kernel.org/show_bug.cgi?id=199101
>
>Thorsten Leemhuis (regressions@leemhuis.info) changed:
>
>           What    |Removed                     |Added
>----------------------------------------------------------------------------
>              CC|                            |regressions@leemhuis.info
>
>--- Comment #6 from Thorsten Leemhuis (regressions@leemhuis.info) ---
>(In reply to Kevin McCormack from comment #3)
>> Still happening with rc6
>
>Kevin: Is this still happening? And is this working with 4.15 (that's
>unclear
>from the initial report)? Just wondering, because I have this issue on
>the
>regression reports for 4.16
>
>-- 
>You are receiving this mail because:
>You reported the bug.
Comment 8 The Linux kernel's regression tracker (Thorsten Leemhuis) 2018-03-30 12:12:58 UTC
@Michel Dänzer: Any progress with this? It's on the list of regressions for 4.16
Comment 9 Alex Deucher 2018-03-30 13:15:30 UTC
Can you bisect?
Comment 10 Thomas Crider 2018-03-30 13:20:13 UTC
just fyi I do not get the flicker on 4.16 rc3, this may help to shorten the time it takes to bisect
Comment 11 Samuel Grahn 2018-03-30 18:30:11 UTC
I get the same issue on arch kernels: linux (4.15), linux-mainline, and from linux-git(4.16rc6) on my Vega 64 when using both latest mesa-git (18) and stable mesa (17) from arch repos.

It seemed not to appear when using fullscreen Vulkan rendering, pointing at GL issues (right?), can examine further later.

Crosshair VI x370 mobo
Ryzen 1800X
16GB DDR4
STRIX Vega 64
Comment 12 Berillions 2018-04-03 20:29:05 UTC
I confirm that i have this issue too with final kernel 4.16, Rx560 and mesa-git.
The issue does not appears with the Kernel 4.15.15
Comment 13 Martin Babutzka 2018-04-04 07:05:02 UTC
I can confirm this issue also exists for the latest (from 2.4.2018) amd-staging-drm-next kernel and for the R9 380 with amdgpu dc.
The issue is also reported and commented on my repo:
https://github.com/M-Bab/linux-kernel-amdgpu-binaries/issues/50
Comment 14 Paweł 2018-04-06 10:58:30 UTC
Since nobody cared I bisected the issue:


>commit 36cc549d59864b7161f0e23d710c1c4d1b9cf022
>Author: Shirish S <shirish.s@amd.com>
>Date:   Wed Feb 28 12:14:58 2018 +0530
>
>    drm/amd/display: disable CRTCs with NULL FB on their primary plane (V2)
>    
>    The below commit
>    
>    "drm/atomic: Try to preserve the crtc enabled state in
>    >drm_atomic_remove_fb, v2"
>    
>    introduces a slight behavioral change to rmfb. Instead of disabling a crtc
>    when the primary plane is disabled, it now preserves it.
>    
>    This change leads to BUG hit while performing atomic commit on amd driver.
>    
>    As a fix this patch ensures that we disable the CRTC's with NULL FB by
>    >returning
>    -EINVAL and hence triggering fall back to the old behavior and turning off
>    >the
>    crtc in atomic_remove_fb().
>    
>    V2: Added error check for plane_state and removed sanity check for crtc.
>    
>    Signed-off-by: Shirish S <shirish.s@amd.com>
>    Signed-off-by: Pratik Vishwakarma <Pratik.Vishwakarma@amd.com>
>    Reviewed-by: Harry Wentland <harry.wentland@amd.com>
>    Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
>
>:040000 040000 9b8fd67908699d2651daa93fab59b21e7a76b1c6
>>21bbcb69561e67e5acf63d56344c7ba7ac4146a6 M      drivers

It makes my AMD Radeon RX 480 flicker a lot.
Comment 15 Harry Wentland 2018-04-06 14:15:18 UTC
We reproduced the issue and have someone looking into it.
Comment 16 Sergey Kondakov 2018-04-09 12:43:22 UTC
(In reply to Harry Wentland from comment #15)
> We reproduced the issue and have someone looking into it.

Too bad a fix doesn't seem to be included in 4.16.1. I myself got this on RX 580 (but on old FX-6100 / PCI-e v.2 system), wasn't able to revert commit above and just resorted to disabling DC with amdgpu.dc=0 boot option.
Comment 17 Berillions 2018-04-12 15:16:37 UTC
Still exist for me on kernel 4.16.2 but I have less randome flickering than kernel 4.16/4.16.1 ...

Rx560 - 4Go
Mesa-Git
llvm 5.0.1
Comment 18 Martin Babutzka 2018-04-13 18:00:10 UTC
Okay the AMD devs reverted the corresponding commit in amd-staging-drm-next (https://cgit.freedesktop.org/~agd5f/linux/commit/?h=amd-staging-drm-next&id=fc0644eddd2d5f77aac44ad2ab5a3edae08d11c2). I rebuild and tested them (https://github.com/M-Bab/linux-kernel-amdgpu-binaries) and can confirm the issue is fixed for now.
Comment 19 Kevin McCormack 2018-04-20 13:14:54 UTC
Thanks for testing, Martin. This doesn't appear to be included in 4.16.3 looking at https://cdn.kernel.org/pub/linux/kernel/v4.x/ChangeLog-4.16.3 and Arch just bumped the kernel from 4.15 to 4.16.3 so I'm holding back updating. 

Are we likely to see this included in 4.16.4?
Comment 20 Alex Deucher 2018-04-20 17:08:50 UTC
The revert is cc'ed to stable so it will show up in the 4.16 stable tree as well.
Comment 21 Kevin McCormack 2018-04-21 11:58:09 UTC
Thank you, Alex!
Comment 22 Kevin McCormack 2018-04-23 17:19:12 UTC
Flickering seems to be gone on 4.17rc2!

However, there's a new issue :/
https://bugs.freedesktop.org/show_bug.cgi?id=106194
Comment 23 Cm 2018-04-24 16:57:57 UTC
I experienced the flicker on 4.16.3 & I had to use amdgpu.dc=0 to suppress the flicker (but it also disabled audio over HDMI) on my RX 560.

The flicker seems to be gone on 4.16.4 for me (amdgpu.dc=0 was also removed), and audio over HDMI is working too.
Comment 24 Harry Wentland 2018-04-24 19:02:17 UTC
Kevin, can you mark this as resolved?
Comment 25 Thomas Crider 2018-04-24 19:24:10 UTC
I can also confirm the flicker is gone with 4.17rc2
Comment 26 Kevin McCormack 2018-04-24 19:29:37 UTC
It looks like this commit has made it into 4.16.4

https://cdn.kernel.org/pub/linux/kernel/v4.x/ChangeLog-4.16.4

Thanks all!

Note You need to log in before you can comment on or make changes to this bug.