Bug 205915
Summary: | AMDGPU: Screen flicker after resume from suspend | ||
---|---|---|---|
Product: | Drivers | Reporter: | onil (onil) |
Component: | Video(DRI - non Intel) | Assignee: | drivers_video-dri |
Status: | NEW --- | ||
Severity: | normal | CC: | aarad.russell, akamch, alexdeucher, bjo, dbulashevich, erik.labianca, harry.wentland, kernel, labre |
Priority: | P1 | ||
Hardware: | x86-64 | ||
OS: | Linux | ||
Kernel Version: | 5.5.0-rc2 | Subsystem: | |
Regression: | No | Bisected commit-id: | |
Attachments: |
dmesg after resume from suspend
Xorg log on commit 50575eb5b339683ee148189beae5eb53ccb3158b Xorg log of good commit 50575eb5b339683ee148189beae5eb53ccb3158b Kernel log of good commit 50575eb5b339683ee148189beae5eb53ccb3158b Xorg log of bad commit 1ea8751bd28d1ec2b36a56ec6bc1ac28903d09b4 Kernel log of bad commit 1ea8751bd28d1ec2b36a56ec6bc1ac28903d09b4 Call optimize bandwidth again. |
Description
onil
2019-12-19 15:37:46 UTC
Please attach your xorg log (if using X) and dmesg output. Created attachment 286367 [details]
dmesg after resume from suspend
Problem only occurs when dpm_force_performance_level is set to "low" before suspend. if dpm_force_performance_level is "auto" or "high" there is no flickering after resume from suspend. Please disregard comment #3, flickering still occurs when performance level is "auto", after manually switching it to "low" or "high" flickering stops. Confirmed on navi14. Bisected it to 1ea8751bd28d1ec2b36a56ec6bc1ac28903d09b4. Log is indicating exactly nothing. I’ll attach them nonetheless, if you happen to see more than me. It’s just a wild guess, but could it be, that your indicator for a safe lowering of the clock is not always a good indicator? I mean you know your code a lot better than me. A clock, that is raised too high while voltage being too low could however be a reason for the flickering on OpenGL operations, that I’m seeing. Have not seen those on Vulkan or framebuffer though. Anyway, have a look for yourself: commit 1ea8751bd28d1ec2b36a56ec6bc1ac28903d09b4 Author: Noah Abradjian <noah.abradjian@amd.com> Date: Fri Sep 27 16:30:57 2019 -0400 drm/amd/display: Make clk mgr the only dto update point [Why] * Clk Mgr DTO update point did not cover all needed updates, as it included a check for plane_state which does not exist yet when the updater is called on driver startup * This resulted in another update path in the pipe programming sequence, based on a dppclk update flag * However, this alternate path allowed for stray DTO updates, some of which would occur in the wrong order during dppclk lowering and cause underflow [How] * Remove plane_state check and use of plane_res.dpp->inst, getting rid of sequence dependencies (this results in extra dto programming for unused pipes but that doesn't cause issues and is a small cost) * Allow DTOs to be updated even if global clock is equal, to account for edge case exposed by diags tests * Remove update_dpp_dto call in pipe programming sequence (leave update to dppclk_control there, as that update is necessary and shouldn't occur in clk mgr) * Remove call to optimize_bandwidth when committing state, as it is not needed and resulted in sporadic underflows even with other fixes in place Signed-off-by: Noah Abradjian <noah.abradjian@amd.com> Reviewed-by: Jun Lei <Jun.Lei@amd.com> Acked-by: Leo Li <sunpeng.li@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> .../gpu/drm/amd/display/dc/clk_mgr/dcn20/dcn20_clk_mgr.c | 14 +++++++++----- drivers/gpu/drm/amd/display/dc/clk_mgr/dcn21/rn_clk_mgr.c | 3 ++- drivers/gpu/drm/amd/display/dc/core/dc.c | 4 ---- drivers/gpu/drm/amd/display/dc/dcn20/dcn20_hwseq.c | 8 +------- 4 files changed, 12 insertions(+), 17 deletions(-) modified drivers/gpu/drm/amd/display/dc/clk_mgr/dcn20/dcn20_clk_mgr.c @@ -108,11 +108,12 @@ void dcn20_update_clocks_update_dpp_dto(struct clk_mgr_internal *clk_mgr, for (i = 0; i < clk_mgr->base.ctx->dc->res_pool->pipe_count; i++) { int dpp_inst, dppclk_khz; - if (!context->res_ctx.pipe_ctx[i].plane_state) - continue; - - dpp_inst = context->res_ctx.pipe_ctx[i].plane_res.dpp->inst; + /* Loop index will match dpp->inst if resource exists, + * and we want to avoid dependency on dpp object + */ + dpp_inst = i; dppclk_khz = context->res_ctx.pipe_ctx[i].plane_res.bw.dppclk_khz; + clk_mgr->dccg->funcs->update_dpp_dto( clk_mgr->dccg, dpp_inst, dppclk_khz); } @@ -235,6 +236,7 @@ void dcn2_update_clocks(struct clk_mgr *clk_mgr_base, update_dispclk = true; } + if (dc->config.forced_clocks == false || (force_reset && safe_to_lower)) { if (dpp_clock_lowered) { // if clock is being lowered, increase DTO before lowering refclk @@ -244,10 +246,12 @@ void dcn2_update_clocks(struct clk_mgr *clk_mgr_base, // if clock is being raised, increase refclk before lowering DTO if (update_dppclk || update_dispclk) dcn20_update_clocks_update_dentist(clk_mgr); - if (update_dppclk) + // always update dtos unless clock is lowered and not safe to lower + if (new_clocks->dppclk_khz >= dc->current_state->bw_ctx.bw.dcn.clk.dppclk_khz) dcn20_update_clocks_update_dpp_dto(clk_mgr, context); } } + if (update_dispclk && dmcu && dmcu->funcs->is_dmcu_initialized(dmcu)) { /*update dmcu for wait_loop count*/ modified drivers/gpu/drm/amd/display/dc/clk_mgr/dcn21/rn_clk_mgr.c @@ -171,7 +171,8 @@ void rn_update_clocks(struct clk_mgr *clk_mgr_base, // if clock is being raised, increase refclk before lowering DTO if (update_dppclk || update_dispclk) rn_vbios_smu_set_dppclk(clk_mgr, clk_mgr_base->clks.dppclk_khz); - if (update_dppclk) + // always update dtos unless clock is lowered and not safe to lower + if (new_clocks->dppclk_khz >= dc->current_state->bw_ctx.bw.dcn.clk.dppclk_khz) dcn20_update_clocks_update_dpp_dto(clk_mgr, context); } modified drivers/gpu/drm/amd/display/dc/core/dc.c @@ -1241,10 +1241,6 @@ static enum dc_status dc_commit_state_no_check(struct dc *dc, struct dc_state *c dc_enable_stereo(dc, context, dc_streams, context->stream_count); - if (!dc->optimize_seamless_boot) - /* pplib is notified if disp_num changed */ - dc->hwss.optimize_bandwidth(dc, context); - for (i = 0; i < context->stream_count; i++) context->streams[i]->mode_changed = false; modified drivers/gpu/drm/amd/display/dc/dcn20/dcn20_hwseq.c @@ -1202,15 +1202,9 @@ static void dcn20_update_dchubp_dpp( struct dpp *dpp = pipe_ctx->plane_res.dpp; struct dc_plane_state *plane_state = pipe_ctx->plane_state; - if (pipe_ctx->update_flags.bits.dppclk) { + if (pipe_ctx->update_flags.bits.dppclk) dpp->funcs->dpp_dppclk_control(dpp, false, true); - dc->res_pool->dccg->funcs->update_dpp_dto( - dc->res_pool->dccg, - dpp->inst, - pipe_ctx->plane_res.bw.dppclk_khz); - } - /* TODO: Need input parameter to tell current DCHUB pipe tie to which OTG * VTG is within DCHUBBUB which is commond block share by each pipe HUBP. * VTG is 1:1 mapping with OTG. Each pipe HUBP will select which VTG Created attachment 286981 [details]
Xorg log on commit 50575eb5b339683ee148189beae5eb53ccb3158b
Created attachment 286983 [details]
Xorg log of good commit 50575eb5b339683ee148189beae5eb53ccb3158b
Created attachment 286985 [details]
Kernel log of good commit 50575eb5b339683ee148189beae5eb53ccb3158b
Created attachment 286987 [details]
Xorg log of bad commit 1ea8751bd28d1ec2b36a56ec6bc1ac28903d09b4
Created attachment 286989 [details]
Kernel log of bad commit 1ea8751bd28d1ec2b36a56ec6bc1ac28903d09b4
Reverting 1ea8751bd28d1ec2b36a56ec6bc1ac28903d09b4 on HEAD, currently 5.5-rc7+, seems to fix it. The revert should probably be limited to the 2 hunks in the clock manager files to preserve the underflow avoidance: - if (update_dppclk) + // always update dtos unless clock is lowered and not safe to lower + if (new_clocks->dppclk_khz >= dc->current_state->bw_ctx.bw.dcn.clk.dppclk_khz) For the sake of comleteness: Steps to reproduce: 1. Suspend with Navi and activated display core. 2. Resume. 3. Start glxgears. Actual results: Animation flickers. This happens on any OpenGL accelerated task, like Firefox with hardware acceleration or similar. Expected results: Flicker-free rendering. Created attachment 287449 [details] Call optimize bandwidth again. I pinned the issue down to the four lines, which call optimize_bandwidth. Unfortunately the commit message indicates, that its removal fixed other issues, so it can not be simply put back in. Maybe this one is also vendor specific voltage lowering. Are there some sys nodes, which expose vendor adaptations? I’ve got one from MSI [1] with 130 Watt TDP. [1]: https://us.msi.com/Graphics-card/Radeon-RX-5500-XT-GAMING-X-8G/Specification Does this series help? https://patchwork.freedesktop.org/patch/353367/ See also: https://bugzilla.kernel.org/show_bug.cgi?id=205915 https://bugzilla.kernel.org/show_bug.cgi?id=206393 https://bugzilla.kernel.org/show_bug.cgi?id=206575 That fixed it. Thanks. |