Bug 207589
Summary: | amdgpu not working with kernel 5.6.x | ||
---|---|---|---|
Product: | Drivers | Reporter: | fannullone (bit.gossip) |
Component: | Video(Other) | Assignee: | drivers_video-other |
Status: | RESOLVED CODE_FIX | ||
Severity: | normal | CC: | alexdeucher, andy.holst, dick |
Priority: | P1 | ||
Hardware: | x86-64 | ||
OS: | Linux | ||
URL: | https://github.com/Dunedan/mbp-2016-linux/issues/142 | ||
Kernel Version: | 5.6.x | Subsystem: | |
Regression: | No | Bisected commit-id: | |
Attachments: |
lshw.log
hwinfo.log dmesg log Xorg log built in display working for kernel v5.5.0 built in display NOT working for compiled kernel v5.6.0 dmesg.kernel.v5.6.x_commit_fb95aae6e67c4e319a24b3eea32032d4246a5335.log dmesg.kernel.v5.5.0+.commit.8815a94f27d2f30fe1216ce10c7da0f6ae69ca0f.bad.log Xorg.0.8815a94f27d2f30fe1216ce10c7da0f6ae69ca0f.log Dmesg and Xorg logs after handful times bisecting the git kernel commits 2001-drm-amd-display-Force-link_rate-as-LINK_RATE_RBR2-fo.patch |
Description
fannullone
2020-05-05 20:42:56 UTC
Please attach your dmesg output and xorg log (if using X). Can you bisect? Created attachment 288983 [details]
lshw.log
lshw log from my MacBookPro13,3 with kernel 5.7 rc3
Created attachment 288985 [details]
hwinfo.log
hwinfo from my MacBookPro13,3 on kernel 5.7 rc3
Created attachment 288987 [details]
dmesg log
kernel 5.7 rc3
Created attachment 288989 [details]
Xorg log
kernel 5.7 rc3
Thanks for your reply Alex! I've bisected the problem to: b9f1246df179522bc28fda50b720553c845863db is the first bad commit commit b9f1246df179522bc28fda50b720553c845863db Author: Noah Abradjian <noah.abradjian@amd.com> Date: Fri Nov 22 16:07:24 2019 -0500 drm/amd/display: Collapse resource arrays when pipe is disabled [Why] Currently, pipe resources are assigned to an index that matches the pipe position. However, if pipe 1 or 2 is disabled, there will be a gap in the arrays which causes a crash when iterating based on pipe_count. [How] Fix resource construct to assign resources to minimum available array index. Signed-off-by: Noah Abradjian <noah.abradjian@amd.com> Reviewed-by: Yongqiang Sun <yongqiang.sun@amd.com> Acked-by: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> .../gpu/drm/amd/display/dc/dcn21/dcn21_resource.c | 20 ++++++++++---------- 1 file changed, 10 insertions(+), 10 deletions(-) https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=b9f1246df179522bc28fda50b720553c845863db I doubt that commit is the culprit It changes a file that is not even used on your asic. Can you attach your dmesg output from a kernel where it is working as well? Created attachment 289023 [details]
built in display working for kernel v5.5.0
Created attachment 289025 [details]
built in display NOT working for compiled kernel v5.6.0
> I doubt that commit is the culprit It changes a file that is not even used
> on your asic.
I was afraid of that, I found out that bisecting isn't that easy. I might have been hunting ghosts, I've marked booting with a black screen as "bad" and booting with a working framebuffer as "good".
I've now used linux-stable for bisecting, I can use another git repository or ranges if you'd like.
After bisecting a few times (more steps to go) I found that commit fb95aae6e67c4e319a24b3eea32032d4246a5335 (v5.6.0-rc1) is working for the built in display. Created attachment 289033 [details]
dmesg.kernel.v5.6.x_commit_fb95aae6e67c4e319a24b3eea32032d4246a5335.log
Created attachment 289037 [details]
dmesg.kernel.v5.5.0+.commit.8815a94f27d2f30fe1216ce10c7da0f6ae69ca0f.bad.log
commit 8815a94f27d2f30fe1216ce10c7da0f6ae69ca0f caused black builtin display, however it seems identified by Xorg log.
Created attachment 289039 [details]
Xorg.0.8815a94f27d2f30fe1216ce10c7da0f6ae69ca0f.log
Xorg.0.8815a94f27d2f30fe1216ce10c7da0f6ae69ca0f.log
I don't know how but my bisect was off by one ... This commit is reported to cause our problems: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?h=linux-5.6.y&id=4a8ca46bae8a See: https://github.com/aunali1/linux-mbp-arch/blob/master/2001-drm-amd-display-Force-link_rate-as-LINK_RATE_RBR2.patch Created attachment 289065 [details] Dmesg and Xorg logs after handful times bisecting the git kernel commits After a handful times bisecting the kernel git commits where v5.5.0 was set as good commit and v5.6.0 as bad commit it says that the first bad commit is: ------------------------------------------------------------------------------ f33a8770cdda79031a22241eaaac4eaf66e304fb is the first bad commit commit f33a8770cdda79031a22241eaaac4eaf66e304fb Author: Andrey Grodzovsky <andrey.grodzovsky@amd.com> Date: Fri Dec 6 12:43:30 2019 -0500 drm/amdgpu: Add task barrier to XGMI hive. Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com> Reviewed-by: Le Ma <Le.Ma@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> :040000 040000 664928e1bc23a51cd51d7dc2d40be9af9b332beb 071a99250da92b900dd06f3d7183d0df4abe6653 M drivers ------------------------------------------------------------------------------ There was two git commits that I couldn't compile, I assumed the first commit I couldn't commit as good one and the second one as bad one. I hope the dmesg and Xorg logs from the built kernel dmesg logs from the bisected git commits can give some clue what is going wrong with the built in display for MacBookPro13,3. The two git commits that I couldn't compile are also included in the tar archive. I bisected the commits a second time between good v5.5.0 and bad v5.6.0 and all commits I couldn't compile I counted as bad commit. The first bad commit I got this time is Author: Roman Li <roman.li@amd.com> Date: Fri Nov 22 10:58:10 2019 -0500 drm/amd/display: Default max bpc to 16 for eDP [Why] Some 10bit eDP panels don't lightup after we cap bpc to 8. [How] Set default max_bpc to 16 for edp connector type. Signed-off-by: Roman Li <roman.li@amd.com> Reviewed-by: Nicholas Kazlauskas <Nicholas.Kazlauskas@amd.com> Acked-by: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> :040000 040000 f23cf38da6c12011608bdcdda5cf6e0f63628254 f108cc1e2e108ce444439a1d05ac1a0b0a228562 M drivers Right the commit hash is 4a8ca46bae8affba063aabac85a0b1401ba810a3 for the first bad commit. Created attachment 290453 [details]
2001-drm-amd-display-Force-link_rate-as-LINK_RATE_RBR2-fo.patch
Aun-Ali Zaidi created this patch and this seems to fix the issue on my system.
There is already a similar fix upstream: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=dec9de2ada523b344eb2428abfedf9d6cd0a0029 Does that patch fix the issue? I can confirm the patch https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=dec9de2ada523b344eb2428abfedf9d6cd0a0029 included in the kernel version 5.7.10 makes the built-in display working again for MBP model 13,3. I can confirm that my mbp 13.3 is running again with amdgpu after upgrading to 5.7.12-200.fc32.x86_64. Can also use built-in screen together with external monitor just fine. Thanks everybody for great job! Actually the commit 639e0db2d70fb84833d96e782cc4a01825e03b13 seems to be one fixing the issue included in v5.8 not the https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=dec9de2ada523b344eb2428abfedf9d6cd0a0029 as suggested. This bug can be closed. Indeed, the bug can be closed since the built-in display is working again for kernel version 5.7.8+. Working great, thanks everybody who contributed to the fix! |