Bug 211965

Summary: RX-480 Runs 40 degF hotter
Product: Platform Specific/Hardware Reporter: Dave (dcbdbisvet)
Component: x86-64Assignee: platform_x86_64 (platform_x86_64)
Status: NEW ---    
Severity: normal    
Priority: P1    
Hardware: Intel   
OS: Linux   
Kernel Version: 5.11.* Subsystem:
Regression: No Bisected commit-id:

Description Dave 2021-02-26 19:32:54 UTC
AMD TR-GenI, 16core, RX-480, 64Gb Ram, ASRock Taichi MB..

RX-480 runs 40 degF hotter than under 5.10.* kernels under the same workload.
 What was ~98-103, is now ~138-142degF under the same workload.
Comment 1 Dave 2021-02-26 23:03:38 UTC
Clarification. Kernel 5.11.* causes the extra heat. Not to overheat - just high heat and higher amperage draw of the GPu than under the same workloads running 5.10.*
Comment 2 Dave 2021-03-08 18:58:06 UTC
Still the same on kernel 5.11.4. When rolling back to LTS - GPU temp goes back down to normal.
Comment 3 Dave 2021-03-13 20:30:10 UTC
Found the reason why.....It's not a kernel bug at all - rather it is a change in the new AMDGPU code in 5.11.*.

The fan under 5.10.* kernels defaults to a slightly higher speed, than under 5.11.* kernels.

"sudo watch -n 0.5  cat /sys/kernel/debug/dri/0/amdgpu_pm_info" indicates that there is no difference at all in the various metrics of the card between 5.10 and 5.11.....

But the fan speed variances was discovered by doing multiple boots between the two kernel families, and it was here that it was discovered that the GPU Fan speeds between the two kernels do NOT boot up at the same speeds. The 5.11 kernel's default GPU fan speed is slower than 5.10's..

I installed a python daemon to control the fan speeds, configured the curve for my system, and heat is no longer a concern.


FYI,


Dave