Bug 207693

Summary: amdgpu: RX 5500 XT boost frequency out of spec
Product: Drivers Reporter: Jan Ziak (0xe2.0x9a.0x9b)
Component: Video(DRI - non Intel)Assignee: drivers_video-dri
Status: NEW ---    
Severity: normal CC: alexdeucher
Priority: P1    
Hardware: All   
OS: Linux   
Kernel Version: 5.6.12 Subsystem:
Regression: No Bisected commit-id:

Description Jan Ziak 2020-05-11 20:34:30 UTC
Hello.

A Navi GPU (model: MSI RX 5500 XT Mech OC 8G) installed in my machine defaults to 1885 MHz in Linux as its maximum frequency. However, in Windows 10 the GPU's maximum observed operating frequency is 1845 MHz and the manufacturer's website states that the boost frequency of the GPU is 1845 MHz.

$ cat pp_od_clk_voltage 
OD_VDDC_CURVE:
0: 500MHz @ 730mV
1: 1192MHz @ 774mV
2: 1885MHz @ 1112mV

Sometimes, the printed voltage is 1107mV.

https://www.msi.com/Graphics-card/Radeon-RX-5500-XT-MECH-8G-OC

Windows uses 1845MHz@1112mV. If Linux is running the GPU at 1885 MHz shouldn't it also increase its voltage in order to decrease the probability of hardware errors?
Comment 1 Alex Deucher 2020-05-11 21:08:57 UTC
The vbios defines the clock frequencies and nominal voltages, not the driver.  The voltage is changed dynamically at runtime based on frequency and power and individual board leakage so you will see slight variations at runtime depending on the board.
Comment 2 Jan Ziak 2020-05-12 13:26:32 UTC
(In reply to Alex Deucher from comment #1)
> The vbios defines the clock frequencies and nominal voltages, not the
> driver.  The voltage is changed dynamically at runtime based on frequency
> and power and individual board leakage so you will see slight variations at
> runtime depending on the board.

particlefire from Vulkan demos (https://github.com/SaschaWillems/Vulkan) is an app with a relatively high power consumption (higher power consumption than Aida64 GPU stability test). On my machine&display it has performance of about 1000 FPS in a maximized window. I let it run for about 20 minutes, during which I manipulated GPU's fan speed.

According to /usr/bin/sensors, the GPU's junction/hotspot critical temperature is 99°C. So I lowered the fan RPM to less than 1000 in order to achieve higher temperatures. Even when the hotspot temperature was 105°C (6°C above critical) and GPU edge temperature was 86°C it had no effect on the FPS of particlefire (still about 1000 FPS).

radeontop (https://github.com/clbr/radeontop) was showing 1885MHz all the time during the testing.

In summary, I am unable to confirm your claim that the GPU is self-adjusting its voltage or frequency in Linux.

If you know an alternative approach (other than the one described above) to verify that the GPU is dynamically changing voltage and frequency in Linux due to temperatures and power consumption, please let me know.