Bug 205977

Summary: [amdgpu] Vega10 dpm defaults cause card to overvolt and overboost
Product: Drivers Reporter: stefanspr94
Component: Video(DRI - non Intel)Assignee: drivers_video-dri
Status: NEW ---    
Severity: high CC: haro41
Priority: P1    
Hardware: All   
OS: Linux   
Kernel Version: 5.4.6-gentoo Subsystem:
Regression: No Bisected commit-id:
Attachments: Testcase: Manually disabling ACG

Description stefanspr94 2019-12-26 18:37:37 UTC
On a Vega 64 with completely stock settings, boost clocks of 1800mhz+ can be observed when running Minecraft with shader mod or Outward (DXVK) or any OpenCL workload, sometimes leading to driver crashes. When no changes to overdrive settings are made, AVFS will also overvolt the card accordingly (I've seen up to 1.35V Vcore). This can potentially damage the hardware.

When disabling AVFS through enabling ppfeaturemask=fffd7fff and using a custom pp_table, the max voltage of 1.2V is respected, but the core clocks are still exceeded like before.

But there is a workaround: pp_table + echoing "profile_peak" or "high" to "power_dpm_force_performance_level", leads to max clocks AND voltages being respected. Even when setting it to "auto" afterwards. Maybe this points to where to look for the bug.

This is also an issue under Windows, so hopefully resolving this bug on Linux will also lead to a fix on that platform (I figure, there is some code sharing, considering I get the same behavior on both). This behaviour is present since the launch of Vega.
Comment 1 stefanspr94 2019-12-31 16:19:38 UTC
Created attachment 286549 [details]
Testcase: Manually disabling ACG

Manually disabling ACG leads to max. clocks being accepted all the time, but this only shows that there is an issue with ACG and is not a proper bugfix.