Bug 201539 - AMDGPU R9 390 automatic fan speed control in Linux 4.19/4.20/5.0
Summary: AMDGPU R9 390 automatic fan speed control in Linux 4.19/4.20/5.0
Status: NEW
Alias: None
Product: Drivers
Classification: Unclassified
Component: Video(DRI - non Intel) (show other bugs)
Hardware: All Linux
: P1 normal
Assignee: drivers_video-dri
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2018-10-27 12:32 UTC by Jan Ziak (http://atom-symbol.net)
Modified: 2020-05-21 15:31 UTC (History)
13 users (show)

See Also:
Kernel Version: 4.19, 4.20, 5.0
Tree: Mainline
Regression: No


Attachments
pwm1-4.18.16 (1.97 KB, text/plain)
2018-10-27 12:34 UTC, Jan Ziak (http://atom-symbol.net)
Details
pwm1-4.19.0 (5.48 KB, text/plain)
2018-10-27 12:35 UTC, Jan Ziak (http://atom-symbol.net)
Details

Description Jan Ziak (http://atom-symbol.net) 2018-10-27 12:32:41 UTC
GPU: R9 390
Kernel module: amdgpu
Application (example): Rise of the Tomb Raider benchmark

In Linux 4.18 the fan speed of the GPU gradually adapts to GPU load. Maximum pwm is 155 during the benchmark.

In Linux 4.19 the overall fan speed is lower (which is more silent than 4.18, with maximum at 112) but the fan spikes to full speed (pwm=255) for 1-2 seconds which is extremely loud.

The 4.18 and 4.19 kernels are using the same firmware.
Comment 1 Jan Ziak (http://atom-symbol.net) 2018-10-27 12:34:51 UTC
Created attachment 279205 [details]
pwm1-4.18.16

while true; do cat /sys/class/drm/card0/device/hwmon/hwmon2/pwm1; sleep 0.1s; done
Comment 2 Jan Ziak (http://atom-symbol.net) 2018-10-27 12:35:17 UTC
Created attachment 279207 [details]
pwm1-4.19.0

while true; do cat /sys/class/drm/card0/device/hwmon/hwmon2/pwm1; sleep 0.1s; done
Comment 3 Jan Ziak (http://atom-symbol.net) 2018-10-27 12:45:37 UTC
4.18.16:
$ cat /sys/class/drm/card0/device/hwmon/hwmon2/pwm1_enable
2

4.19.0:
$ cat /sys/class/drm/card0/device/hwmon/hwmon2/pwm1_enable
1
Comment 4 Sergey 2018-12-08 07:59:01 UTC
I have exactly the same problem with my R9 290X. I tested on the 4.19.6 kernel. The problem is still present, not how it is not solved. I have to use the 4.18 kernel.x, so as not to damage my graphics card.
Comment 5 Tony Chaveiro 2018-12-21 01:30:43 UTC
I experience the same issue

4.18-20 kernel on Manjaro (amdgpu + opencl taken from amdgpu pro on 3 x R9 390). Everything works fine. Fan rpm increase with load (tested with stak-xmr miner).

4.19-8-2 kernel: Rpm fans always on minimum setting. Once one of the gpus hits 95C, sudden bursts of 100% rpm occur (under 1s duration). Application has to be suspended or hardware damage will occur

checked pwm1_enable which displays "1". Changing to "2" has no effect. pwmconfig drives fans to 100% during config but is unable to provide control.

pwm1 reads different values (from 86 to 127) but fans don't appear to change speed at all.
Comment 6 Tony Chaveiro 2018-12-21 02:16:14 UTC
kernel parameters: amdgpu.cik_support=1 amdgpu.dc=0
Comment 7 Alex Deucher 2018-12-21 15:06:16 UTC
Does setting amdgpu.dpm=1 on 4.19 help?  The default dpm implementation changed between 4.18 and 4.19.
Comment 8 Tony Chaveiro 2018-12-21 15:56:09 UTC
(In reply to Alex Deucher from comment #7)
> Does setting amdgpu.dpm=1 on 4.19 help?  The default dpm implementation
> changed between 4.18 and 4.19.

amdgpu.dpm=1 produces black screen (unable to boot) in both 4.18 and 4.19. Using amdgpu.dc=1 produces error and freeze on boot
Comment 9 Jan Ziak (http://atom-symbol.net) 2018-12-24 11:55:44 UTC
Linux 4.20 behaves the same as Linux 4.19.
Comment 10 danglingpointerexception@gmail.com 2018-12-31 09:06:06 UTC
This bug is likely related.  Mine's a R9-290X...

https://bugs.freedesktop.org/show_bug.cgi?id=108781

I'm stuck on kernel 4.18.20
Comment 11 danglingpointerexception@gmail.com 2019-01-05 13:41:54 UTC
I found a solution for my problem and detailed it in the previous link.
My R9-290X now fully functions on 4.20.0
Comment 12 Rudolf Kastl 2019-03-09 02:44:32 UTC
Comment 10 and 11 are not related to the reported fan control issue.

I am still seeing the same problem on a r290x with amdgpu driver (and the correct firmware available in the initramfs) on 5.1.0-0.rc0.git4.2
Comment 13 Alex Smith 2019-04-17 10:37:11 UTC
I am also able to reproduce the fan control issue on an R9 290X with Fedora's kernel 5.0.3-200.fc29.x86_64.

The fans on the card do not spin up until the temperature reaches 95C, at which point they jump straight to 100%.

Same as comment 8, amdgpu.dpm=1 just gives me a blank screen at boot.
Comment 14 Steffen Klee 2019-06-06 11:18:21 UTC
Reproducible on an R9 390 with kernel 5.1.6-arch1-1-ARCH using amdgpu driver.
Setting amdgpu.dpm=1 or changing amdgpu.dc does not have any effect.

Also, sensors does not report the current fan RPM:
amdgpu-pci-1d00
Adapter: PCI adapter
vddgfx:       +1.00 V  
fan1:             N/A  (min =    0 RPM, max =    0 RPM)
temp1:        +59.0°C  (crit = +104000.0°C, hyst = -273.1°C)
power1:       47.24 W  (cap = 230.00 W)
Comment 15 danglingpointerexception@gmail.com 2019-06-06 11:44:17 UTC
@Rudolf - Have you tried my solution in the link I provided above?  I'm on 5.1.6 mainline and have no issues whatsoever with R9-290X

@Alex Smith - I've got a liquid-cooled card so don't know if my solution solves your fan problem but try following my steps in the link.

@Steffen - Try my steps in the link mate, it may solve your problem.  Alex Ducher himself gave me the tip on fixing it.
Comment 16 Steffen Klee 2019-06-06 12:18:17 UTC
(In reply to danglingpointerexception@gmail.com from comment #15)
> @Steffen - Try my steps in the link mate, it may solve your problem.  Alex
> Ducher himself gave me the tip on fixing it.

As far as I understand, you had issues with firmware loading. However, firmware is loading fine on my end:
[drm] Found UVD firmware Version: 1.64 Family ID: 9
[drm] Found VCE firmware Version: 50.10 Binary ID: 2
Comment 17 artheg 2019-06-07 01:41:39 UTC
I'm having the very same issues with my R9 290x as well.
Arch Linux, 5.1.7-arch1-1-ARCH.

(In reply to danglingpointerexception@gmail.com from comment #15)
I've tried your solution, but unfortunately it didn't work for me.
Comment 18 Sean Birkholz 2019-06-18 17:42:56 UTC
I am not a kernel developer and haven't done much programming as of late, so I am not really in a position to actually test this hypothesis.  However - from the bit of research I've done trying to figure this problem out for myself I believe the following explains the overheating and burst of fan speed instead of proper cooling behavior.

Here is my sensors bit from kernel 4.18.x - I have the R9-290.

amdgpu-pci-0100
Adapter: PCI adapter
vddgfx:           N/A  
fan1:           0 RPM
temp1:        +65.0°C  (crit = +120.0°C, hyst = +90.0°C)

Take note that this displays the proper critical and hysteresis values for my card.  If you look at the post on comment 14 which is how sensors display the crit/hyst value for kernels beyond 4.18.x you notice the critical value is about 19x the temperature of the surface of the sun and the hyst value is absolute zero.  These values are hard coded into kernel source code in some file, forgive me as I do not recall where I saw the code snippet.  But I strongly believe that correcting the values in the file or changing it to detect proper crit/hyst values based on card will correct this issue.  I simply do not have the means to do this, nor do I know how to submit kernel bug fixes and hope someone with more experience could give it a shot and see if the resulting kernel functions properly.
Comment 19 Sean Birkholz 2019-06-24 02:18:44 UTC
I've done a bit of digging and I've managed to get a proper hysteresis value to appear in a 5.1.14 kernel built from source.

I now have this output from sensors:

amdgpu-pci-0100
Adapter: PCI adapter
vddgfx:       +1.00 V  
fan1:             N/A  (min =    0 RPM, max =    0 RPM)
temp1:        +66.0°C  (crit = +104000.0°C, hyst = +90.0°C)
power1:       29.02 W  (cap = 208.00 W)

I don't know why proper values are not set automatically because I've found the correct values in tons of source files but none of the #defines appear to be used?  And much of the source doesn't appear to differ between 5.1.14 and 4.18.x

I modified (kernel src)/drivers/gpu/drm/amd/powerplay/inc/pp_thermal.h and changed the values of -273150 to 90000.  This corrects the hysteresis value but I'm still searching for where the critical temp value is actually set.

I *think* fixing these values may fix the fan problem because why would a fan spin up if its nowhere near the critical or hysteresis values?  No need.  Except when the critical value is 19x the temp of the sun, the card gets so hot it protects itself by maxing the fans for a short burst.  That is my theory anyway, hope to be able to test it soon but no promises.
Comment 20 artheg 2019-10-05 13:19:08 UTC
To anyone who's still struggling with this, perhaps this would be of help:
I'm using this script (https://github.com/grmat/amdgpu-fancontrol) as a service, with these params: 

TEMPS=( 65000 75000 80000 90000 )
PWMS=(      0   190   200   255 )

Perhaps someone could tweak this a little bit better, but this works for me.
My gpu still sounds like an airplane when I'm running a benchmark like Unigine-Heaven, but at least fans are spinning now.
Comment 21 MasterCATZ 2019-11-03 04:08:50 UTC
Having simular issues with 5.3.8-050308-generic 

when it starts happening this is being spammed in dmesg

amdgpu: [powerplay] 
                failed to send message 282 ret is 254


I loose write acess to 
/sys/class/drm/card1/device/hwmon/hwmon1/pwm1_enable

it is stuck on 2 , and runs the fans super low @ 20% causing the GPU to reach thermalmelt down 96 deg when the fan will do blips of 100% 
my bios was modded to even have a minimum fan speed of 50% and even this is being over written 

/sys/class/drm/card1/device/hwmon/hwmon1/pwm1
also can not adjust 



GRUB_CMDLINE_LINUX_DEFAULT="amdgpu.ppfeaturemask=0xffffffff amdgpu.dc=1 amdgpu.gpu_recovery=1 amdgpu.cik_support=1 amdgpu.dpm=1 radeon.cik_support=0"


if I reboot it works for a little while allowing me to change GPU speeds and fan speeds then .. I loose fan speed control and can not get it back off auto , which seems to be setup with fans speeds way too low 


 GL_RENDERER:   AMD Radeon R9 200 Series (HAWAII, DRM 3.33.0, 5.3.8-050308-generic, LLVM 9.0.0)
    GL_VERSION:    4.5 (Compatibility Profile) Mesa 19.3.0-devel (git-ff6e148 2019-10-29 bionic-oibaf-ppa)


if I disable amdgpu.dpm I can control the fans but then I can not do Auto GPU speeds and can not manually do my speeds 


my only guess is the firmware being loaded by kernel is the place containing the info for fan speeds ?
Comment 22 MasterCATZ 2019-11-03 04:16:43 UTC

(In reply to Sean Birkholz from comment #19)
> I've done a bit of digging and I've managed to get a proper hysteresis value
> to appear in a 5.1.14 kernel built from source.


> I modified (kernel src)/drivers/gpu/drm/amd/powerplay/inc/pp_thermal.h and
> changed the values of -273150 to 90000.  This corrects the hysteresis value
> but I'm still searching for where the critical temp value is actually set.


I think you hit the nail on the head 

amdgpu-pci-0100
Adapter: PCI adapter
vddgfx:       +0.90 V  
fan1:             N/A  (min =    0 RPM, max =    0 RPM)
edge:         +50.0°C  (crit = +104000.0°C, hyst = -273.1°C)
power1:       11.03 W  (cap = 208.00 W)
Comment 23 MasterCATZ 2019-11-03 04:29:35 UTC
the numbers used in the 
linux/drivers/gpu/drm/amd/powerplay
are correct as they are the values the bios uses 
but Linux is reading/using the values differently ...
Comment 24 MasterCATZ 2019-11-03 04:37:50 UTC
Signed-off-by: Evan Quan <evan.quan@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>

guess one of them should be able to find the issue
Comment 25 MasterCATZ 2019-11-05 23:47:54 UTC
5.4.0-050400rc6-generic

24 hours and I still have fan control 

99% chance I have just jinxed my self now
Comment 26 MasterCATZ 2019-11-06 09:35:20 UTC
around 12 hrs later lost fan control again
Comment 27 MasterCATZ 2019-11-16 00:42:35 UTC
pwmconfig 
seems to be the only thing that allows me to get manual mode back on 
I wounder if this is the actual program giving grieve
Comment 28 MasterCATZ 2019-11-16 00:57:34 UTC
hmm maybe not 
it lets me briefly access manual 


Found the following PWM controls:
   hwmon1/pwm1           current value: 68
hwmon1/pwm1 is currently setup for automatic speed control.
In general, automatic mode is preferred over manual mode, as
it is more efficient and it reacts faster. Are you sure that
you want to setup this output for manual control? (n) y
hwmon1/pwm1_enable stuck to 2
Manual control mode not supported, skipping hwmon1/pwm1.

wish I knew what the heck keeps locking pwm1_enable to auto @ low speeds :@
Comment 29 MasterCATZ 2019-11-16 12:07:35 UTC
from what I can work out the only difference between the kernel versions 
was they added extra thermal readings to support the newer cards with thermal junction sensors 


{-273150,  99000},
{ 120000, 120000},

has been in their since Jan 2018 ... 

looks like its reading the max temp settings from the bios 
I will confirm this tomorrow I will flash a custom bios 

/torvalds/linux/blob/master/drivers/gpu/drm/amd/powerplay/inc/hwmgr.h


/* The temperature, in 0.01 centigrades, below which we just run at a minimal PWM. */


so maybe it is thinking it can do 1000C ? 


anyhow as I don't want to run an altered bios as that would force fan 100% on boot  , what I decided to do was rip out all of AMD's new thermal code ...
Comment 30 MasterCATZ 2019-11-16 13:36:01 UTC
found them hard coded here for the R9 290 hawaii / Sea Islands chip sets 

so that will be a dirty way to get it to go 100% throttle sooner I'll set mine to 85000 and see how it goes , hopefully the rest follows  


linux/drivers/gpu/drm/amd/amdgpu/ci_dpm.c


if (adev->asic_type == CHIP_HAWAII) {
		pi->thermal_temp_setting.temperature_low = 94500;
		pi->thermal_temp_setting.temperature_high = 95000;
		pi->thermal_temp_setting.temperature_shutdown = 104000;
	} else {
		pi->thermal_temp_setting.temperature_low = 99500;
		pi->thermal_temp_setting.temperature_high = 100000;
		pi->thermal_temp_setting.temperature_shutdown = 104000;
	}
Comment 31 MasterCATZ 2019-11-16 14:15:58 UTC
oh I love it they know the drivers file is crap 

anyhow it looks like the real issue is in the GPU driver 

fan speeds temps everything is in their , ofcause this would not be an issue if 
pwm1_enable  was NOT STUCK  ON AUTO 


#if 0
		/* XXX: need to figure out how to handle this properly */
		tmp = RREG32_SMC(ixCG_THERMAL_CTRL);
		tmp &= DPM_EVENT_SRC_MASK;
		tmp |= DPM_EVENT_SRC(dpm_event_src);
		WREG32_SMC(ixCG_THERMAL_CTRL, tmp);
#endif
Comment 32 MasterCATZ 2019-11-16 22:35:07 UTC
apparently I was looking through kernel 4.7 code on my pc and not master
linux/drivers/gpu/drm/amd/powerplay/hwmgr/smu7_hwmgr.c
looks like the new file name 

as they relocated ci_dpm.c to 
/home/aio/Programs/linux/drivers/gpu/drm/radeon/ci_dpm.c
Comment 33 MasterCATZ 2019-11-17 00:43:11 UTC
another plan 

drivers/gpu/drm/amd/powerplay/hwmgr/hwmgr.c 

hwmgr->dpm_level = AMD_DPM_FORCED_LEVEL_AUTO;
	hwmgr_init_default_caps(hwmgr);
	hwmgr_init_default_caps(hwmgr);
	hwmgr_set_user_specify_caps(hwmgr);
	hwmgr_set_user_specify_caps(hwmgr);
	hwmgr->fan_ctrl_is_in_default_mode = true;

change to false to disable auto .. not like its going to be any worse for us 

then GPU's thermal system will run and you can actually manually run the fans 
but unsure if this will stop auto core speed power save features as well
Comment 34 MasterCATZ 2019-11-17 07:46:02 UTC
success 

drivers/gpu/drm/amd/powerplay/hwmgr/hwmgr.c 

hwmgr->fan_ctrl_is_in_default_mode = false;

it will now boot up in manual mode


finally I have fan control "AMD_DPM_FORCED_LEVEL_AUTO"  
I am wondering just how "FORCED" that "AUTO" is meant to be ....

how ever once you put it back to "2" "AUTO" it takes control again ... and will overwrite your "0" card control and "1" manual  

echo 2 >  /sys/class/drm/card1/device/hwmon/hwmon1/pwm1_enable 
don't do it unless you want to reboot with a hot GPU :P 



also the crit temp for "Sea Island" cards like my R9 290 is defiantly being retrieved from 
drivers/gpu/drm/amd/powerplay/hwmgr/smu7_hwmgr.c 

thermal_temp_setting.temperature_shutdown


if (hwmgr->chip_id  == CHIP_HAWAII) {
		data->thermal_temp_setting.temperature_low = 74500;
		data->thermal_temp_setting.temperature_high = 80000;
		data->thermal_temp_setting.temperature_shutdown = 98000;




and the fans still spin slow regardless how low I set it .. sooo .. somethings broken ... so looks like I will be doing a custom kernel on every update for a while now to disable AUTO fan control 

and for some reason AMD devs feel 120 deg is NORMAL for a GPU and users want quite fans ... I give up ...
Comment 35 michael 2019-12-05 17:00:31 UTC
I discovered a workaround that works for my R9-290 and Debian 5.3.0 kernel:

  root@joyola:~# echo "2" >>/sys/class/drm/card0/device/hwmon/hwmon3/pwm1_enable 
  root@joyola:~# echo "0" >>/sys/class/drm/card0/device/hwmon/hwmon3/pwm1_enable 

pwm1_enable will still be 2 afterwards, but (after spinning the fans at max for a bit) automatic fan control works for me. I also have to do the same pwm1_enable prodding after resuming from suspend.

(If it matters, I boot with radeon.cik_support=0 amdgpu.cik_support=1 radeon.si_support=0 amdgpu.si_support=1 amdgpu.dc_log=1 amdgpu.gpu_recovery=1)

I still have the same brokenness as reported in comment 14 though.
Comment 36 MasterCATZ 2019-12-06 02:58:05 UTC
after having good fan control for a few weeks
 5.4.2-050402-generic is now having a melt down back to trying to run the cards @
 (crit = +104000.0°C, hyst = -273.1°C)

and this is whats got me stumped , it seems to go auto when it hits high temp ~ 70 then starts dropping the fan speed I can exit a game set a high fan speed it will sit their @ 60 deg for a good 20 mins with ~ 60% , decide to go back into game .. hits 70 .. fan speeds keep dropping until its 20% and blipping 100% @ 95 deg

I am very close to going back to liquid cooling ... or connecting the fan to a manual speed controller ( if someone knows of a way I can still have the fan connected dor driver control and monitoring with a manual device override for PWM I am all ears , would it be safe for me to just use a thermostat to just send voltage to the fan ? ie) 2x input power sources 



my guess base or asic  is what its reading now about to hack away at those modules and try again 

/home/aio/Programs/linux/drivers/gpu/drm/i915/oa/i915_oa_tgl.c
/home/aio/Programs/linux/drivers/gpu/drm/amd/include/asic_reg/vce/vce_4_0_default.h
/home/aio/Programs/linux/drivers/gpu/drm/nouveau/nvkm/engine/ce/gf100.c
/home/aio/Programs/linux/drivers/gpu/drm/nouveau/nvkm/engine/ce/gt215.c
/home/aio/Programs/linux/drivers/gpu/drm/nouveau/nvkm/engine/device/base.c
Comment 37 MasterCATZ 2019-12-06 03:04:09 UTC
well its neither of those modules 
I should have looked at the files after I scanned for files containing 104000


I can not even force run the cards in performance mode anymore with 100% fan speed stuck on 

if i could just find the setting to tell amdgpu / hwmon / powerplay what temp I call hot this would be solved
Comment 38 muncrief 2019-12-06 03:22:25 UTC
(In reply to MasterCATZ from comment #37)
> well its neither of those modules 
> I should have looked at the files after I scanned for files containing 104000
> 
> 
> I can not even force run the cards in performance mode anymore with 100% fan
> speed stuck on 
> 
> if i could just find the setting to tell amdgpu / hwmon / powerplay what
> temp I call hot this would be solved

Here is a slightly modified version of a fan control script, along with the service to run it, from the Arch Wiki. I don't know what distribution you use but hopefully this will at least get you started. Unfortunately it doesn't seem like the kernel devs are interested in fixing this, so after a long time I just had to use this kludgey solution.

1. Create a file with the following contents named "amdgpu-fancontrol" in "/usr/local/bin" and make it executable.

--------------- Start amdgpu-fancontrol ---------------

#!/bin/bash

HYSTERESIS=6000   # in mK
SLEEP_INTERVAL=1  # in s
DEBUG=true

# set temps (in degrees C * 1000) and corresponding pwm values in ascending order and with the same amount of values
TEMPS=( 40000  50000  65000 75000 80000 90000 )
PWMS=(      0  100     140   190   200   255 )

# hwmon paths, hardcoded for one amdgpu card, adjust as needed
HWMON=$(ls /sys/class/drm/card0/device/hwmon)
FILE_PWM=$(echo /sys/class/drm/card0/device/hwmon/$HWMON/pwm1)
FILE_FANMODE=$(echo /sys/class/drm/card0/device/hwmon/$HWMON/pwm1_enable)
FILE_TEMP=$(echo /sys/class/drm/card0/device/hwmon/$HWMON/temp1_input)
# might want to use this later
#FILE_TEMP_CRIT=$(echo /sys/class/hwmon/hwmon?/temp1_crit_hyst)
[[ -f "$FILE_PWM" && -f "$FILE_FANMODE" && -f "$FILE_TEMP" ]] || { echo "invalid hwmon files" ; exit 1; }

# load configuration file if present
[ -f /etc/amdgpu-fancontrol.cfg ] && . /etc/amdgpu-fancontrol.cfg

# check if amount of temps and pwm values match
if [ "${#TEMPS[@]}" -ne "${#PWMS[@]}" ]
then
  echo "Amount of temperature and pwm values does not match"
  exit 1
fi

# checking for privileges
if [ $UID -ne 0 ]
then
  echo "Writing to sysfs requires privileges, relaunch as root"
  exit 1
fi

function debug {
  if $DEBUG; then
    echo $1
  fi
}

# set fan mode to max(0), manual(1) or auto(2)
function set_fanmode {
  echo "setting fan mode to $1"
  echo "$1" > "$FILE_FANMODE"
}

function set_pwm {
  NEW_PWM=$1
  OLD_PWM=$(cat $FILE_PWM)

  echo "current pwm: $OLD_PWM, requested to set pwm to $NEW_PWM"
  debug "current pwm: $OLD_PWM, requested to set pwm to $NEW_PWM"
  if [ $(cat ${FILE_FANMODE}) -ne 1 ]
  then
    echo "Fanmode not set to manual."
    set_fanmode 1
  fi

  if [ "$NEW_PWM" -gt "$OLD_PWM" ] || [ -z "$TEMP_AT_LAST_PWM_CHANGE" ] || [ $(($(cat $FILE_TEMP) + HYSTERESIS)) -le "$TEMP_AT_LAST_PWM_CHANGE" ]; then
    $DEBUG || echo "current temp: $TEMP"
    echo "temp at last change was $TEMP_AT_LAST_PWM_CHANGE"
    echo "changing pwm to $NEW_PWM"
    echo "$NEW_PWM" > "$FILE_PWM"
    TEMP_AT_LAST_PWM_CHANGE=$(cat $FILE_TEMP)
  else
    debug "not changing pwm, we just did at $TEMP_AT_LAST_PWM_CHANGE, next change when below $((TEMP_AT_LAST_PWM_CHANGE - HYSTERESIS))"
  fi
}

function interpolate_pwm {
  i=0
  TEMP=$(cat $FILE_TEMP)

  debug "current temp: $TEMP"

  if [[ $TEMP -le ${TEMPS[0]} ]]; then
    # below first point in list, set to min speed
    set_pwm "${PWMS[i]}"
    return
  fi

  for i in "${!TEMPS[@]}"; do
    if [[ $i -eq $((${#TEMPS[@]}-1)) ]]; then
      # hit last point in list, set to max speed
      set_pwm "${PWMS[i]}"
      return
    elif [[ $TEMP -gt ${TEMPS[$i]} ]]; then
      continue
    fi

    # interpolate linearly
    LOWERTEMP=${TEMPS[i-1]}
    HIGHERTEMP=${TEMPS[i]}
    LOWERPWM=${PWMS[i-1]}
    HIGHERPWM=${PWMS[i]}
    PWM=$(echo "( ( $TEMP - $LOWERTEMP ) * ( $HIGHERPWM - $LOWERPWM ) / ( $HIGHERTEMP - $LOWERTEMP ) ) + $LOWERPWM" | bc)
    debug "interpolated pwm value for temperature $TEMP is: $PWM"
    set_pwm "$PWM"
    return
  done
}

function reset_on_fail {
  echo "exiting, resetting fan to auto control..."
  set_fanmode 2
  exit 1
}

# always try to reset fans on exit
trap "reset_on_fail" SIGINT SIGTERM

function run_daemon {
  while :; do
    interpolate_pwm
    debug
    sleep $SLEEP_INTERVAL
  done
}

# set fan control to manual
set_fanmode 1

# finally start the loop
run_daemon

--------------- End amdgpu-fancontrol ---------------


2. Create a file with the following contents named "amdgpu-fancontrol.service" in /etc/systemd/system.

--------------- Start amdgpu-fancontrol.service ---------------

[Unit]
Description=amdgpu-fancontrol

[Service]
Type=simple
ExecStart=/usr/local/bin/amdgpu-fancontrol

[Install]
WantedBy=multi-user.target

--------------- End amdgpu-fancontrol.service ---------------

3. Here's how to enable, disable, and get the status of the fan control service:

sudo systemctl enable amdgpu-fancontrol
sudo systemctl start amdgpu-fancontrol
sudo systemctl status amdgpu-fancontrol
Comment 39 MasterCATZ 2019-12-06 04:30:52 UTC
will not work , /sys/class/drm/card1/device/hwmon/hwmon1/pwm1_enable 
is locked to Auto 




[28455.094113] manual fan speed control should be enabled first
[28473.077182] manual fan speed control should be enabled first
[28480.086754] manual fan speed control should be enabled first
[28498.073701] manual fan speed control should be enabled first
[28499.095753] manual fan speed control should be enabled first
[28512.086404] manual fan speed control should be enabled first
[28525.077255] manual fan speed control should be enabled first
[28529.080955] manual fan speed control should be enabled first
[28530.070058] manual fan speed control should be enabled first
[28839.107591] manual fan speed control should be enabled first
[28840.099633] manual fan speed control should be enabled first
[28842.083214] manual fan speed control should be enabled first
[28890.089742] manual fan speed control should be enabled first
[28896.099884] manual fan speed control should be enabled first
[28902.081972] manual fan speed control should be enabled first
[28909.093220] manual fan speed control should be enabled first
[28927.107978] manual fan speed control should be enabled first
[28950.085450] manual fan speed control should be enabled first
[28979.116690] manual fan speed control should be enabled first
[28982.086568] manual fan speed control should be enabled first
[29004.103327] manual fan speed control should be enabled first
[29040.104962] manual fan speed control should be enabled first
[29066.095979] manual fan speed control should be enabled first
[29077.113080] manual fan speed control should be enabled first
[29086.091060] manual fan speed control should be enabled first
[29096.113497] manual fan speed control should be enabled first
[29111.123447] manual fan speed control should be enabled first
[29123.117578] manual fan speed control should be enabled first
[29126.092675] manual fan speed control should be enabled first
[29148.109806] manual fan speed control should be enabled first
[29155.119475] manual fan speed control should be enabled first
[29168.111159] manual fan speed control should be enabled first
[29170.094539] manual fan speed control should be enabled first
[29187.119961] manual fan speed control should be enabled first
[29196.113113] manual fan speed control should be enabled first
[29199.119590] manual fan speed control should be enabled first
[29211.126157] manual fan speed control should be enabled first
[29214.098257] manual fan speed control should be enabled first
[29217.107755] manual fan speed control should be enabled first
[29229.115177] manual fan speed control should be enabled first
[29242.097319] manual fan speed control should be enabled first
[29325.114063] manual fan speed control should be enabled first
[29333.108686] manual fan speed control should be enabled first
[29449.116469] manual fan speed control should be enabled first
[29455.132518] manual fan speed control should be enabled first
[29471.129284] manual fan speed control should be enabled first
[29480.121633] manual fan speed control should be enabled first
[29640.125839] manual fan speed control should be enabled first
[29981.128248] manual fan speed control should be enabled first
[30199.151363] manual fan speed control should be enabled first
[30204.143080] manual fan speed control should be enabled first
[30211.154484] manual fan speed control should be enabled first
[30226.128368] manual fan speed control should be enabled first
[30228.145612] manual fan speed control should be enabled first
[30236.144778] manual fan speed control should be enabled first
[30243.149198] manual fan speed control should be enabled first
[30245.134568] manual fan speed control should be enabled first
[30248.140668] manual fan speed control should be enabled first
[30362.126900] manual fan speed control should be enabled first
[30909.144940] manual fan speed control should be enabled first
[30910.137533] manual fan speed control should be enabled first
[30920.163730] manual fan speed control should be enabled first
[30931.161975] manual fan speed control should be enabled first
[30932.158340] manual fan speed control should be enabled first
[30933.147783] manual fan speed control should be enabled first
[30944.159956] manual fan speed control should be enabled first
[30958.138767] manual fan speed control should be enabled first
[30977.151665] manual fan speed control should be enabled first
[30996.157518] manual fan speed control should be enabled first
[31025.147100] manual fan speed control should be enabled first
[31029.149391] manual fan speed control should be enabled first
[31030.148760] manual fan speed control should be enabled first


and the echo 0 >  /sys/class/drm/card1/device/hwmon/hwmon1/pwm1_enable 
to 100% only works in low power state as soon as core speeds go up fan speeds drop ...
Comment 40 MasterCATZ 2019-12-06 04:32:46 UTC
aio@aio:~$ sudo pwmconfig
[sudo] password for aio: 
# pwmconfig revision $Revision$ ($Date$)
This program will search your sensors for pulse width modulation (pwm)
controls, and test each one to see if it controls a fan on
your motherboard. Note that many motherboards do not have pwm
circuitry installed, even if your sensor chip supports pwm.

We will attempt to briefly stop each fan using the pwm controls.
The program will attempt to restore each fan to full speed
after testing. However, it is ** very important ** that you
physically verify that the fans have been to full speed
after the program has completed.

Found the following devices:
   hwmon0 is acpitz
   hwmon1 is amdgpu
   hwmon2 is coretemp
   hwmon3 is it8620
   hwmon4 is it8792

Found the following PWM controls:
   hwmon1/pwm1           current value: 104
hwmon1/pwm1 is currently setup for automatic speed control.
In general, automatic mode is preferred over manual mode, as
it is more efficient and it reacts faster. Are you sure that
you want to setup this output for manual control? (n) y
hwmon1/pwm1_enable stuck to 2
Manual control mode not supported, skipping hwmon1/pwm1.
Comment 41 MasterCATZ 2019-12-06 04:45:45 UTC
aio@aio:/usr/local/bin$ sudo systemctl status amdgpu-fancontrol
● amdgpu-fancontrol.service - amdgpu-fancontrol
   Loaded: loaded (/etc/systemd/system/amdgpu-fancontrol.service; enabled; vendor preset: enabled)
   Active: active (running) since Fri 2019-12-06 14:45:07 AEST; 3s ago
 Main PID: 23922 (amdgpu-fancontr)
    Tasks: 2 (limit: 4915)
   Memory: 3.3M
   CGroup: /system.slice/amdgpu-fancontrol.service
           ├─23922 /bin/bash /usr/local/bin/amdgpu-fancontrol
           └─23979 sleep 1

Dec 06 14:45:08 aio amdgpu-fancontrol[23922]: changing pwm to 175
Dec 06 14:45:08 aio amdgpu-fancontrol[23922]: /usr/local/bin/amdgpu-fancontrol: line 65: echo: write error: Invalid argument
Dec 06 14:45:09 aio amdgpu-fancontrol[23922]: current temp: 62000
Dec 06 14:45:09 aio amdgpu-fancontrol[23922]: interpolated pwm value for temperature 62000 is: 175
Dec 06 14:45:09 aio amdgpu-fancontrol[23922]: current pwm: 104, requested to set pwm to 175
Dec 06 14:45:09 aio amdgpu-fancontrol[23922]: Fanmode not set to manual.
Dec 06 14:45:09 aio amdgpu-fancontrol[23922]: setting fan mode to 1
Dec 06 14:45:09 aio amdgpu-fancontrol[23922]: temp at last change was 62000
Dec 06 14:45:09 aio amdgpu-fancontrol[23922]: changing pwm to 175
Dec 06 14:45:09 aio amdgpu-fancontrol[23922]: /usr/local/bin/amdgpu-fancontrol: line 65: echo: write error: Invalid argument
Comment 42 muncrief 2019-12-06 05:01:57 UTC
(In reply to MasterCATZ from comment #41)
> aio@aio:/usr/local/bin$ sudo systemctl status amdgpu-fancontrol
> ● amdgpu-fancontrol.service - amdgpu-fancontrol
>    Loaded: loaded (/etc/systemd/system/amdgpu-fancontrol.service; enabled;
> vendor preset: enabled)
>    Active: active (running) since Fri 2019-12-06 14:45:07 AEST; 3s ago
>  Main PID: 23922 (amdgpu-fancontr)
>     Tasks: 2 (limit: 4915)
>    Memory: 3.3M
>    CGroup: /system.slice/amdgpu-fancontrol.service
>            ├─23922 /bin/bash /usr/local/bin/amdgpu-fancontrol
>            └─23979 sleep 1
> 
> Dec 06 14:45:08 aio amdgpu-fancontrol[23922]: changing pwm to 175
> Dec 06 14:45:08 aio amdgpu-fancontrol[23922]:
> /usr/local/bin/amdgpu-fancontrol: line 65: echo: write error: Invalid
> argument
> Dec 06 14:45:09 aio amdgpu-fancontrol[23922]: current temp: 62000
> Dec 06 14:45:09 aio amdgpu-fancontrol[23922]: interpolated pwm value for
> temperature 62000 is: 175
> Dec 06 14:45:09 aio amdgpu-fancontrol[23922]: current pwm: 104, requested to
> set pwm to 175
> Dec 06 14:45:09 aio amdgpu-fancontrol[23922]: Fanmode not set to manual.
> Dec 06 14:45:09 aio amdgpu-fancontrol[23922]: setting fan mode to 1
> Dec 06 14:45:09 aio amdgpu-fancontrol[23922]: temp at last change was 62000
> Dec 06 14:45:09 aio amdgpu-fancontrol[23922]: changing pwm to 175
> Dec 06 14:45:09 aio amdgpu-fancontrol[23922]:
> /usr/local/bin/amdgpu-fancontrol: line 65: echo: write error: Invalid
> argument

I was about to call it a day when I got your email notifications. The line it's talking about is:

echo "$NEW_PWM" > "$FILE_PWM"

So it looks like the "$FILE_PWM" variable is not valid. Remember, you have to change the variables under the comment "hwmon paths, hardcoded for one amdgpu card, adjust as needed" to whatever your system requires. To debug the variables I would execute the 4 lines that set HWMON, FILE_PWM, FILE_FANMODE, and FILE_TEMP from terminal and see where things are going wrong. I have to go now but I'll try to help you more tomorrow if you're still having problems. But once you have those variables set correctly the script should work. Here's what the service status output looks like on my system:

   Loaded: loaded (/etc/systemd/system/amdgpu-fancontrol.service; enabled; vendor preset: disabled)
   Active: active (running) since Thu 2019-12-05 18:16:27 PST; 2h 28min ago
 Main PID: 836 (amdgpu-fancontr)
    Tasks: 2 (limit: 4915)
   Memory: 7.7M
   CGroup: /system.slice/amdgpu-fancontrol.service
           ├─  836 /bin/bash /usr/local/bin/amdgpu-fancontrol
           └─14235 sleep 1

Dec 05 20:44:46 Entropod amdgpu-fancontrol[836]: changing pwm to 80
Dec 05 20:44:47 Entropod amdgpu-fancontrol[836]: current temp: 49000
Dec 05 20:44:47 Entropod amdgpu-fancontrol[836]: interpolated pwm value for temperature 49000 is: 90
Dec 05 20:44:47 Entropod amdgpu-fancontrol[836]: current pwm: 76, requested to set pwm to 90
Dec 05 20:44:47 Entropod amdgpu-fancontrol[836]: temp at last change was 48000
Dec 05 20:44:47 Entropod amdgpu-fancontrol[836]: changing pwm to 90
Dec 05 20:44:48 Entropod amdgpu-fancontrol[836]: current temp: 48000
Dec 05 20:44:48 Entropod amdgpu-fancontrol[836]: interpolated pwm value for temperature 48000 is: 80
Dec 05 20:44:48 Entropod amdgpu-fancontrol[836]: current pwm: 86, requested to set pwm to 80
Dec 05 20:44:48 Entropod amdgpu-fancontrol[836]: not changing pwm, we just did at 49000, next change when below 43000
Comment 43 MasterCATZ 2019-12-06 05:52:49 UTC
the file is correct .. and you can tell that because its reading the temp "current pwm: 76"

error is because NOTHING is being allowed to edit pwm1_enable it is stuck on auto so nothing can manually change pwm1



but if their is an error in my adjustments let me know 


# hwmon paths, hardcoded for one amdgpu card, adjust as needed
HWMON=$(ls /sys/class/drm/card1/device/hwmon/hwmon1)
FILE_PWM=$(echo /sys/class/drm/card1/device/hwmon/hwmon1/pwm1)
FILE_FANMODE=$(echo /sys/class/drm/card1/device/hwmon/hwmon1/pwm1_enable)
FILE_TEMP=$(echo /sys/class/drm/card1/device/hwmon/hwmon1/temp1_input)
Comment 44 MasterCATZ 2019-12-06 05:56:50 UTC
aio@aio:~$ ls /sys/class/drm/card1/device/hwmon/hwmon1
device       freq1_input  name            pwm1         temp1_crit_hyst
fan1_enable  freq1_label  power           pwm1_enable  temp1_input
fan1_input   freq2_input  power1_average  pwm1_max     temp1_label
fan1_max     freq2_label  power1_cap      pwm1_min     uevent
fan1_min     in0_input    power1_cap_max  subsystem
fan1_target  in0_label    power1_cap_min  temp1_crit
aio@aio:~$ cat  /sys/class/drm/card1/device/hwmon/hwmon1/pwm1
68
aio@aio:~$ cat  /sys/class/drm/card1/device/hwmon/hwmon1/pwm1_enable
2
aio@aio:~$ cat  /sys/class/drm/card1/device/hwmon/hwmon1/temp1_input
54000
aio@aio:~$
Comment 45 muncrief 2019-12-07 19:48:10 UTC
(In reply to MasterCATZ from comment #43)
> the file is correct .. and you can tell that because its reading the temp
> "current pwm: 76"
> 
> error is because NOTHING is being allowed to edit pwm1_enable it is stuck on
> auto so nothing can manually change pwm1
> 
> 
> 
> but if their is an error in my adjustments let me know 
> 
> 
> # hwmon paths, hardcoded for one amdgpu card, adjust as needed
> HWMON=$(ls /sys/class/drm/card1/device/hwmon/hwmon1)
> FILE_PWM=$(echo /sys/class/drm/card1/device/hwmon/hwmon1/pwm1)
> FILE_FANMODE=$(echo /sys/class/drm/card1/device/hwmon/hwmon1/pwm1_enable)
> FILE_TEMP=$(echo /sys/class/drm/card1/device/hwmon/hwmon1/temp1_input)

Your variables are set wrong. If your GPU is card1 they should be:

HWMON=$(ls /sys/class/drm/card1/device/hwmon)
FILE_PWM=$(echo /sys/class/drm/card1/device/hwmon/$HWMON/pwm1)
FILE_FANMODE=$(echo /sys/class/drm/card1/device/hwmon/$HWMON/pwm1_enable)
FILE_TEMP=$(echo /sys/class/drm/card1/device/hwmon/$HWMON/temp1_input)


The "HWMON" variable is there to determine which actual hardware monitor is being used because it can change whenever you boot. One time it could be hwmon1, the next time hwmon3, etc. So you can't hard-code it as you're doing. You have to use the $HWMON variable to set FILE_PWM, FILE_FANMODE, and FILE_TEMP.
Comment 46 Jan Ziak (http://atom-symbol.net) 2019-12-07 20:05:22 UTC
There is also the possibility to use question marks in the path:

/sys/class/drm/card?/device/hwmon/hwmon?
Comment 47 muncrief 2019-12-07 20:20:08 UTC
(In reply to Jan Ziak (http://atom-symbol.net) from comment #46)
> There is also the possibility to use question marks in the path:
> 
> /sys/class/drm/card?/device/hwmon/hwmon?

Thank you for mentioning that. If you only have one GPU that will indeed work. I have multiple GPUs, one Nvidia and one AMD, so I have to hard-code the card.
Comment 48 Jan Ziak (http://atom-symbol.net) 2019-12-07 20:46:34 UTC
(In reply to muncrief from comment #47)
> (In reply to Jan Ziak (http://atom-symbol.net) from comment #46)
> > There is also the possibility to use question marks in the path:
> > 
> > /sys/class/drm/card?/device/hwmon/hwmon?
> 
> Thank you for mentioning that. If you only have one GPU that will indeed
> work. I have multiple GPUs, one Nvidia and one AMD, so I have to hard-code
> the card.

Maybe you can use the PCI ID of the device:

FOUND=false
for CARD in /sys/class/drm/card?; do
  DEVICE="$(cat "$CARD/device/device")"
  if [[ "${DEVICE,,}" == 0x67b1 ]]; then
    FOUND=true
    break
  fi
done
$FOUND || exit 1
HWMON=$CARD/device/hwmon/hwmon?
echo $HWMON
Comment 49 muncrief 2019-12-07 21:04:07 UTC
(In reply to Jan Ziak (http://atom-symbol.net) from comment #48)
> (In reply to muncrief from comment #47)
> > (In reply to Jan Ziak (http://atom-symbol.net) from comment #46)
> > > There is also the possibility to use question marks in the path:
> > > 
> > > /sys/class/drm/card?/device/hwmon/hwmon?
> > 
> > Thank you for mentioning that. If you only have one GPU that will indeed
> > work. I have multiple GPUs, one Nvidia and one AMD, so I have to hard-code
> > the card.
> 
> Maybe you can use the PCI ID of the device:
> 
> FOUND=false
> for CARD in /sys/class/drm/card?; do
>   DEVICE="$(cat "$CARD/device/device")"
>   if [[ "${DEVICE,,}" == 0x67b1 ]]; then
>     FOUND=true
>     break
>   fi
> done
> $FOUND || exit 1
> HWMON=$CARD/device/hwmon/hwmon?
> echo $HWMON

Well, my system works great the way it is and I don't really have time to do any further debugging or redesign. I'm just trying to help MasterCATZ get things going. However that's another great way to determine where a specific card is, thank you for the multiple great suggestions!

It's great to see so many people trying to help, we need more of that in Linux, especially with Arch and its derivative distros. It's very irritating and frustrating when I see experienced users simply tell others to "read the wiki", or expect them to use Linux for two years before they can have a usable installation.

In fact that kind of old, outdated, and downright mean attitude is one of the reasons Linux still has such a low share of the desktop market. So whenever I see someone who needs help I try to make it as easy as I can for them, and have even been insulted numerous times by the cruel people who are angered that I don't just tell others to get a PhD or something :)
Comment 50 MasterCATZ 2019-12-09 00:33:12 UTC
Thanks for correction, I was unsure if $HWMON knew to go to hwmon1

works until GPU hits 70 deg then something forces "pwm1_enable" to auto and starts ramping the fan speed down until its 20% @ 90+ deg and bliping 100% @ 95 deg 



for now all I can do is run custom bios with 800 memory speed and 850 core and keep toggling between standard and performance mode on to reset fan speed to 100% and redo that every time its drops back below 40% and set /sys/class/drm/card1/device/hwmon/hwmon1/power1_cap to under 140w so the GPU does not cook 

so unless its "Radeon Profile" doing something to get locked out I have no idea 

its fan profile should be 
over 70 deg 1:1 ratio 
under 60 deg 50%
under 50 deg 10%
under 40 deg 5%
under 20 deg 0%

any way to find out what is accessing pwm1_enable ?
Comment 51 MasterCATZ 2019-12-09 00:55:47 UTC
current temp: 61000
interpolated pwm value for temperature 61000 is: 170
current pwm: 165, requested to set pwm to 170
current pwm: 165, requested to set pwm to 170
temp at last change was 61000
changing pwm to 170

current temp: 71000
current pwm: 255, requested to set pwm to 255
current pwm: 255, requested to set pwm to 255
not changing pwm, we just did at 71000, next change when below 66000


current temp: 73000
current pwm: 68, requested to set pwm to 255
current pwm: 68, requested to set pwm to 255
Fanmode not set to manual.
setting fan mode to 1
temp at last change was 73000
changing pwm to 255
/usr/local/bin/amdgpu-fancontrol: line 65: echo: write error: Invalid argument




current temp: 87000
current pwm: 124, requested to set pwm to 255
current pwm: 124, requested to set pwm to 255
Fanmode not set to manual.
setting fan mode to 1
temp at last change was 87000
changing pwm to 255
/usr/local/bin/amdgpu-fancontrol: line 65: echo: write error: Invalid argument
Comment 52 muncrief 2019-12-09 21:26:54 UTC
(In reply to MasterCATZ from comment #51)
> current temp: 61000
> interpolated pwm value for temperature 61000 is: 170
> current pwm: 165, requested to set pwm to 170
> current pwm: 165, requested to set pwm to 170
> temp at last change was 61000
> changing pwm to 170
> 
> current temp: 71000
> current pwm: 255, requested to set pwm to 255
> current pwm: 255, requested to set pwm to 255
> not changing pwm, we just did at 71000, next change when below 66000
> 
> 
> current temp: 73000
> current pwm: 68, requested to set pwm to 255
> current pwm: 68, requested to set pwm to 255
> Fanmode not set to manual.
> setting fan mode to 1
> temp at last change was 73000
> changing pwm to 255
> /usr/local/bin/amdgpu-fancontrol: line 65: echo: write error: Invalid
> argument
> 
> 
> 
> 
> current temp: 87000
> current pwm: 124, requested to set pwm to 255
> current pwm: 124, requested to set pwm to 255
> Fanmode not set to manual.
> setting fan mode to 1
> temp at last change was 87000
> changing pwm to 255
> /usr/local/bin/amdgpu-fancontrol: line 65: echo: write error: Invalid
> argument

Well, that's certainly quite bizarre. I wish I could think of something else but I'm stumped. I've never experienced that problem on my system, and I don't know why yours isn't allowing the write. Is it possible there was some error in copying the script? It seems unlikely but that's all I can come up with at this point. If you have somewhere I can send my actual script and service files I'd be happy to send them to you. Otherwise I'm just out of ideas.
Comment 53 MasterCATZ 2019-12-10 01:03:57 UTC
its been like this since mid  kernel 4's, just wish I knew whats locking that file root has no permissions and it seems to activate @ 70 deg , which even if i run the fan 100% will be reached unless I under clock 

amdgpupro just turns PC into a paperweight so I don't use that 
radeon drivers suck for any gaming  
amdgpu / mesa 
are what I need to use and its been like this since powerplay was introduced 

Ubuntu 18.04, and just upgraded it to 19.10 same issues 
currently using 5.4.2-050402-generic

GRUB_CMDLINE_LINUX_DEFAULT="amdgpu.ppfeaturemask=0xfffd7fff amdgpu.ppfeaturemask=0xffffffff amdgpu.dc=1 amdgpu.cik_support=1 radeon.cik_support=0"

featuremasks seem to make no difference 
maybe I should re - add 
radeon.si_support=0  amdgpu.si_support=1

as  in as radeon profile is showing radeonsi is in use ?, 
but I thought R9 290 were Sea Islands = amdgpu.cik_support=1 ?
Comment 54 michael 2020-05-21 15:31:24 UTC
It seems that since kernel 5.6 (or at least Debian's version thereof), I no longer need to fiddle with /sys/class/drm/card0/device/hwmon/hwmon3/pwm1_enable. The default value (1) seems to do the right thing now. Progress!

Mind you, lmsensors is still unable to report fan speed, and gives nonsensical values for crit/hyst temperatures. I have a feeling that further improvements to power management may be possible too.

amdgpu-pci-0100
Adapter: PCI adapter
vddgfx:       +1.00 V  
fan1:             N/A  (min =    0 RPM, max =    0 RPM)
edge:         +73.0°C  (crit = +104000.0°C, hyst = -273.1°C)
power1:       58.21 W  (cap = 208.00 W)

Note You need to log in before you can comment on or make changes to this bug.