Bug 119211
Summary: | amdgpu disables fan by default | ||
---|---|---|---|
Product: | Drivers | Reporter: | Stas Sergeev (stsp2) |
Component: | Video(DRI - non Intel) | Assignee: | drivers_video-dri |
Status: | NEW --- | ||
Severity: | normal | CC: | alexdeucher, JimiJames.Bove |
Priority: | P1 | ||
Hardware: | All | ||
OS: | Linux | ||
Kernel Version: | 6.7.7 | Subsystem: | |
Regression: | No | Bisected commit-id: | |
Attachments: |
dmesg
Xorg log |
Description
Stas Sergeev
2016-05-29 22:09:54 UTC
You should produce some GPU load before the fan activates. Even besides the fact that GPU was so hot I couldn't even touch it? Essentially, when driver initializes, it puts 0 to /sys/class/hwmon/hwmon0/pwm1. And unless you set up fancontrol (which is a major pita), this 0 remain there, no matter how you load you GPU. It should put some other value there, like 50 or more. On my system 50 is a minimum value needed to get the GPU fan rotating. Is this bug still happening? With my R9 Fury on amdgpu, cat /sys/class/hwmon/hwmon0/pwm1 (well, in my case, it's hwmon2 because I have another card), returns 35 on idle, not 0, but the fans are not running. Even when I'm running a AAA game, my card doesn't even reach 40°C, so my cooling system is too good for me to be able to actually see if the fans turn on when they should. When I first started my computer up, pwm1 was giving me 56, but it went down to 35 before I could finish opening my case and has stayed there no matter what I do. When the card is bound to vfio-pci instead of amdgpu (for a virtual machine), the fan is on all the time, even though the card's low idle temperatures must be similar. I managed to check the fans while pwm1 was giving values like 68, 61, and 56, and they were not turned on. I don't know if that means anything, because the card was still <40 degrees and definitely not too hot to touch. > Is this bug still happening? For me it is happening as a hell. And because fancontrol service also doesn't work on my PC (I've filled another reports about it), the problems are very real. > I managed to check the fans while pwm1 was giving values like You can write the values there, too. In fact, I wonder who changes them for you. Do you have the fancontrol set up and running? $ systemctl status fancontrol I do not have fancontrol set up or running (it's inactive on my system). I don't know anything about fancontrol at all. I'm running Arch Linux, so I pretty much only am running services that I know about. I tried writing values myself with echo, like 'echo 50 > /sys/class/hwmon/hwmon0/pwm1', but that didn't affect it at all. Is that not how you're supposed to change it? By default the hw controls the fan based on temperature, etc. Not all cards have a fan control. If you do, then the following standard HWMON pwm attributes should be available: * pwm1_enable: Current fan management mode (MANUAL or AUTO) * pwm1: Current PWM value (power percentage) * pwm1_min: The minimum PWM speed allowed * pwm1_max: The maximum PWM speed allowed (bypassed when hitting Fan_boost) The fan can be driven in different modes: * 1: The fan can be driven in manual (use pwm1 to change the speed); * 2; The fan is driven automatically depending on the temperature. See: http://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/tree/drivers/gpu/drm/amd/amdgpu/amdgpu_pm.c#n644 (In reply to Alex Deucher from comment #8) > By default the hw controls the fan based on temperature, etc. For me not. > Not all cards have a fan control. If you do, then the following standard > HWMON > pwm attributes should be available: > > * pwm1_enable: Current fan management mode (MANUAL or AUTO) I have always '1' there. Trying to write 0 or 2 there still leaves 1. It simply doesn't change. > * pwm1: Current PWM value (power percentage) Always 0, unless manually written. > The fan can be driven in different modes: > > * 1: The fan can be driven in manual (use pwm1 to change the speed); > * 2; The fan is driven automatically depending on the temperature. What should I write to pwm1_enable? '2'? It doesn't change. Please attach your dmesg output and xorg log (if running X). Created attachment 228111 [details]
dmesg
Created attachment 228121 [details]
Xorg log
That's interesting. My pwm1_enable returns 1, and trying to change it to 0 or 2 does nothing, but my pwm1 value does indeed change on its own, and I've never seen it be 0. It sounds like I don't have this bug but do have some other less major one? If pwm1 is changing on its own, can I trust that the fan will turn on if my card ever gets too hot? I should mention, my pwm1_min is 0 and pwm1_max is 255. > I should mention, my pwm1_min is 0 and pwm1_max is 255.
Same here.
IMHO pwm1_min should contain the value that
keeps the fan rotating at a minimal safe speed.
Putting 0 there makes it entirely useless.
Not necessarily. Less fans means less power usage means money saved, and as we can see with my computer, you can keep the card cool without its own fans. I have 4 case fans that are plugged directly into power and so are less competent at knowing when to turn off. Something needs to be done about your fan not activating when it should, though. > Not necessarily. Less fans means less power usage means money saved
You can set up fancontrol or put 0 into pwm1 manually
to stop the fan. But putting 0 into pwm1_min is IMHO
quite useless, it can as well just not exist at all.
But if it will contain the minimum _safe_ value, then
that can well be used.
Currently fancontrol have to "evaluate" the minimal
safe value by hands. It lowers the pwm1 value and looks
when the fan have stopped by checking the value of
fan1_input if that exists. And it doesn't exist for
amdgpu, so you need to do such a probe by hands.
I figured out something very interesting regarding this bug. Writing 2 to pwm1_enable causes the fan to rotate for about 10 seconds. Note that the old value of pwm1_enable is also 2, so it doesn't change, but the mere fact of writing has an effect! And this is not all! Now if you periodically READ from pwm1, then the fan doesn't stop! I can do: while :; do cat /sys/class/hwmon/hwmon0/pwm1; sleep 1; done And with this, the fan keeps rotating forever! But if you stop that script for something like 10 seconds, then the fan stops and pwm1 reads always return 0. You need to start again by writing 2 to pwm1_enable (even if there is already 2!), and quickly start reading from pwm1, and you have your fan finally rotating. :) A bit of a hand-written fancontrol script. :) Alex Deucher can you make any sense out of that? |