Bug 210683
Summary: | Nasty amdgpu powersave regression Navi14 | ||
---|---|---|---|
Product: | Drivers | Reporter: | siyia (eutychios23) |
Component: | Video(DRI - non Intel) | Assignee: | drivers_video-dri |
Status: | RESOLVED OBSOLETE | ||
Severity: | high | CC: | alexdeucher, david.18.19.21, m4ng4n, mweires |
Priority: | P1 | ||
Hardware: | x86-64 | ||
OS: | Linux | ||
Kernel Version: | 5.10 | Subsystem: | |
Regression: | Yes | Bisected commit-id: | |
Attachments: |
dmesg outputs for 5.9.14 and 5.10.1.
git bisect for Navi 21. |
Description
siyia
2020-12-14 13:08:57 UTC
Just tested kernel 5.10 stable and idle powersave is still broken on the gpu compared to kernel 5.9 vddgfx also plays between 6.00mv-700mv compared to a steady 6.00 mv in kernel 5.9 at idle Please attach your dmesg output. Can you bisect? Created attachment 294239 [details]
dmesg outputs for 5.9.14 and 5.10.1.
I can reproduce the issue on the RX 6800 (Navi 21 XL).
I use Radeontop to inspect the memory/GPU clock of my GPU.
When using Linux 5.9.14:
- In both KDE Plasma and tty2, Memory Clock hovers at around 100MHz.
- GPU Power reported by lm_sensors is around 5-7W.
When using Linux 5.10.1:
- In tty2, Memory Clock hovers at around 100MHz and GPU Power reported by lm_sensors is around 5-7W.
- In KDE Plasma, Memory Clock is usually around 1GHz (100%), although it can be down to ~470MHz, and GPU Power reported by lm_sensors is around 30W.
- Disconnecting one of my two monitors does not change the memory clock.
I am trying to bisect the commit, but many revisions seem to give a blank screen or the amdgpu module is not loaded. (I suspect I am not building the kernel properly)
Tested linux-firmware versions: 20201120.bc9cd0b, 20201218.646f159
Created attachment 294289 [details]
git bisect for Navi 21.
Not sure if I am doing is properly, but I performed a git bisect between 5.9 and 5.10 on drivers/gpu/drm/amd.
Note that none of the supposedly "good" commits actually fixed the issue. I just mark them as "good" because those commits either cannot modeset to my monitor's resolution, or the kernel fails to write certain registers to my GPU and causes my display lose signal and go to standby. So technically the "first bad commit" is just the first commit that I can boot into SDDM/KDE and can also reproduce the issue.
Please let me know if there are anything else I can help with. I have a spare Vega 64 (Vega 10) card lying around, but it has its own memory clock issues as far as I remember =/
This seems to affect APUs as well. I can reproduce the issue on Raven (3500U). I can confirm this with an RX 5700 XT. GPU-power and temperatures with running fans @1100 rpm: with kernel 5.9.14 during idle = 10 -11 W, temps around 32-35 deg C with kernel 5.10.1 and 5.10.2 during idle = 35 W, temps around 42-45 deg C I've found a workaround here: https://gitlab.freedesktop.org/drm/amd/-/issues/1407 read last 2 comments I have done some addtional testing. I am running an up to date openSUSE Tumleweed with KDE Plasma. My monitor is capable of running at 60, 100, 120 and 144 Hz refresh rate. For some reason unknown to me it was set to 60 Hz after the last update. With this setting it idled at 35 W instead of the usual 11 W. Also the memory/GPU frequencies come down to what they used to and where they should be. After setting it back to 144 Hz, wattage at idle came back down to 11 W. Setting it back to 60 Hz refresh rate lets the wattage come back up to 35 W. This is reproducible any time. Setting it to 100 and 120 Hz resepectively lets the graphics card also idle at 11W. It seems, at least on my system, that this bug only affects me, when the monitor is set to 60 Hz refresh rate. I have done some addtional testing. I am running an up to date openSUSE Tumleweed with KDE Plasma. My monitor is capable of running at 60, 100, 120 and 144 Hz refresh rate. For some reason unknown to me it was set to 60 Hz after the last update. With this setting it idled at 35 W instead of the usual 11 W. After setting it back to 144 Hz, wattage at idle came back down to 11 W. Also the memory/GPU frequencies come down to what they used to and where they should be. Setting it back to 60 Hz refresh rate lets the wattage come back up to 35 W. This is reproducible any time. Setting it to 100 and 120 Hz resepectively lets the graphics card also idle at 11W. It seems, at least on my system, that this bug only affects me, when the monitor is set to 60 Hz refresh rate. |