Bug 213411 - Recent kernel builds seem to interfere with AMD dedicated GPU performance on Acer Nitro 5 AN515-42 laptop
Summary: Recent kernel builds seem to interfere with AMD dedicated GPU performance on ...
Status: NEW
Alias: None
Product: Drivers
Classification: Unclassified
Component: Video(DRI - non Intel) (show other bugs)
Hardware: All Linux
: P1 normal
Assignee: drivers_video-other
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2021-06-11 10:04 UTC by za.open.source
Modified: 2021-06-13 16:25 UTC (History)
0 users

See Also:
Kernel Version: Any kernel greater than 5.3
Subsystem:
Regression: Yes
Bisected commit-id:


Attachments
dmesg output (99.08 KB, text/plain)
2021-06-11 13:33 UTC, za.open.source
Details
glxinfo output (59.38 KB, text/plain)
2021-06-11 14:00 UTC, za.open.source
Details
glxinfo output (with DRI_PRIME=1) (59.32 KB, text/plain)
2021-06-11 14:03 UTC, za.open.source
Details

Description za.open.source 2021-06-11 10:04:14 UTC
Hello, I've recently begun testing openSUSE Tumbleweed on my Acer Nitro 5 AN515-42 laptop. I've noticed that the default kernel has some problems regarding the hybrid Raven Ridge/RX560x setup, whether I use the SUSE version of the kernel or an entirely vanilla build of it.

No matter what 3D load is run, the memory clock of the GPU hardware seems to be locked at its lowest state, resulting in extremely poor performance.

This happens regardless of DRI_PRIME setting.

I made a similar bug report on openSUSE's bugzilla here: https://bugzilla.opensuse.org/show_bug.cgi?id=1186944

Interestingly, this did NOT happen when I tested other distributions on this hardware, such as Fedora, Ubuntu, and even openSUSE Leap with its older kernel version (5.3 at the time of this report.)

The problem also disappears if I use a custom version of the kernel, such as Xanmod, or if I force Tumbleweed to boot with Leap's kernel, which I still have installed on this machine.

I was told that this is more than likely an upstream bug, and that the reason why my experience was different on other distributions is because those distros may have downstream patches or userspace modifications in place to circumvent this issue.

I wanted to file a report here so that it has a chance of being fixed upstream and those downstream workarounds are no longer needed. Unfortunately, I don't really know *what* distributions like Fedora or Ubuntu are doing to get around this issue with the hardware.
Comment 1 za.open.source 2021-06-11 10:07:15 UTC
Additionally, dmesg and journalctl logs don't seem to contain anything useful in relation to this.
Comment 2 za.open.source 2021-06-11 13:33:00 UTC
Created attachment 297329 [details]
dmesg output

I'm attaching my dmesg output just in case, though I didn't seem to find anything useful during my initial examination of it.
Comment 3 za.open.source 2021-06-11 14:00:01 UTC
Created attachment 297333 [details]
glxinfo output
Comment 4 za.open.source 2021-06-11 14:03:05 UTC
Created attachment 297337 [details]
glxinfo output (with DRI_PRIME=1)
Comment 5 za.open.source 2021-06-12 19:32:17 UTC
Update: I narrowed this down a little bit, in the original report I said it happens regardless of DRI_PRIME setting, but it seems to only be present when DRI_PRIME=1 is used in an attempt to launch 3D loads on the dedicated RX560x.
Comment 6 za.open.source 2021-06-13 16:19:39 UTC
Update 2: A (bad) workaround for this seems to be suspending the latop while a 3D load is running. When I wake it from suspension, the dedicated GPU seems to be running at the proper speed and 3D applications perform normally.

It's fairly similar to this bug: https://bugzilla.kernel.org/show_bug.cgi?id=206309

However, considering my positive experience with downstream kernels from other distros, I highly doubt my power supply is faulty.
Comment 7 za.open.source 2021-06-13 16:25:13 UTC
Curiously, while running dmesg -w and doing the suspend trick, I received this:

[  681.123922] amdgpu: dpm has been enabled

I grepped the entire dmesg and found that this message ONLY appeared after waking from suspend. Is it possible that amdgpu's dynamic power management component isn't activating at boot and only after suspend on this hardware on modern kernels?

Note You need to log in before you can comment on or make changes to this bug.