Bug 218689

Summary: AMD_Pstate_EPP Ryzen 7000 issues. Freezing and static sound
Product: Power Management Reporter: Christopher Bii (kbii.chris)
Component: cpufreqAssignee: linux-pm (linux-pm)
Status: NEW ---    
Severity: normal CC: mario.limonciello
Priority: P3    
Hardware: AMD   
OS: Linux   
Kernel Version: Subsystem:
Regression: No Bisected commit-id:

Description Christopher Bii 2024-04-06 20:49:02 UTC
Hardware:

AMD Ryzen 9 7900x
Gigabyte b650 aero g
Corsair Dominator Platinum 2x16 6000MHz

Kernel tested on:

6.8.2 Arch
6.8.1 Arch
6.6.9 Debian

Issue:

There has been a static noise coming from my cpu as well as frequent freezes. This is nearly constant and makes the computer impossible to use for long due to the sound.

Remedy:

I have found the issue to disappear completely while using the passive driver provided by amd_pstate. The scaling driver must be set to schedutil. This completely mitigates the issue.
kernel flags: amd_pstate=passive cpufreq.default_governor=schedutil

Suspicions:

I suspect the power management implementation used by amd_pstate_epp to be causing this issue. There is a async param found inside /sys/devices/system/cpu/power that is always set to false whenever amd_pstate_epp is used. My intuition tells me this must be what holds up the system at times while changing c_states or something.

Extra:

I have seen alot of talk surrounding the fTPM and issues around it but I cannot related these two observations together, hence why I am opening a new issue. If this is not the right way to go about it mybad, this is my first bug report.
Comment 1 Artem S. Tashkinov 2024-04-07 11:21:01 UTC
This is extremely unlikely to be caused by the CPU driver.

Please try resetting your BIOS settings and not using any XMP memory profiles first.
Comment 2 Christopher Bii 2024-04-07 18:11:03 UTC
(In reply to Artem S. Tashkinov from comment #1)
> This is extremely unlikely to be caused by the CPU driver.
> 
> Please try resetting your BIOS settings and not using any XMP memory
> profiles first.

Few things I have done:

- Reset bios to defaults
- Updated Bios
- Downgraded bios
- Disable xmp profile
- Set/unset typical idle usage
- Tweaked nearly every setting in the bios over 2 weeks
- Replaced motherboard
- Replaced ram

Kernel params passed:

no_rcucbs=0-$numcores
idle=nomwait
iommu=soft
acpi=off

Absolutely nothing worked.

I had tried testing around different governors but had never tried using the passive amd_pstate mode. The one thing that caught my attention was the asynchronous power management parameter being off. This to me seems like the most plausible cause of the stuttering.

The one thing I am absolutely certain of though, change the cpufreq driver and governor 100% makes a difference. Exact cause needs to be further investigated.
Comment 3 Artem S. Tashkinov 2024-04-07 19:07:45 UTC
amd-pstate has been the default CPU driver for quite some time now in many distros and your report is the first of this kind. And since most Linux users use AMD CPUs it becomes even more bizarre.

I still smell something is wrong with your setup hardware wise only I don't understand what exactly.

What other options I can think of:

1. CPU dies sometimes can be faulty as well. This is extremely unlikely but does happen.
2. Thermal paste/CPU cooler issues.
3. PSU issues.

I'm quite sure replacing the CPU is not on your mind yet, but I'd at least make sure that the thermal paste is applied properly (and you've not forgotten to remove the plastic for your cooler) and that the PSU CPU power connector is attached properly.

And then if you're keen to check thermal paste application, please remove the die from the motherboard socket and check that all the CPU pins are intact.

Again, I'm 99.999% sure it's a hardware issue.

I'll CC Mario because he knows a lot more about AMD hardware, so probably he could add something.
Comment 4 Artem S. Tashkinov 2024-04-07 19:12:47 UTC
If I were you, I'd definitely post your issue here:

https://www.reddit.com/r/AMDHelp/

and here

https://community.amd.com/t5/support-forums/ct-p/supprtforums
Comment 5 Christopher Bii 2024-04-07 19:40:22 UTC
(In reply to Artem S. Tashkinov from comment #4)
> If I were you, I'd definitely post your issue here:
> 
> https://www.reddit.com/r/AMDHelp/
> 
> and here
> 
> https://community.amd.com/t5/support-forums/ct-p/supprtforums

I have attempted to post on both early on but to no avail. CPU, CPU cooler, nvme and the rest we're all removed and put back. Absolutely nothing wrong with them.

Even if there were to be a hardware issue, which I think is ruled out at this point. It would not perfectly explain how the issue completely goes away when the driver/governor is changed.

The CPU benchmarks perfect scores. Another observation I had was that under full load with that I suspect to be the faulty driver, there is no issue.

Disabling c6 in bios does not fix the issue. I will try the other driver when I disable _PC register to see if it fixed the issue. This was something I had tried also but I cannot fully remember whether the issue persisted or not.

The CPU benchmarks perfectly, outperforming the average on geekbench (I am cheating since I am on the superior linux). I believe I tested the issue on windows also, it was definitely the same until I upgraded the drivers and all. Issue nearly completely disappeared.

I am yet to try on 6.8.4, I will try the different drivers and see whether there is any difference.
Comment 6 Christopher Bii 2024-04-07 19:41:22 UTC
if there were* to be a hardware issue^^^
Comment 7 Artem S. Tashkinov 2024-04-08 09:16:44 UTC
> if there were* to be a hardware issue^^^

AFAIK you're the _only_ person using Linux who experiences such issues. So far there have been _zero_ reports about complications related to the amd-pstate driver.

It may not work at all but it doesn't and cannot lead to freezes or sounds.