Bug 217720 - System powers down when running particular programs
Summary: System powers down when running particular programs
Status: RESOLVED INVALID
Alias: None
Product: ACPI
Classification: Unclassified
Component: Power-Processor (show other bugs)
Hardware: AMD Linux
: P3 high
Assignee: acpi_power-processor
URL: https://github.com/SchrodingerZhu/pag...
Keywords:
Depends on:
Blocks:
 
Reported: 2023-07-27 17:24 UTC by Schrodinger Yifan ZHU
Modified: 2023-07-29 11:45 UTC (History)
1 user (show)

See Also:
Kernel Version: 6.4.3-1-cachyos
Subsystem:
Regression: No
Bisected commit-id:


Attachments

Description Schrodinger Yifan ZHU 2023-07-27 17:24:06 UTC
I have a server with AMD EPYC 7773X 64-Core Processor (ucode: 20230625.ee91452d-5) running on supermicro h12ssl-i. 

I mistakenly enabled the ACPI-based driver instead of amd-pstate. It turns out that, when running a particular program (the source code is at https://github.com/SchrodingerZhu/paguroidea/tree/acpi-cpupower-bug and I invoked `cargo bench`), the whole system powers down immediately (the video output is cut; the power LED is down; the BMC seems to be alive as the fan is still running but I cannot get any log from SEL). I reproduced this under both schedutil and performance governor (others are not tested).

I switched to amd_pstate=guided now, and the problem is gone. So I think this should be a problem with the ACPI driver.
Comment 1 Artem S. Tashkinov 2023-07-29 11:45:53 UTC
The ACPI driver is extremely unlikely to cause this.

This looks more like a HW error, either in your CPU or motherboard.

Note You need to log in before you can comment on or make changes to this bug.