Bug 217976
Summary: | cpuinfo_max_freq incorrect reading for i9-13900HX CPU | ||
---|---|---|---|
Product: | Platform Specific/Hardware | Reporter: | Evert Vorster (evorster) |
Component: | x86-64 | Assignee: | platform_x86_64 (platform_x86_64) |
Status: | NEEDINFO --- | ||
Severity: | normal | CC: | alexbelm48, buurman.sven |
Priority: | P3 | ||
Hardware: | Intel | ||
OS: | Linux | ||
Kernel Version: | Subsystem: | ||
Regression: | No | Bisected commit-id: | |
Attachments: |
Geekbench Comparison
The Dmidecode dump of the SMBIOS dsdt.dat Dsl_files zipped Filtered dmesg for ACPI Events with osi=Linux decoded dsl files for acpi_osi=Linux |
Description
Evert Vorster
2023-10-04 09:24:35 UTC
Run e.g. geekbench5 under Linux and Windows ( https://www.geekbench.com/legacy/ ) and compare the results. If they are similar (the Linux version runs slightly faster) than it's all fine. Created attachment 305188 [details]
Geekbench Comparison
Hi there!
I was filled with a bit of trepidation going into this test.
On Linux the performance is pretty much half of what it is on Windows, as you can see the results of the two tests on the attached file.
On the flip side, there is a fair bit of performance waiting for me if we manage to get this sorted.
Created attachment 305189 [details]
The Dmidecode dump of the SMBIOS
Just in case it helps the troubleshooting effort, here is the Dmidecode of the SMBIOS of this laptop.
I had to change the Manufacturer to TUXEDO to get around drivers for my keyboard not working... as they are made by Tuxedo for the Gemini Gen 2, which has identical hardware to this laptop, and their driver checks this field in the SMBIOS.
What I see that is odd is that the Max CPU speed listed here is 8500MHz, which is outside of what this CPU can do.
However, I managed to find a Dmidecode of a TUXEDO Gemini Gen 2 laptop online, and it also has the same value for Max CPU Speed. So, it may be that all the Gemini Gen 2's out there are hobbled on Linux, which is pretty terrible for a supposedly Linux shop.
Just as an aside, this Sager NP8875E is a rebranded Clevo PD70SNE-G, with a slightly updated BIOS. I have been unable to find a dmidecode for a Clevo PD705SNE-G so I don't know what the CPU max speed is set to there.
Lastly, the Bios does not let me fiddle with this value, and the SMBIOS editor that Sager kindly provided me does not have documentation beyond changing the manufacturer field, and I am a little scared of bricking this device, so I don't want to just enter any old value to see what it does.
Digging a little further, that Max CPU Speed that's in the SMBIOS is for the system itself. It basically means: The fastest CPU supported by the system is 8500MHz ... in my case. So, a red herring. How does the kernel figure out the max speed of a CPU anyways? The value reported in /sys/devices/system/cpu/cpu*/cpufreq_max_freq is clearly wrong. Why is Windows able to get the correct ranges for this CPU? Is it really so wrong to expect the min and max speeds reported by the kernel to match the minimum and maximum speed specified by the manufacturer? I was able to get a similar score on the geekbench for Linux by following the instructions on this page: https://www.thinkwiki.org/wiki/Problem_with_CPU_frequency_scaling Then I set cpupower's config file to specify the Frequency ranges for the PC, and re-started the service. There are a few serious concerns with this approach, though. The bottom end of the range was not modified, and so the operating range was from 0.8GHz to 5.4GHz. CPU temperatures were all over the place, flashing up to 100 degrees C, and it looks like the cpu scalers were not really managing this process well. If I read the document correctly, this approach is tantamount to turning off all the safeties and just going for it. I'm not a big fan, and the point of this ticket is not to make my computer to run faster, I want the GHz values reported by the kernel to be correct. One more update: It is possible to set the proper min and max frequencies with cpupower once we start disrespecting the BIOS limits. So, this set of commands runs the laptop in performance mode, makes a lot of noise and gets very respectable scores on geekbench: echo 1 | sudo tee /sys/module/processor/parameters/ignore_ppc sudo cpupower -c 0-15 frequency-set -d 2.2GHz -u 5.4GHz sudo cpupower -c 16-31 frequency-set -d 1.6GHz -u 3.9GHz sudo cpupower -c all frequency-set -g performance sudo cpupower -c all set --perf-bias 0 results: https://browser.geekbench.com/v6/cpu/2935716 ----------------- For cool, quiet and powersaving mode, and still getting about 80% of the performance of the above, run the following commands: echo 1 | sudo tee /sys/module/processor/parameters/ignore_ppc sudo cpupower -c 0-15 frequency-set -d 2.2GHz -u 5.4GHz sudo cpupower -c 16-31 frequency-set -d 1.6GHz -u 3.9GHz sudo cpupower -c all frequency-set -g powersave sudo cpupower -c all set --perf-bias 15 results: https://browser.geekbench.com/v6/cpu/2935361 ----------------- The frequencies listed here are for the i9-13900HX, you should look up the ranges for your cpu before setting them. As the safeties are off, you might break your hardware by overheating it with the wrong frequencies. As these CPU's are generally in USD 3000+ laptops, be careful. Anyways, there is a config file for cpupower, namely /etc/default/cpupower Unfortunately I could not find any documentation on how to set per-cpu frequencies, so it's not such a lot of use for this CPU, and I am unable to set a default policy before the user logs on. As a workaround, this is OK, I guess. If you know how to set per-cpu frequencies in the cpupower config file, please let me know. Also, if you have any extra information on all the settings that one is able to set in this file, please leave a link... it would be highly appreciated. Is anyone looking into why the i9-13900HX minimum and maximum frequencies are so very incorrect in Linux? Am I the only one that this is happening to? Just had a kernel update, and the issue is still present: [evert@Evert scripts]$ ./CheckCpu.bash Automated test for i9-13900HX CPU Intel states: Performance CPU cores 0-7 is clocked at 2.2GHz base to 5.4GHz with turbo on and Hyperthreading with two threads each. Efficiency cores 16-31 are clocked from 1.6GHz base to 3.9GHz uname -a: Linux Evert.Laptop 6.5.6-arch2-1 #1 SMP PREEMPT_DYNAMIC Sat, 07 Oct 2023 08:14:55 +0000 x86_64 GNU/Linux /sys/devices/system/cpu/cpu*/cpufreq/base_frequency | sort | uniq 1600000 2200000 /sys/devices/system/cpu/cpu*/cpufreq/cpuinfo_max_freq | sort | uniq 1600000 2200000 /sys/devices/system/cpu/cpu*/cpufreq/cpuinfo_min_freq | sort | uniq 800000 I downloaded and installed Puppy Linux on a USB drive to test whether these wrong reported frequencies are due to a packaging bug in Arch. As it turns out, the Maximum Frequencies reported by that kernel is exactly the same as the base frequencies, and matches the frequencies reported by Arch exactly. Had another kernel update, and the issue is still present: [evert@Evert scripts]$ ./CheckCpu.bash Automated test for i9-13900HX CPU Intel states: i9-13900HX has 8 performance cores with hyperthreading, and 16 Efficiency cores. Performance cores 0-15 is clocked at 2.2GHz base to 5.4GHz. Efficiency cores 16-31 are clocked from 1.6GHz base to 3.9GHz uname -a: Linux Evert.Laptop 6.5.7-arch1-1 #1 SMP PREEMPT_DYNAMIC Tue, 10 Oct 2023 21:10:21 +0000 x86_64 GNU/Linux /sys/devices/system/cpu/cpu*/cpufreq/base_frequency | sort | uniq 1600000 2200000 /sys/devices/system/cpu/cpu*/cpufreq/cpuinfo_max_freq | sort | uniq 1600000 2200000 /sys/devices/system/cpu/cpu*/cpufreq/cpuinfo_min_freq | sort | uniq 800000 [evert@Evert scripts]$ More added information, when disabling the intel_pstates driver, there is now a /sys/devices/system/cpu/cpufreq/policy0/bios_limit file that is not present when the intel_pstates driver is active. [evert@Evert policy0]$ cat /sys/devices/system/cpu/cpufreq/policy0/bios_limit 2400000 On a mobile system there is no place where you can set this value in the BIOS, and why would it be set to that value in the first place? The mystery deepens. I'm coming to your rescue! I have the exact same problem on my HP Omen ck2000nf, same CPU and same frequency scaling problems! Did you browse through the efivars? I believe there's a ton of things that can be modified manually, although you cannot alter them right from the UEFI integrated setup utility. I'll try your workaround on my side and see if this works for me as well! Will keep you updated. Hi there! Yes, it's still an issue for me, and I have just made a startup script that disables the BIOS protections, and set the CPU ranges to what it says on the box. I had a look at the various files in efivars, but honestly they do not make a sense to me. Let me know how things are going for you, and especially if you find a way of not having a workaround! Alright, sounds good to me! I also use ArchLinux on my side, and knowing that you have a different brand of laptop makes me think that this is not just a UEFI firmware issue, although this shouldn't be overlooked either (is your firmware copyrighted by AMI as well?) We also should probably look at how /sys/devices/system/cpu/cpufreq/policy0/bios_limit gives this value from the kernel source code, maybe we could find some answers from there! My BIOS is made by Insyde H20. Unfortunately Sager that supplied the laptop is not a Linux shop, and so the techs who do support hands are really tied. They have been as helpful as they can, though. I have been trying to get my hands on a BIOS made by TUXEDO for this laptop, as it has identical hardware. However, looking at results on Geekbench for the Gemini Gen 2 laptop I suspect that they are running into the same issue, and I find it surprising that no one has complained yet. For your info, here is the script that I run on startup to put my CPU in a "normal" state: ``` [evert@Evert ~]$ cat /usr/bin/CPU_Fix #!/bin/bash echo 1 | tee /sys/module/processor/parameters/ignore_ppc #cpupower -c 0-15 frequency-set -d 2.2GHz -u 5.4GHz #cpupower -c 16-31 frequency-set -d 1.6GHz -u 3.9GHz sleep 60 cpupower -c 0-15 frequency-set -d 0.8GHz -u 4.8GHz cpupower -c 16-31 frequency-set -d 0.8GHz -u 3.9GHz cpupower -c all frequency-set -g powersave cpupower -c all set --perf-bias 15 #echo 1 | sudo tee /sys/devices/system/cpu/intel_pstate/no_turbo ``` As you can see, I have a few different options there that I can still tune. I then made a systemd startup script that runs this script on startup. For some reason the 60 seconds of sleep is necessary, as something else also tries to set the CPU frequencies on startup. I think it may be the Tuxedo system control. I don't know enough about coding to be able to trace where the kernel gets or applies this BIOS limit. Anyways, here is a little more on the startup service that might help you to create your own: ``` [evert@Evert ~]$ systemctl status CPU_Fix.service ○ CPU_Fix.service - Unlock BIOS settings and set proper frequency range for CPU. Loaded: loaded (/etc/systemd/system/CPU_Fix.service; enabled; preset: disabled) Active: inactive (dead) since Thu 2023-11-16 16:01:35 CAT; 2h 25min ago Duration: 1min 44ms Process: 833 ExecStart=/usr/bin/CPU_Fix (code=exited, status=0/SUCCESS) Main PID: 833 (code=exited, status=0/SUCCESS) CPU: 29ms Nov 16 16:01:35 Evert.Laptop CPU_Fix[3221]: Setting cpu: 23 Nov 16 16:01:35 Evert.Laptop CPU_Fix[3221]: Setting cpu: 24 Nov 16 16:01:35 Evert.Laptop CPU_Fix[3221]: Setting cpu: 25 Nov 16 16:01:35 Evert.Laptop CPU_Fix[3221]: Setting cpu: 26 Nov 16 16:01:35 Evert.Laptop CPU_Fix[3221]: Setting cpu: 27 Nov 16 16:01:35 Evert.Laptop CPU_Fix[3221]: Setting cpu: 28 Nov 16 16:01:35 Evert.Laptop CPU_Fix[3221]: Setting cpu: 29 Nov 16 16:01:35 Evert.Laptop CPU_Fix[3221]: Setting cpu: 30 Nov 16 16:01:35 Evert.Laptop CPU_Fix[3221]: Setting cpu: 31 Nov 16 16:01:35 Evert.Laptop systemd[1]: CPU_Fix.service: Deactivated successfully. [evert@Evert ~]$ cat /etc/systemd/system/CPU_Fix.service [Unit] Description=Unlock BIOS settings and set proper frequency range for CPU. [Service] ExecStart=/usr/bin/CPU_Fix [Install] WantedBy=multi-user.target [evert@Evert ~]$ ``` Seems that my situation is worse than you -- I've added intel_pstate=disable to my kernel parameters, but your set of commands doesn't affect whatsoever my CPU frequency scaling. I'll try intel_pstate=nohwp next and see if this can help instead. No cigar for me, this doesn't work either. Do you have cpupower installed? What are your symptoms? For me the key was: echo 1 | tee /sys/module/processor/parameters/ignore_ppc That enabled me to be able to set the CPU frequencies with cpupower. I also have cpupower-gui installed to monitor the actual frequencies applied to the CPU. Installation of that is a little tricky, as the normal install scripts don't work. It just doesn't work at all, whether I check through i7z or silver.urih.com after running your first set of commands. The CPU is still underperforming under Linux. Writing to ignore_ppc did not produce any error, same as for cpupower, but this didn't affect the performance. Interesting. What do you mean by underperforming? On my system, I could see that there never was a frequency assigned thats higher than 2.2GHz, until I turned off the safeties and manually set the frequency ranges with that CPU_fix script. Of course, the system still reports the old maximum cpu speed in cpuinfo: ``` Automated test for i9-13900HX CPU Intel states: i9-13900HX has 8 performance cores with hyperthreading, and 16 Efficiency cores. Performance cores 0-15 is clocked at 2.2GHz base to 5.4GHz. Efficiency cores 16-31 are clocked from 1.6GHz base to 3.9GHz uname -a: Linux Evert.Laptop 6.6.1-arch1-1 #1 SMP PREEMPT_DYNAMIC Wed, 08 Nov 2023 16:05:38 +0000 x86_64 GNU/Linux /sys/devices/system/cpu/cpu*/cpufreq/base_frequency | sort | uniq 1600000 2200000 /sys/devices/system/cpu/cpu*/cpufreq/cpuinfo_max_freq | sort | uniq 1600000 2200000 /sys/devices/system/cpu/cpu*/cpufreq/cpuinfo_min_freq | sort | uniq 800000 [evert@Evert Old_Scripts]$ ``` But, in cpupower-gui I can see a different frequency range, and when I load the CPU I can see that the proper frequencies are set I do benchmarks with silver.urih.com. Under Windows (PE), I get ~P92000. Under ArchLinux, I get ~P50000. In other words, almost double performance on the NT kernel. I have made a full post on the ArchLinux BBS, which you can read here: https://bbs.archlinux.org/viewtopic.php?id=290345 In addition, trying out your commands to check the min and max CPU freq reported by the kernel seem to give me completely different values from what you've reported: $ cat /sys/devices/system/cpu/cpu*/cpufreq/cpuinfo_min_freq | sort | uniq 800000 $ cat /sys/devices/system/cpu/cpu*/cpufreq/cpuinfo_max_freq | sort | uniq 2401000 What's worse is that I don't have that base_frequency file present in that sysfs, so no output at all for this. I'm doing this with intel_pstate=disabled though, so maybe that's why. I forgot to mention that I use Google Chrome to go to that website, latest version as of writing this (119.0.6045.123). raplcap is telling me my CPU is not supported: ``` [evert@Evert ~]$ sudo rapl-configure-msr [ERROR] [raplcap-msr] CPU not supported: Family=6, Model=B7 Failed to initialize: Operation not supported ``` Same goes for me -- I also wanted to try out s-gui, but I couldn't find the package/software online... do you have a specific link or repo that I could take a look? I believe my CPU is power throttled -- turbostat reports under load that the CPU consumes only <50W...: # turbostat --quiet --interval 1 --cpu 0-31 --show "PkgWatt","Busy%","Core","CoreTmp" Core Busy% CoreTmp PkgWatt - 97.22 58 48.24 0 70.10 58 47.80 0 97.42 4 84.38 54 4 98.21 8 92.53 56 8 98.65 12 95.34 54 12 98.42 16 95.77 58 16 98.97 20 98.50 53 20 98.67 24 97.70 56 24 97.35 28 95.34 53 28 98.93 32 99.72 55 33 99.71 55 34 99.74 55 35 99.72 55 36 99.70 57 37 99.74 57 38 99.73 57 39 99.55 57 40 99.76 55 41 99.74 55 42 99.75 55 43 99.74 55 44 99.52 55 45 99.72 55 46 99.54 55 47 99.72 55 I'll try to search in that direction to see if there's any way of bypassing something that keeps my CPU from going past that barrier. I believe my CPU is power throttled -- turbostat reports under load that the CPU consumes only <50W...: # turbostat --quiet --interval 1 --cpu 0-31 --show "PkgWatt","Busy%","Core","CoreTmp" Core Busy% CoreTmp PkgWatt - 97.22 58 48.24 0 70.10 58 47.80 0 97.42 4 84.38 54 4 98.21 8 92.53 56 8 98.65 12 95.34 54 12 98.42 16 95.77 58 16 98.97 20 98.50 53 20 98.67 24 97.70 56 24 97.35 28 95.34 53 28 98.93 32 99.72 55 33 99.71 55 34 99.74 55 35 99.72 55 36 99.70 57 37 99.74 57 38 99.73 57 39 99.55 57 40 99.76 55 41 99.74 55 42 99.75 55 43 99.74 55 44 99.52 55 45 99.72 55 46 99.54 55 47 99.72 55 I'll try to search in that direction to see if there's any way of bypassing something that keeps my CPU from going past that barrier. Ah, my mistake, the software is called s-tui, and there is a package for it in Arch. 50W is a ridiculously low limit for a CPU like the i9-13900HX. From intel: https://ark.intel.com/content/www/us/en/ark/products/232171/intel-core-i9-13900hx-processor-36m-cache-up-to-5-40-ghz.html It specifies the CPU alone runs at 55W base, and goes up to 157W when it's all in. I have seen power draws of up to 215W on my system. Of course, this is with my workaround in place, which does not work for everybody, unfortunately. When I switch off my workaround, my power draw on the cpu tops out at 37W. sudo s-tui shows a total power draw of 80W for the system. You have to run s-tui in superuser mode to see the power graph. Using turbostat to get values: Without the workaround in place my system draws about 6.5W base. When I assign the proper frequencies, it draws 8W in powersave mode, and 12W in performace mode with no load. In performance mode, and stress testing, it uses an instantaneous 141.8W, and then rapidly backs off to about 120W constant powerdraw. Stress testing with 1 thread, with the system in performance mode, the laptop will hold a constant 5.4GHz on one core, and draw 40W I made an interesting find in turbostat, after reading this article on StackOverflow: https://askubuntu.com/questions/1226254/set-max-tdp-of-intel-h-series-cpu # turbostat ... cpu0: MSR_RAPL_POWER_UNIT: 0x000a0e03 (0.125000 Watts, 0.000061 Joules, 0.000977 sec.) --> cpu0: MSR_PKG_POWER_INFO: 0x000001b8 (55 W TDP, RAPL 0 - 0 W, 0.000000 sec.) cpu0: MSR_PKG_POWER_LIMIT: 0x42857800df8578 (UNlocked) ... Most notably on the MSR_PKG_POWER_INFO MSR register -- it reports bogus TDP and min/max ranges... The raplcap did not support my CPU, but that is now fixed, and if you are running Arch you can install raplcap-git and test out the fixes for yourself before the next version hits. For my system, the values are: ``` [evert@Evert raplcap-git]$ sudo rapl-configure-msr enabled: true clamped: false locked: false watts_long: 135.000000000000 seconds_long: 96.000000000000 watts_short: 162.000000000000 seconds_short: 0.002441406250 locked_peak: false watts_peak: 315.000000000000 joules: 23251.590698242188 joules_max: 262144.000000000000 ``` ``` [evert@Evert raplcap-git]$ sudo rapl-configure-powercap enabled: true watts_long: 135.000000000000 seconds_long: 95.944704000000 watts_short: 162.000000000000 seconds_short: 0.002440000000 watts_peak: 315.000000000000 joules: 23480.298593000000 joules_max: 262143.328849999991 [evert@Evert raplcap-git]$ ``` On my system it shows very reasonable power caps, and still does not explain why my CPU is getting throttled if I do not ``` echo 1 | tee /sys/module/processor/parameters/ignore_ppc ``` I have also done a little test with and without enabling ignore_ppc, and there is no difference between before and after throwing this switch. I have very low values reported by those two tools: # rapl-configure-msr enabled: true clamped: false locked: false watts_long: 130.000000000000 seconds_long: 56.000000000000 watts_short: 130.000000000000 seconds_short: 0.002441406250 locked_peak: false watts_peak: 200.000000000000 joules: 2094.659851074219 joules_max: 262144.000000000000 # rapl-configure-powercap enabled: true watts_long: 130.000000000000 seconds_long: 55.967744000000 watts_short: 130.000000000000 seconds_short: 0.002440000000 watts_peak: 200.000000000000 joules: 2208.615012000000 joules_max: 262143.328849999991 Still have no idea how to go beyond 50W... Looking at your values, it is not the rapl that is limiting your wattage. While they are more conservative than my laptop's settings, they definitely should allow you to draw more than 50W How many watts can your power brick deliver? Is your laptop battery fully charged? So, let's go back to /sys/module/processor/parameters/ignore_ppc What is the output of: cat /sys/module/processor/parameters/ignore_ppc ? My laptop has a 330W power brick, and is almost always plugged into AC. cat /sys/module/processor/parameters/ignore_ppc initially reports 0, but accepts 1. I did put it to 1 with echo 1 > /sys/module/processor/parameters/ignore_ppc, but this didn't change anything -- the TDP reading stays below 50W under load. I also observed that CPU-Z under Windows PE reports 55W as well. Under load, HWMonitor reports 75W TDP. I'll try a Windows-to-Go installation to see if I can hit even higher values. I tried a lot of things regarding kernel parameters: - acpi=off - acpi=ht - intel_pstate=nohwp - intel_pstate=passive - intel_pstate=disable - acpi_osi='Windows 2022' - acpi_osi=! acpi_osi='Windows 2022' - acpi_osi='Linux' - nvidia.modprobe=0 nvidia_drm.modeset=0 (I thought maybe my GPU sucked all of the power, so that's why I tried that) I also tried a lot of tools, reading and writing to MSR registers, writing to sysfs files... and even then I still cannot bloody reach that artificial 50W barrier!! Did you confirm that cat /sys/module/processor/parameters/ignore_ppc reported 1 after the you echoed 1 into it? If it does not say 1, then the CPU frequencies settings won't work. I have it set as a kernel parameter these days processor.ignore_ppc=1 These are my only qkernel parameters: processor.ignore_ppc=1 nvidia_drm.modeset=0 Anyways, after all that, are you able to set kernel frequencies and confirm that they are set? I use cpufreq-gui to confirm that the ranges are set properly. Yes, the value written in /sys/module/processor/parameters/ignore_ppc stays at one after writing to it. I however found the source of my problem: using Intel's XTU utility on Windows 10, it seems that my CPU is throttled in terms of Current/EDP limit. I observe pretty much the same power limit on it, just like ArchLinux. Only Windows PE is not affected. I believe this requires configuring the CPU ratings through the Omen Gaming Hub, as the firmware has no option in the setup utility to configure this directly. I'll try to upgrade to Windows 11 as this is the original OS that came with the laptop. The installation is done on an external HDD, so I still have my ArchLinux installation as well. I was right -- the Omen Gaming Hub actively controls the max TDP -- it completely bypasses Intel's XTU utility and does its tuning by itself by artificially limiting the processor. I'll try to set the maximum performance and see if I still keep those settings after rebooting back to ArchLinux. Of course it won't keep them after a reboot... This is a nightmare. In other words, the Omen Gaming Hub controls the TDP. But since we're on Linux, it defaults to the lowest settings. Just great. I feel defeated by this. I tried a *LOT* of things, took me a whole day but EVEN THEN I still didn't manage to figure out a solution out of this. I am so dissapointed at HP for making us dependent on specific software that can't be used outside of Windows. I even wanted to downgrade my UEFI firmware, but I can't even do that since the updater program doesn't allow you to do that. I was also unable to do that manually by setting up a separate USB storage medium that contains the files to perform the downgrade; it just complains that there's no "BIOS image signature"... I feel really exhausted right now. :-( I was able to reach up to 100W on Linux once, but I have no idea why & how. I did fiddle with /sys/devices/system/cpu/intel_pstate/min_perf_pct and /sys/devices/system/cpu/intel_pstate/max_perf_pct. However, this shows that it IS possible. I'll try to find back my reproduction steps and post it there if I manage to do it. I'm sure you will get to the bottom of this, you have amazing persistence. Whenever you are running Linux on the latest and greatest hardware, you volunteered to be some kind of pioneer. While it is a lot more work, it is also a learning experience. If you feel like taking some revenge, bash HP for crippling their hardware! Should you manage to get your laptop to perform normally, please put a description of what you did here or somewhere on the web so that people with similar hardware issues can have a less ugly time of it. Saying all that, it still does not explain why my laptop needs to have it's ppc disabled to perform normally. > Saying all that, it still does not explain why my laptop needs to have it's > ppc disabled to perform normally. I'd say to try the following things: Have you looked at dmesg? Any ACPI/Intel modules errors logged in it? - sudo dmesg | grep -iE '(intel|acpi|power)' Have you looked at this bug report? - https://bugzilla.kernel.org/show_bug.cgi?id=23412 - It seems that this "bug" is affecting people since kernel 2.6.30, regardless of whether this is AMD or Intel. It is apparently caused by interpreting the _PPC object which defines a range of states from n to the maximum state supported by your CPU. - It could be good to dissassemble your ACPI tables and check if there is a flow that explicitely defines _PPC to a non-zero value. - Install acpica and dump all tables: - sudo acpidump > acpi.dat - acpixtract acpi.dat - iasl -d *.dat - This will give you .iasl files IIRC (not at home right now), do a grep -R '_PPC' to see where the object is set/referenced. You can give me your dump if you have difficulties figuring this out! - Maybe you could spoof up the ACPI OSI string? - I know for a fact that some BIOSes purposedly disable features when the Linux kernel identifies itself. - Add `acpi_osi="Windows 2019"` to your kernel parameters to trick the SSDT code. - If this doesn't work, you can also try `acpi_osi=! acpi_osi="Windows 2019"`. This one will make sure to strip out every possible _OSI vendor string, which in turn will pervent execution of code that could break things if the first option doesn't work. - You may take a look at https://forum.manjaro.org/t/how-to-choose-the-proper-acpi-kernel-argument/1405 Let me know if you need extra help -- I'll also try this evening to pass directly processor.ignore_ppc=1 to my kernel parameters instead of just echoing it directly after starting up. Hi there! Thanks for the extra info. For posterity, here is the output of the filtered dmesg: ``` [sudo] password for evert: [ 0.000000] Command line: quiet delayacct loglevel=3 rd.systemd.show_status=auto nvidia_drm.modeset=1 processor.ignore_ppc=1 root=PARTLABEL="Sbr_Root" rw initrd=\intel-ucode.img initrd=\initramfs-linux.img [ 0.000000] BIOS-e820: [mem 0x00000000309df000-0x000000003398efff] ACPI NVS [ 0.000000] BIOS-e820: [mem 0x000000003398f000-0x0000000033afefff] ACPI data [ 0.000000] efi: ACPI=0x33afe000 ACPI 2.0=0x33afe014 TPMFinalLog=0x33866000 SMBIOS=0x2efb9000 MEMATTR=0x27f63018 ESRT=0x27f66418 INITRD=0x27b5ef18 RNG=0x33a2d018 TPMEventLog=0x27b51018 [ 0.010529] ACPI: Early table checksum verification disabled [ 0.010533] ACPI: RSDP 0x0000000033AFE014 000024 (v02 INSYDE) [ 0.010539] ACPI: XSDT 0x0000000033A41188 00011C (v01 INSYDE ADL 00000002 01000013) [ 0.010547] ACPI: FACP 0x0000000033AD3000 000114 (v06 INSYDE ADL 00000002 ACPI 00040000) [ 0.010554] ACPI: DSDT 0x0000000033A48000 08770B (v02 INSYDE ADL 00000002 ACPI 00040000) [ 0.010559] ACPI: FACS 0x000000003383E000 000040 [ 0.010562] ACPI: UEFI 0x000000003398E000 000236 (v01 INSYDE H2O BIOS 00000001 ACPI 00040000) [ 0.010566] ACPI: SSDT 0x0000000033AF9000 0034B3 (v02 DptfTb DptfTabl 00001000 INTL 20200717) [ 0.010570] ACPI: SSDT 0x0000000033AF3000 005D34 (v02 CpuRef CpuSsdt 00003000 INTL 20200717) [ 0.010574] ACPI: SSDT 0x0000000033AF0000 002767 (v02 SaSsdt SaSsdt 00003000 INTL 20200717) [ 0.010578] ACPI: SSDT 0x0000000033AEC000 00328B (v02 INTEL IgfxSsdt 00003000 INTL 20200717) [ 0.010582] ACPI: SSDT 0x0000000033AEB000 00077B (v02 INSYDE Tpm2Tabl 00001000 INTL 20200717) [ 0.010585] ACPI: TPM2 0x0000000033AEA000 00004C (v04 INSYDE ADL 00000002 ACPI 00040000) [ 0.010589] ACPI: SSDT 0x0000000033AE6000 002B8F (v02 INTEL DTbtSsdt 00001000 INTL 20200717) [ 0.010593] ACPI: SSDT 0x0000000033AE4000 0016A6 (v02 INSYDE UsbCTabl 00001000 INTL 20200717) [ 0.010596] ACPI: WSMT 0x0000000033AE3000 000028 (v01 INSYDE ADL 00000002 ACPI 00040000) [ 0.010601] ACPI: SSDT 0x0000000033AE1000 0015FD (v02 INSYDE PtidDevc 00001000 INTL 20200717) [ 0.010604] ACPI: SSDT 0x0000000033AD7000 009104 (v02 INSYDE TbtTypeC 00000000 INTL 20200717) [ 0.010607] ACPI: DBGP 0x0000000033AD6000 000034 (v01 INSYDE ADL 00000002 ACPI 00040000) [ 0.010610] ACPI: DBG2 0x0000000033AD5000 000054 (v00 INSYDE ADL 00000002 ACPI 00040000) [ 0.010613] ACPI: NHLT 0x0000000033AD4000 00002D (v00 INSYDE ADL 00000002 ACPI 00040000) [ 0.010616] ACPI: HPET 0x0000000033AD2000 000038 (v01 INSYDE ADL 00000002 ACPI 00040000) [ 0.010618] ACPI: APIC 0x0000000033AD1000 0001DC (v05 INSYDE ADL 00000002 ACPI 00040000) [ 0.010621] ACPI: MCFG 0x0000000033AD0000 00003C (v01 INSYDE ADL 00000002 ACPI 00040000) [ 0.010624] ACPI: SSDT 0x0000000033A42000 0054CA (v02 INSYDE ADL 00000002 01000013) [ 0.010628] ACPI: DMAR 0x0000000033AFD000 000088 (v02 INTEL ICL 00000002 ACPI 00040000) [ 0.010631] ACPI: SSDT 0x0000000033A3D000 00306C (v01 NvdRef NvdTabl 00001000 INTL 20200717) [ 0.010634] ACPI: SSDT 0x0000000033A3C000 000244 (v01 NvdRef NvdExt 00001000 INTL 20200717) [ 0.010637] ACPI: SSDT 0x0000000033A3B000 000581 (v01 NvdRef NvdDds 00001000 INTL 20200717) [ 0.010640] ACPI: SSDT 0x0000000033A3A000 000FD8 (v02 INTEL xh_rplsb 00000000 INTL 20200717) [ 0.010643] ACPI: SSDT 0x0000000033A36000 003AEA (v02 SocGpe SocGpe 00003000 INTL 20200717) [ 0.010646] ACPI: SSDT 0x0000000033A32000 0039DA (v02 SocCmn SocCmn 00003000 INTL 20200717) [ 0.010649] ACPI: SSDT 0x0000000033A31000 0000F8 (v02 INSYDE PcdTabl 00001000 INTL 20200717) [ 0.010652] ACPI: FPDT 0x0000000033A30000 000044 (v01 INSYDE ADL 00000002 ACPI 00040000) [ 0.010655] ACPI: BGRT 0x0000000033A2F000 000038 (v01 INSYDE H2O BIOS 00000001 ACPI 00040000) [ 0.010658] ACPI: PHAT 0x0000000033A2E000 000A83 (v01 INSYDE ADL 00000005 ACPI 00040000) [ 0.010661] ACPI: Reserving FACP table memory at [mem 0x33ad3000-0x33ad3113] [ 0.010663] ACPI: Reserving DSDT table memory at [mem 0x33a48000-0x33acf70a] [ 0.010664] ACPI: Reserving FACS table memory at [mem 0x3383e000-0x3383e03f] [ 0.010665] ACPI: Reserving UEFI table memory at [mem 0x3398e000-0x3398e235] [ 0.010665] ACPI: Reserving SSDT table memory at [mem 0x33af9000-0x33afc4b2] [ 0.010666] ACPI: Reserving SSDT table memory at [mem 0x33af3000-0x33af8d33] [ 0.010667] ACPI: Reserving SSDT table memory at [mem 0x33af0000-0x33af2766] [ 0.010668] ACPI: Reserving SSDT table memory at [mem 0x33aec000-0x33aef28a] [ 0.010669] ACPI: Reserving SSDT table memory at [mem 0x33aeb000-0x33aeb77a] [ 0.010669] ACPI: Reserving TPM2 table memory at [mem 0x33aea000-0x33aea04b] [ 0.010670] ACPI: Reserving SSDT table memory at [mem 0x33ae6000-0x33ae8b8e] [ 0.010671] ACPI: Reserving SSDT table memory at [mem 0x33ae4000-0x33ae56a5] [ 0.010672] ACPI: Reserving WSMT table memory at [mem 0x33ae3000-0x33ae3027] [ 0.010673] ACPI: Reserving SSDT table memory at [mem 0x33ae1000-0x33ae25fc] [ 0.010673] ACPI: Reserving SSDT table memory at [mem 0x33ad7000-0x33ae0103] [ 0.010674] ACPI: Reserving DBGP table memory at [mem 0x33ad6000-0x33ad6033] [ 0.010675] ACPI: Reserving DBG2 table memory at [mem 0x33ad5000-0x33ad5053] [ 0.010676] ACPI: Reserving NHLT table memory at [mem 0x33ad4000-0x33ad402c] [ 0.010677] ACPI: Reserving HPET table memory at [mem 0x33ad2000-0x33ad2037] [ 0.010678] ACPI: Reserving APIC table memory at [mem 0x33ad1000-0x33ad11db] [ 0.010678] ACPI: Reserving MCFG table memory at [mem 0x33ad0000-0x33ad003b] [ 0.010679] ACPI: Reserving SSDT table memory at [mem 0x33a42000-0x33a474c9] [ 0.010680] ACPI: Reserving DMAR table memory at [mem 0x33afd000-0x33afd087] [ 0.010681] ACPI: Reserving SSDT table memory at [mem 0x33a3d000-0x33a4006b] [ 0.010682] ACPI: Reserving SSDT table memory at [mem 0x33a3c000-0x33a3c243] [ 0.010682] ACPI: Reserving SSDT table memory at [mem 0x33a3b000-0x33a3b580] [ 0.010683] ACPI: Reserving SSDT table memory at [mem 0x33a3a000-0x33a3afd7] [ 0.010684] ACPI: Reserving SSDT table memory at [mem 0x33a36000-0x33a39ae9] [ 0.010685] ACPI: Reserving SSDT table memory at [mem 0x33a32000-0x33a359d9] [ 0.010686] ACPI: Reserving SSDT table memory at [mem 0x33a31000-0x33a310f7] [ 0.010686] ACPI: Reserving FPDT table memory at [mem 0x33a30000-0x33a30043] [ 0.010687] ACPI: Reserving BGRT table memory at [mem 0x33a2f000-0x33a2f037] [ 0.010688] ACPI: Reserving PHAT table memory at [mem 0x33a2e000-0x33a2ea82] [ 0.100616] Reserving Intel graphics memory at [mem 0x3c800000-0x407fffff] [ 0.101345] ACPI: PM-Timer IO Port: 0x1808 [ 0.101355] ACPI: LAPIC_NMI (acpi_id[0x01] high edge lint[0x1]) [ 0.101358] ACPI: LAPIC_NMI (acpi_id[0x02] high edge lint[0x1]) [ 0.101359] ACPI: LAPIC_NMI (acpi_id[0x03] high edge lint[0x1]) [ 0.101359] ACPI: LAPIC_NMI (acpi_id[0x04] high edge lint[0x1]) [ 0.101360] ACPI: LAPIC_NMI (acpi_id[0x05] high edge lint[0x1]) [ 0.101361] ACPI: LAPIC_NMI (acpi_id[0x06] high edge lint[0x1]) [ 0.101362] ACPI: LAPIC_NMI (acpi_id[0x07] high edge lint[0x1]) [ 0.101362] ACPI: LAPIC_NMI (acpi_id[0x08] high edge lint[0x1]) [ 0.101363] ACPI: LAPIC_NMI (acpi_id[0x09] high edge lint[0x1]) [ 0.101364] ACPI: LAPIC_NMI (acpi_id[0x0a] high edge lint[0x1]) [ 0.101364] ACPI: LAPIC_NMI (acpi_id[0x0b] high edge lint[0x1]) [ 0.101365] ACPI: LAPIC_NMI (acpi_id[0x0c] high edge lint[0x1]) [ 0.101366] ACPI: LAPIC_NMI (acpi_id[0x0d] high edge lint[0x1]) [ 0.101366] ACPI: LAPIC_NMI (acpi_id[0x0e] high edge lint[0x1]) [ 0.101367] ACPI: LAPIC_NMI (acpi_id[0x0f] high edge lint[0x1]) [ 0.101368] ACPI: LAPIC_NMI (acpi_id[0x10] high edge lint[0x1]) [ 0.101369] ACPI: LAPIC_NMI (acpi_id[0x11] high edge lint[0x1]) [ 0.101369] ACPI: LAPIC_NMI (acpi_id[0x12] high edge lint[0x1]) [ 0.101370] ACPI: LAPIC_NMI (acpi_id[0x13] high edge lint[0x1]) [ 0.101371] ACPI: LAPIC_NMI (acpi_id[0x14] high edge lint[0x1]) [ 0.101371] ACPI: LAPIC_NMI (acpi_id[0x15] high edge lint[0x1]) [ 0.101372] ACPI: LAPIC_NMI (acpi_id[0x16] high edge lint[0x1]) [ 0.101373] ACPI: LAPIC_NMI (acpi_id[0x17] high edge lint[0x1]) [ 0.101374] ACPI: LAPIC_NMI (acpi_id[0x00] high edge lint[0x1]) [ 0.101418] ACPI: INT_SRC_OVR (bus 0 bus_irq 0 global_irq 2 dfl dfl) [ 0.101420] ACPI: INT_SRC_OVR (bus 0 bus_irq 9 global_irq 9 high level) [ 0.101425] ACPI: Using ACPI (MADT) for SMP configuration information [ 0.101426] ACPI: HPET id: 0x8086a201 base: 0xfed00000 [ 0.109738] Kernel command line: quiet delayacct loglevel=3 rd.systemd.show_status=auto nvidia_drm.modeset=1 processor.ignore_ppc=1 root=PARTLABEL="Sbr_Root" rw initrd=\intel-ucode.img initrd=\initramfs-linux.img [ 0.252736] ACPI: Core revision 20230628 [ 0.263565] smpboot: CPU0: 13th Gen Intel(R) Core(TM) i9-13900HX (family: 0x6, model: 0xb7, stepping: 0x1) [ 0.263565] Performance Events: XSAVE Architectural LBR, PEBS fmt4+-baseline, AnyThread deprecated, Alderlake Hybrid events, 32-deep LBR, full-width counters, Intel PMU driver. [ 0.322163] ACPI: PM: Registering ACPI NVS region [mem 0x309df000-0x3398efff] (50003968 bytes) [ 0.325330] thermal_sys: Registered thermal governor 'power_allocator' [ 0.325330] acpiphp: ACPI Hot Plug PCI Controller Driver version: 0.5 [ 0.333851] ACPI: Added _OSI(Module Device) [ 0.333853] ACPI: Added _OSI(Processor Device) [ 0.333855] ACPI: Added _OSI(3.0 _SCP Extensions) [ 0.333856] ACPI: Added _OSI(Processor Aggregator Device) [ 0.531746] ACPI BIOS Error (bug): Could not resolve symbol [\_SB.PC00.MHBR], AE_NOT_FOUND (20230628/psargs-330) [ 0.531758] ACPI: Ignoring error and continuing table load [ 0.531793] ACPI BIOS Error (bug): Could not resolve symbol [\_SB.PTID.PBAR], AE_NOT_FOUND (20230628/dsfield-500) [ 0.544507] ACPI: 18 ACPI AML tables successfully acquired and loaded [ 0.629343] ACPI: Dynamic OEM Table Load: [ 0.629361] ACPI: SSDT 0xFFFF888101B80C00 000394 (v02 PmRef Cpu0Cst 00003001 INTL 20200717) [ 0.631346] ACPI: Dynamic OEM Table Load: [ 0.631357] ACPI: SSDT 0xFFFF888103C77800 00051E (v02 PmRef Cpu0Ist 00003000 INTL 20200717) [ 0.633369] ACPI: Dynamic OEM Table Load: [ 0.633378] ACPI: SSDT 0xFFFF888103C0EC00 0001AB (v02 PmRef Cpu0Psd 00003000 INTL 20200717) [ 0.635245] ACPI: Dynamic OEM Table Load: [ 0.635254] ACPI: SSDT 0xFFFF888103C71800 0004B5 (v02 PmRef Cpu0Hwp 00003000 INTL 20200717) [ 0.637920] ACPI: Dynamic OEM Table Load: [ 0.637939] ACPI: SSDT 0xFFFF888101B66000 001BAF (v02 PmRef ApIst 00003000 INTL 20200717) [ 0.641255] ACPI: Dynamic OEM Table Load: [ 0.641269] ACPI: SSDT 0xFFFF888101B62000 001038 (v02 PmRef ApHwp 00003000 INTL 20200717) [ 0.644159] ACPI: Dynamic OEM Table Load: [ 0.644173] ACPI: SSDT 0xFFFF888103C78000 001349 (v02 PmRef ApPsd 00003000 INTL 20200717) [ 0.647138] ACPI: Dynamic OEM Table Load: [ 0.647151] ACPI: SSDT 0xFFFF888101B8B000 000FBB (v02 PmRef ApCst 00003000 INTL 20200717) [ 0.675768] ACPI: _OSC evaluated successfully for all CPUs [ 0.675881] ACPI: EC: EC started [ 0.675882] ACPI: EC: interrupt blocked [ 0.689063] ACPI: EC: EC_CMD/EC_SC=0x66, EC_DATA=0x62 [ 0.689066] ACPI: \_SB_.PC00.LPCB.EC__: Boot DSDT EC used to handle transactions [ 0.689069] ACPI: Interpreter enabled [ 0.689204] ACPI: PM: (supports S0 S3 S4 S5) [ 0.689206] ACPI: Using IOAPIC for interrupt routing [ 0.694764] PCI: MMCONFIG at [mem 0xc0000000-0xcfffffff] reserved as ACPI motherboard resource [ 0.694780] PCI: Using host bridge windows from ACPI; if necessary, use "pci=nocrs" and report a bug [ 0.696834] ACPI: Enabled 10 GPEs in block 00 to 7F [ 0.699215] ACPI: \_SB_.PC00.PEG1.PG00: New power resource [ 0.708450] ACPI: \_SB_.PC00.XHCI.RHUB.HS14.BTRT: New power resource [ 0.721739] ACPI: \_SB_.PC00.CNVW.WRST: New power resource [ 0.743405] ACPI: \_SB_.PC00.RP25.PXP_: New power resource [ 0.761678] ACPI: \PIN_: New power resource [ 0.762282] ACPI: PCI Root Bridge [PC00] (domain 0000 [bus 00-fe]) [ 0.762292] acpi PNP0A08:00: _OSC: OS supports [ExtendedConfig ASPM ClockPM Segments MSI EDR HPX-Type3] [ 0.766869] acpi PNP0A08:00: _OSC: OS now controls [PCIeHotplug SHPCHotplug PME AER PCIeCapability LTR DPC] [ 0.792499] acpiphp: Slot [1] registered [ 0.819462] ACPI: EC: interrupt unblocked [ 0.819464] ACPI: EC: event unblocked [ 0.819481] ACPI: EC: EC_CMD/EC_SC=0x66, EC_DATA=0x62 [ 0.819483] ACPI: EC: GPE=0x6e [ 0.819485] ACPI: \_SB_.PC00.LPCB.EC__: Boot DSDT EC initialization complete [ 0.819487] ACPI: \_SB_.PC00.LPCB.EC__: EC: Used to handle transactions and events [ 0.820542] ACPI: bus type USB registered [ 0.824593] PCI: Using ACPI for IRQ routing [ 0.850592] pnp: PnP ACPI init [ 0.856128] pnp: PnP ACPI: found 8 devices [ 0.862390] clocksource: acpi_pm: mask: 0xffffff max_cycles: 0xffffff, max_idle_ns: 2085701024 ns [ 0.875691] pci 0000:01:00.1: extending delay after power-on from D3hot to 20 msec [ 0.875725] pci 0000:01:00.1: D0 power state depends on 0000:01:00.0 [ 0.876511] DMAR: Intel-IOMMU force enabled due to platform opt in [ 0.879047] DMAR: Intel(R) Virtualization Technology for Directed I/O [ 0.907858] ACPI: \_SB_.PR00: Found 3 idle states [ 0.910815] ACPI: AC: AC Adapter [AC] (on-line) [ 0.910897] input: Power Button as /devices/LNXSYSTM:00/LNXSYBUS:00/PNP0C0C:00/input/input0 [ 0.911311] ACPI: button: Power Button [PWRB] [ 0.911449] ACPI: button: Sleep Button [SLPB] [ 0.911594] ACPI: button: Lid Switch [LID0] [ 0.911630] input: Power Button as /devices/LNXSYSTM:00/LNXPWRBN:00/input/input3 [ 0.911736] ACPI: button: Power Button [PWRF] [ 0.924184] ACPI: thermal: Thermal Zone [TZ0] (25 C) [ 0.927413] ACPI: battery: Slot [BAT0] (battery present) [ 0.928364] hpet_acpi_add: no address or irqs in _CRS [ 0.943469] ACPI: bus type drm_connector registered [ 0.951875] intel_pstate: Intel P-state driver initializing [ 0.955886] intel_pstate: HWP enabled [ 0.962989] intel_pmc_core INT33A1:00: initialized [ 1.533795] usb: port power management may be unreliable [ 1.901085] BTRFS info (device nvme0n1p2): using crc32c (crc32c-intel) checksum algorithm [ 3.913595] input: Intel HID events as /devices/platform/INTC1051:00/input/input6 [ 3.961085] intel-lpss 0000:00:15.0: enabling device (0004 -> 0006) [ 3.962353] idma64 idma64.0: Found Intel integrated DMA 64-bit [ 3.964022] clevo_acpi: interface initialized [ 4.014701] ACPI: bus type thunderbolt registered [ 4.017308] intel_rapl_msr: PL4 support detected. [ 4.017348] intel_rapl_common: Found RAPL domain package [ 4.017352] intel_rapl_common: Found RAPL domain core [ 4.017354] intel_rapl_common: Found RAPL domain uncore [ 4.017356] intel_rapl_common: Found RAPL domain psys [ 4.100732] iTCO_wdt iTCO_wdt: Found a Intel PCH TCO device (Version=6, TCOBASE=0x0400) [ 4.118443] Intel(R) Wireless WiFi driver for Linux [ 4.133083] intel_rapl_common: Found RAPL domain package [ 4.160685] intel-lpss 0000:00:15.1: enabling device (0004 -> 0006) [ 4.161066] idma64 idma64.1: Found Intel integrated DMA 64-bit [ 4.177806] intel-lpss 0000:00:15.2: enabling device (0004 -> 0006) [ 4.178329] idma64 idma64.2: Found Intel integrated DMA 64-bit [ 4.551199] BTRFS info (device nvme0n1p5): using crc32c (crc32c-intel) checksum algorithm [ 4.552765] BTRFS info (device nvme1n1p1): using crc32c (crc32c-intel) checksum algorithm [ 4.562221] snd_hda_intel 0000:00:1f.3: enabling device (0000 -> 0002) [ 4.562523] snd_hda_intel 0000:01:00.1: enabling device (0000 -> 0002) [ 4.562588] snd_hda_intel 0000:01:00.1: Disabling MSI [ 4.562594] snd_hda_intel 0000:01:00.1: Handle vga_switcheroo audio client [ 4.563902] ACPI Warning: \_SB.PC00.XHCI.RHUB.HS14._DSM: Argument #4 type mismatch - Found [Integer], ACPI requires [Package] (20230628/nsarguments-61) [ 4.576437] Bluetooth: hci0: Found device firmware: intel/ibt-1040-0041.sfi [ 4.665530] iwlwifi 0000:00:14.3: Detected Intel(R) Wi-Fi 6E AX211 160MHz, REV=0x430 [ 4.965931] intel_tcc_cooling: Programmable TCC Offset detected [ 5.611589] ACPI Warning: \_SB.NPCF._DSM: Argument #4 type mismatch - Found [Buffer], ACPI requires [Package] (20230628/nsarguments-61) [ 5.611679] ACPI Warning: \_SB.PC00.PEG1.PEGP._DSM: Argument #4 type mismatch - Found [Buffer], ACPI requires [Package] (20230628/nsarguments-61) [ 6.082916] Bluetooth: hci0: Found Intel DDC parameters: intel/ibt-1040-0041.ddc [ 6.084846] Bluetooth: hci0: Applying Intel DDC parameters completed [ 6.559986] i915 0000:00:02.0: [drm] Skipping intel_backlight registration [ 6.561227] ACPI: video: [Firmware Bug]: ACPI(PEGP) defines _DOD but not _DOS [ 6.561245] ACPI: video: Video Device [PEGP] (multi-head: yes rom: no post: no) [ 6.562516] ACPI: video: Video Device [GFX0] (multi-head: yes rom: no post: no) [ 6.563122] snd_hda_intel 0000:00:1f.3: bound 0000:00:02.0 (ops i915_audio_component_bind_ops [i915]) [ 6.770440] input: HDA Intel PCH Mic as /devices/pci0000:00/0000:00:1f.3/sound/card0/input26 [ 6.770481] input: HDA Intel PCH Headphone as /devices/pci0000:00/0000:00:1f.3/sound/card0/input27 [ 6.770546] input: HDA Intel PCH HDMI/DP,pcm=3 as /devices/pci0000:00/0000:00:1f.3/sound/card0/input28 [ 6.770619] input: HDA Intel PCH HDMI/DP,pcm=7 as /devices/pci0000:00/0000:00:1f.3/sound/card0/input29 [ 6.770718] input: HDA Intel PCH HDMI/DP,pcm=8 as /devices/pci0000:00/0000:00:1f.3/sound/card0/input30 [ 6.770932] input: HDA Intel PCH HDMI/DP,pcm=9 as /devices/pci0000:00/0000:00:1f.3/sound/card0/input31 [ 9.331323] ucsi_acpi USBC000:00: error -ETIMEDOUT: PPM init failed ``` I do not see anything untoward in there. I have added acpi_osi='Windows 2022' to my kernel parameters, but there was no change in the throttling. I will attach the dump to this ticket. I had a look at the *.dsl files, but I can not make sense of them. Created attachment 305437 [details]
dsdt.dat
Acpi dump.
Could you directly send me all of the *.dsl files in a .zip file? iasl disassembles everything with the context of your PC's ACPI tables, so I cannot completely disassemble them from my end... or at least not completely on MSYS (I'm at work right now) -- I can do that when I'm home otherwise. Thanks! Created attachment 305439 [details]
Dsl_files zipped
All the dsl files, zipped
Just added the all the dsl files in a zip file to this report. Thanks for looking into this, I am quite curios if we can ever get to the bottom of this. Have you tried acpi_osi=Linux? Your ACPI tables suggest that it supports it separately from Windows, maybe you could try this out first. First, thank you so much for looking at this! Looks like you might be keeping your promise of fixing this issue for me. I tested out acpi_osi='Linux' At the same time, I removed the ignore_ppc kernel flag. I never really liked it in the first place, and I wanted to see if the acpi_osi flag actually fixes my problem. Just booting up like that, and not setting the CPU frequencies at all, they are all capped at 2.2GHz. Just making sure that nothing switched on the ignore_ppc: ``` [evert@Evert scripts]$ cat /sys/module/processor/parameters/ignore_ppc 0 ``` When running s-tui, I can confirm that the system is maxing out at 2.2GHz, and power draw is about 90W for the whole system. What is interesting now, is that I am able to set the cpu frequency ranges with cpupower, even though the ignore_ppc is not set. Previously this was not possible without ignore_ppc flag. The CPU operates in it's designed ranges, ie 16 cores go up to 5.4GHz, and 16 go up to 3.9GHz, confirmed by cpupower-gui and s-tui It also draws a healthy 200W at full pull. So, this is a massive improvement, and I feel we are closing in on the issue. Unfortunately, the cpuinfo_max_freq still does not reflect the actual max frequencies that the cores run at: ``` /sys/devices/system/cpu/cpu*/cpufreq/base_frequency | sort | uniq 1600000 2200000 /sys/devices/system/cpu/cpu*/cpufreq/cpuinfo_max_freq | sort | uniq 1600000 2200000 /sys/devices/system/cpu/cpu*/cpufreq/cpuinfo_min_freq | sort | uniq 800000 ``` It's not a train smash, but a lot of utilities read these frequencies, and they seem to be read-only as not even root can change the values. As at the beginning of this bug report, my goal is still for the values for cpuinfo_max_freq to be correct. Created attachment 305442 [details]
Filtered dmesg for ACPI Events with osi=Linux
Created attachment 305443 [details]
decoded dsl files for acpi_osi=Linux
I added the filtered ACPI messages for acpi_osi=Linux, as well as an ACPI dump that was decoded. Not sure if it will help, or if anything has changed, but they are there for posterity now. What about with acpi_osi=Linux intel_pstate=passive? It is perfectly normal that those files cannot be written -- those are part of a sysdev structure which can be pretty much seen as imaginery/ghost files: https://man7.org/linux/man-pages/man5/sysfs.5.html I'll take a closer look at this in a few moments. The intel_pstate driver far outclasses the cpufreq one, and even when passive, it's still the backend that actually controls the CPU. The following link supplies a lot of useful information on the driver: https://www.kernel.org/doc/html/v4.12/admin-guide/pm/intel_pstate.html What is interesting, though.. when I set the frequency ranges to what the CPU supports, the cpuinfo_max_freq stays unchanged. However, when I run cpupower-gui, they show the correct values. This makes me wonder where cpupower-gui reads those values from. When using the intel_pstate driver, should the cpuinfo_max_freq files even exist? Again, thank you very much for taking a look at this. Here is something interesting for you: ``` [evert@Evert scripts]$ sudo cpupower --cpu 0 frequency-info analyzing CPU 0: driver: intel_pstate CPUs which run at the same hardware frequency: 0 CPUs which need to have their frequency coordinated by software: 0 maximum transition latency: Cannot determine or is not supported. hardware limits: 800 MHz - 2.20 GHz available cpufreq governors: performance powersave current policy: frequency should be within 800 MHz and 5.20 GHz. The governor "performance" may decide which speed to use within this range. current CPU frequency: Unable to call hardware current CPU frequency: 2.27 GHz (asserted by call to kernel) boost state support: Supported: yes Active: yes ``` Note the difference between the max frequency provided by "hardware" and then that which is provided by the policy. Surely this should not even be possible to have a difference there? (Of course, I am quite happy that I _DO_ have that difference displayed there) This is probably because it reads from different sysfs files. Looking at the source code of cpupower, everything is read from /sys/devices/system/cpu/cpuX/cpufreq/: - Hardware limits are read from "cpuinfo" prefixed files - Policy limits are read from "scaling" prefixed files $ cat /sys/devices/system/cpu/cpu0/cpufreq/cpuinfo_min_freq 800000 $ cat /sys/devices/system/cpu/cpu0/cpufreq/cpuinfo_max_freq 5200000 $ cat /sys/devices/system/cpu/cpu0/cpufreq/scaling_min_freq 800000 $ cat /sys/devices/system/cpu/cpu0/cpufreq/scaling_max_freq 5200000 I personally get normal values, but the TDP is (still) the factor that throttles my performance. I wanted to add that those files are independent from the scaling driver you're using -- they are always provided by the kernel, no matter what. The readings that you get are determined off the MSR registers of your processor IIRC, which can be altered at boot by your UEFI/ACPI firmware. Those sysfs files are directly provided by intel-speed-select, and cannot be directly altered -- they can only be modified through a The readings that you get are (I think) determined off the MSR registers of your processor IIRC, which can be altered at boot by your UEFI/ACPI firmware. Whoops, ignore comment #62, this was part of an answer that I was writing. My second paragraph is wrong and has nothing to do with what I mentioned. :X I believe I should just go to sleep, I (again) wrote something wrong. I actually wanted to mention comment #60.. Anyways, let me know if you find something else interesting. Intel Speed Select Technology. Here is some interesting information for you: https://www.kernel.org/doc/html/latest/admin-guide/pm/intel-speed-select.html And some discussion on dragging it into Arch: https://bbs.archlinux.org/viewtopic.php?id=289464 And a request from October for the same: https://bugs.archlinux.org/index.php?do=details&action=details.addvote&task_id=79932 Right now I am cloning the entire kernel source just to build it locally to see if it can shine a light on my issues. I couldn't run it on my end since the associated module isn't loaded by my kernel. No idea why, our CPU supports SST: https://en.wikipedia.org/wiki/List_of_Intel_Core_processors#%22Raptor_Lake-HX%22_(Intel_7) Same here. I guess it's not enabled in the kernel, and I am now figuring out how to do that. Looks like it's a module: ``` [evert@Evert linux-tools]$ zgrep CONFIG_INTEL_SPEED_SELECT_INTERFACE /proc/config.gz CONFIG_INTEL_SPEED_SELECT_INTERFACE=m ``` Now I'm trying to find the name of the module.... Hmm, according to intel page: https://www.intel.com/content/www/us/en/products/sku/232171/intel-core-i913900hx-processor-36m-cache-up-to-5-40-ghz/specifications.html It looks like Speed Shift is supported, but I see no mention of Speed Select. According to intel-speed-select the CPU is not supported: ``` [evert@Evert linux-tools]$ sudo intel-speed-select --help Intel(R) Speed Select Technology Executing on CPU model:183[0xb7] Intel speed select drivers are not loaded on this system. Verify that kernel config includes CONFIG_INTEL_SPEED_SELECT_INTERFACE. If the config is included then this is not a supported platform. ``` I think I have the drivers installed, but can't be sure as I have not found a definitive list of kernel module names for it. ``` [evert@Evert linux-tools]$ lsmod | grep sst | grep -v snd isst_tpmi 12288 0 isst_tpmi_core 20480 1 isst_tpmi intel_vsec_tpmi 16384 1 isst_tpmi_core isst_if_mmio 12288 0 isst_if_mbox_pci 12288 0 isst_if_common 24576 3 isst_if_mmio,isst_tpmi_core,isst_if_mbox_pci ``` So there's definitely a mix up there -- I think this is more for server or professional grade CPUs, I believe. Bummer, but at least I learned something. The burning question I have is how is the /sys/devices/system/cpu/cpu0/cpufreq/cpuinfo_max_freq determined? Unfortunately I don't have the coding knowledge to sit and read the kernel source to trace where it gets its inputs from, how its calculated and so on. Once we can verify the inputs, then it would be possible to determine if this is a bug in the kernel, or there is an issue with my hardware. I guess the same thing goes for your hardware, Alex. /sys/devices/system/cpu/cpuX/cpufreq/cpuinfo_max_freq is determined by /drivers/cpufreq/cpufreq.c, and is mapped to function cpufreq_get_hw_max_freq: __weak unsigned int cpufreq_get_hw_max_freq(unsigned int cpu) { struct cpufreq_policy *policy = cpufreq_cpu_get(cpu); unsigned int ret_freq = 0; if (policy) { ret_freq = policy->cpuinfo.max_freq; cpufreq_cpu_put(policy); } return ret_freq; } The logic for altering the policy is done by cpufreq_set_policy, which depends on the cpufreq_driver -- in our case, it can either be intel_pstate or acpi-cpufreq: static int cpufreq_set_policy(struct cpufreq_policy *policy, struct cpufreq_governor *new_gov, unsigned int new_pol) { struct cpufreq_policy_data new_data; struct cpufreq_governor *old_gov; int ret; ... if (cpufreq_driver->setpolicy) { policy->policy = new_pol; pr_debug("setting range\n"); return cpufreq_driver->setpolicy(policy); } ... } For intel_pstate, it relies internally on the intel_pstate_update_perf_limits and intel_pstate_set_policy function -- and the source code provides... "interesting" comments: static void intel_pstate_update_perf_limits(struct cpudata *cpu, unsigned int policy_min, unsigned int policy_max) { ... /* * HWP needs some special consideration, because HWP_REQUEST uses * abstract values to represent performance rather than pure ratios. */ if (hwp_active && cpu->pstate.scaling != perf_ctl_scaling) { int scaling = cpu->pstate.scaling; int freq; freq = max_policy_perf * perf_ctl_scaling; max_policy_perf = DIV_ROUND_UP(freq, scaling); freq = min_policy_perf * perf_ctl_scaling; min_policy_perf = DIV_ROUND_UP(freq, scaling); } ... } static int intel_pstate_set_policy(struct cpufreq_policy *policy) { ... if (cpu->policy == CPUFREQ_POLICY_PERFORMANCE) { /* * NOHZ_FULL CPUs need this as the governor callback may not * be invoked on them. */ intel_pstate_clear_update_util_hook(policy->cpu); intel_pstate_max_within_limits(cpu); } else { intel_pstate_set_update_util_hook(policy->cpu); } if (hwp_active) { /* * When hwp_boost was active before and dynamically it * was turned off, in that case we need to clear the * update util hook. */ if (!hwp_boost) intel_pstate_clear_update_util_hook(policy->cpu); intel_pstate_hwp_set(policy->cpu); } /* * policy->cur is never updated with the intel_pstate driver, but it * is used as a stale frequency value. So, keep it within limits. */ policy->cur = policy->min; ... } Hmm, like I said, my skills in this direction are limited. On a conceptual level, cpupower reads something to get the hardware limits. I have a suspicion it just read from /sys/devices/system/cpu/cpuX/cpufreq/cpuinfo_max_freq It would be wasteful to re-calculate that every time, as surely the hardware capabilities can't change? So, it then follows that the first time the kernel boots, it reads/calculates the hardware capabilities of each cpu from somewhere else, and then populates /sys/devices/system/cpu/cpuX/cpufreq/cpuinfo_max_freq. Now, when you run cpupower --cpu 0 frequency-info, it shows 2200 MHz as the hardware limit, and 5200MHz as the scaling limit on my machine. It is not logical to me that the scaling limit can be higher than the hardware limit, and since the CPU happily and hotly runs at the scaling limit, the hardware limit must be wrong. So, whatever sets the hardware limit the first time around must be either getting wrong information or interpreting it wrong. It does not make sense to me that a policy sets the hardware limits, surely the policy must read the limits and not the other way around? With my limited capabilites, it looks like this source path is concerned with reading the hardware limits and setting the parameters of a policy with it. In any case, thanks for looking at this! Looking through the whole ticket again, and re-considering your suggestion about looking at efivars. There are a handful of files in there that have interesting file names, but I am currently unable to decode the contents of it, even when looking at them with a HEX editor, the contents are just gibberish. How does one go about decoding the contents of these files? The files that have interesting names are ``` [evert@Evert efivars]$ ls | grep -i setup BoardInfoSetup-1e785e1a-8ec4-49e4-8275-fbbdeded18e7 CpuSetup-b08f97ff-e6e8-4193-a997-5e9e9b0adb32 CpuSetupVolatileData-b08f97ff-e6e8-4193-a997-5e9e9b0adb32 InitSetupVariable-ec87d643-eba4-4bb5-a1e5-3f3e36b20da9 OemNvPcfDcPwrLimitSetup-f3c83344-34e3-47f8-bd71-4c99ad16e55f OemSetup-ec87d643-eba4-4bb5-a1e5-3f3e36b20da9 OemVbtSetup-5fc6ad29-04bd-4253-b8ae-8bd851627a4e Setup-a04a27f4-df00-4d42-b552-39511302113d Setup-ec87d643-eba4-4bb5-a1e5-3f3e36b20da9 SetupCpuFeatures-ec87d643-eba4-4bb5-a1e5-3f3e36b20da9 SetupMode-8be4df61-93ca-11d2-aa0d-00e098032b8c ``` Putting the standard header aside, there's no real tool to decode them since the efivar contents is firmware-dependent if I'm correct The only way would be to disassemble (and reverse engineer) the UEFI firmware. Here am I now reverse engineering /sys/kernel/debug/ec/ec0/io (available with modprobe ec_sys). This is the source of the TDP problems that I have -- it pretty much controls everything from TDP, to package temperature, to PL1/PL2/PL4 values... in other words, this is going to take a bit of time to figure out all of the necessary fields. Oh, man. I'm so glad you are still poking at it, and good luck! I had a look around the internet, and there are precious few utilities in Linux to even look at the stuff in efivars, and almost all of them come with dire warnings about destroying your hardware. For a community of tinkerers, this area of Linux is woefully under represented. Exactly what I thought. Modifying the EC registers directly through /sys/kernel/debug/ec/ec0/io allow the CPU to go to higher TDPs values -- at the expense of causing a thermal shutdown ;-) This means that the native hp-wmi module doesn't handle TDP scaling through the embedded controller at all. I found a project which *could* do what I've craved for: https://github.com/ranisalt/hp-omen-linux-module And lucky me, an associated AUR package has been created just yesterday: https://aur.archlinux.org/packages/hp-omen-wmi-dkms I'll probably contribute to it as there are hardware features that my model has which aren't implemented. I wonder if we could directly integrate this improved hp-wmi module in the kernel source tree, when it's stable enough. Mhh, that was a complete failure. I guess I'll directly work on hp-wmi instead then. :( > This means that the native hp-wmi module doesn't handle TDP scaling through
> the > embedded controller at all.
This is actually wrong and hp-wmi does support it, looking at the source code. Looking at omen_thermal_profile_boards shows that my board (8BAD) isn't listed.
I believe I'll have to build the module myself & add it myself to see if TDP scaling works.
There you go, I was right -- adding my motherboard ID to the list makes my CPU hit up to 138W on load!! Woohoo!! Now I need to submit a patch and we should be good! :) Now there's another problem, which is that the fans don't seem to spin down... I wonder why. I'll try to disable the "Always spin fans" option in my UEFI setup utility and see if this works. I also (still) have the correct readings for my CPU. I, for one am really glad that you managed to get your computer running properly. I wonder if there is anything similar for the hardware that I have. Probably not -- I'm wondering if you've installed intel-ucode? Yeah, I tried with and without intel-ucode. As this I have a working workaround for getting normal performance on my laptop this is not a serious issue for me... there are just some tools that do not work properly but that is about it. From what we've all seen, I believe there are three possible reasons as to why you're getting this kind of read out: - The firmware reports lower values to the kernel when quering the WMI interface - The MSR registers w<ere altered at boot - The loaded WMI module (if there is one) makes bad calls to the WMI interface The first two are firmware related and thus can only be fixed with workarounds, but the third one could be fixed if it happened to be true. Can you do lsmod | grep wmi please? Hi there! Here is the output: ``` [evert@Evert ~]$ lsmod | grep wmi snd_rawmidi 53248 2 snd_usbmidi_lib,snd_ump snd_seq_device 16384 4 snd_seq,snd_seq_oss,snd_ump,snd_rawmidi mxm_wmi 12288 0 nvidia_wmi_ec_backlight 12288 0 clevo_wmi 16384 0 snd 155648 26 snd_hda_codec_generic,snd_seq,snd_seq_device,snd_hda_codec_hdmi,snd_hwdep,snd_seq_oss,snd_hda_intel,snd_usb_audio,snd_usbmidi_lib,snd_hda_codec,snd_hda_codec_realtek,snd_sof,snd_timer,snd_compress,snd_soc_core,snd_ump,snd_pcm,snd_rawmidi video 77824 3 nvidia_wmi_ec_backlight,i915,nvidia_modeset tuxedo_keyboard 94208 3 clevo_acpi,tuxedo_io,clevo_wmi wmi 45056 4 video,nvidia_wmi_ec_backlight,clevo_wmi,mxm_wmi ``` I don't see anything in there that pertains to CPU or power, unfortunately. Could it be that I am missing some sort of driver? Seems that there's the correct clevo_wmi (keyboard backlighting) & mxm_wmi (switchable graphics) modules loaded. Looking at your ACPI tables again, I can definitely say that there's no obvious path with the EC that would control indirectly what cpuinfo_max_freq says -- out of curiosity, have you also tried intel_pstate=disabled + acpi_osi='Linux'? Hi there! Thanks for your patience. So, I just tried with intel_pstate=disabled + acpi_osi='Linux' My system now respects /sys/module/processor/parameters/ignore_ppc, and I am unable to set the proper frequencies with this switch disabled. With intel_pstate=disabled, there is now a /sys/devices/system/cpu/cpufreq/policy0/bios_limit file, and this file has the contents of 2400000 When intel_pstates is active, this file does not exist. From the name of this file, I am assuming that it reads this value from the BIOS, or calculates this value from a variable that is in the BIOS. pstates must read the same variable from the BIOS, and also sets the speed limit on my CPU until I ignore the ppc. What is the ppc? How does it work, and where can I find out more about it? PPC stands for "Performance Present Capabilities", this comes from ACPI. It basically defines how fast your CPU can run at in terms of states. Your CPU can go from S0 (full speed) to Sn (slowest speed) -- depending on the CPU that you have, you can have n states available to you. Some hardware manufacturers alter the _PPC object to control how slow your CPU can run in certain conditions defined by said manufacturer -- this means that the OS is no longer the master of tuning the CPU performance, but rather the ACPI tables that's built-in your laptop. You can find more about it here: https://uefi.org/htmlspecs/ACPI_Spec_6_4_html/08_Processor_Configuration_and_Control/declaring-processors.html#ppc-performance-present-capabilities Some manufacturers have a more aggressive approach to it, like HP, where the embedded controller directly controls how much wattage the CPU can draw from power. That embedded controller can be controlled through the Windows Instrumentation Management (WMI) protocol that is exposed in the ACPI tables/MOF blobs. From my understanding, the intel_pstate scheduler governs on how the CPU shall scale up with the help of pseudo-states that aren't normally accessible by the OS. In other words, it has more granularity and privilege than the acpi_cpufreq scheduler. While acpi_cpufreq exposes the bios_limit file, intel_pstate doesn't since it is "closer" to the CPU, so to speak -- it writes directly to the MSR registers to control power consumption, frequency, HWP, etc. I can't shake the feeling that I am missing something. In ssdt11.dsl, about halfway down the file, it talks about performance capabilities. But, is it reading values from somewhere, or are they contained in the file? If it's reading from somewhere, how do I figure out where it's reading from? If it's in the file, why are there if statements? It seems to first check if there's a _PPC object, and if there's one, then it writes whatever value it has received. It then checks for TCNT and notifies PR00 to PR31, depending on that same value -- I believe PR represents the CPU cores, so this would then notify each of them to indicate that we've changed speed, I guess? I came across these files to help understand what's going on: - https://github.com/Lekensteyn/acpi-stuff/blob/master/Clevo-P651RA/notes.txt - https://github.com/tianocore/edk2-platforms/blob/master/Platform/Intel/KabylakeOpenBoardPkg/Acpi/BoardAcpiDxe/Dsdt/Platform.asl *Maybe* your firmware writes bytes directly through the system bus rather than having a middleware (like me, with an embedded controller). However I still have doubts on my hypothesis, so take this with a grain of salt :] How can I check what's in the _PPC object? This CPU has 32 logical cores, and I guess that the max "hardware" speed for each CPU is being set. I'll go read the links you provided and see if I can figure out why this laptop thinks the hardware frequency is wrong. I think we should rather understand what TCNT is in this situation, I don't think this is what determines the values that you're getting in those sysfs files. I just discovered a fascinating read: https://wiki.archlinux.org/title/DSDT It seems Arch users have a wiki for everything! Seems that I can override an ACPI variable that is causing an issue. I wonder if this laptop will even boot with acpi=off? Just for interest sake, I booted this laptop with acpi=off. It was near unusable! The builtin keyboard was not detected, and only one CPU core was active.... and still trottled to 2.2GHz. I did not investigate abything else, and turned acpi on in a hurry. Here is an interesting read, not sure if will help, but it seems tangenically related: https://wiki.archlinux.org/title/DSDT ACPI provides SMP info, so this was pretty much expected. The old APM system is not implemented in newer hardware, so there's no other way Linux can determine that you have more than one CPU core. Considering that the CPU was still throttled, it would appear that the throttling on my computer is not from the ACPI. |