Most recent kernel where this bug did *NOT* occur: 2.6.18 Distribution: Sinux 8.0 (private distro with vanilla kernel) Hardware Environment: MSI MS-7210 motherboard with Pentium-D CPU Software Environment: BIOS Information (from dmidecode) Vendor: American Megatrends Inc. Version: 080012 Release Date: 11/24/2005 Problem Description: 1) Different info shown for CPU cores: patrol@arcus:~$ cat /proc/acpi/processor/CPU1/info processor id: 0 acpi id: 1 bus mastering control: no power management: no throttling control: yes limit interface: yes patrol@arcus:~$ cat /proc/acpi/processor/CPU2/info processor id: 1 acpi id: 2 bus mastering control: no power management: no throttling control: no limit interface: no 2) Totally missing C-states, CPUs reported as being permanently in C0 (running): patrol@arcus:~$ cat /proc/acpi/processor/CPU*/power active state: C0 max_cstate: C8 bus master activity: 00000000 maximum allowed latency: 2000 usec states: active state: C0 max_cstate: C8 bus master activity: 00000000 maximum allowed latency: 2000 usec states: 3) (maybe not ACPI-related) Second core displays much higher performance than the first patrol@arcus:~$ cat /proc/cpuinfo ... bogomips : 6403.55 ... bogomips : 8110.87 (clock is 3.2G, so the first value seems OK). All these problems occured in 2.6.19, all was OK in 2.6.18 and prior. Steps to reproduce:
Created attachment 10283 [details] dmesg and acpidump output
Just verified that not only different info is given for CPU1 and CPU2, but, most importantly, while limit and throttling of CPU1 can read/set its settings, both the files are reading as <not supported> for CPU2. I also grepped my old logs and I've found that while C1 disappeared in 2.6.19, maybe the throttling/limit was unavailable for the second core didn't work in earlier kernels too.
I've found, that when the CPU is going hot due to high load, the kernel logs this: Feb 6 06:04:16 arcus kernel: CPU1: Temperature above threshold, cpu clock throttled (total events = 1) Feb 6 06:04:45 arcus kernel: Machine check events logged Feb 6 06:07:51 arcus kernel: CPU0: Temperature above threshold, cpu clock throttled (total events = 1) Feb 6 06:09:16 arcus kernel: CPU1: Temperature above threshold, cpu clock throttled (total events = 33925) Feb 6 06:09:45 arcus kernel: Machine check events logged Feb 6 06:14:16 arcus kernel: CPU1: Temperature above threshold, cpu clock throttled (total events = 44822) Feb 6 06:14:20 arcus kernel: CPU0: Temperature above threshold, cpu clock throttled (total events = 17) Feb 6 06:14:45 arcus kernel: Machine check events logged Feb 6 06:19:54 arcus kernel: CPU1: Temperature above threshold, cpu clock throttled (total events = 61607) Feb 6 06:24:45 arcus kernel: Machine check events logged Feb 6 06:28:14 arcus kernel: CPU1: Temperature above threshold, cpu clock throttled (total events = 67911) Feb 6 06:29:45 arcus kernel: Machine check events logged Feb 6 06:31:34 arcus kernel: CPU0: Temperature above threshold, cpu clock throttled (total events = 19) Feb 6 06:33:14 arcus kernel: CPU1: Temperature/speed normal Feb 6 06:34:45 arcus kernel: Machine check events logged Feb 6 06:39:41 arcus kernel: CPU1: Temperature above threshold, cpu clock throttled (total events = 151883) Feb 6 06:39:45 arcus kernel: Machine check events logged Feb 6 06:46:47 arcus kernel: CPU1: Temperature above threshold, cpu clock throttled (total events = 154679) Feb 6 06:49:45 arcus kernel: Machine check events logged Feb 6 06:51:47 arcus kernel: CPU1: Temperature above threshold, cpu clock throttled (total events = 179484) Feb 6 06:52:28 arcus kernel: CPU0: Temperature above threshold, cpu clock throttled (total events = 31) Feb 6 06:54:45 arcus kernel: Machine check events logged I see the following important things: - Although the CPU load is now over for a while (even the fan runs idle now), there was just ONE message (for one CPU) saying that the CPU is going to normal, somewhere in the middle of the throttling messages. Does it mean that the CPUs now remain throttled (or at least there is an attempt to throttle) ? - They are named as CPU0 and CPU1, while ACPI knows them as CPU1 and CPU2! Maybe the confusion comes from there ? - Even during the computation and immediately after the "Throttled" message has been printed, reading /proc/acpi/processor/CPU1/throttling showed zero throttling, and no visible performance drop has been observed. Reading /proc/cpuinfo also showed full clock speed.
Re: comment #3 > CPU1: Temperature above threshold, cpu clock throttled (total events = 1) > Machine check events logged This is due to TM1 (or TM2), which is a mechanism in the processor hardware used to control temperature when ACPI, fan, and everything else have failed. ACPI doesn't actually know anything about TM1/TM2 -- they are supposed to be extremely infrequent and very short in duration. Is the processor colling device attached properly?
Processor (CPU1, 0x01, 0x00000810, 0x06) Processor (CPU2, 0x02, 0x00000000, 0x00) The DSDT shows the 2nd processor is declared w/o a PBLKL (address, length). That explains the 2nd processor with: throttling control: no limit interface: no Also, it seems that the idle code doesn't bother putting any entries in /proc/acpi/processor/CPU?/power when a processor has no _CST and no PBLK because it is always using just C1 anyway. I see this on one of my boxes too. Indeed, the question is why it bothers to put a C1 entry in there even for systems with a pblk, because the entries isn't actually used by the C1 idle code. So this is just consmetic. > bogomips : 6403.55 > bogomips : 8110.87 This is the only mystery on this system -- though maybe it will be explained when you figure out why the hardware throttling is kicking in...
Regarding missing PBLK: Is it ok, or does the DSDT need to be fixed ? Regarding TM1/TM2: ACPI doesn't provide any thermal zones, but there is a hardware mechanism (which can be setup in the BIOS as "target temperature"), causing that the CPU fan revolutions increase substantially when the CPU is hot. It is working perfectly. I've inspected the fans - even the PSU fan is working and sucking the hot air out from the case. The cooler is well seated, but maybe a bit dusty. I'll shut the system down and clean it soon.
Pavel, Do you still have this issue on latest kernel? I tried to search and find there is a new bios upgrade from MSI for this mobo. Maybe you want to try it first.
Hi, Pavel Will you please try the latest kernel and check whether the problem still exists after bios is update? Thanks.
the only ACPI problem I see here is the cosmetic part about C1 not being displayed. The real problem seems to be that the hardware thermal throttling is kicking in, and that has nothing to do with ACPI. please re-open if there is still a problem seen using software from this year.