Created attachment 21249 [details] Clevo M720R acpidump My Clevo M720R notebook has an Intel Core 2 Duo T8100 processor. Upon boot, Core 1 shows throttling states T0--T7 available and operates in T0, with full performance. Core 0 on the other hand has only T3--T7 and is stuck in T3, its performance suffering noticeably. There is no thermal reason for this, the CPU is well within its safe temperature zone. If I booting with acpi=off, this problem does not manifest. After a suspend/resume cycle, both cores end up in T8 (outside the reported available range), but both have full performance. Another difference after suspend/resume is that CPU0 goes from SMI thermal monitoring to TM2 (Core 1 is TM2 throughout). This is with kernel-2.6.29.2-129.fc11.x86_64, but I believe I have seen this as far back as roughly 2.6.24, IIRC (the first kernel I used with this notebook). Attached are the acpidump output, /proc/acpi info before and after suspend/resume, and a dmesg log for a session with a suspend/resume cycle.
Created attachment 21250 [details] /proc/acpi info from cold
Created attachment 21251 [details] /proc/acpi info following suspend/resume
Created attachment 21252 [details] A dmesg log
will you please attach the output of acpidump? thanks.
Sorry that the acpidump is already attached. Please ignore the comment #4. thanks.
Will you please attach the output of "cat /proc/acpi/thermal_zone/*/*"? Thanks.
Hi, James From the acpidump it seems that this issue is related with the BIOS. There exists the _TPC object for CPU.And the _TPC object is used to define the processor throttling limit. The throttling limit will be used when updating the T-state.(In the boot phase the T state will be changed according to the current T-state). > CPU0. Name (_TPC, 0x03) . It means that the active T-state must be equal to or greater than T3. So the info in comment #1 reflects the correct T-state. After the suspend/resume cycle, maybe the CPU T-state is configured in BIOS. And only the throttling state is obtained when /proc/acpi/processor/*/throttling is invoked. But it won't udpate the throttling state. In such case the throttling limit is not used. From the info in comment #2 we know that T8 means the incorrect T-state. It is already beyond the T-state limit. The CPU will still use T3 when we try to change the T-state for CPU0. (echo T0 > /proc/acpi/processor/CPU0/throttling). When the T-state is changed, the T-state limit will be considered. In such case the active T-state will be equal to or greater than T3. So IMO this is a BIOS bug. And It had better be update by upgrading BIOS. Thanks.
I don't understand why _TPC is implemented for CPU0 only, and it has a hard-coded value 0x03. All these suggest that you are using a BIOS that is not well implemented. And I agree with Yakui that this is a BIOS bug. But anyway, I think the idea of enabling/disabling throttling control in processor driver, either by kernel config options or module parameters. I'll attach a patch later.
Created attachment 21253 [details] patch: ignore _TPC please apply this patch and 1. build in ACPI processor driver and boot with processor.ignore_tpc=1 or 2. build ACPI processor driver as a module and load it with parameter ignore_tpc=1. please verify if this patch helps.
(In reply to comment #6) > Will you please attach the output of "cat /proc/acpi/thermal_zone/*/*"? > Thanks. $ cat /proc/acpi/thermal_zone/*/* 0 - Active; 1 - Passive <polling disabled> state: ok temperature: 28 C critical (S5): 155 C (This is about 2 minutes after switching on, having been left off overnight...) I'll try the patch as soon as possible and report back. My notebook's original supplier has not mentioned any BIOS updates, although there are some around for the same chassis; none of the changelogs mention anything about throttling bugfixes. I'll send some e-mails and see what I can get.
The patch worked as intended. Straight after booting: [james@rhapsody ~]$ cat /proc/acpi/processor/CPU0/throttling state count: 8 active state: T0 state available: T0 to T7 states: *T0: 100% T1: 88% T2: 75% T3: 63% T4: 50% T5: 38% T6: 25% T7: 13% I'm not sure I understand what was meant in Comment #7 about the T8 state after suspend/resume, though. It still ends up in T8 after an S3 cycle: [james@rhapsody ~]$ cat /proc/acpi/processor/CPU0/throttling state count: 8 active state: T8 state available: T0 to T7 states: T0: 100% T1: 88% T2: 75% T3: 63% T4: 50% T5: 38% T6: 25% T7: 13%
Hi, James Thanks for the test. After the boot option of "processor.ignore_tpc" is added, the T-state limit will be ignored. In such case the CPU0 won't be set to T3 state in the boot phase. For the suspend/resume issue: Maybe I don't explain it very clear in comment #7. The command of "cat /proc/acpi/processor/CPU0/throttling" will re-obtained the current T-state. And this is realized by the following flowchart on this box: a. Read the T-state MSR status register b. If the status value can be found in the _TSS package, the correct T-state is obtained. Otherwise the invalid T-state is returned. Unfortunately the T-state status is not found in the _TSS package after suspend/resume. And it will report that it is beyond the T-state. thanks.
Created attachment 21270 [details] patch: reset t-state once it's invalid please apply this patch on top of the previous one and see if it helps.
(In reply to comment #13) > Created an attachment (id=21270) [details] > patch: reset t-state once it's invalid > > please apply this patch on top of the previous one and see if it helps. It doesn't appear to have made any difference, the processor is still in T8 after resume. I'm sure it applied correctly... I see no "Invalid throttling state, reset" messages.
What if instead of "if (state == -1)", the test were if ((state < pr->throttling_platform_limit) || (state >= pr->throttling.state_count)) ?
Hi, James The patch in comment #13 is based on the latest upstream kernel. Will you please apply the patch on the latest upstream kernel(2.6.30-rc4) and see whether the issue still exists? If you use the 2.6.29.xx kernel, what you have done in comment #15 is also OK. Thanks.
(In reply to comment #16) > Hi, James > The patch in comment #13 is based on the latest upstream kernel. > Will you please apply the patch on the latest upstream kernel(2.6.30-rc4) > and see whether the issue still exists? It worked on that kernel, T0 after resume. > If you use the 2.6.29.xx kernel, what you have done in comment #15 is > also > OK. > Thanks. I'll give this a go since 2.6.30 is unstable in other places for me at the moment.
Also, the modification from Comment #15 works OK on 2.6.29.
Hah, I see. that's because I made the patch based on 2.6.30-rc i.e. after this patch is merged. commit 53af9cfb37af5e03ee2b24c5d5c4963c34e5b765 Author: Len Brown <lenb@kernel.org> diff --git a/drivers/acpi/processor_throttling.c b/drivers/acpi/processor_throttling.c index d278381..5f09901 100644 --- a/drivers/acpi/processor_throttling.c +++ b/drivers/acpi/processor_throttling.c @@ -783,11 +783,9 @@ static int acpi_get_throttling_state(struct acpi_processor *pr, (struct acpi_processor_tx_tss *)&(pr->throttling. states_tss[i]); if (tx->control == value) - break; + return i; } - if (i > pr->throttling.state_count) - i = -1; - return i; + return -1; } so my patch also works if you apply it on top of the latest git kernel, right? patches are available.
(In reply to comment #19) > > patches are available. which is also known as http://patchwork.kernel.org/patch/22833/ http://patchwork.kernel.org/patch/22834/
patches in comment #20 applied to acpi tree
workaround for this _TPC BIOS bug shipped in Linux-2.6.30-rc6-git1
Mostly just for your information. I have a HP 2510p notebook and am now getting this new warning after upgrading to 2.6.30-rc6 (x86_64). From dmesg: <snip> ACPI: AC Adapter [C23B] (on-line) input: Power Button as /class/input/input4 ACPI: Power Button [PWRF] input: Sleep Button as /class/input/input5 ACPI: Sleep Button [C2BF] input: Lid Switch as /class/input/input6 ACPI: Lid Switch [C155] ACPI: SSDT 000000007e7dbd42 0027F (v01 HP Cpu0Ist 00003000 INTL 20060317) ACPI: SSDT 000000007e7dc046 005FA (v01 HP Cpu0Cst 00003001 INTL 20060317) ACPI Warning (processor_throttling-0843): Invalid throttling state, reset [20090320] Monitor-Mwait will be used to enter C-1 state Monitor-Mwait will be used to enter C-2 state Marking TSC unstable due to TSC halts in idle ACPI: CPU0 (power states: C1[C1] C2[C2]) processor ACPI_CPU:00: registered as cooling_device7 ACPI: Processor [CPU0] (supports 8 throttling states) ACPI: SSDT 000000007e7dbc7a 000C8 (v01 HP Cpu1Ist 00003000 INTL 20060317) ACPI: SSDT 000000007e7dbfc1 00085 (v01 HP Cpu1Cst 00003000 INTL 20060317) ACPI Warning (processor_throttling-0843): Invalid throttling state, reset [20090320] ACPI: CPU1 (power states: C1[C1] C2[C2]) processor ACPI_CPU:01: registered as cooling_device8 ACPI: Processor [CPU1] (supports 8 throttling states) </snip> Is this expected or cause for concern? Any debugging I can to do check whether the warning is really valid or not? Cheers, FJP
This warning message is printed out because that the t-state we get from hardware is not one of the known ones, in this case, we reset the t-state to t0. So, this is a bug we should be aware of, but at the same time, we can handle it well in the processor driver. > Any debugging I can to do check whether the warning is really valid or not? I don't think it's worth doing. :)
The analysis in the bug report was that the cause is a buggy BIOS. As long as we're talking about one single (relatively obscure?) system that seems fine. However, the problem now also shows up for a completely different system from a different manufacturer with probably a completely different BIOS. A system which has never shown any problems in this area, at least not that I have noticed. My question is: are you still 100% sure that this *is* a BIOS or hardware problem, or is that new info reason to reconsider and examine if maybe after all there is an error somewhere in the kernel itself that results in these invalid readings? The Clevo had a Phoenix BIOS. From dmidecode for my system: BIOS Information Vendor: Hewlett-Packard Version: 68MSP Ver. F.0C Release Date: 06/18/2008 P.S. I've just sent a patch to remove the spurious newline from the warning message: the ACPI_CA_VERSION should not be printed on a separate line.
(In reply to comment #25) > The analysis in the bug report was that the cause is a buggy BIOS. As long as > we're talking about one single (relatively obscure?) system that seems fine. > > However, the problem now also shows up for a completely different system from > a > different manufacturer with probably a completely different BIOS. A system > which has never shown any problems in this area, at least not that I have > noticed. > > My question is: are you still 100% sure that this *is* a BIOS or hardware > problem, or is that new info reason to reconsider and examine if maybe after > all there is an error somewhere in the kernel itself that results in these > invalid readings? I also have doubts. The message "ACPI Warning (processor_throttling-0843): Invalid throttling state, reset" I saw in 2.6.30-rc7, found in Google, that many people have the same. In 2.6.30-rc6 after s2ram I had: maciek@gumis:~$ cat /proc/acpi/processor/*/throttling state count: 8 active state: T0 state available: T0 to T7 states: *T0: 100% T1: 88% T2: 75% T3: 63% T4: 50% T5: 38% T6: 25% T7: 13% state count: 8 active state: T-1 state available: T0 to T7 states: T0: 100% T1: 88% T2: 75% T3: 63% T4: 50% T5: 38% T6: 25% T7: 13% Before I haven't any problems, never. Some interesting things. In 2.6.30-rc7, always, when I try cat /proc/acpi/processor/*/throttling, I see in dmesg: [ 1467.264416] ACPI Warning (processor_throttling-0843): Invalid throttling state, reset [ 1467.264429] [20090320] [ 1467.264816] ACPI Warning (processor_throttling-0843): Invalid throttling state, reset [ 1467.264827] [20090320] [ 1531.644406] ACPI Warning (processor_throttling-0843): Invalid throttling state, reset [ 1531.644419] [20090320] [ 1531.646978] ACPI Warning (processor_throttling-0843): Invalid throttling state, reset [ 1531.646988] [20090320] One warning for each processor, and each "cat...". It never happens before.
Confirmed. 'cat /proc/acpi/processor/*/throttling' triggers the warning for me too (twice: one for each core of my Intel Core Duo).
Hi, Frans Thanks for the confirmation and the additional info. What Rui said in comment #24 is right. The message is complained because the obtained T-state is beyond the scope of available T-state. It is harmless. Of course it is confusing. Maybe this should be handled directly by kernel and doesn't complain such info. thanks.
Created attachment 21553 [details] remove the superfluous warning messages hmm, what about this incremental patch?
I'm still not happy with this. Here's some additional info. If I boot the system with -rc5, /proc/acpi/processor/CPU*/throttling correctly shows the active state as T0 for both processors. This means that my system does NOT have the same problem as originally reported in this BR, as that system really did show an invalid state (T8 with only T0-T7 supported). If I boot with -rc6, I get the new warning *every time* I do cat /proc/acpi/processor/CPU*/throttling despite the fact that you supposedly reset the value to a valid throttling state. I then tried a manual change of the throttling state as follows: echo -n T4 >/proc/acpi/processor/CPU0/throttling echo -n T4 >/proc/acpi/processor/CPU1/throttling cat /proc/acpi/processor/CPU*/throttling echo -n T0 >/proc/acpi/processor/CPU0/throttling echo -n T0 >/proc/acpi/processor/CPU1/throttling cat /proc/acpi/processor/CPU*/throttling Both 'cat' statements show the correct active state (T4 resp. T0) and after the changes to T4 the warning no longer triggers. If I just 'echo T0' for both processors without first changing to T4, the warning will still be displayed. Apparently a "real" state change is needed to avoid the warning. That also means that your "reset to T0" is not seen as a real state change (otherwise it should also prevent further warnings) which again seems to confirm that the initial state after boot is not incorrect. I'm still convinced that there are two *different* issues here: the original bug where a machine actually reported an invalid state, and a separate issue that causes the warning to trigger even though the initial state after boot is correctly at T0. I will see if I can add some debug printks to find out exactly what is going on here.
A few days ago I opened a new BR to track the possible regression discussed here: http://bugzilla.kernel.org/show_bug.cgi?id=13389. There is at least a problem with the reset as no actual reset takes place. Follow up to the new BR please.