Bug 11345 - CPUIDLE does not see C2, C3 on SMP Opteron system
Summary: CPUIDLE does not see C2, C3 on SMP Opteron system
Status: CLOSED CODE_FIX
Alias: None
Product: ACPI
Classification: Unclassified
Component: Power-Processor (show other bugs)
Hardware: All Linux
: P1 normal
Assignee: Venkatesh Pallipadi
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2008-08-14 19:15 UTC by vit
Modified: 2008-10-24 23:12 UTC (History)
4 users (show)

See Also:
Kernel Version: 2.6.26.2(mainline)
Subsystem:
Regression: ---
Bisected commit-id:


Attachments
acpidump (137.09 KB, text/plain)
2008-08-19 16:17 UTC, vit
Details
dmesg output (62.57 KB, text/plain)
2008-08-19 16:18 UTC, vit
Details
grep . /sys/devices/system/cpu/cpu*/cpuidle/state*/* (671 bytes, text/plain)
2008-08-19 16:19 UTC, vit
Details
grep . /proc/acpi/processor/*/power (2.78 KB, text/plain)
2008-08-22 10:54 UTC, vit
Details
Test patch (1.11 KB, patch)
2008-08-25 22:24 UTC, Venkatesh Pallipadi
Details | Diff
Test patch (1.11 KB, patch)
2008-08-29 16:15 UTC, Venkatesh Pallipadi
Details | Diff
Refreshed patch (1.20 KB, patch)
2008-09-23 14:07 UTC, Venkatesh Pallipadi
Details | Diff

Description vit 2008-08-14 19:15:53 UTC
Latest working kernel version:
Earliest failing kernel version: 2.6.25.[9-14](fedora) 2.6.26.2(mainline)
Distribution: Fedora 9
Hardware Environment:  2 Opteron 2356 on S2915 Tyan MB
Software Environment:
Problem Description: Cpuidle does not work on dual cpu quad-core Opteron system, sensors reports around 48C and 61C under 0 load for the first and second CPUs respectively. Both menu and ladder governors give similar results. Recompiling kernel with "CONFIG_CPU_IDLE is not set" gives around 35C and 41C under 0 load.

Steps to reproduce:
Comment 1 ykzhao 2008-08-17 07:34:47 UTC
Hi, Vit
   Will you please attach the output of acpidump?
   Do you mean that the sensor temperature in case of enabling CONFIG_CPU_IDLE is above than that in case of unseting CONFIG_CPU_IDLE? Right? Will you please add the boot option of "maxcpus=1" in the above two cases and see whether the difference still exists?

Thanks.
Comment 2 vit 2008-08-18 08:07:38 UTC
Hi,

   That is right. With CONFIG_CPU_IDLE enabled the sensors 
temparature
is higher.

   With maxcpus=1 the difference stil exists.

    acpidump is attached.

   Thanks.

>
>
> http://bugzilla.kernel.org/show_bug.cgi?id=11345
>
> ------- Comment #1 from yakui.zhao@intel.com  2008-08-17 
> 07:34
> -------
> Hi, Vit
>    Will you please attach the output of acpidump?
>    Do you mean that the sensor temperature in case of 
> enabling
> CONFIG_CPU_IDLE
> is above than that in case of unseting CONFIG_CPU_IDLE? 
> Right? Will
> you please
> add the boot option of "maxcpus=1" in the above two cases 
> and see
> whether the
> difference still exists?
>
> Thanks.
>
> -- 
> Configure bugmail: 
> http://bugzilla.kernel.org/userprefs.cgi?tab=email
> ------- You are receiving this mail because: -------
> You reported the bug, or are watching the reporter.
Comment 3 Venkatesh Pallipadi 2008-08-18 16:44:57 UTC
Also, can you attach the output of 'dmesg' and the output of command
# grep . /sys/devices/system/cpu/cpu*/cpuidle/state*/*
with CPU_IDLE enabled.
Comment 4 vit 2008-08-19 13:37:13 UTC
dmesg and  grep . 
/sys/devices/system/cpu/cpu*/cpuidle/state*/*
attached

> http://bugzilla.kernel.org/show_bug.cgi?id=11345
>
> ------- Comment #3 from venkatesh.pallipadi@intel.com  
> 2008-08-18
> 16:44 -------
> Also, can you attach the output of 'dmesg' and the output 
> of command
> # grep . /sys/devices/system/cpu/cpu*/cpuidle/state*/*
> with CPU_IDLE enabled.
>
> -- 
> Configure bugmail: 
> http://bugzilla.kernel.org/userprefs.cgi?tab=email
> ------- You are receiving this mail because: -------
> You reported the bug, or are watching the reporter.

??? ????? ?????.??<p>&nbsp;</p><p>dmesg and&nbsp; grep .
/sys/devices/system/cpu/cpu*/cpuidle/state*/* attached</p><blockquote
style="border-left: 2px dotted rgb(170, 170, 170); margin: 10px 10px 10px 0px;
padding-left: 10px;"><div><a
href="http://bugzilla.kernel.org/show_bug.cgi?id=11345"
mce_href="http://bugzilla.kernel.org/show_bug.cgi?id=11345"
target="_blank">http://bugzilla.kernel.org/show_bug.cgi?id=11345</a><br>
<br>
<br>
<br>
<br>
<br>
------- Comment #3 from <a href="mailto:venkatesh.pallipadi@intel.com"
mce_href="mailto:venkatesh.pallipadi@intel.com" title="New Message to
venkatesh.pallipadi@intel.com">venkatesh.pallipadi@intel.com</a>&nbsp;
2008-08-18 16:44 -------<br>
Also, can you attach the output of 'dmesg' and the output of command<br>
# grep . /sys/devices/system/cpu/cpu*/cpuidle/state*/*<br>
with CPU_IDLE enabled.<br>
<br>
<br>
-- <br>
Configure bugmail: <a href="http://bugzilla.kernel.org/userprefs.cgi?tab=email"
mce_href="http://bugzilla.kernel.org/userprefs.cgi?tab=email"
target="_blank">http://bugzilla.kernel.org/userprefs.cgi?tab=email</a><br>
------- You are receiving this mail because: -------<br>
You reported the bug, or are watching the
reporter.</div></blockquote><br><br><br><br>
??? ????? <a href="http://www.pochta.ru/">?????.??</a>
Comment 5 vit 2008-08-19 16:17:12 UTC
Created attachment 17316 [details]
acpidump
Comment 6 vit 2008-08-19 16:18:19 UTC
Created attachment 17317 [details]
dmesg output
Comment 7 vit 2008-08-19 16:19:20 UTC
Created attachment 17318 [details]
grep . /sys/devices/system/cpu/cpu*/cpuidle/state*/*
Comment 8 Len Brown 2008-08-22 10:41:23 UTC
dmesg says that ACPI exports 3 C-states:

ACPI: CPU0 (power states: C1[C1] C2[C2] C3[C3])


But CPUIDLE sees and uses just C1:
Further, CPUIDLE has a NULL description for C1, which doesn't look right:

/sys/devices/system/cpu/cpu0/cpuidle/state1/desc:<null>
/sys/devices/system/cpu/cpu0/cpuidle/state1/latency:0
/sys/devices/system/cpu/cpu0/cpuidle/state1/name:C1
/sys/devices/system/cpu/cpu0/cpuidle/state1/power:0
/sys/devices/system/cpu/cpu0/cpuidle/state1/time:202562479
/sys/devices/system/cpu/cpu0/cpuidle/state1/usage:30678

Please _paste_ (or attach if you must), the output from
grep . /proc/acpi/processor/*/power

when CPUIDLE is disabled.
Comment 9 vit 2008-08-22 10:54:45 UTC
Created attachment 17377 [details]
grep . /proc/acpi/processor/*/power
Comment 10 Howard Chu 2008-08-25 01:37:12 UTC
Not sure if this is the exact same problem, but it looks like cpuidle doesn't work at all for me. Two different machines, one an Opteron 185 dual-core SMP kernel (Asus A8V) and one a Turion ZM-80 dual-core SMP (laptop, HP dv5z again). Both are running 2.6.26.3.

In /sys/devices/system/cpu/cpu*/* there is no cpuidle directory at all. As far as I can tell the system is never entering the C1 state.

I also have an Asus M6Ne laptop running this same kernel version, and cpuidle appears to be working correctly there (processor is a Pentium M 755).
Comment 11 Venkatesh Pallipadi 2008-08-25 06:53:27 UTC
Looks like the BIOS is advertising C1, C2 and C3 states only for CPU 0 and there are no C-states exported for other CPUs at all. !CPU_IDLE case seems to be using C1 on all CPUs.
This is possibly confusing the cpuidle governor. Will look at the code and post a test/workaround patch soon.

Regardless of this CPUIDLE bug, if the CPU really supports C2, C3 and only CPU 0 advertises these states, then there is something wrong in the BIOS. We  will not be using C2, C3 both with or without CPUIDLE. Mark should be able to shed more light into this part.
Comment 12 Venkatesh Pallipadi 2008-08-25 22:24:05 UTC
Created attachment 17451 [details]
Test patch

Attached patch should resolve the problem. Can you please try it and verify.

Thanks,
Venki
Comment 13 Mark Langsdorf 2008-08-26 08:41:37 UTC
The BIOS is clearly buggy.  C1 should be supported on all processors and C3 is not available on SMP Opteron systems with quad-core.
Comment 14 vit 2008-08-27 17:50:28 UTC
With the test patch kernel just hangs after 

cpuidle: using governor menu
Comment 15 Venkatesh Pallipadi 2008-08-29 16:15:27 UTC
Created attachment 17536 [details]
Test patch


Sorry. I overlooked things in the earlier patch. This patch should fare better....
Comment 16 vit 2008-09-10 09:19:00 UTC
This patch works. The temperature is lower now (although it is a little bit higher than with !CPUIDLE)
Comment 17 Venkatesh Pallipadi 2008-09-23 14:07:00 UTC
Created attachment 17977 [details]
Refreshed patch
Comment 18 Venkatesh Pallipadi 2008-09-23 14:12:12 UTC
I am not sure why the temperature is not sama as !CPUIDLE case. Does the number of wakeups or C-state residency in powertop is any different with and without !CPUIDLE
Comment 19 Shaohua 2008-10-15 00:10:19 UTC
Venki, what's the status of the patch?
Comment 20 Len Brown 2008-10-16 16:00:34 UTC
patch in comment #17 applied to acpi-test cpuidle branch
Comment 21 Len Brown 2008-10-16 16:09:50 UTC
For the record, the bug here isn't that C2 and C3 were
not seen, as originally reported.

While C2 and C3 values are reported in the FADT,
FADT.P_LVL2_UP is 0, which means that C2 is _not_
supported on SMP mode on this box.

So ACPI is properly not using C2 and C3 on CPU0.
Indeed, we should fix /proc/acpi/processor/*/power
so that it doesn't even mention them in SMP mode.

We could probably use them in UP mode, but this
box would be pretty un-interesting in UP mode.

But the more important bug, which is fixed by venki's patch
above, is that when the processors did not have a _CST,
we were polling instead of using C1 on all processors.
Comment 22 Len Brown 2008-10-24 23:12:47 UTC
shipped in linux-2.6.28-rc1
closed

commit 89cedfefca1d446ee2598fd3bcbb23ee3802e26a
Author: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com>
Date:   Thu Oct 16 19:00:08 2008 -0400

    cpuidle: upon BIOS bug, default to default_idle rather than polling

Note You need to log in before you can comment on or make changes to this bug.