Bug 12389 - acpi-cpufreq doesn't detect speed limit when cold booted on battery
Summary: acpi-cpufreq doesn't detect speed limit when cold booted on battery
Status: REJECTED DOCUMENTED
Alias: None
Product: ACPI
Classification: Unclassified
Component: Config-Processors (show other bugs)
Hardware: All Linux
: P1 low
Assignee: ykzhao
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2009-01-09 01:53 UTC by James Ettle
Modified: 2010-12-29 00:42 UTC (History)
5 users (show)

See Also:
Kernel Version: 2.6.36
Subsystem:
Regression: Yes
Bisected commit-id:


Attachments
dmesg straight after booting on battery (85.42 KB, text/plain)
2010-11-08 18:25 UTC, James Ettle
Details
dmesg straight after booting on battery, then pressing a Magic Power button (93.98 KB, text/plain)
2010-11-08 18:26 UTC, James Ettle
Details
Output of acpidump (127.59 KB, text/plain)
2010-11-08 18:27 UTC, James Ettle
Details
acpidump --addr 0x7F6D52F8 --length 0x000004AA (1.17 KB, application/octet-stream)
2010-11-08 21:21 UTC, James Ettle
Details
acpidump --addr 0x7F6D5827 --length 0x000002A8 (680 bytes, application/octet-stream)
2010-11-08 21:21 UTC, James Ettle
Details
acpidump --addr 0x7F6D57A2 --length 0x00000085 (133 bytes, application/octet-stream)
2010-11-08 21:23 UTC, James Ettle
Details
acpidump --addr 0x7F6D5ACF --length 0x000000B1 (177 bytes, application/octet-stream)
2010-11-08 21:23 UTC, James Ettle
Details

Description James Ettle 2009-01-09 01:53:51 UTC
Latest working kernel version: 2.6.26?
Earliest failing kernel version: 2.6.27.10-167.fc10 (perhaps some earlier)
Distribution: Fedora 10
Hardware Environment: x86-64, Clevo M720R notebook with Intel Core 2 Duo T8100 processor, X3100 graphics
Software Environment: x86-64

Problem Description:
When on battery power, my notebook's BIOS specifies that the maximum CPU speed
should be 1.2 GHz. But immediately after booting on battery, acpi-cpufreq
thinks the maximum speed is 2.101 GHz (what it should be on mains power).
Switching briefly to mains, or pressing the "power save" button (which imposes
a 1.2 GHz limit when on mains), seems to kick it all into place and thereafter
the limit is 1.2 GHz on battery, 2.101 on mains.

Steps to reproduce:
1. Cold boot on battery power.
2. Observe CPU will scale up to 2.101 GHz.
3. Briefly switch to mains or press power-save button.
4. Note maximum frequency on battery is now 1.2 GHz.
Comment 1 James Ettle 2009-01-22 03:25:57 UTC
Still there in 2.6.29-0.43.rc2.git1.
Comment 2 James Ettle 2010-11-05 18:11:45 UTC
Still present in 2.6.36.
Comment 3 Thomas Renninger 2010-11-06 20:04:54 UTC
This should be an ACPI, not a cpufreq bug.
Some things you could check:
  - BIOS up-to-date?
  - Is the limitation done through _PPC:
cat /sys/devices/system/cpu/cpu0/cpufreq/bios_limit


The bios limit (_PPC) was not evaluated at (processor.ko) driver load time, but only when CPU notification events where thrown. It sounds like you see this.
This got fixed by commit:
455c0d71d46e86b0b7ff2c9dcfc19bc162302ee9

Above patch could make trouble on other systems and you set "Fedora tree".
Can you double check that above commit is included in your tested kernel.
Comment 4 James Ettle 2010-11-08 08:54:28 UTC
As far as I can see, the kernel I'm using has that commit.

  $ cat /sys/devices/system/cpu/cpu0/cpufreq/bios_limit
  2101000

which is the highest clock speed. This is from cold on battery power. If I plug then unplug(1), or press some "Magic Power" button(2), this drops to 1200000. Would this indicate that the BIOS isn't setting things up right under these conditions? Odd because I'm sure it worked some time in the past... (Maybe I was mistaken?)

Footnotes

1. Events associated with plugging in:

  ac_adapter ACPI0003:00 00000080 00000001
  battery PNP0C0A:00 00000080 00000001
  battery PNP0C0A:00 00000081 00000001
  processor LNXCPU:00 00000080 00000000
  processor LNXCPU:00 00000081 00000000
  processor LNXCPU:00 00000081 00000000
  processor LNXCPU:01 00000080 00000000
  processor LNXCPU:01 00000081 00000000
  processor LNXCPU:01 00000081 00000000
   pnp0c14:00 000000d0 00000000

and unplugging:

  ac_adapter ACPI0003:00 00000080 00000000
  battery PNP0C0A:00 00000080 00000001
  battery PNP0C0A:00 00000081 00000001
  processor LNXCPU:00 00000080 00000003
  processor LNXCPU:00 00000081 00000000
  processor LNXCPU:00 00000081 00000000
  processor LNXCPU:01 00000080 00000003
  processor LNXCPU:01 00000081 00000000
  processor LNXCPU:01 00000081 00000000
   pnp0c14:00 000000d0 00000000

2. Activating this produces events

  processor LNXCPU:00 00000080 00000003
  processor LNXCPU:00 00000081 00000000
  processor LNXCPU:01 00000080 00000003
  processor LNXCPU:01 00000081 00000000
   pnp0c14:00 000000d0 00000000

and when deactivated,

  processor LNXCPU:00 00000080 00000000
  processor LNXCPU:00 00000081 00000000
  processor LNXCPU:01 00000080 00000000
  processor LNXCPU:01 00000081 00000000
   pnp0c14:00 000000d0 00000000


Will see if there's a more recent BIOS for this notebook, but I've found nothing so far.
Comment 5 Thomas Renninger 2010-11-08 09:19:24 UTC
> As far as I can see, the kernel I'm using has that commit.
It's important that you find that out.
This commit exactly addresses this issue.
Matthew should be able to tell you whether the latest Fedora kernel has it included or not (you may need to tell him which kernel exactly you used or where you got it from).
It's about commit: 455c0d71d46e86b0b7ff2c9dcfc19bc162302ee9
Q: Does the Fedora kernel James used has this commit included?
Comment 6 James Ettle 2010-11-08 09:27:17 UTC
(In reply to comment #5)
> > As far as I can see, the kernel I'm using has that commit.
> It's important that you find that out.
> This commit exactly addresses this issue.

I built this kernel from Fedora sources, kernel-2.6.36-1.fc15:

In processor_perflib.c, starting at line 414 I see:

	if (result)
		goto update_bios;

	/* We need to call _PPC once when cpufreq starts */
	if (ignore_ppc != 1)
		result = acpi_processor_get_platform_limit(pr);

	return result;

	/*
	 * Having _PPC but missing frequencies (_PSS, _PCT) is a very good hint
Comment 7 Thomas Renninger 2010-11-08 10:00:52 UTC
> I built this kernel from Fedora sources
That's great.
Please make sure you have:
CONFIG_CPU_FREQ_DEBUG=y
enabled and boot with cpufreq.debug=1
You should see this message then:
     cpufreq_printk("CPU %d: _PPC is %d - frequency %s limited\n", pr->id,
             (int)ppc, ppc ? "" : "not");             
at boot up (when processor driver gets loaded) and later when you plug/unplug AC.
When you get a "not limited" at boot up, there is something fishy with your BIOS and you should provide acpidump output.
If this is the case you could try avoid autoloading then by adding:
blacklist acpi-cpufreq
blacklist processor
blacklist thermal
to /etc/modprobe.conf and try to load acpi-cpufreq (which in turn should load processor) later after some more acpi drivers got loaded...

If _PPC is not 0 at boot up, but no limit gets applied to bios_limit cpufreq sysfs file, there is a kernel bug.
Comment 8 James Ettle 2010-11-08 18:24:25 UTC
> When you get a "not limited" at boot up, there is something fishy with your
> BIOS and you should provide acpidump output.

OK, I've done this test (though not with the blacklisting described). I've attached two dmesgs below --- +1 for BIOS fishiness... the first is just from booting, and the second was taken after I'd pressed the Magic Power button.

Also attached the acpidump output.
Comment 9 James Ettle 2010-11-08 18:25:53 UTC
Created attachment 36762 [details]
dmesg straight after booting on battery
Comment 10 James Ettle 2010-11-08 18:26:26 UTC
Created attachment 36772 [details]
dmesg straight after booting on battery, then pressing a Magic Power button
Comment 11 James Ettle 2010-11-08 18:27:00 UTC
Created attachment 36782 [details]
Output of acpidump
Comment 12 Thomas Renninger 2010-11-08 21:04:57 UTC
_PPC is 0 at cpufreq activation -> no limit
The later _PPC evaluations seem to get suppressed by the ratelimit:
[  206.046151] cpufreq-core: __cpufreq_governor for CPU 1, event 3
[  206.164035] cpufreq_debug_printk: 102 callbacks suppressed
We should see _PPC evaluation with non-zero result (when hitting the button and the limit is applied) by disabling the ratelimit:
module_param(debug_ratelimit, uint, 0644);
MODULE_PARM_DESC(debug_ratelimit, "CPUfreq debugging:"
                                        " set to 0 to disable ratelimiting.");
cpufreq.debug_ratelimit=0
No need to try that out, I first was totally confused where the _PPC calls with the limit are gone...

I don't have much time this week and it all looks like the BIOS should not return zero on first _PPC evaluation. For some reason this might be the case on Windows or they throw an event when some func gets evaluated we don't... For now I lower the severity (this not a critical bug at all) for this one and it will take some time until I can have a look at it.

Hm can't resist to have a quick look at the dump..., something is missing, with these tables _PPC can return nothing but zero.
Can you please dump some additional tables by:
acpidump --addr 0x7F6D5827 --length 0x000002A8 > CPU0IST.dat
acpidump --addr 0x7F6D5ACF --length 0x000000B1 > CPU1IST.dat
acpidump --addr 0x7F6D52F8 --length 0x000004AA > CPU0CST.dat
acpidump --addr 0x7F6D57A2 --length 0x00000085 > CPU1CST.dat
and attach them.
Comment 13 James Ettle 2010-11-08 21:21:18 UTC
Created attachment 36792 [details]
acpidump --addr 0x7F6D52F8 --length 0x000004AA
Comment 14 James Ettle 2010-11-08 21:21:41 UTC
Created attachment 36802 [details]
acpidump --addr 0x7F6D5827 --length 0x000002A8
Comment 15 James Ettle 2010-11-08 21:23:32 UTC
Created attachment 36812 [details]
acpidump --addr 0x7F6D57A2 --length 0x00000085
Comment 16 James Ettle 2010-11-08 21:23:48 UTC
Created attachment 36822 [details]
acpidump --addr 0x7F6D5ACF --length 0x000000B1
Comment 17 James Ettle 2010-11-08 21:26:25 UTC
(In reply to comment #12)
> For
> now I lower the severity (this not a critical bug at all) for this one and it
> will take some time until I can have a look at it.

Understood, the workaround is trivial anyway.

> Can you please dump some additional tables

Done.
Comment 18 Thomas Renninger 2010-11-10 20:12:35 UTC
Ok, there is CPPC function wich probably means "Change" PPC in CPU0IST table.
But this one is only called via _PCT which is only called at cpufreq driver load time.
So I have no idea how _PPC gets modified when this magic button is hit, when switching to battery or anywhen after boot up.
Can you verify that when the CPU is limited, that you see this limit here:
cat /sys/devices/system/cpu/cpu0/cpufreq/bios_limit
Do you know about any Clevo or related platform specific driver you may have running?
I may miss something..., I looked at drivers/platform/x86/* whether a driver directly modifies VPPC or calls CPPC func, but couldn't find anything.
I am stuck here.
Comment 19 James Ettle 2010-11-10 20:21:59 UTC
(In reply to comment #18)
> Ok, there is CPPC function wich probably means "Change" PPC in CPU0IST table.
> But this one is only called via _PCT which is only called at cpufreq driver
> load time.
> So I have no idea how _PPC gets modified when this magic button is hit, when
> switching to battery or anywhen after boot up.
> Can you verify that when the CPU is limited, that you see this limit here:
> cat /sys/devices/system/cpu/cpu0/cpufreq/bios_limit

Yeah. Plugged in at the moment:

$ cat /sys/devices/system/cpu/cpu0/cpufreq/bios_limit
2101000

Pull the plug:

$ cat /sys/devices/system/cpu/cpu0/cpufreq/bios_limit
1200000

Plug back in it's as before; push the fancy button and bios_limit gets set to 1.2 GHz again.

> Do you know about any Clevo or related platform specific driver you may have
> running?

Not so far as I am aware. wmi, cpufreq_ondemand, freq_table, acpi_cpufreq and mperf are loaded.
Comment 20 ykzhao 2010-12-28 05:47:42 UTC
hi, James
     Do you mean that the bios limit will be changed when the AC adaptor is plugged/unplugged? From the ACPI AML code it seems that it will update the _PPC and send the notification event when AC is plugged/unplugged.
     >Method (_Q16, 0, NotSerialized)
                    {
                        Store (0x16, P80H)
                        If (LEqual (ADP, Zero))
                        {
                            Store (\_PR.CPU0.VPPC, \_PR.CPU0._PPC)
                            Store (\_PR.CPU0.VPPC, \_PR.CPU1._PPC)
                            TRAP (0x42)
                        }
                        Else
                        {
                            If (LEqual (SILM, One))
                            {
                                Store (\_PR.CPU0.VPPC, \_PR.CPU0._PPC)
                                Store (\_PR.CPU0.VPPC, \_PR.CPU1._PPC)
                                TRAP (0x42)
                            }
                            Else
                            {
                                Store (Zero, \_PR.CPU0._PPC)
                                Store (Zero, \_PR.CPU1._PPC)
                                TRAP (0x43)
                            }
                        }

   At the same time it will also update the _PPC value in course of querying the AC status by using _PSR object.
   Method (_PSR, 0, NotSerialized)
            {
                If (LEqual (^^PCI0.LPCB.EC.ADP, Zero))
                {
                    If (THRF) {}
                    Else
                    {
                        Store (\_PR.CPU0.VPPC, \_PR.CPU0._PPC)
                        Store (\_PR.CPU0.VPPC, \_PR.CPU1._PPC)
                        Notify (\_PR.CPU0, 0x80)
                        Sleep (0x64)
                        Notify (\_PR.CPU0, 0x81)
                        Sleep (0x0A)
                        Notify (\_PR.CPU1, 0x80)
                        Sleep (0x64)
                        Notify (\_PR.CPU1, 0x81)
                        TRAP (0x42)
                    }
                }


     Will you please poll the AC status again and then check whether the speed limit is correct?
     >for example: /proc/acpi/ac_adapter/*/state 

Thanks.
   Yakui
Comment 21 James Ettle 2010-12-28 10:05:42 UTC
(In reply to comment #20)

>      Will you please poll the AC status again and then check whether the
>      speed
> limit is correct?
>      >for example: /proc/acpi/ac_adapter/*/state 

A-ha! Yes. Having just cold-booted on battery (and still on battery),

[james@rhapsody ~]$ cat /sys/devices/system/cpu/cpu0/cpufreq/bios_limit
2101000
[james@rhapsody ~]$ cat /sys/devices/system/cpu/cpu0/cpufreq/scaling_max_freq 
2101000
[james@rhapsody ~]$ cat /proc/acpi/ac_adapter/*/state
state:                   off-line
[james@rhapsody ~]$ cat /sys/devices/system/cpu/cpu0/cpufreq/bios_limit
1200000
[james@rhapsody ~]$ cat /sys/devices/system/cpu/cpu0/cpufreq/scaling_max_freq 
1200000

So the BIOS limit isn't set until the AC state is polled.
Comment 22 ykzhao 2010-12-29 00:36:50 UTC
Hi, James
    Thanks for the testing. From the test result it seems that the BIOS doesn't configure the _PPC limit correctly until the AC state is polled. And the issue is caused by the bogus BIOS.
Comment 23 ykzhao 2010-12-29 00:42:00 UTC
As this issue is caused by the bogus BIOS, it will be better that it is fixed by the BIOS. And this bug will be rejected.

Of course the issue can be workaround by that the AC state needs to be polled after booting on battery.

Thanks.

Note You need to log in before you can comment on or make changes to this bug.