Distribution: 2.6 base kernel Hardware Environment: Software Environment: Problem Description: BIOS may not expose EST feature by default, even when platform supports it. OS needs to communicate its capability to handle EST, to BIOS, using _PDC method. This has to be done before OS looks at other P-state related methods (like _PSS, _PCT, etc). The attached patches adds this functionality to the ACPI processor driver. Steps to reproduce:
Created attachment 2864 [details] pdc.patch
Created attachment 2865 [details] msr.patch
pdc.patch) Enable setting of _PDC for ACPI P-state driver, with the IO based transitions. msr.patch) Add support for MSR based transitions in ACPI P-state driver and reflect the same in _PDC, so that BIOS can use this information and expose MSR based EST mechanism when possible.
Both patches are _completely_ invalid, the whole bug can be rejected: the speedstep-centrino.c driver already does this, at least if CONFIG_X86_SPEEDSTEP_CENTRINO_ACPI is enabled [still marked experimental, but will be changed to be used as default soon]: 1.) "pdc.patch": set_pdc is part of the ACPI P-States library inside drivers/acpi/processor.c; maybe the original author of that function got a few values wrong which need fixing. However, the _PDC handling can be and is done in the ACPI perflib core; maybe other cpufreq drivers will need _PDC in future as well. 2.) "msr.patch": there's no need to add that capability to acpi.c as it is already present in the speedstep-centrino driver. Having two drivers providing exactly the same is unnecessary, causes maintenance headaches, ...
First patch is nothing specific to MSR. With the first patch, pdc hhandling is still done in perflib. But, the default value there is not good enough for the acpi driver requirements. And I am just initializing a PDC structure that ACPI driver needs. But, still set is done using the infrastructure. Note, we are still using IO ports and speedstep_centrino will not be of much help. Second patch. OK. Agreed that it may be easier to support MSR transitions at one place instead of multiple maintenance headaches. But, long term, it may be better, if that one place is ACPI. Isn't it? Say if the same thing has to be supported in x86-64 tree. As I said, speedstep can do this too. But, it will need more changes, like SMP support (I am not sure at the moment. But, it may also need the _PPC callbacks). If you prefer that, I can rework the second patch to make changes onto speedstep_centrino driver and keep ACPI driver restricted to IO ports. Note: We want to have this on DP (4way with HT) as well. So, speedstep_centrino SMP is needed anyways.
> First patch is nothing specific to MSR. With the first patch, pdc hhandling > is > still done in perflib. But, the default value there is not good enough for > the > acpi driver requirements. Well, as the set_pdc there was written without access to the _PDC datasheet, that is no surprise to me. Could you update the set_pdc in drivers/acpi/processor.c so that it works correctly, please? > And I am just initializing a PDC structure that ACPI > driver needs. But, still set is done using the infrastructure. Note, we are > still using IO ports and speedstep_centrino will not be of much help. Well, AFAICS acpi.c init will fail if you're only using patch 1, as the registers will be FIXED_HARDWARE... So patch 1 only does make sense if patch 2 is applied, too. > Second patch. OK. Agreed that it may be easier to support MSR transitions at > one place instead of multiple maintenance headaches. But, long term, it may > be > better, if that one place is ACPI. Isn't it? No. It's better to have _small_ drivers doing one thing. the acpi.c driver doing ACPI P-States baesd I/O-transitions; the speedstep-centrino.c driver [we can discuss renaming it to module_alias'ed or enhancedspeedstep or acpi-est] doing MSR-based EST transitions. Just look at the length of the code path when setting a new state. With your approach, it has many more jumps and ifs, with two different drivers, it's more straightforward and _faster_. > Say if the same thing has to be supported in x86-64 tree. As I said, > speedstep can do this too. But, it will need more changes, like SMP support SMP support should become much easier if the cpumask_t transition I suggested a few days ago on the cpufreq mailinglist is done. Even SMP+HT will be much easier then. > (I am not sure at the moment. But, it may also need the _PPC callbacks). _PPC is already handled whenever a driver is registered with the acpi P-States library. > If you prefer that, I can rework the second patch to make changes onto > speedstep_centrino driver and keep ACPI driver restricted to IO ports. That'd be great. > Note: We want to have this on DP (4way with HT) as well. So, > speedstep_centrino SMP is needed anyways. Excellent.
Agreed about the second patch. I will work on making changes to speedstep_centrino to handle SMP, proper PDC and all. The story behind the first patch and PDC is quite long.... _PDC format itself is not coming from ACPI. ACPI just says about the method. The value to be written is architecture specific. The current value I am trying to write in this patch is defined for Enhanced Speedstep. And I dont think we should make that change in acpi/processor.c It may cause issues for other CPUs and other drivers as well. And _PDC is not just for choosing MSR or IO port access. There is one more dimension to it. SMP. To work without any issues on legacy OSes, which doesn't know how to handle P-state in SMP config, by default BIOS disables EST. When OS writes some value to _PDC, BIOS understands that OS can handle SMP P-states and enables the feature. _PDC can also be used to tell BIOS that OS can handle IO based and or MSR transitions. In the first patch, The value I am writing to PDC says, OS can handle IO ports and SMP. Once patch in bug 2615 is applied, ACPI driver becomes SMP capable and then we need patch 1 here, to tell BIOS that we can handle SMP P-state with IO ports. First patch doesn't know anything about MSR transitions. Second patch, I am writing value to _PDC, which says OS can handle MSR too. But, that we can move to speedstep_centrino.
OK, thanks for this information. This changes my opinion somewhat. Now I only have some (small) issues with patch 1: 1.) it's leaking memory. kfree() obj, obj_list after the call to acpi_processor_register_performance(). 2.) #define ACPI_PDC_CAPABILITY_SMP 0xa [or, maybe ACPI_PDC_CAPABILITY_EST_SMP] and move the #definition of ACPI_PDC_CAPABILITY_ENHANCED_SPEEDSTEP to the same place, either asm/acpi.h or acpi/processor.h (latter with #ifdef "ARCH", maybe); then use that for + buf[2] = ACPI_PDC_CAPABILITY_SMP; 3.) Add a comment like /* * acpi_processor_cpu_init_pdc_est - inform BIOS of the SMP capabilities of this cpufreq driver * @perf: processor-specific acpi_io_data struct * @cpu: CPU being initialized * * To avoid problems on legacy OS, some BIOSes require to be informed of the SMP capabilities * of an ACPI-based cpufreq driver to enable P-States. So, set the _PDC bits accordingly for * Enhanced Speedstep -- the actual setting of _PDC is done in drivers/acpi/processor.c. */ Then, when adding SMP capabilities to the speedstep-centrino driver, you can use + buf[2] = ACPI_PDC_CAPABILITY_ENHANCED_SPEEDSTEP & ACPI_PDC_CAPABILITY_SMP; One other issue: assuming a SMP system with 2 phyiscal HT-capable CPUs has 4 logical CPUs: will the MSR need to be set on both siblings of each CPU, just like the THERM_CONTROL MSR needs to be set to the same value on both siblings? Will there be _P{CT,DC,SS,PC} entries in the {S,D}SDT for each sibling, or only for one sibling of each physical CPU?
Created attachment 2907 [details] pdc.patch Reworked pdc.patch, based on the feedback. This patch is complete by itself, and enables changing frequency using IO ports on an SMP system, with Enhanced speedstep feature. msr.patch - Support for MSR based transition, on an SMP platform. Will be done later, with a seperate bugzilla.
Yes. _P{CT,DC,SS,PC} entries will be there in the {S,D}SDT for each sibling. And AFAIK the MSR is shared across the siblings, where as THERM_CONTROL MSR is duplicated. However, MSR based P-state transition on SMP with HT, OS has to ensure the P-state coordination across the siblings.
Just a suggestion for a small improvement: instead of + /* allocate and initialize pdc. It will be used later. */ + if (obj_list) { + if (obj_list->count && obj_list->pointer) { please do + if (!obj_list) + return_VOID; + + /* Initialize pdc. It will be used later. */ + if (obj_list->count && obj_list->pointer) { It reduces indentiation. Besides that, patch looks OK. Oh, 0xa is bitmask 10010 , so two bits are set... /me wonders whether this being two bits is something we should be aware of.
Created attachment 2919 [details] pdc.patch OK. Here goes a new version. One other change is to make pr->pdc NULL, after all the initialization, so that someone don't refer to stack addresses in future. 0xa is setting 2 bits specific to MP. The actual document should be there in developer.intel anytime now. Will forward the link as soon as I see it there.
FYI. http://developer.intel.com/technology/iapc/acpi/downloads/30222302.pdf has the _PDC bit definitions.
shipped in 2.6.9 closing