Bug 60839 - scaling_max_freq cannot be set to values larger than bios_limit
Summary: scaling_max_freq cannot be set to values larger than bios_limit
Status: CLOSED INSUFFICIENT_DATA
Alias: None
Product: Power Management
Classification: Unclassified
Component: cpufreq (show other bugs)
Hardware: All Linux
: P1 normal
Assignee: cpufreq
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2013-09-03 12:47 UTC by Sven Köhler
Modified: 2013-10-14 07:58 UTC (History)
2 users (show)

See Also:
Kernel Version: 3.10.10
Subsystem:
Regression: No
Bisected commit-id:


Attachments
cpufreq_sys.patch (1.03 KB, patch)
2013-09-12 14:07 UTC, Lan Tianyu
Details | Diff

Description Sven Köhler 2013-09-03 12:47:50 UTC
Background:

On my Dell Latitude E6410, the following happens if the power supply is plugged in or plugged out: The value of bios_limit drops for all CPUs to cpuinfo_min_freq. Within seconds, it is then gradually increased until bios_limit is equal to cpuinfo_max_freq. I assume, that this serves the purpose of detecting the type of attached power supply. The BIOS temporarely limits the CPU frequency in order to keep the power consumption low until it is determined that the power supply is the correct one.

Problem description:

Assume that nobody is writing to scaling_max_freq and that scaling_max_freq=cpuinfo_max_freq initially. Upon switching the power supply,  you observe that the value of scaling_max_freq decreases and then gradually increases along with bios_limit. From this I conclude that
(a) internally, the value for scaling_max_freq can assume values larger than bios_limit. However, when reading scaling_max_freq through sysfs the value is clipped by bios_limit
(b) there is no good reason, why the internal value of scaling_max_freq shouldn't be set to a value larger than bios_limit.

However, any write to scaling_max_freq through sysfs will also be clipped by bios_limit. In result, if the value of cpuinfo_max_freq is written to scaling_max_freq while bios_limit is low, then the internal value of scaling_max_freq will be set to whatever the value of bios_limit is.

In my case, this was causing the following problems:
laptop-mode-tools would write to scaling_max_freq shortly after the power supply was plugged in / unplugged. As the bios_limit would be low at that point, the internal value of scaling_max_freq would be set to a low value. Hence, cpufreq would never raise the frequencies of my CPUs again even after bios_limit increased. I solved it my disabling the cpufreq related parts of laptop-mode-tools. 

Also note, that it isn't currently possible to determine the true value of scaling_max_freq as the value returned through sysfs is clipped by bios_limit.
Comment 1 Lan Tianyu 2013-09-10 08:28:55 UTC
(In reply to Sven Köhler from comment #0)
> Background:
> 
> On my Dell Latitude E6410, the following happens if the power supply is
> plugged in or plugged out: The value of bios_limit drops for all CPUs to
> cpuinfo_min_freq. Within seconds, it is then gradually increased until
> bios_limit is equal to cpuinfo_max_freq. I assume, that this serves the
> purpose of detecting the type of attached power supply. The BIOS temporarely
> limits the CPU frequency in order to keep the power consumption low until it
> is determined that the power supply is the correct one.
> 
> Problem description:
> 
> Assume that nobody is writing to scaling_max_freq and that
> scaling_max_freq=cpuinfo_max_freq initially. Upon switching the power
> supply,  you observe that the value of scaling_max_freq decreases and then
> gradually increases along with bios_limit. From this I conclude that
> (a) internally, the value for scaling_max_freq can assume values larger than
> bios_limit. However, when reading scaling_max_freq through sysfs the value
> is clipped by bios_limit
> (b) there is no good reason, why the internal value of scaling_max_freq
> shouldn't be set to a value larger than bios_limit.
> 
> However, any write to scaling_max_freq through sysfs will also be clipped by
> bios_limit. In result, if the value of cpuinfo_max_freq is written to
> scaling_max_freq while bios_limit is low, then the internal value of
> scaling_max_freq will be set to whatever the value of bios_limit is.
> 
> In my case, this was causing the following problems:
> laptop-mode-tools would write to scaling_max_freq shortly after the power
> supply was plugged in / unplugged. As the bios_limit would be low at that
> point, the internal value of scaling_max_freq would be set to a low value.
> Hence, cpufreq would never raise the frequencies of my CPUs again even after
> bios_limit increased. I solved it my disabling the cpufreq related parts of
> laptop-mode-tools.

Hi:
     This sounds like a user space issue. Bios limit will rise after plugging/unplugging AC and laptop-mode-tools also should update scaling_max_freq. Cpufreq core schedules freq scope according user space configuration. If user space tool doesn't extend the scope according bios limit after plugging/unplugging AC, the scope will keep low cpufreq.


> 
> Also note, that it isn't currently possible to determine the true value of
> scaling_max_freq as the value returned through sysfs is clipped by
> bios_limit.
Comment 2 Sven Köhler 2013-09-10 09:46:17 UTC
(In reply to Lan Tianyu from comment #1)
> Hi:
>      This sounds like a user space issue. Bios limit will rise after
> plugging/unplugging AC and laptop-mode-tools also should update
> scaling_max_freq. Cpufreq core schedules freq scope according user space
> configuration. If user space tool doesn't extend the scope according bios
> limit after plugging/unplugging AC, the scope will keep low cpufreq.

First of all, does the kernel provide any hook to run a script every time bios_limit changes or do you expect userspace to do polling?

Second, you can't be serious that the sysfs interface should remain as inconsistent as it is now. Clearly, the internal state of scaling_max_freq and bios_limit can be one, where scaling_max_freq is larger than bios_limit. This is very clear since I observed that (unless userspace tries to change scaling_max_freq) the value of scaling_max_freq will increase as bios_limit increases. At time when bios_limit is low, userspace cannot even find out about the true state of scaling_max_freq (which is larger than bios_limit) as the value obtainable via sysfs is always clipped. Obviously, knowing that such a state exists, it is dubious why it can't be configured.

Thirdly, I'd really like know the rational behind the decision to that 
a) userspace should never be able to observe, that scaling_max_freq is actually kernel internally larger than bios_limit 
b) userspace should never be able to set scaling_max_freq to a value larger than bios_limit

IMHO, both (a) and (b) are wrong, in the sense that there is no good reason in favour of (a) and (b) and many reasons against (a) and (b). A very strong reason against (a) is the very simple fact that userspace cannot tell whether the current maximum CPU frequency is limited by the BIOS or the value of scaling_max_freq.
Comment 3 Lan Tianyu 2013-09-10 12:45:12 UTC
(In reply to Sven Köhler from comment #2)
> (In reply to Lan Tianyu from comment #1)
> > Hi:
> >      This sounds like a user space issue. Bios limit will rise after
> > plugging/unplugging AC and laptop-mode-tools also should update
> > scaling_max_freq. Cpufreq core schedules freq scope according user space
> > configuration. If user space tool doesn't extend the scope according bios
> > limit after plugging/unplugging AC, the scope will keep low cpufreq.
> 
> First of all, does the kernel provide any hook to run a script every time
> bios_limit changes or do you expect userspace to do polling?

When bios limit is changed, ACPI processor driver will send an ACPI event to user space.

> 
> Second, you can't be serious that the sysfs interface should remain as
> inconsistent as it is now.

The scaling_max/min_freq shows current policy limit. Kernel side also will change the limit besides user space configuration and so it's inconsistent with the data set to the attribute.

> Clearly, the internal state of scaling_max_freq
> and bios_limit can be one, 

No, actually there are some other factors that may change policy limit(E,G, processor throttling controlled by thermal core).

> where scaling_max_freq is larger than bios_limit.
> This is very clear since I observed that (unless userspace tries to change
> scaling_max_freq) the value of scaling_max_freq will increase as bios_limit
> increases. At time when bios_limit is low, userspace cannot even find out
> about the true state of scaling_max_freq (which is larger than bios_limit)
> as the value obtainable via sysfs is always clipped. Obviously, knowing that
> such a state exists, it is dubious why it can't be configured.
>

Kernel will store the userspace configuration but not show. I think we can introduce some new attributs to show user space configurations if necessory.

> Thirdly, I'd really like know the rational behind the decision to that 
> a) userspace should never be able to observe, that scaling_max_freq is
> actually kernel internally larger than bios_limit

I think you said the user space configuration rather than current policy(scaling_min/max_freq shows). The current policy should take the bios_limit into account.

> b) userspace should never be able to set scaling_max_freq to a value larger
> than bios_limit

Actually, you can set the value larger than bios_limit and it will works when bios_limit raises.

> 
> IMHO, both (a) and (b) are wrong, in the sense that there is no good reason
> in favour of (a) and (b) and many reasons against (a) and (b). A very strong
> reason against (a) is the very simple fact that userspace cannot tell
> whether the current maximum CPU frequency is limited by the BIOS or the
> value of scaling_max_freq.

Actually, the cpu freq may be limited by other factors besides bios limit. So new attributes may be helpful to identify whether kernel side has changed the cpufreq scope.
Comment 4 Sven Köhler 2013-09-10 13:15:48 UTC
(In reply to Lan Tianyu from comment #3)
> > b) userspace should never be able to set scaling_max_freq to a value larger
> > than bios_limit
> 
> Actually, you can set the value larger than bios_limit and it will works
> when bios_limit raises.

That is exactly what didn't work! A write to the scaling_max_freq via sysfs was clipped by bios_limit, as far as I can tell. It never want up again, after a write to scaling_max_freq occured. I reproduced this having a small while loop in the bash shell, that wrote the maximum CPU frequency to scaling_max_freq while bios_limit was low.

I will try to reproduce the issue again and post the bash loop code here.
Comment 5 Lan Tianyu 2013-09-10 14:19:33 UTC
Ok. I was confused and please try the following patch.

diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c
index 5c75e31..de9e6e4 100644
--- a/drivers/cpufreq/cpufreq.c
+++ b/drivers/cpufreq/cpufreq.c
@@ -449,7 +449,7 @@ static ssize_t store_##file_name                                    \
                return -EINVAL;                                         \
                                                                        \
        ret = __cpufreq_set_policy(policy, &new_policy);                \
-       policy->user_policy.object = policy->object;                    \
+       policy->user_policy.object = new_policy.object;                 \
                                                                        \
        return ret ? ret : count;                                       \
 }
Comment 6 Sven Köhler 2013-09-11 19:49:59 UTC
Nope, the patch didn't help. Here's the bash-code I used to test:

cmax=$(cat cpuinfo_max_freq)
echo $cmax >scaling_max_freq
bmax=$cmax
while [ $bmax -ge $cmax ]; do
  bmax=$(cat bios_limit)
  smax=$(cat scaling_max_freq)
  echo $cmax >scaling_max_freq
  echo $bmax $smax
done
while true; do
  echo $(cat bios_limit) $(cat scaling_max_freq)
  sleep 0.5
done

The output is like this:

2534000 2534000
2534000 2534000
2534000 2534000
2534000 2534000
2534000 1199000
1199000 1199000
1199000 1199000
1199000 1199000
1199000 1199000
1199000 1199000
1199000 1199000
1199000 1199000
1199000 1199000
1199000 1199000
1199000 1199000
1199000 1199000
1199000 1199000
1199000 1199000
1333000 1199000
1333000 1199000
1333000 1199000
1333000 1199000
1466000 1199000
1466000 1199000


As you can see, scaling_max_freq remains low. When the script is not running, scaling_max_freq increases along with bios_limit.
Comment 7 Lan Tianyu 2013-09-12 14:07:58 UTC
Created attachment 108201 [details]
cpufreq_sys.patch

Please apply this patch with previous one. This patch will add two new attributes user_min/max_freq which show user space configuration. Change your scripts and show their values.
Comment 8 Lan Tianyu 2013-09-12 14:49:13 UTC
Please also provide dmesg after testing with cpufreq debug opening.
Run "echo "file cpufreq.c +p" > /sys/kernel/debug/dynamic_debug/control to open debug.
Comment 9 Lan Tianyu 2013-09-18 03:27:22 UTC
ping ...
Comment 10 Zhang Rui 2013-10-14 07:58:39 UTC
Bug closed as there is no response for more than a month.
please feel free to reopen it when you can provide the information request in comment #7

Note You need to log in before you can comment on or make changes to this bug.