Bug 75121

Summary: Intel Pstate driver - powersave mode - CPU frequency too low
Product: Power Management Reporter: Doug Smythies (dsmythies)
Component: cpufreqAssignee: cpufreq
Status: CLOSED PATCH_ALREADY_AVAILABLE    
Severity: normal CC: dirk.brandewie, jacobecc9, lenb, rjw, tomi
Priority: P1    
Hardware: All   
OS: Linux   
Kernel Version: 3.15rc3 Subsystem:
Regression: No Bisected commit-id:
Attachments: The phoronix ffmpeg test is an easy way to show the issue
CPU 7 frequency vs load. Turbo on. various kernels
CPU 7 frequency vs load / idle freqency. Turbo on. various kernels
The phoronix ffmpeg test graph again with one more data point
output from perf script as requested
Perf data for Kernel 3.15RC3 CPU at 85% load at 200 Hertz load / noload frequency.
Perf data for Kernel 3.15RC3 CPU at 10% load at 200 Hertz load / noload frequency.
Perf data for Kernel 3.15RC3-doug CPU at 10% load at 200 Hertz load / noload frequency.
Perf data for Kernel 3.15RC3 CPU at 0.5% load at 200 Hertz load / noload frequency.
Perf data for Kernel 3.15RC3-doug2 CPU at 85% load at 200 Hertz load / noload frequency.
CPU 7 frequency vs load. Turbo on. Modified C0 inclusion
The phoronix ffmpeg test graph again with the C0 weight 25% data point
CPU 7 frequency vs load / idle freqency. Turbo on. Modified C0 inclusion. Various kernels
Just another method I tried. Phoronix ffmpeg test is as fast as performance mode with this method.
CPU 7 frequency vs load. Turbo on. Dirk patch applied
CPU 7 frequency vs load / idle freqency. Turbo on. Dirk patch applied.
Just showing Performance mode CPU frequency response curve

Description Doug Smythies 2014-04-29 23:47:53 UTC
Until recently, the difference between the intel_pstate driver in performance mode verses powersave mode was minimal to imperceptible. Now, the frequency response (meaning the rate of load / noload on a CPU) is drastically different than it used to be and often not even as good as the acpi-cpufreq driver in ondemand mode.

This is a negative side effect from Commit fcb6a15c2e7e - intel_pstate: Take core C0 time into account for core busy calculation.

I can revert to the same performance as previously for the intel_pstate driver in powersave mode by changing one line in the code (kernel = 3.15rc3-doug on the graphs). From this:

sample->core_pct_busy = mul_fp(core_pct, c0_pct);

to this:

sample->core_pct_busy = core_pct;

basically, reverting the commit.

There has been other bug reports and threads:
References:
https://bugzilla.kernel.org/show_bug.cgi?id=70941
https://bugzilla.kernel.org/show_bug.cgi?id=66581
https://lkml.org/lkml/2014/2/19/626

In a moment, I will add 3 attachment graphs that detail the issue.
Comment 1 Doug Smythies 2014-04-29 23:54:04 UTC
Created attachment 134311 [details]
The phoronix ffmpeg test is an easy way to show the issue

Various kernels, showing the increase in test execution time as of and since the commit referenced herein.

For reference some apci-cpufreq modes are included as is an intel_pstate performance mode run.
Comment 2 Doug Smythies 2014-04-29 23:56:33 UTC
Created attachment 134321 [details]
CPU 7 frequency vs load. Turbo on. various kernels

This graph is more generic, and shows the CPU not ramping up in CPU frequency as the load becomes significant.
Comment 3 Doug Smythies 2014-04-29 23:59:10 UTC
Created attachment 134331 [details]
CPU 7 frequency vs load / idle freqency. Turbo on. various kernels

This graph is also more generic and shows the frequency response (meaning the load /idle frequency response) under constant load.
Comment 4 Doug Smythies 2014-04-30 00:08:52 UTC
Created attachment 134341 [details]
The phoronix ffmpeg test graph again with one more data point

I forgot to add the results from my "doug" test kernel to the phoronix graph before I posted it, so adding it now.
Comment 5 Dirk Brandewie 2014-05-02 14:25:20 UTC
Hi Doug,

I am trying to get phoronix installed to reproduce this (without success so far)

Could you run:
  perf record -a -c 1 -e power:pstate_sample phoronix-test-suite ffmpeg

And attach the output of:
  perf script
Comment 6 Doug Smythies 2014-05-02 23:42:23 UTC
Created attachment 134771 [details]
output from perf script as requested

I had troubles with phoronix stuff also when I tried to run it under "perf record". It seemed to think it needed to re-install and then I couldn't find where the test profile was, so that I could change it to 1 run instead of the default 3 (I typically run it 10 times, and always abort the first run because it has an inconsistent run time due to the extra time needed to load the file, which thereafter is cached).

In the end I ran this:

sudo /home/doug/bin/perf record -a -c 1 -e power:pstate_sample phoronix-test-suite benchmark pts/ffmpeg

The output from "pref script" has been truncated to just one of the 3 tests.

Myself, I find it much easier to interpret pref data captured for the "consume" program that was used to make the other graphs. I'll attach some of that data also.
Comment 7 Doug Smythies 2014-05-02 23:58:00 UTC
Created attachment 134781 [details]
Perf data for Kernel 3.15RC3 CPU at 85% load at 200 Hertz load / noload frequency.

Under these conditions, the CPU is never in the C0 state for an entire intel_pstate sample time. In my opinion the C0 inclusion unduly biases the target p-state downwards.
Comment 8 Doug Smythies 2014-05-03 00:03:21 UTC
Created attachment 134791 [details]
Perf data for Kernel 3.15RC3 CPU at 10% load at 200 Hertz load / noload frequency.

Under these conditions, the CPU used to have started to ramp up in frequency.
Comment 9 Doug Smythies 2014-05-03 00:05:09 UTC
Created attachment 134801 [details]
Perf data for Kernel 3.15RC3-doug CPU at 10% load at 200 Hertz load / noload frequency.

This time with the C0 inclusion removed
Comment 10 Doug Smythies 2014-05-03 00:11:03 UTC
Created attachment 134811 [details]
Perf data for Kernel 3.15RC3 CPU at 0.5% load at 200 Hertz load / noload frequency.

The purpose of this data is to show: Math seems to have underflowed or something a couple of times; Where is all the data, there should be more samples.
Comment 11 Doug Smythies 2014-05-03 00:19:21 UTC
Created attachment 134821 [details]
Perf data for Kernel 3.15RC3-doug2 CPU at 85% load at 200 Hertz load / noload frequency.

For this one, I tried the following, which includes C0 but applied to core_pct on a scale from min_perf_pct rather than from 0. However, it made no difference, as the C0 inclusion still dominates.

static inline void intel_pstate_calc_busy(struct cpudata *cpu,
                                        struct sample *sample)
{
        int32_t core_pct;
        int32_t c0_pct;
        int32_t temp;

// need to figure out how to do fractional weight
#define C0_WEIGHT 1

        core_pct = div_fp(int_tofp((sample->aperf)),
                        int_tofp((sample->mperf)));
        core_pct = mul_fp(core_pct, int_tofp(100));
        FP_ROUNDUP(core_pct);

        c0_pct = div_fp(int_tofp(sample->mperf), int_tofp(sample->tsc));

        sample->freq = fp_toint(
                mul_fp(int_tofp(cpu->pstate.max_pstate * 1000), core_pct));

//      sample->core_pct_busy = core_pct;
//      sample->core_pct_busy = mul_fp(core_pct, c0_pct);
        temp = core_pct - limits.min_perf_pct;
        c0_pct = int_tofp(1) - mul_fp((int_tofp(1) - c0_pct), int_tofp(C0_WEIGHT));
        temp = mul_fp(temp, c0_pct);
        sample->core_pct_busy = temp + limits.min_perf_pct;
}
Comment 12 Doug Smythies 2014-05-03 03:49:52 UTC
Created attachment 134831 [details]
CPU 7 frequency vs load. Turbo on. Modified C0 inclusion

There was a stupid mistake in the code change I made in the previous post. Here it is again, fixed.

static inline void intel_pstate_calc_busy(struct cpudata *cpu,
                                        struct sample *sample)
{
        int32_t core_pct;
        int32_t c0_pct;
        int32_t temp;

// As a float with 6 FRAC_BITS ( 1 << FRAC_BITS / 4 )
#define C0_WEIGHT 16

        core_pct = div_fp(int_tofp((sample->aperf)),
                        int_tofp((sample->mperf)));
        core_pct = mul_fp(core_pct, int_tofp(100));
        FP_ROUNDUP(core_pct);

        c0_pct = div_fp(int_tofp(sample->mperf), int_tofp(sample->tsc));

        sample->freq = fp_toint(
                mul_fp(int_tofp(cpu->pstate.max_pstate * 1000), core_pct));

//      sample->core_pct_busy = core_pct;
//      sample->core_pct_busy = mul_fp(core_pct, c0_pct);
        temp = core_pct - int_tofp(limits.min_perf_pct);
        c0_pct = int_tofp(1) - mul_fp((int_tofp(1) - c0_pct), C0_WEIGHT);
        temp = mul_fp(temp, c0_pct);
        sample->core_pct_busy = temp + int_tofp(limits.min_perf_pct);
}

And the graph shows the effect, for C0_WEIGHT of 1 (Doug3) and 0.25 (Doug 4)
Comment 13 Doug Smythies 2014-05-03 03:58:29 UTC
Created attachment 134841 [details]
The phoronix ffmpeg test graph again with the C0 weight 25% data point

Just showing some improvement in the phoronix ffmpeg test with the code mods shown previously.
Comment 14 Doug Smythies 2014-05-03 04:59:27 UTC
Created attachment 134851 [details]
CPU 7 frequency vs load / idle freqency. Turbo on. Modified C0 inclusion. Various kernels

Adding C0 weight 25% data to the previously posted load / idle frequency graph. Actually, it turned out better than I thought it would.
Comment 15 Doug Smythies 2014-05-04 19:18:50 UTC
Created attachment 135151 [details]
Just another method I tried. Phoronix ffmpeg test is as fast as performance mode with this method.

Hi Dirk: Did you ever figure out for certain why some had the CPU freq remains high after suspend issue? It remains unclear to me, and I do not know how to even try to re-create here on my test computer.

References (some but not all):
https://bugzilla.kernel.org/show_bug.cgi?id=66581#c21
https://bugzilla.kernel.org/show_bug.cgi?id=66581#c26
Comment 16 Dirk Brandewie 2014-05-05 18:05:11 UTC
(In reply to Doug Smythies from comment #15)
> Created attachment 135151 [details]
> Just another method I tried. Phoronix ffmpeg test is as fast as performance
> mode with this method.
> 
> Hi Dirk: Did you ever figure out for certain why some had the CPU freq
> remains high after suspend issue? It remains unclear to me, and I do not
> know how to even try to re-create here on my test computer.
> 

I could make it happen occasionally on my ivy bridge laptop test system.

I think it have to the hardware coordination on the chip but I haven't
proven that.
 
> References (some but not all):
> https://bugzilla.kernel.org/show_bug.cgi?id=66581#c21
> https://bugzilla.kernel.org/show_bug.cgi?id=66581#c26
Comment 17 Doug Smythies 2014-05-05 18:52:26 UTC
(In reply to Dirk Brandewie from comment #16)
> (In reply to Doug Smythies from comment #15)
> > Hi Dirk: Did you ever figure out for certain why some had the CPU freq
> > remains high after suspend issue? It remains unclear to me, and I do not
> > know how to even try to re-create here on my test computer.
> > 
> 
> I could make it happen occasionally on my ivy bridge laptop test system.
> 
> I think it have to the hardware coordination on the chip but I haven't
> proven that.
>  

I have figured out how to "suspend" my test computer. I have tried a few times, but so far haven't been able to re-create the issue. I wanted to be able to re-create the issue, as a base line reference, so that I would know how far I can adjust C0_WEIGHT or C0_MINIMUM and still have the issue never occur.

It still doesn't make sense to me that the intel_pstate driver would work fine before but not after a "suspend". If the root issue is some hardware coordination on the chip, then shouldn't that be fixed (if possible via whatever re-initialization) rather than messing with the intel_pstate servo loop? If it is not some hardware coordination on the chip, but rather due to some flaw in the servo loop, then we should be able to re-create the issue without any intervening "suspend".
Comment 18 Doug Smythies 2014-05-08 22:19:24 UTC
Created attachment 135491 [details]
CPU 7 frequency vs load. Turbo on. Dirk patch applied

Reference: https://lkml.org/lkml/2014/5/8/574

For the Phoronix ffmpeg test I get the exact same numbers as Dirk (we have the same CPU) so am not posting a new graph.

For CPU 7 freq. Vs. load, this is a new graph with Dirk's patches applied (I have removed some previous test data to reduce clutter).

I'll post a new CPU 7 frequency vs load / idle frequency in a moment.
Comment 19 Doug Smythies 2014-05-08 22:21:14 UTC
Created attachment 135501 [details]
CPU 7 frequency vs load / idle freqency. Turbo on. Dirk patch applied.
Comment 20 Dirk Brandewie 2014-05-08 22:26:04 UTC
Fix sent to LKML also the fix to FP_ROUNDUP() macro
https://lkml.org/lkml/2014/5/8/574
Comment 21 Doug Smythies 2014-05-28 21:22:55 UTC
Created attachment 137641 [details]
Just showing Performance mode CPU frequency response curve

The graph just shows performance mode response curve (done twice) with powersave, but min set to 100%, response curve (done twice) with normal powersave mode, but the C0 reduced or removed (two versions).

I mentioned I would post this graph in
http://www.spinics.net/lists/cpufreq/msg10167.html
Comment 22 Len Brown 2015-07-21 19:28:44 UTC
re: comment #20

There is no longer an FP_ROUNDUP macro in intel_pstate.c
It was removed here:

commit f0fe3cd7e12d8290c82284b5c8aee723cbd0371a
Author: Dirk Brandewie <dirk.j.brandewie@intel.com>
Date:   Thu May 29 09:32:23 2014 -0700

    intel_pstate: Correct rounding in busy calculation

which shipped in Linux-3.15

Please re-open this report if the issue here was not fixed by that change.