Bug 64261
Created attachment 113231 [details]
Shows rounded pstate and current actual frequency Vs. requested - Turbo on
Just adding a turbo on version of the same attachment provided in the original posting, which was with turbo off.
Can you attach the script you are using for this test? Is the requested P state in response to a load? Dirk: The script is the same one as was used and posted back in bug 59481. However, now that that main issue is fixed, it is rather dense to use that script in the same way, as it takes about 10 hours. In a few hours I will post a modified version as there is no reason not to speed it up a lot now. Yes, the requested Pstate is in response to a full load but both min and max percent have been set to whatever is shown on the x-axis. So you are trying to emulate the userspace governor here, not really to goal of intel_pstate but OK :-) The ability to select a single P state was the intended usage of {min,max}_pct_perf but allow users to select a floor and ceiling for the range of available P states. The absolute meaning percent available performance changes based on the SKU of the part (p states available). The driver can only select integer values 16->38 in your turbo-on test case. The p states are 2.6315789 percent wide in terms of turbo frequency and 2.9411765 percent wide in terms of the non-turbo max on your CPU .42 * 38 = 15.96 .43 * 38 = 16.34 .44 * 38 = 16.72 .45 * 38 = 17.1 45% is where your test goes from 16->17. For the 3 percent shelves in the turbo-off case. .50 * 34 = 17 .51 * 34 = 17.34 .52 * 34 = 17.68 .53 * 34 = 18.03 (so we are off by 3 percent of a pstate due to truncation) Hi Dirk, There are cases, i.e. when investigating error in load averages where one wants to lock the CPU at whatever frequency. I realize it was not the goal of intel_pstate, but still needs to be allowed for. Regardless, I am merely using this as a way to easily demonstrate the issue. ".53 * 34 = 18.03 (so we are off by 3 percent of a pstate due to truncation)" No, I am arguing that it is off by 103 percent of a pstate due to truncation. I am also arguing that if rounding is used there will never be more than a half of a pstate discrepancy between desired and actual instead of 1 pstate. I am also arguing that it will help at the 100% end, where right now it might struggle to get to 100% on some processors. For your examples, I am saying it should be: Turbo: int(.42 * 38 + 0.5) = 16 int(.43 * 38 + 0.5) = 16 int(.44 * 38 + 0.5) = 17 int(.45 * 38 + 0.5) = 17 Turbo off: int(.50 * 34 + 0.5) = 17 int(.51 * 34 + 0.5) = 17 int(.52 * 34 + 0.5) = 18 int(.53 * 34 + 0.5) = 18 (In reply to Doug Smythies from comment #5) > Hi Dirk, > > There are cases, i.e. when investigating error in load averages where one > wants to lock the CPU at whatever frequency. I realize it was not the goal > of intel_pstate, but still needs to be allowed for. It is allowed for clearly you are using the mechanism. The percentage values to get to a given frequency are SKU dependent. We could change it to make your graph look the way you want it on your system. Then someone else comes up and say it goes to the higher P state too soon on their system. In the normal case where intel_pstate is being used to save enrgy being conservative is a good thing. If you can't get to a selected (measured) frequency with the current interface then that is a bug. > Regardless, I am merely > using this as a way to easily demonstrate the issue. > > ".53 * 34 = 18.03 (so we are off by 3 percent of a pstate due to truncation)" > No, I am arguing that it is off by 103 percent of a pstate due to truncation. int(.53 * 34 + 0.5) = 18 How is it off by a whole P state? > > I am also arguing that if rounding is used there will never be more than a > half of a pstate discrepancy between desired and actual instead of 1 pstate. > I am also arguing that it will help at the 100% end, where right now it > might struggle to get to 100% on some processors. > > For your examples, I am saying it should be: > > Turbo: > int(.42 * 38 + 0.5) = 16 > int(.43 * 38 + 0.5) = 16 > int(.44 * 38 + 0.5) = 17 > int(.45 * 38 + 0.5) = 17 > > Turbo off: > int(.50 * 34 + 0.5) = 17 > int(.51 * 34 + 0.5) = 17 > int(.52 * 34 + 0.5) = 18 > int(.53 * 34 + 0.5) = 18 (In reply to Dirk Brandewie from comment #6) > The percentage > values to get to a given frequency are SKU dependent. Yes, of course. > int(.53 * 34 + 0.5) = 18 > > How is it off by a whole P state? Because it actually goes to 1.7 GHz not 1.8 GHz: CPU 7 is fully loaded: doug@s15:~/temp$ cat /sys/devices/system/cpu/intel_pstate/* 53 42 1 doug@s15:~/temp$ cat /sys/devices/system/cpu/cpu7/cpufreq/cpuinfo_cur_freq 1699867 By rounding, we stay away from finite math issues at integer boundaries. This NOT how you pin a single P state. Make max == min with the config above the driver is free to select any P state between 42-53%. I have done it both ways and get the same result, if the CPU is fully loaded. All of my tests, until this morning were always done with max == min But O.K.: doug@s15:~/temp$ sudo ./set_cpu_turbo_off doug@s15:~/temp$ echo "53" | sudo tee /sys/devices/system/cpu/intel_pstate/max_perf_pct 53 doug@s15:~/temp$ echo "53" | sudo tee /sys/devices/system/cpu/intel_pstate/min_perf_pct 53 doug@s15:~/temp$ cat /sys/devices/system/cpu/intel_pstate/* 53 53 1 doug@s15:~/temp$ cat /sys/devices/system/cpu/cpu7/cpufreq/cpuinfo_cur_freq 1700000 Created attachment 113491 [details]
Test done with min=max percent and min left at 42 percent
This was the test I did so that I knew I could leave min percent at 42 percent without an unknown side effect. (as long as CPU 7 was under full load, of course)
Created attachment 113581 [details]
CPU 7 frequency Vs. requested percent with rounding. Turbo off.
I added generic rounding to intel_pstate.c on the kernel 3.12 build for my test computer.
This attachment is similar to previous attachments, but with a new line added using the new code. Turbo off. The two occurrences of the span of 4 percent samples without a frequency step are still there, just moved. I'll look into that sometime.
Created attachment 113591 [details]
CPU 7 frequency vs load. Turbo off. With and without rounding.
The load on cpu 7 varies from 0.005 to 0.995 in steps of 0.005 at 10 seconds per step. The frequency is monitored at 10 Hertz.
Created attachment 113601 [details]
CPU 7 frequency vs load. Turbo on. With and without rounding.
The load on cpu 7 varies from 0.005 to 0.995 in steps of 0.005 at 10 seconds per step. The frequency is monitored at 10 Hertz.
Test 1 of 2. A subsequent test where the load was held at 0.83 and the load/sleep frequency was varied from 50 hertz to 300 hertz (graph not posted herein, as it was uneventful between rounding and no rounding), showed much less frequency jitter for the un-rounded case. Therefore it was decided to repeat this test.
Created attachment 113611 [details]
CPU 7 frequency vs load. Turbo on. With and without rounding. 2
The load on cpu 7 varies from 0.005 to 0.995 in steps of 0.005 at 10 seconds per step. The frequency is monitored at 10 Hertz.
Turbo on test 2 of 2. The significant different for the un-rounded data at the higher frequencies is not understood. I'll look into it.
Note that gains and setpoints have not been altered, yet.
Created attachment 113621 [details]
The rounding code.
The changes to the code. Probably some incorrect format.
(In reply to Doug Smythies from comment #12) > Created attachment 113591 [details] > CPU 7 frequency vs load. Turbo off. With and without rounding. > > The load on cpu 7 varies from 0.005 to 0.995 in steps of 0.005 at 10 seconds > per step. The frequency is monitored at 10 Hertz. Are you still setting {min/max}_perf_pct or are we changing horses and talking about using the driver in "normal' mode? (In reply to Dirk Brandewie from comment #16) > (In reply to Doug Smythies from comment #12) > > Created attachment 113591 [details] > > CPU 7 frequency vs load. Turbo off. With and without rounding. > > > > The load on cpu 7 varies from 0.005 to 0.995 in steps of 0.005 at 10 > seconds > > per step. The frequency is monitored at 10 Hertz. > > Are you still setting {min/max}_perf_pct or are we changing horses and > talking about using the driver in "normal' mode? No, I am not setting min=max=whatever. Yes, I was "changing horses" here. The settings are all default from boot up, except with respect to turbo on or off. I was wanting to try to demonstrate improvement using rounding in a real operational sense and as it got closer to 100% frequency. Test 1, turbo on, did show improvement and much less jitter. However Test 2, turbo on, did not. I still need to go back and investigate why those two tests, which should have been the same, weren't. Sorry, I should have been clearer in my description. I have two methods for loading CPUs to various levels and working/sleeping frequencies: One uses a program called "consume", originally from Peter Zijlstra of the kernel.org sched maintainers. It will apply the desired load at the desired work/sleep frequency, regardless of the CPU frequency. I.E. it does not respond to the CPU frequency going up, but rather modifies its work load accordingly to hold to what was asked for, not like a real system. The other is a program called "waiter" (the name is from the text book I started it from). It will spin out the desired number of processes at the desired load at the desired work/sleep frequency, but what it actually does depends on the CPU frequency and number of running processes. I.E. it responds to the CPU frequency going up by getting its work done faster, just like a real system. The user interface for waiter is NOT good, and I tend to use another program to create scripts to provide operational parameters. I used "consume" for these graphs. Created attachment 133231 [details]
Shows rounded pstate and actual for freq. Vs. Requested. Kernel 3.15rc1
I saw that this bug was set to resolved, and so did a new version of graphs previously posted herein. It looks to me as though the rounding is not even as good as it was.
However, it is the next graph, that I will post in a moment that is cause for concern.
Created attachment 133241 [details]
CPU 7 frequency vs load. Turbo on. Kernels 3.15RC2 and 3.12
This graph, similar to others posted herein, shows old data from Kernel 3.12 as a reference and adds data from Kernel 3.15RC2. With Kernel 3.15RC2 the CPU 7 frequency never increases even though the load gets as high as 99%.
The sleep frequency was fixed at 200 Hertz for this test. Meaning, for a 99% load CPU 7 is busy for 4.95 milliseconds and idle for .05 milliseconds.
Note: I do not know when this change occurred, but I tired Kernel 3.13 and it seems similar to kernel 3.15RC2 for this. It is unlikely that this has anything to do with truncating, it just that I had similar graphs already in this bug report.
There will be one more graph...
Created attachment 133251 [details]
Sleep / load frequency sweep from 2 to 250 Hertz. Kernel 3.12 and 3.15RC2
In this graph CPU 7 load was always 85%, however the sleep / load frequency was swept from 2 to 250 hertz. At 2 hertz, CPU 7 is busy for 425 milliseconds and idle for 75 milliseconds (we might expect to see some CPU frequency oscillations at this low sleep/work frequency). At 250 hertz, CPU 7 is busy for 3.4 milliseconds and idle for 0.6 milliseconds.
Note the dramatic difference in sleep/work frequency response between kernel 3.12 and 3.15 RC2.
I don't see the suggested patch upstream or in linux-next, so it seems that Resolved/Code_fix isn't the right state for this report. Re-opening, though I'd not be surprised if this one gets explained and then closed as Documented... Is the issue of rounding vs truncation of target frequency in the driver independent of use of the min_perf_pct and max_perf_pct sysfs interface? If that is the case, it seems that intel_pstate might be choosing "the next lower p-state" more often than if it had the luxury of actually using floating point arithmetic. Is that what this report is about? If yes, that is an interesting realization, and perhaps a good suggestion for optimization. If that is not the case and this is about the precision control using the sysfs interface, then I think you got what you got. For better or for worse, it is defined in terms of percent, not in terms of tenths or hundredths of a percent, so high precision on the right side of the decimal place is not implied. If there is a need for more precise control via sysfs, please share the use-case. Hi Len, The intel_pstate driver has changed a lot since I entered this bug report. Extra bits are maintained throughout. Rounding was added, but in the end it was a different form than in the example given in this bug report. In the end, myself, I was O.K. with this one being closed as resolved. My memory is vague, but I believe I re-did several of the tests and was O.K. with the results. Yes, my original concerns were about struggling to get to the max pstate and servo response oddities that might results from digital anomalies. (some of my concerns turned out to be unfounded.) |
Created attachment 113151 [details] shows rounded pstate and current actual for frequency Vs. requested. The Intel Pstate driver seems to truncate its calculations to the lower integer pstate. The suggestion is that it should round to the nearest pstate. Recent (kernel 3.12RC7) math improvements have made achieving 100% frequency better, but rounding would make it more robust. The attachment demonstrates.