Bug 23551

Summary: Export ACPI _PSS to i2c/sensors
Product: Drivers Reporter: Artem S. Tashkinov (aros)
Component: Hardware MonitoringAssignee: Guenter Roeck (linux)
Status: RESOLVED WILL_NOT_FIX    
Severity: enhancement CC: acpi-bugzilla, anton.kochkov, fenghua.yu, jdelvare, lenb, linux, rui.zhang, zheng.z.yan
Priority: P1    
Hardware: All   
OS: Linux   
Kernel Version: 3.11 Subsystem:
Regression: No Bisected commit-id:
Bug Depends on:    
Bug Blocks: 56331    

Description Artem S. Tashkinov 2010-11-22 20:56:22 UTC
New Intel Core (i3/i5/i7) CPUs have an internal sensor which contains the information about the current power draw.

Please, export this information to i2c/sensors.
Comment 1 Len Brown 2010-11-30 03:21:33 UTC
The sensor used to monitor current power draw in Nehalem
for purposes of controlling turbo mode is not visible to software.

On the upcoming Sandy Bridge generation of processors, there
are energy counters, and Rui is working to export those counters
via perf.
Comment 2 Artem S. Tashkinov 2010-11-30 08:15:53 UTC
Len, most Windows monitoring utilities (like Hardware Monitor, Everest and others) show power draw of the mentioned CPUs and and even some Core 2 CPUs.

How do they manage to pull this trick?
Comment 3 Zhang Rui 2012-01-18 02:25:18 UTC
Ping Len...
Comment 4 Artem S. Tashkinov 2012-05-10 10:02:47 UTC
Ping everyone :)
Comment 5 Zhang Rui 2012-05-24 07:47:17 UTC
Len, I think you mean the RAPL stuff, right?
RAPL will not go to perf any more.
and actually, I'm not sure what we can do using this feature in kernel.
but there is a patch about this feature in case you are interested.
https://github.com/razvanlupusoru/Intel-RAPL-via-Sysfs/blob/fa283b4d955dff7661b3c9c8de017d7f76209c8c/intel_rapl_power.c
Comment 6 Len Brown 2012-10-30 02:10:02 UTC
> New Intel Core (i3/i5/i7) CPUs have an internal sensor which contains the
> information about the current power draw.

I'm not aware of a way to get to this information using open source software.

> I think you mean the RAPL stuff, right?

We can print the RAPL power meter from user-space
without any kernel modification, as they are simply MSRs.

This is a feature request for a feature that doesn't exist,
rather than a bug against the kernel, and so this report is now closed.
Comment 7 Artem S. Tashkinov 2012-10-30 18:22:29 UTC
(In reply to comment #6)
> > New Intel Core (i3/i5/i7) CPUs have an internal sensor which contains the
> > information about the current power draw.
> 
> I'm not aware of a way to get to this information using open source software.
> 
> > I think you mean the RAPL stuff, right?
> 
> We can print the RAPL power meter from user-space
> without any kernel modification, as they are simply MSRs.
> 
> This is a feature request for a feature that doesn't exist,
> rather than a bug against the kernel, and so this report is now closed.

I've never seen such a confusing "resolution" of a bug report.

lm-sensors don't show this information and they have no way of getting it since, according to my interpretation of your words, it involves querying MSR registers which are not available to userspace applications unless you load MSR kernel modules.

So, are you basically saying "We don't want to export this information, let lm-sensors guy implement this feature?".

If so, please, reassign this feature request instead of closing it. Your attitude is kinda insulting - Windows users have been able to fetch and see this information using various utilities for three years already and instead of offering ways of implementing it you say "it's none of my business, let's close it".

I feel like it's somewhat unprofessional and surely insulting.
Comment 8 Jean Delvare 2012-10-30 20:16:33 UTC
If the power values can be queried through MSRs, then we can have a kernel driver implementing the standard hwmon sysfs interface for libsensors to consume. We could even add this to the coretemp driver. Just because MSRs can be read from user-space (by root only!) is not a valid excuse for not doing it in a kernel driver, as this is how hardware monitoring is implemented on Linux.

Where are these MSRs documented?

Unfortunately these MSRs seem to have been introduced after Nehalem-EP so I will have no easy way to test.
Comment 9 Artem S. Tashkinov 2012-10-30 20:52:44 UTC
(In reply to comment #8)
> If the power values can be queried through MSRs, then we can have a kernel
> driver implementing the standard hwmon sysfs interface for libsensors to
> consume. We could even add this to the coretemp driver. Just because MSRs can
> be read from user-space (by root only!) is not a valid excuse for not doing
> it
> in a kernel driver, as this is how hardware monitoring is implemented on
> Linux.
> 
> Where are these MSRs documented?
> 
> Unfortunately these MSRs seem to have been introduced after Nehalem-EP so I
> will have no easy way to test.

http://download.intel.com/products/processor/manual/325462.pdf

Vol. 3B 14-19: 14.7.1 RAPL Interfaces - but I'm not sure that's what we need.
Comment 10 Zhang Rui 2012-10-31 02:46:46 UTC
we have the code to export these energy consumption values via per tool.
Zheng,
what is the status of the code?
Comment 11 Artem S. Tashkinov 2012-12-18 13:57:22 UTC
Len Brown himself has added this feature into power tools:

http://lkml.indiana.edu/hypermail/linux/kernel/1212.2/00553.html

[git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux.git next]

Alas, power tools need the MSR kernel module loaded and root permissions to run.
Comment 12 Guenter Roeck 2012-12-19 16:23:38 UTC
coretemp seems to be the right place for this.

The MSR counters provide energy readings. Those can of course be converted into power (or displayed as energy), but since the counters can wrap every 60 seconds or so that means we would have to run a periodic activity (maybe once a second or so) to collect the values. Is everyone ok with this ? If so I can give it a shot  - I have both SandyBridge and IvyBridge CPUs to test it with.
Comment 13 Guenter Roeck 2012-12-22 01:32:43 UTC
I submitted a set of patches to the lm-sensors mailing list. If anyone out there is interested, review and test feedback would be very much appreciated.
Comment 14 Artem S. Tashkinov 2013-02-10 13:40:07 UTC
(In reply to comment #13)
> I submitted a set of patches to the lm-sensors mailing list. If anyone out
> there is interested, review and test feedback would be very much appreciated.

Thank you!

Guenter,

I wonder if you could export turbo boost CPU frequency via sensors as well. There's a utility which fetches and shows this information (kernel/tools/power/x86/turbostat) - but it's far from handy since you should have your terminal window open or write scripts to pass this data to monitoring applications.

Thank you.
Comment 15 Jean Delvare 2013-02-10 13:46:53 UTC
I fail to see how turbo boost fits into hardware monitoring.
Comment 16 Guenter Roeck 2013-02-10 14:01:07 UTC
Same here.

Besides, there does not seem to be much if any real interest in this feature. There has been no feedback whatsoever to my patch. Maybe this bug should be closed.
Comment 17 Artem S. Tashkinov 2013-02-10 17:13:16 UTC
(In reply to comment #16)
> Same here.
> 

Does kernel 3.8 include those patches or they haven't been merged yet? If they are merged I can test this new feature right away. And don't be deceived by the seemingly absent response from the testers - very few average people are subscribed to the lm-sensors mailing lists, so it's more than expected.

> Besides, there does not seem to be much if any real interest in this feature.
> There has been no feedback whatsoever to my patch. Maybe this bug should be
> closed.

As for CPU turbo frequency - it relates to TDP quite directly, so by observing it you can estimate your power draw and I believe it's quite important information to have. If you are against the inclusion of this feature, I will keep my mouth shut then.
Comment 18 Jean Delvare 2013-02-10 17:26:22 UTC
(In reply to comment #17)
> As for CPU turbo frequency - it relates to TDP quite directly, so by
> observing
> it you can estimate your power draw (...)

No, you can't. It's only one of the many factors which determine how many power your CPU draws. And AFAIK CPU turbo frequency is only possible on one core when the other cores are idle, so the overall power draw isn't necessarily higher when turbo frequency is used.
Comment 19 Guenter Roeck 2013-02-10 17:37:33 UTC
I'll need code review and testing feedback before the patch will make it upstream. I can provide a standalone driver for testing if needed.
Comment 20 Artem S. Tashkinov 2013-02-10 18:52:26 UTC
(In reply to comment #19)
> I'll need code review and testing feedback before the patch will make it
> upstream. I can provide a standalone driver for testing if needed.

That would be great if it's not too much of a trouble for you.

(In reply to comment #18)
> 
> No, you can't. It's only one of the many factors which determine how many
> power
> your CPU draws. And AFAIK CPU turbo frequency is only possible on one core
> when
> the other cores are idle, so the overall power draw isn't necessarily higher
> when turbo frequency is used.

Actually all CPU cores can be in a turbo boost mode simultaneously (especially if your relax TDP restrictions and enable overclocking in your BIOS):

./turbostat
cor CPU    %c0  GHz  TSC    %c1    %c3    %c6    %c7   %pc2   %pc3   %pc6   %pc7
         82.82 3.43 3.31   3.96   1.39  11.82   0.00   0.00   0.00   0.00   0.00
  0   0  86.16 3.43 3.31   0.38   0.00  13.45   0.00   0.00   0.00   0.00   0.00
  1   1  75.23 3.43 3.31   7.42   2.93  14.42   0.00
  2   2  79.84 3.44 3.31   7.12   2.46  10.57   0.00
  3   3  90.06 3.44 3.31   0.92   0.17   8.84   0.00

Normal maximum for my CPU is 3.3GHz - so, 3.45Ghz obviously represents a turbo mode.
Comment 21 Guenter Roeck 2013-02-10 19:09:49 UTC
The standalone driver is available at https://github.com/groeck/coretemp
or "wget https://github.com/groeck/coretemp/archive/master.tar.gz".
Comment 22 Artem S. Tashkinov 2013-02-10 20:32:46 UTC
(In reply to comment #21)
> The standalone driver is available at https://github.com/groeck/coretemp
> or "wget https://github.com/groeck/coretemp/archive/master.tar.gz".

With your Makefile I couldn't compile it, so I just replaced coretemp.c from Linux 3.7 and off we go.

I've been monitoring `watch sensors` for an hour already and everything seems totally fine. The only nuisance is that I've no idea how to make gkrellm show these two values (Watts and Joles) as this little utility knows only about voltages, fans and temperatures.

So,

Tested-by: Artem S. Tashkinov

and thank you for you work!

P.S. I still hope you will reconsider monitoring the real CPU frequency as running console utilities for this task seems totally counterproductive. If you don't want to waste your time implementing it I can try to find people who will do that (I'm not really a programmer) but I need to know if you agree to merge it.
Comment 23 Guenter Roeck 2013-02-10 22:45:17 UTC
Now we'll have to figure out how this behaves across suspend/restore.

Regarding frequencies, implementation is not the issue. The hardware monitoring subsystem monitors environmental data, not CPU frequencies. Even if someone would implement frequency monitoring into the coretemp driver, both Jean and myself would reject it as out of scope.
Comment 24 Artem S. Tashkinov 2013-02-10 23:37:11 UTC
(In reply to comment #23)
> Now we'll have to figure out how this behaves across suspend/restore.
> 
> Regarding frequencies, implementation is not the issue. The hardware
> monitoring
> subsystem monitors environmental data, not CPU frequencies. Even if someone
> would implement frequency monitoring into the coretemp driver, both Jean and
> myself would reject it as out of scope.

To my knowledge Linux has quite a limited number of APIs to allow user space applications to read hardware related information, and since the coretemp driver has all the necessary code, it makes perfect sense to implement this feature right here, as reimplementing it differently means perpetuating the NIH syndrome which often plagues open source projects.

I'm almost sure that the inclusion of this code won't make the driver less pretty or more difficult to maintain, at the same time users who don't want to see this information can easily disable its monitoring via a standard means (e.g. /etc/sensors.d/*conf).

In short, it's not just nice to have this functionality here, it's technically feasible and appropriate even though it's out of the scope (but the scope quite often exists only in our heads).

Ideally, this information belongs to /proc/cpuinfo or/and /sys/devices/system/cpu/cpu*/cpufreq/scaling_cur_freq but

/proc/cpuinfo is a very simple limited interface, and I'm almost sure MSR polling for this interface will be outright rejected by Linus

and

cpufreq subsystem is unaware of turbo boost and CPUfreq developers have stated numerous times that turbo boost is beyond their scope.

So, we are back to running (as root) the turbostat utility?

That's ugly.

I wonder if you have any other suggestions as to how this information must be exported to user space. However mind that even in Windows turbo boost stats are reported by ... hardware monitoring applications like hwinfo.
Comment 25 Jean Delvare 2013-02-11 07:21:53 UTC
So you perfectly know that turbo boost belongs to the cpufreq subsystem, and you tried to get it there already, but it was rejected, so you try somewhere else?

If turbo boost belongs somewhere in the kernel, that's in the cpufreq subsystem. hwmon has nothing to do with frequencies.
Comment 26 Artem S. Tashkinov 2013-02-11 08:27:54 UTC
(In reply to comment #25)
> So you perfectly know that turbo boost belongs to the cpufreq subsystem, and
> you tried to get it there already, but it was rejected, so you try somewhere
> else?

No, actually I haven't tried, I'm speaking from what I've read.

> If turbo boost belongs somewhere in the kernel, that's in the cpufreq
> subsystem. hwmon has nothing to do with frequencies.

The cpufreq subsystem is not a hardware monitoring subsystem unfortunately. Anyways, this argument isn't going anywhere, so I'll keep using the turbostat utility as kernel developers cannot reach agreement on this petty topic. I've got no hard feelings, I just want to sigh.
Comment 27 Artem S. Tashkinov 2013-02-11 08:30:49 UTC
(In reply to comment #23)
> Now we'll have to figure out how this behaves across suspend/restore.
 
It seems to be working perfectly here - I'm not sure about the reported values though:

sensors
coretemp-isa-0000
Adapter: ISA adapter
Physical id 0: +29.0°C  (high = +80.0°C, crit = +99.0°C)
Core 0:        +27.0°C  (high = +80.0°C, crit = +99.0°C)
Core 1:        +26.0°C  (high = +80.0°C, crit = +99.0°C)
Core 2:        +26.0°C  (high = +80.0°C, crit = +99.0°C)
Core 3:        +25.0°C  (high = +80.0°C, crit = +99.0°C)
Pkg 0 power:    5.93 W
Pkg 0 energy: 196.71 kJ

Yesterday energy was above 100k so everything seems normal.
Comment 28 Guenter Roeck 2013-02-25 02:50:34 UTC
New standalone version of the coretemp driver is available at https://github.com/groeck/coretemp. Output now looks as follows.

SandyBridge, idle:

coretemp-isa-0000
Adapter: ISA adapter
Physical id 0:  +32.0°C  (high = +80.0°C, crit = +98.0°C)
Core 0:         +21.0°C  (high = +80.0°C, crit = +98.0°C)
Core 1:         +26.0°C  (high = +80.0°C, crit = +98.0°C)
Core 2:         +31.0°C  (high = +80.0°C, crit = +98.0°C)
Core 3:         +26.0°C  (high = +80.0°C, crit = +98.0°C)
Pkg power:       7.22 W  (max =  95.00 W, cap =  95.00 W)
Core power:    935.00 mW (max =  95.00 W)
Pkg energy:     43.81 kJ
Core energy:    48.28 kJ

SandyBridge, under load:

coretemp-isa-0000
Adapter: ISA adapter
Physical id 0:  +82.0°C  (high = +80.0°C, crit = +98.0°C)
Core 0:         +77.0°C  (high = +80.0°C, crit = +98.0°C)
Core 1:         +80.0°C  (high = +80.0°C, crit = +98.0°C)
Core 2:         +82.0°C  (high = +80.0°C, crit = +98.0°C)
Core 3:         +78.0°C  (high = +80.0°C, crit = +98.0°C)
Pkg power:      68.79 W  (max =  95.00 W, cap =  95.00 W)
Core power:     64.05 W  (max =  95.00 W)
Pkg energy:     54.10 kJ
Core energy:    56.86 kJ

IvyBridge, idle:

coretemp-isa-0000
Adapter: ISA adapter
Physical id 0:  +29.0°C  (high = +85.0°C, crit = +105.0°C)
Core 0:         +28.0°C  (high = +85.0°C, crit = +105.0°C)
Core 1:         +26.0°C  (high = +85.0°C, crit = +105.0°C)
Core 2:         +24.0°C  (high = +85.0°C, crit = +105.0°C)
Core 3:         +30.0°C  (high = +85.0°C, crit = +105.0°C)
Pkg power:       6.33 W  (max =  77.00 W, cap = 1000.00 W)
Core power:    545.00 mW  (max =  77.00 W, cap =   1.20 kW)
Pkg energy:     75.98 kJ
Core energy:     8.43 kJ

The large cap on IvyBridge is not a bug; torbostat reports the same values.

Key changes:
- report core power consumption
- dropped cap min and cap_max attributes
- Report cap only if enabled
- power_max now reports TDP

TODO before the code can be sent upstream:
- code review
- testing with suspend/resume and power consumption impact on laptops
Comment 29 Jean Delvare 2013-02-25 07:39:57 UTC
What do "Core power" and "Core energy" refer to, given that the CPU has 4 cores? I would expect 4 entries of each, one per core.

How can "Core energy" be more than "Pkg energy"?

WRT laptop testing, I should receive a new laptop at work within one month and it is IvyBridge-based, so I can do some testing then.
Comment 30 Artem S. Tashkinov 2013-02-25 07:53:43 UTC
(In reply to comment #28)

Working nicely here (Intel Core i5 2500), thank you.

coretemp-isa-0000
Adapter: ISA adapter
Physical id 0: +34.0°C  (high = +80.0°C, crit = +99.0°C)
Core 0:        +30.0°C  (high = +80.0°C, crit = +99.0°C)
Core 1:        +31.0°C  (high = +80.0°C, crit = +99.0°C)
Core 2:        +31.0°C  (high = +80.0°C, crit = +99.0°C)
Core 3:        +29.0°C  (high = +80.0°C, crit = +99.0°C)
Pkg power:      5.42 W
Core power:     1.03 W (never goes below 1W, I guess your CPU is less powerful)
Pkg energy:    30.52 kJ
Core energy:   10.72 kJ

> power consumption impact on laptops

I don't have a laptop.

> testing with suspend/resume

Will test soon after I switch back to MBR (UEFI suspend is broken here).
Comment 31 Guenter Roeck 2013-02-25 08:23:35 UTC
(In reply to comment #29)
> What do "Core power" and "Core energy" refer to, given that the CPU has 4
> cores? I would expect 4 entries of each, one per core.
> 
Unfortunately the CPU only reports one value for all cores combined, not per CPU. You can see that clearly when the load is high.

> How can "Core energy" be more than "Pkg energy"?
> 
The initial energy reading starts with the current value in the registers. The counter can wrap every 60 seconds, and the pkg and core counters wrap at different times. This can result in the initially higher value for the core energy.

> WRT laptop testing, I should receive a new laptop at work within one month
> and
> it is IvyBridge-based, so I can do some testing then.

Great. Thanks!
Comment 32 Guenter Roeck 2013-02-25 08:31:29 UTC
(In reply to comment #30)
> (In reply to comment #28)
> 
> Working nicely here (Intel Core i5 2500), thank you.
> 
thanks for the feedback.

> coretemp-isa-0000
> Adapter: ISA adapter
> Physical id 0: +34.0°C  (high = +80.0°C, crit = +99.0°C)
> Core 0:        +30.0°C  (high = +80.0°C, crit = +99.0°C)
> Core 1:        +31.0°C  (high = +80.0°C, crit = +99.0°C)
> Core 2:        +31.0°C  (high = +80.0°C, crit = +99.0°C)
> Core 3:        +29.0°C  (high = +80.0°C, crit = +99.0°C)
> Pkg power:      5.42 W
> Core power:     1.03 W (never goes below 1W, I guess your CPU is less
> powerful)

Or more efficient. The 5+ years old CPU on my Windows PC takes much more power, but that doesn't mean it is more powerful. IvyBridge CPUs have a lower TDP than SandyBridge CPUs, yet are more powerful.
Comment 33 Jean Delvare 2013-02-25 08:44:07 UTC
(In reply to comment #31)
> Unfortunately the CPU only reports one value for all cores combined, not per
> CPU. You can see that clearly when the load is high.

This is odd. But then shouldn't "Core power" be changed to "Cores power"?

Also, not sure how "Pkg power" can be so much higher than "Core power". What else in the CPU package is drawing significant power?

> The initial energy reading starts with the current value in the registers.
> The
> counter can wrap every 60 seconds, and the pkg and core counters wrap at
> different times. This can result in the initially higher value for the core
> energy.

I see. Then I question the value of these measurements for user-space. I assume that the driver uses these values to compute the power figures, so the energy values are not only confusing but also redundant. Don't you think exposing the power values would be sufficient and less confusing?
Comment 34 Guenter Roeck 2013-02-25 14:45:10 UTC
Graphics(In reply to comment #33)
> (In reply to comment #31)
> > Unfortunately the CPU only reports one value for all cores combined, not
> per
> > CPU. You can see that clearly when the load is high.
> 
> This is odd. But then shouldn't "Core power" be changed to "Cores power"?
> 
Ok, no problem.

> Also, not sure how "Pkg power" can be so much higher than "Core power". What
> else in the CPU package is drawing significant power?
> 
Graphics, most likely, in my case. On servers with no graphics it might be the DRAM controller.

Note that I don't report those since it would require restructuring the entire driver. The code is already a bit kludgy, as the core cpower is reported with CPU 0. If someone takes CPU 0 offline, the core power/attributes would disappear. So all power/energy should really be reported through the package instance, but if I do that I would have more than one power attribute associated with it, which would make things a bit difficult.

Question is if I should rearrange the code and report all power domains. Thoughts on that ?

> > The initial energy reading starts with the current value in the registers.
> The
> > counter can wrap every 60 seconds, and the pkg and core counters wrap at
> > different times. This can result in the initially higher value for the core
> > energy.
> 
> I see. Then I question the value of these measurements for user-space. I
> assume
> that the driver uses these values to compute the power figures, so the energy
> values are not only confusing but also redundant. Don't you think exposing
> the
> power values would be sufficient and less confusing?

No, because power is instantaneous and energy is cumulative. Both values are important: With one the user can see how much energy the chip consumed since the driver was loaded, and with the other how much power it consumes now.

We have a couple of options: I can start the count from the time the driver is loaded (which would miss some energy, but be a one-liner), I could add a wraparound value to the pkg energy if the core energy reading is higher (a bit complicated), or I could leave it is as.
Comment 35 Guenter Roeck 2013-02-26 06:25:34 UTC
Latest changes: Report graphics controller and DRAM controller power and energy consumption if supported. Add support for suspend/resume (turns out that was necessary).

Output:

SandyBridge:

coretemp-isa-0000
Adapter: ISA adapter
Physical id 0:    +31.0°C  (high = +80.0°C, crit = +98.0°C)
Core 0:           +21.0°C  (high = +80.0°C, crit = +98.0°C)
Core 1:           +29.0°C  (high = +80.0°C, crit = +98.0°C)
Core 2:           +31.0°C  (high = +80.0°C, crit = +98.0°C)
Core 3:           +27.0°C  (high = +80.0°C, crit = +98.0°C)
Pkg power:         3.60 W  (max =  95.00 W, cap =  95.00 W)
Cores power:     407.00 mW
Graphics power:  251.00 mW
Pkg energy:       48.30 kJ
Cores energy:      7.02 kJ
Graphics energy:   3.34 kJ

IvyBridge:

coretemp-isa-0000
Adapter: ISA adapter
Physical id 0:    +29.0°C  (high = +85.0°C, crit = +105.0°C)
Core 0:           +27.0°C  (high = +85.0°C, crit = +105.0°C)
Core 1:           +28.0°C  (high = +85.0°C, crit = +105.0°C)
Core 2:           +26.0°C  (high = +85.0°C, crit = +105.0°C)
Core 3:           +28.0°C  (high = +85.0°C, crit = +105.0°C)
Pkg power:         4.04 W  (max =  77.00 W, cap = 1000.00 W)
Cores power:     818.00 mW  (cap =   1.20 kW)
Graphics power:    0.00 W  (cap =   1.20 kW)
Pkg energy:        3.93 kJ
Cores energy:    493.78 J
Graphics energy:  14.64 J

IvyBridge graphics power is quite low and zero if the console is in sleep mode.

turbostat reports the same values.
Comment 36 Jean Delvare 2013-02-26 07:47:20 UTC
(In reply to comment #34)
> (...) The code is already a bit kludgy, as the core power is reported with
> CPU 0. If someone takes CPU 0 offline, the core power/attributes would
> disappear.

This is no good. It took quite some time to get the coretemp driver to properly support CPU removal, I would be sad to see it go wrong again.

> So all power/energy should really be reported through the package
> instance, but if I do that I would have more than one power attribute
> associated with it, which would make things a bit difficult.
> 
> Question is if I should rearrange the code and report all power domains.
> Thoughts on that ?

I honestly don't see how hard this can be. Having multiple power attributes should be no harder than having multiple voltage or temperature attributes, many drivers do that and libsensors supports it for years.

You are adding a new feature to the driver, let's do it right.

> No, because power is instantaneous and energy is cumulative. Both values are
> important: With one the user can see how much energy the chip consumed since
> the driver was loaded, and with the other how much power it consumes now.

I don't get it. If the energy counters are wrapping then they tell how much energy was consumed since a random point in time, and this random point is different for each counter. I see no value in this information.

> We have a couple of options: I can start the count from the time the driver
> is
> loaded (which would miss some energy, but be a one-liner), I could add a
> wraparound value to the pkg energy if the core energy reading is higher (a
> bit
> complicated), or I could leave it is as.

I'm a bit lost, this would probably be better discussed through a patch review than in bugzilla. Please post an updated version of your patch and I'll review it.
Comment 37 Guenter Roeck 2013-02-26 15:56:40 UTC
(In reply to comment #36)
> (In reply to comment #34)
> > (...) The code is already a bit kludgy, as the core power is reported with
> > CPU 0. If someone takes CPU 0 offline, the core power/attributes would
> > disappear.
> 
> This is no good. It took quite some time to get the coretemp driver to
> properly
> support CPU removal, I would be sad to see it go wrong again.
> 
Agreed.

> > So all power/energy should really be reported through the package
> > instance, but if I do that I would have more than one power attribute
> > associated with it, which would make things a bit difficult.
> > 
> > Question is if I should rearrange the code and report all power domains.
> > Thoughts on that ?
> 
> I honestly don't see how hard this can be. Having multiple power attributes
> should be no harder than having multiple voltage or temperature attributes,
> many drivers do that and libsensors supports it for years.
> 
> You are adding a new feature to the driver, let's do it right.
> 
Ok. Already done, actually.

> > No, because power is instantaneous and energy is cumulative. Both values
> are
> > important: With one the user can see how much energy the chip consumed
> since
> > the driver was loaded, and with the other how much power it consumes now.
> 
> I don't get it. If the energy counters are wrapping then they tell how much
> energy was consumed since a random point in time, and this random point is
> different for each counter. I see no value in this information.
> 
Not really true. Wrapping is only a problem if the driver is not loaded when the system boots. Hopefully loading it during boot is the normal way of operation. Even when loaded later I personally find the information quite useful.

> > We have a couple of options: I can start the count from the time the driver
> is
> > loaded (which would miss some energy, but be a one-liner), I could add a
> > wraparound value to the pkg energy if the core energy reading is higher (a
> bit
> > complicated), or I could leave it is as.
> 
> I'm a bit lost, this would probably be better discussed through a patch
> review
> than in bugzilla. Please post an updated version of your patch and I'll
> review
> it.

Ok.
Comment 38 Artem S. Tashkinov 2013-03-11 04:47:12 UTC
So, is the driver ready? I fail to see it included into 3.9 :(
Comment 39 Jean Delvare 2013-03-11 07:08:59 UTC
Not yet, it is pending my review and I have no idea when I will find the time for it. I suspect this won't happen before I receive my new laptop, in a few weeks.
Comment 40 Artem S. Tashkinov 2013-06-27 11:06:32 UTC
May I interrupt you? ;-)
Comment 41 Jean Delvare 2013-06-27 11:33:36 UTC
I have my new laptop meanwhile, but did not find the time to test the updated coretemp driver on it yet, sorry.
Comment 42 Artem S. Tashkinov 2013-09-05 11:38:36 UTC
Any updates?
Comment 43 Guenter Roeck 2013-12-04 16:35:46 UTC
Won't fix; the functionality overlaps with other functionality introduced into the kernel in the meantime, and I was unable to get review or test feedback other than my own on the outstanding patch.
Comment 44 Artem S. Tashkinov 2013-12-04 16:52:49 UTC
(In reply to Guenter Roeck from comment #43)
> Won't fix; the functionality overlaps with other functionality introduced
> into the kernel in the meantime, and I was unable to get review or test
> feedback other than my own on the outstanding patch.

How can I get this information then? Reboot into Windows? What's the point of trying to create a Windows replacement then?
Comment 45 Guenter Roeck 2013-12-05 02:43:39 UTC
Through the Intel powerclamp driver, which also exposes the power average through turbostat.

https://www.linux.com/news/hardware/drivers/667190-intel-introduces-powerclamp-driver-for-linux