On our 11th to 13th gen Framework Laptops, we see high power usage when the i2c-hid connected touchpad is in use. We suspect this bug also will impact other Intel systems using these type of touchpads. We would like to see if the kernel could group interrupts that service touchpads so that both the ACPI GPIO interrupt and designware i2c interrupt are bound to the same core to avoid waking two separate cores whenever the touchpad is in use to reduce power consumption. Or kernel maintainers may have a better idea to improve this. Background: We have found that when the touchpad is in use, the touchpad interaction with the CPU will follow the following sequence: 1. the touchpad will generate a GPIO interrupt to the CPU. The CPU will then perform a series of i2c reads using the i2c controller in master mode to retrieve an in report from the touchpad. The touchpad will generate GPIO interrupts at a rate of around 140 interrupts/second. The intel designware-i2c driver will generate interrupts at a rate of around 1000 interrupts / second. Intel premier support agents asked us to open a bug here to track this issue. Reproduction steps: 1. Boot to a gui environment and let the system idle, such as fedora desktop with no applications running. 2. Run a program like s-tui as root to measure the CPU / Psys power consumption. Observe the power consumption of the system for several seconds without touching the touchpad. 3. Touch the touchpad and move your finger around, note the system package power consumption will increase by about 2W. We measured the touchpad, and the touchpad will not cause a measurable increase in system power consumption. Additionally the s-tui results show the additional power consumption is coming from the CPU. I looked at this using Fedora, and it seems the gpio interrupt is pinned to one core, and the i2c interrupts are pinned to a separate core. This means every time the touchpad fires an interrupt, both a high performance and efficiency core cluster have to wake up to service the touchpad. And these are probably not even in the same cluster, so lots of cache evictions etc might be happening as both core clusters power on and off and caches are cleared and filled. What core type are TP interrupts handled by? The second thing was to find out what cores the interrupts were handled by: cat /proc/interrupts watch -n 1 "cat /proc/interrupts | grep designware" watch -n 1 "cat /proc/interrupts | grep PIXA" CPU0 CPU1 CPU2 CPU3 CPU4 CPU5 CPU6 CPU7 CPU8 CPU9 CPU10 CPU11 CPU12 CPU13 CPU14 CPU15 CPU16 CPU17 CPU18 CPU19 43: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 43694516 0 0 0 0 IR-IO-APIC 43-fasteoi i2c_designware.2, idma64.2 135: 0 0 1711105 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 intel-gpio 3 PIXA3854:00 And move the touchpad, you will see which core the touchpad interrupts are increasing on both. However the PIXA ACPI GPIO interrupt is pinned to a high performance core, but the i2c-designware interrupt is pinned to an efficiency core. Lets pin them both to the same core CPU2: # echo 00004 > /proc/irq/43/smp_affinity # echo 00004 > /proc/irq/135/smp_affinity # DOESNT WORK, IO error? Power consumption when using the touchpad seems to drop from about 7W to 5W! (Baseline is 4W) Background is here: https://community.frame.work/t/tracking-touchpad-interrupts-battery-usage-issues-idma64-2/13630/35
As a first step your suggestion sounds good to me, but I think the common denominator here is I2C-HID module that might have the information about both: the actual I2C controller and GPIO interrupt. Hence I'm wondering if this cand be somehow managed there. Benjamin?
+ Hans for harvesting the ideas. Hans, I believe you have a lot of experience with user space managing tools / UIs and may bring a good point into this bug report. I'm now wondering if actually userspace can do what reporter suggested? Because in the kernel it might be not so easy to solve, while user space can do it with ease.
First of all thank you Kieran for filing this interesting bug-report, there indeed seems to be a significant opportunity here to safe a nice amount of power when the touchpad is in use. I agree with Andy's take on this that this sounds like something which is best fixed inside the kernel, specifically inside i2c-hid core. I think this will need at-least a few kernel patches / some extra kernel infra. 1. A patch adding: 1.1 an irq field to struct i2c_adapter, with 0 meaning IRQ unknown! 1.2 an "int i2c_get_adapter_irq(struct i2c_client);" helper (can probably be a static line, the client here just being a convenient way to get the adapter) 2. A patch setting adapter->irq for i2c_designware controllers 3. A patch for i2c-hid-core to see if there is a non-0 adapter-irq configured and if yes then set IRQ affinity for both the i2c-hid IRQ and the adapter IRQ to the same core. It is unclear to me how to pick a core for this though. Maybe just read the GPIO IRQ affinity and then apply the same affinity to the i2c-adapter IRQ ? Note I do not have any time / bandwidth to work on this myself, so hopefully the above raw sketch is helpful for someone else to implement something like this.
Hey folks. I'm on a Thinkpad P1 gen 6, which also uses the i2c_designware controller, and I can confirm that this issue impacts it. My baseline is 1w (idling DE) For comparison... - ...moving the trackpoint around only increases it by around 2 watts. (3w) - ...playing back a FLAC audio via the pulseaudio api only increases it by around a single watt. (2w)
^ Forgot to say that trackpad usage increases power consumption by around 4 watts. (5w) This means that trackpad usage makes the CPU draw twice as much power compared to trackpoint usage.
I wonder if this is still better to do in userspace? I mean the kernel can guess the CPU but it might be sub-optimal for the user. Say it is put to the "little" core instead of the "big" core or so. I think one place where this could belong is in systemd hwdb in the same way there are those autosuspend things. If there is a known "optimal" configuration for a given combination it could have its own rule there. Or something like that.
Created attachment 305853 [details] Script for testing core pinning. Values reflect i7-13800h on thinkpad p1 gen 6. Values depend on the trackpad vendor, platform/machine and CPU topology. Make sure to change them if you're running this. Also included: trackpad-pinning-report.tar.xz, with sample results.
Created attachment 305854 [details] Test results for trackpad-test.sh. CPU: Intel Core i7-13800h Trackpad vendor: ELAN Machine: Thinkpad P1 Gen 6
I made a quick and dirty script to test some different pinning configurations. I ran it under the following conditions: • All radios (wifi, bluetooth, etc.) disabled. • Logged in TTY, no graphical session. • Display manager turned off. (systemctl stop gdm) • Platform profile set to powersave. • TTY blinking cursor disabled. (/sys/class/graphics/fbcon/cursor_blink set to 0) • Running on battery. • powertop tunings applied. This doesn't remove every independent factor, but should make the measurements somewhat more accurate. Here's what I observed: • Every single combination tested managed to consume less power vs the default one (where an e-core and a p-core were both in use). • Doing literally anything other than the current default provides better results. • Over 0.3W can be saved, even without userspace moving a cursor around. • Measurements where only e-cores were used show slightly lower power draw compared to measurements with only p-cores (50 mW). • No noticeable differences with input latency, at least from my subjective observation. • Couldn't determine what's the best way of pinning to e-cores. I measured slightly better results with pinning to two e-cores in different clusters, but this might be caused by the difference in number of interrupts. More testing needed.
Created attachment 305897 [details] Test results for trackpad-test.sh. EPB set to balanced (via power-profiles-daemon) CPU: Intel Core i7-13800h Trackpad vendor: ELAN Machine: Thinkpad P1 Gen 6
Created attachment 305898 [details] Test results for trackpad-test.sh. EPB set to powersave (via power-profiles-daemon) CPU: Intel Core i7-13800h Trackpad vendor: ELAN Machine: Thinkpad P1 Gen 6
I made a major breakthrough! There was a limitation with [power-profiles-daemon](https://gitlab.freedesktop.org/upower/power-profiles-daemon) which prevented it from setting the energy-performance bias if there were platform profiles available. This was fixed with [MR #127](https://gitlab.freedesktop.org/upower/power-profiles-daemon/-/merge_requests/127). I was unaware of this until now. I've retested with the new 0.20 release which applies this fix, and the results are *very* apparent. The energy consumption decreases by over two watts with p-cores, going under 2W when the two IRQs are assigned to different P threads. On the other hand, the e-cores only experience a slight power reduction, still being comfortably within the 4W range. This shows that it's actually preferable to pin to P cores instead of E cores. I suspect this is the case because the P cores manage to finish processing the interrupts sooner, which lets them idle deeper, resulting in the whole package idling for longer and therefore reducing power consumption.
If the plan is to handle this in userspace, intel-lpmd seems like the most appropriate place to do so: https://github.com/intel/intel-lpmd/issues/25
(In reply to José Relvas from comment #13) > If the plan is to handle this in userspace, intel-lpmd seems like the most > appropriate place to do so: https://github.com/intel/intel-lpmd/issues/25 lpmd is targetting meteor-lake and newer its primary goal is to move tasks which still run when the system is mostly idle to the 2 Ecores on the IO-die and then power off the CPU-die to save power. Since we are talking about Alder- and Raptor-Lake here a more appropriate place would be thermald.
> lpmd is targetting meteor-lake and newer its primary goal is to move tasks > which still run when the system is mostly idle to the 2 Ecores on the IO-die > and then power off the CPU-die to save power. That's correct, but it seems like alder/raptor lake are supported as well. I gave it a quick test on my own machine and it pinned all processes on a single E core cluster when usage was low enough. This was effective and reduced power consumptions in some circumstances, such as video playback. I've found some documentation which implies core pinning (or sometimes referred to as "core parking") was a strategy already used on Windows: https://www.intel.com/content/dam/develop/external/us/en/documents-tps/348851-optimizing-x86-hybrid-cpus.pdf I might be missing something though. If thermald already has other energy-aware heuristics, it could be the right place to do this.
> That's correct, but it seems like alder/raptor lake are supported as well. I > gave it a quick test on my own machine and it pinned all processes on a > single E core cluster when usage was low enough. Ah interesting, I did not know that. So I guess that lpmd will be enabled by distros on Alder Lake and Raptor Lake too. Yes in that case lpmd might be a good candidate for dealing with this. Assuming that lpmd upstream is willing to accept patches for something like this.
I've come to find out that this same issue affects AMD laptops in the same fashion. As the interrupts are initially setup by the kernel drivers and the kernel does have the knowledge of how the hardware is arranged (it will or can know which CCD or core type is in use) it conceptually seems better to have the two drivers coordinate. I don't think a userspace hwdb approach is that great because it's going to be an always growing list that "should" be discoverable information. Likewise although this can be corrected by a daemon I don't feel that's appropriate. The GPIO controller interrupt is specified by ACPI, I don't think it's worth moving that around. I do think it's better to try to coordinate where to put the interrupt for the I2C client. I can see that i2c_acpi_get_irq() gets the information either from ACPI by using i2c_acpi_add_irq_resource() or tries to use acpi_dev_gpio_irq_wake_get()/acpi_dev_gpio_irq_wake_get_by(). I didn't see the interrupt to be used specified in _CRS, so it should be using acpi_dev_gpio_irq_wake_get_by() to specify the interrupt. So I think it might be best to adjust some of the gpiolib logic to influence the interrupt selection.
Created attachment 306302 [details] possible patch for amd systems Here's one idea I have for this issue. Basically in the GPIO controller driver add a to_irq callback that the GPIO controller driver can figure out what affinity it's IRQ is on and then use the same affinity for the IRQ that is handed out to I2C device. (This is only compile tested)
Besides the need to assign the interrupts to the right CPUs, I was wondering if it's feasible to reduce the number of interrupts the I2C HID generates... According to Framework, the underlying hardware only generates 140 interrupts per second...
I tested my patch on a F13 AMD and it seems to work. Kieran, would you be able to measure the power with this patch and confirm that's fixed? $ cat /proc/interrupts |grep "PIXA\|pinctrl" 7: 0 0 0 0 0 3984 0 0 0 0 0 0 IR-IO-APIC 7-fasteoi pinctrl_amd 69: 0 0 0 0 0 3982 0 0 0 0 0 0 amd_gpio 8 PIXA3854:00 However there is one side effect to work out if it's OK or not. This warning is now emitted when the GPIO controller is setup: [ 0.485880] gpio gpiochip0: (AMDI0030:00): to_irq is redefined in gpiochip_irqchip_add_allocated_domain and you shouldn't rely on it
(In reply to Mario Limonciello (AMD) from comment #20) > However there is one side effect to work out if it's OK or not. This > warning is now emitted when the GPIO controller is setup: > > [ 0.485880] gpio gpiochip0: (AMDI0030:00): to_irq is redefined in > gpiochip_irqchip_add_allocated_domain and you shouldn't rely on it Right, this approach is a PoC hack and has not to be applied in this form. We should think how in general to coordinate different subsystems to pin the interrupt to the same CPU (core) as GPIO controller itself.
Created attachment 306306 [details] possible patch for amd systems (v2) Right; I didn't think it was the correct solution to land. What do you think of this one? Or can you think of a better location?
(In reply to Mario Limonciello (AMD) from comment #22) > Created attachment 306306 [details] > possible patch for amd systems (v2) > > Right; I didn't think it was the correct solution to land. What do you > think of this one? Interesting. It may have two downsides: 1) it's only one driver specific solution (AMD in this case); 2) it puts all GPIO interrupts to the signdle CPU. The second might have undesired effects and I'm not sure if it scales well. > Or can you think of a better location? The idea (as far as I remember the discussion) is to make it on the consumer-based request, i.e. I2C HID driver should be able to tell to IRQ framework that affinity it wants is the same as GPIO IRQ.
> Interesting. It may have two downsides: > 1) it's only one driver specific solution (AMD in this case); > 2) it puts all GPIO interrupts to the signdle CPU. > The second might have undesired effects and I'm not sure if it scales well. Right; I think the same thing could pretty easily be done for the Intel drivers if we agree in this direction. At least in the AMD case irq_ack() is empty code and the interrupts "really" are handled by the GPIO controller IRQ handler. So I 'think' it makes sense there as is. > The idea (as far as I remember the discussion) is to make it on the > consumer-based request, i.e. I2C HID driver should be able to tell to IRQ > framework that affinity it wants is the same as GPIO IRQ. AFAICT they're not actually linked in any way. Neither Intel nor AMD drivers use IRQ_DOMAIN_HIERARCHY. So the GPIO core can't really discover the "relationship". But considering the noop in what irq_ack() does I'm not sure making the conversion really makes sense.
Created attachment 306344 [details] Possible patch to reduce i2c_dw interrupts (In reply to José Relvas from comment #19) > Besides the need to assign the interrupts to the right CPUs, I was wondering > if it's feasible to reduce the number of interrupts the I2C HID generates... > > According to Framework, the underlying hardware only generates 140 > interrupts per second... Tried this which reduces the interrupts from around 2.8k/second when vigorously using the touchpad to ~280/second with 400-ish max. Testing (using powerstat and RAPL, IRQs pinned to CPU14 already, base commit dd5a440a31f): Idle baseline: psys ~8.7W, pkg-0 ~3.2W Vigorous spamming touchpad without patch: psys ~12W, pkg-0 ~4.1W Vigorous spamming touchpad with patch: psys ~10.7W, pkg-0 3.8W I have no idea whether this is widely safe to do with overflows and stuff, but even something like 8 bytes or having a quirk would be a lot better than often reading ~2 bytes per interrupt which is current behaviour. `rx_fifo_depth / 2` is 32 in this case by the way.
Created attachment 306345 [details] Possible (broken) patch to optimize intel_pinctrl's gpio irq handler Note that the patch is incomplete and breaks the touchpad after restoring from sleep. Also that this one may not be worth doing, but throwing it out there anyways. `perf top` when vigorously using the touchpad can get `intel_gpio_irq` to between 25-40% time after some time, despite only having ~140 interrupts/second. This is mostly in the reads of `intel_gpio_community_irq_handler`. Testing shows only 2/15 pad groups are enabled, but they are all read from every interrupt. Reading only from enabled pads (assuming they are only enabled/disabled from the driver. If this isn't true, the patch may be invalid), makes it so `intel_gpio_irq` only goes up to around 5-8% so improvement there, but battery impacts seem quite small in my testing so probably not worth it if that is the metric to go by. Testing (using RAPL, IRQs pinned to CPU14 already, base commit dd5a440a31f + i2c_dw patch): Idle baseline: psys ~8.7W, pkg-0 ~3.2W Vigorous spamming touchpad without patch: psys ~12W, pkg-0 ~4.1W Vigorous spamming touchpad with this and i2c_dw patch: psys ~10.5W, pkg-0 ~3.7W This may make writing to `GPI_IE` redundant unless I'm missing something (which I very well may be).
> I have no idea whether this is widely safe to do with overflows and stuff Hmm, my understanding is that an overflow shouldn't happen as long as all the data in the queue is read. > but battery impacts seem quite small in my testing so probably not worth it > if that is the metric to go by. The only issue you've gotten with this patch so far was restoring from sleep, right? If we can figure that out without making invasive changes, then it could be a nice to have, even if there's only a minor improvement :)