Created attachment 284029 [details] journalctl -k, tlp-stat, journalctl -b hello, i have this power management issue on my laptop with cpu frequency scaling. when i do an initial boot, everything works fine, i have tlp installed and my cpu frequency is correctly scaled as requested at minimum (800mhz) to prevent heating and keep longer battery. but, when i put the computer to sleep, after i wake it up, the cpu frequency scaling doesn't seem to work anymore. no matter what i try my cpu freq stays at 2.6ghz. i added kernel argument: GRUB_CMDLINE_LINUX_DEFAULT="text debug systemd.log_level=info" and then from terminal i did this just after a fresh boot: echo 'file intel_pstate.c +p' | sudo dd of=/sys/kernel/debug/dynamic_debug/control next, i did a sleep/wakeup cycle, and finaly i took the output of: journalctl -k --since='20 minutes ago' and i attached it to this report. my laptop is a macbookpro (12,1 early 2015) running ubuntu 19.04 as single OS. so, basicaly to sum it up, regarding to cpu scaling, everything works fine, until i do a sleep/wakeup cycle, then afterwards the cpu freq just won't scale anymore
How about put all non boot CPU offline and online, then check their cpufreq? Besides, do you set both the min, max to 5 before running S3?
I believe I have this same issue with the intel_pstate driver with a new HP Zbook x360 with the i9-8950HK Intel CPU. After some pause/resume states, I get two issues: 1) CPU frequencies across all threads go to 100% even when system load is close to 0%. 2) Some times the system will crash hard with a complete lockup (can't ssh to the machine remotely, current ssh sessions (doing a "tail -f /var/log/*log") hang ) as described in Bug# 200133. I'm seeing this in /var/log/kern.log around the time of the CPU going to max frequency. Sep 10 19:22:09 REDACTED kernel: [ 5.116718] intel_pstate: Intel P-state driver initializing Sep 10 19:22:09 REDACTED kernel: [ 5.116763] intel_pstate: Disabling energy efficiency optimization Details: $cat /proc/cpuinfo processor : 0 vendor_id : GenuineIntel cpu family : 6 model : 158 model name : Intel(R) Core(TM) i9-8950HK CPU @ 2.90GHz stepping : 10 microcode : 0xb4 cpu MHz : 2400.000 cache size : 12288 KB physical id : 0 siblings : 12 core id : 0 cpu cores : 6 apicid : 0 initial apicid : 0 fpu : yes fpu_exception : yes cpuid level : 22 wp : yes flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc art arch_perfmon pebs bts rep_good nopl xtopology n onstop_tsc cpuid aperfmperf tsc_known_freq pni pclmulqdq dtes64 monitor ds_cpl vmx est tm2 ssse3 sdbg fma cx16 xtpr pdcm pcid sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand lahf_lm abm 3dnowprefetch cpuid_fault epb invpcid_single pti ssbd ibrs ibpb stibp tpr_shadow vnmi flexpriority ept vpid ept_ad fsgsbase tsc_adjust bmi1 hle avx2 smep bmi2 erms invpcid rtm mpx rdseed adx smap clflushopt intel_pt xsaveopt xsavec xgetbv1 xsave s dtherm ida arat pln pts hwp hwp_notify hwp_act_window hwp_epp md_clear flush_l1d bugs : cpu_meltdown spectre_v1 spectre_v2 spec_store_bypass l1tf mds swapgs bogomips : 5808.00 clflush size : 64 cache_alignment : 64 address sizes : 39 bits physical, 48 bits virtual power management: .... [others identical] $ cat /sys/devices/system/cpu/cpu*/cpufreq/scaling_driver intel_pstate intel_pstate intel_pstate intel_pstate intel_pstate intel_pstate intel_pstate intel_pstate intel_pstate intel_pstate intel_pstate intel_pstate $ cat /sys/devices/system/cpu/cpu*/cpufreq/scaling_governor powersave powersave powersave powersave powersave powersave powersave powersave powersave powersave powersave powersave $ uname -a Linux ao-x360 5.0.0-27-generic #28-Ubuntu SMP Tue Aug 20 19:53:07 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux $ cat /etc/issue Ubuntu 19.04 \n \l $ cpufreq-info analyzing CPU 0: driver: intel_pstate CPUs which run at the same hardware frequency: 0 CPUs which need to have their frequency coordinated by software: 0 maximum transition latency: 4294.55 ms. hardware limits: 800 MHz - 4.80 GHz available cpufreq governors: performance, powersave current policy: frequency should be within 800 MHz and 4.80 GHz. The governor "powersave" may decide which speed to use within this range. current CPU frequency is 4.80 GHz. ... [similar for all the other CPU threads] All the CPU threads all show they are running at the maximum CPU allowed frequency. I can change the max allowed frequency to a lower setting with for i in {0..11}; do sudo cpufreq-set -c $i --max "$DESIRED_MAX"GHz; done which lowers all to whatever I've set $DESIRED_MAX to, but they all stay pegged at 100% clock frequency continuously. Suspend and resume does not solve the issue. Only after reboots do CPU frequencies go back to normal behavior with each thread having independent and fluctuating frequencies. Happy to put in more information or debug if necessary. Would be nice to not have to reboot so frequently.
More info I was running BIOS="Q71 Ver 01.07.00 4/15/2019" with the above report/comment. I just checked and saw there's a newer HP BIOS available and installed it. NewBIOS="(Q71) Ver 01.08.00 Rev.A 08/14/2019" Will update this ticket if the BIOS update does or does not result in a fix for this issue.
The BIOS update did not solve this issue. On restore: "intel_pstate: Disabling energy efficiency optimization" is reported Some lines from /var/log/kern.log near intel_pstate and ACPI messages. Sep 11 19:40:08 REDACTED kernel: [22319.285340] ACPI: Low-level resume complete Sep 11 19:40:08 REDACTED kernel: [22319.285432] ACPI: EC: EC started Sep 11 19:40:08 REDACTED kernel: [22319.285433] PM: Restoring platform NVS memory Sep 11 19:40:08 REDACTED kernel: [22319.286327] Enabling non-boot CPUs ... Sep 11 19:40:08 REDACTED kernel: [22319.286372] x86: Booting SMP configuration: Sep 11 19:40:08 REDACTED kernel: [22319.286373] smpboot: Booting Node 0 Processor 1 APIC 0x2 Sep 11 19:40:08 REDACTED kernel: [22319.288313] cache: parent cpu1 should not be sleeping Sep 11 19:40:08 REDACTED kernel: [22319.288346] intel_pstate: Disabling energy efficiency optimization Sep 11 19:40:08 REDACTED kernel: [22319.288502] CPU1 is up Sep 11 19:40:08 REDACTED kernel: [22319.288788] smpboot: Booting Node 0 Processor 2 APIC 0x4 Sep 11 19:40:08 REDACTED kernel: [22319.289372] cache: parent cpu2 should not be sleeping Sep 11 19:40:08 REDACTED kernel: [22319.289534] CPU2 is up Sep 11 19:40:08 REDACTED kernel: [22319.289554] smpboot: Booting Node 0 Processor 3 APIC 0x6 Sep 11 19:40:08 REDACTED kernel: [22319.290138] cache: parent cpu3 should not be sleeping Sep 11 19:40:08 REDACTED kernel: [22319.290310] CPU3 is up ... Sep 11 19:40:08 REDACTED kernel: [22319.309924] ACPI: Waking up from system sleep state S3 Sep 11 19:40:08 REDACTED kernel: [22320.409692] ACPI: EC: interrupt unblocked Sep 11 19:40:08 REDACTED kernel: [22320.410302] nvidia 0000:01:00.0: Enabling HDA controller Sep 11 19:40:08 REDACTED kernel: [22320.748846] nvme nvme0: 8/0/0 default/read/poll queues Sep 11 19:40:08 REDACTED kernel: [22320.848849] usb 1-14: reset full-speed USB device number 4 using xhci_hcd Sep 11 19:40:08 REDACTED kernel: [22320.975491] ata4: SATA link down (SStatus 4 SControl 300) Sep 11 19:40:08 REDACTED kernel: [22320.975522] ata3: SATA link down (SStatus 4 SControl 300) Sep 11 19:40:08 REDACTED kernel: [22321.125262] usb 1-8: reset full-speed USB device number 3 using xhci_hcd Sep 11 19:40:08 REDACTED kernel: [22321.142143] ish-hid {33AECD58-B679-4E54-9BD9-A04D34F0C226}: [hid-ish]: enum_devices_done OK, num_hid_devices=3 Sep 11 19:40:08 REDACTED kernel: [22321.401408] usb 1-7: reset high-speed USB device number 2 using xhci_hcd Sep 11 19:40:08 REDACTED kernel: [22323.193031] ACPI: EC: event unblocked Sep 11 19:40:08 REDACTED kernel: [22323.256586] acpi LNXPOWER:09: Turning OFF Sep 11 19:40:08 REDACTED kernel: [22323.256643] acpi LNXPOWER:05: Turning OFF Sep 11 19:40:08 REDACTED kernel: [22323.260798] OOM killer enabled. Sep 11 19:40:08 REDACTED kernel: [22323.260798] Restarting tasks ... done. Sep 11 19:40:09 REDACTED kernel: [22323.358096] PM: suspend exit ...
More investigation: $cat /proc/cmdline BOOT_IMAGE=/boot/vmlinuz-5.0.0-27-generic root=REDACTED ro quiet splash vt.handoff=1 On boot these seem to be the intel drivers. $grep -a intel_ /var/log/kern.log Sep 13 08:09:41 REDACTED kernel: [ 4.988078] intel_idle: MWAIT substates: 0x11142120 Sep 13 08:09:41 REDACTED kernel: [ 4.988079] intel_idle: v0.4.1 model 0x9E Sep 13 08:09:41 REDACTED kernel: [ 4.988569] intel_idle: lapic_timer_reliable_states 0xffffffff Sep 13 08:09:41 REDACTED kernel: [ 5.099840] intel_pstate: Intel P-state driver initializing Sep 13 08:09:41 REDACTED kernel: [ 5.099883] intel_pstate: Disabling energy efficiency optimization Sep 13 08:09:41 REDACTED kernel: [ 5.100412] intel_pstate: HWP enabled Sep 13 08:09:41 REDACTED kernel: [ 5.133952] intel_pmc_core: initialized Sep 13 08:09:41 REDACTED kernel: [ 10.263525] intel_rapl: Found RAPL domain package Sep 13 08:09:41 REDACTED kernel: [ 10.263526] intel_rapl: Found RAPL domain core Sep 13 08:09:41 REDACTED kernel: [ 10.263527] intel_rapl: Found RAPL domain uncore Sep 13 08:09:41 REDACTED kernel: [ 10.263528] intel_rapl: Found RAPL domain dram
Are you saying that you don't see "intel_pstate: Disabling energy efficiency optimizations" during boot process. Only appears during system resume?
I see "intel_pstate: Disabling energy efficiency optimization" appearing both during boot and system resume. Here's the last boot dmesg filtered by egrep: $ dmesg | egrep 'intel_|ENERGY' [ 0.509321] ENERGY_PERF_BIAS: Set to 'normal', was 'performance' [ 0.509321] ENERGY_PERF_BIAS: View and update with x86_energy_perf_policy(8) [ 4.988078] intel_idle: MWAIT substates: 0x11142120 [ 4.988079] intel_idle: v0.4.1 model 0x9E [ 4.988569] intel_idle: lapic_timer_reliable_states 0xffffffff [ 5.099840] intel_pstate: Intel P-state driver initializing [ 5.099883] intel_pstate: Disabling energy efficiency optimization [ 5.100412] intel_pstate: HWP enabled [ 5.133952] intel_pmc_core: initialized [ 10.263525] intel_rapl: Found RAPL domain package [ 10.263526] intel_rapl: Found RAPL domain core [ 10.263527] intel_rapl: Found RAPL domain uncore [ 10.263528] intel_rapl: Found RAPL domain dram Comment #4 above is what appeared in /var/log/kern.log right before the CPU frequencies went and stayed at 100% max.
Can you run turbostat and then do suspend/resume while the tool is running and collect the log till you observe the problem.
Created attachment 284955 [details] Turbotstat Before Suspend. All working normally. First of 3 files to upload: turbostat.1.log 1. Reboot. 2. $sudo turbostat -d -o turbostat.1.log 3. Ctrl-C Baseline file when all is working normally.
Created attachment 284957 [details] Turbotstat Before Suspend and After Suspend. CPU issue 2nd of three files to upload 1. sudo turbostat -d -o turbostat.2.log 2. Close lid 3. Open lid CPUs frequencies now all identical and all close to 100% (not quite).
Created attachment 284959 [details] Turbotstat After Suspend. CPU freq at max (limited to 3.40 GHz) 3rd of 3 files to upload. I had forgotten that to prevent overheating when this happens I run for i in {0..11}; do sudo cpufreq-set -c $i --max 3.40GHz; done which really shows how CPU gets pegged to max. 1. Limit max CPU freq as above 2. sudo turbostat -d -o turbostat.3.log 3. Ctrl-C
This is happening because after resume you are getting too many interrupts (~2000 vs ~123000). So some device after resume just getting continuous interrupts. I suggest look at cat /proc/interrupts before and after and check which IRQ is this.
Created attachment 284965 [details] After resume, interrups report I did a suspend and resume and did a $watch "cat /proc/interrupts" (see attached file) The interrupts which jumped and/or kept increasing like mad were 14,16,17,20,122,126,136 The largest initial jump was from 17: IR-IO-APIC 17-fasteoi i2c_designware.1, idma64.1, snd_hda_intel:card1 The ones which seemed to increase per unit time the fastest of the above were 126, 136, 16, 14 136: intel-gpio 258 SYNA307B:00 126: intel-gpio 77 WCOM48A0:00 16: IR-IO-APIC 16-fasteoi i801_smbus, i2c_designware.0, idma64.0 14: IR-IO-APIC 14-fasteoi INT3450:00 The ones which increased but not as quickly were 20 and 122 20: IR-IO-APIC 20-fasteoi intel_ish_ipc 122: IR-PCI-MSI 327680-edge xhci_hcd I didn't do a "interrupts/time" chart but can if there's a need (and if you know a better way than using awk to sum the rows). -------------------------------------- In "top", ordered by CPU, SNA307B and WCOM48A are the top two processes. ---- top - 08:34:19 up 26 min, 1 user, load average: 2.04, 1.98, 1.65 Tasks: 280 total, 1 running, 279 sleeping, 0 stopped, 0 zombie %Cpu(s): 0.1 us, 0.6 sy, 0.0 ni, 99.0 id, 0.0 wa, 0.0 hi, 0.2 si, 0.0 st MiB Mem : 64229.5 total, 60860.0 free, 1672.9 used, 1696.7 buff/cache MiB Swap: 2048.0 total, 2048.0 free, 0.0 used. 61751.3 avail Mem PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 329 root -51 0 0 0 0 S 7.0 0.0 1:34.69 irq/136-SYNA307 327 root -51 0 0 0 0 D 1.3 0.0 0:17.84 irq/126-WCOM48A 460 root 19 -1 119652 75752 74292 S 1.0 0.1 0:13.79 systemd-journal 1002 root 20 0 8260 4816 1692 S 1.0 0.0 0:10.76 haveged 1009 syslog 20 0 224292 5088 3556 S 0.7 0.0 0:04.06 rsyslogd ...
They seems to be touchpad drivers. Can you blacklist intel-gpio via kernel command line and boot? It is better to check rate of interrupts before and after resume. I think you will not see them as at the top in the "top" report. One more experiment to try if you see this issue by doing #echo freeze > /sys/power/state Then resume.
Created attachment 284985 [details] Interrupts Graph before suspend, during, and after resume I didn't find any programs which recorded changes in interrupts over time by each irq so I wrote one in bash (let me know if you think it useful to upload) which parses /proc/interrupts and then passes aggregated data to gnuplot (see attached .svg). The attached graph shows: * xaxis = seconds since program started. You see that I started the program, then did a suspend and resume (closed, waited, then opened the lid). Suspend kicked in at about t=5 seconds. About 25 seconds later (about t=30) the system resumed reporting interrupts. * yaxis = # interrupts per IRQ since program started. This # reported is cumulative. IRQ 16 = "IR-IO-APIC 16-fasteoi i801_smbus," IRQ 17 = "IR-IO-APIC 17-fasteoi i2c_designware.1," Interrupts for IRQ 17 continue to increase rapidly when the system pegs CPU frequencies. As you asked, I also tried #echo freeze > /sys/power/state Since the system resumed normally I tested it a few times. I resumed it in different ways. On resume the system always acted normally. In this freeze test, Interrupts for IRQ 17 increased rapidly but only until about time=50 seconds and then stopped increasing rapidly. Interrupts for IRQ 16 did not skyrocket. I can upload that graph as an attachment if you think it worthwhile.
So some i2c devices going crazy after resume. Can you try to blacklist i2c_designware and i801_smbus module?
Also blacklist snd_hda_intel.
Created attachment 284987 [details] Suspend and Resume: interrupts with modules disabled. Added to /etc/modprobe.d/blacklist-aaatest.conf the following lines blacklist i2c_designware blacklist i801_smbus blacklist snd_hda_intel After a reboot. The laptop does not have sound. Suspend and resume still results in CPU frequencies going to max. IRQs 16 and 17 increase continuously. (see attached)
Also tried blacklist i2c_i801 in the file, but it looks like it gets loaded anyway. $ lsmod | grep i2c i2c_algo_bit 16384 1 i915 i2c_i801 32768 0 i2c_hid 28672 0 hid 126976 8 i2c_hid,wacom,usbhid,hid_multitouch,hid_sensor_hub,intel_ishtp_hid,hid_generic,uhid And with "blacklist i2c_i801" in /etc/modprobe.d/blacklist-aaatest.conf, suspend and resume still has the CPU frequencies going to max.
Added Jarkko to see if there is a way to silence i2c interrupts after resume.
Couple thoughts: "In "top", ordered by CPU, SNA307B and WCOM48A are the top two processes.". Does interrupt storm go away if you rmmod i2c_hid? I see both of them generate also bunch of interrupts and IIRC each touchscreen/touchpad interrupt cause many I2C transactions and interrupts. Are you able to try newest Linux v5.3 kernel to see does it have the same issue?
Created attachment 285031 [details] Interrupt graph while running 'rmmod i2c_hid' Yes, if you run "rmmod i2c_hid" then the interrupt storm immediately stops. In the attached SVG image you see interrupts for IRQs 16 and 17 skyrocketing until T=4 seconds which is about when I ran the rmmod command. It also kills the touchpad and touchscreen leaving only keyboard for an interface. Unfortunately running "insmod i2c_hid" did not work (I think the message was something like "could not find i2c_hid") I can try 5.3 with https://kernel.ubuntu.com/~kernel-ppa/mainline/v5.3/ and let you know what happens.
After you rmmod i2c_hid, do you see the frequencies drop also? I don't see turbostate output after that.
Created attachment 285037 [details] Interrupt graph while running kernel 5.3 Test with Kernel 5.3 installed with .deb files ( linux-headers-5.3.0-050300_5.3.0-050300.201909152230_all.deb linux-headers-5.3.0-050300-generic_5.3.0-050300.201909152230_amd64.deb linux-image-unsigned-5.3.0-050300-generic_5.3.0-050300.201909152230_amd64.deb linux-modules-5.3.0-050300-generic_5.3.0-050300.201909152230_amd64.deb ) after I did a suspend/resume the results were: * touchpad/touchscreen worked normally for only a few seconds, then was significantly laggy, then stopped responding entirely (keyboard worked though) * IRQ 17 interrupts normalized (see attached image) * IRQ 16 interrupts continued to increase without bound (see attached image) * CPU frequencies were intermittently at max and normal * full crash/system-lockup when trying to reboot Messages on installation of kernel were ------------------------------ $ sudo dpkg -i *.deb Selecting previously unselected package linux-headers-5.3.0-050300. (Reading database ... 229726 files and directories currently installed.) Preparing to unpack linux-headers-5.3.0-050300_5.3.0-050300.201909152230_all.deb ... Unpacking linux-headers-5.3.0-050300 (5.3.0-050300.201909152230) ... Selecting previously unselected package linux-headers-5.3.0-050300-generic. Preparing to unpack linux-headers-5.3.0-050300-generic_5.3.0-050300.201909152230_amd64.deb ... Unpacking linux-headers-5.3.0-050300-generic (5.3.0-050300.201909152230) ... Selecting previously unselected package linux-image-unsigned-5.3.0-050300-generic. Preparing to unpack linux-image-unsigned-5.3.0-050300-generic_5.3.0-050300.201909152230_amd64.deb ... Unpacking linux-image-unsigned-5.3.0-050300-generic (5.3.0-050300.201909152230) ... Selecting previously unselected package linux-modules-5.3.0-050300-generic. Preparing to unpack linux-modules-5.3.0-050300-generic_5.3.0-050300.201909152230_amd64.deb ... Unpacking linux-modules-5.3.0-050300-generic (5.3.0-050300.201909152230) ... Setting up linux-headers-5.3.0-050300 (5.3.0-050300.201909152230) ... Setting up linux-headers-5.3.0-050300-generic (5.3.0-050300.201909152230) ... /etc/kernel/header_postinst.d/dkms: * dkms: running auto installation service for kernel 5.3.0-050300-generic Kernel preparation unnecessary for this kernel. Skipping... Building module: cleaning build area... unset ARCH; env NV_VERBOSE=1 'make' -j12 NV_EXCLUDE_BUILD_MODULES='' KERNEL_UNAME=5.3.0-050300-generic IGNORE_XEN_PRESEN CE=1 IGNORE_CC_MISMATCH=1 SYSSRC=/lib/modules/5.3.0-050300-generic/build LD=/usr/bin/ld.bfd modules.....(bad exit status : 2) ERROR (dkms apport): kernel package linux-headers-5.3.0-050300-generic is not supported Error! Bad return status for module build on kernel: 5.3.0-050300-generic (x86_64) Consult /var/lib/dkms/nvidia/418.56/build/make.log for more information. ...done. Setting up linux-modules-5.3.0-050300-generic (5.3.0-050300.201909152230) ... Setting up linux-image-unsigned-5.3.0-050300-generic (5.3.0-050300.201909152230) ... I: /vmlinuz.old is now a symlink to boot/vmlinuz-5.0.0-27-generic I: /initrd.img.old is now a symlink to boot/initrd.img-5.0.0-27-generic I: /vmlinuz is now a symlink to boot/vmlinuz-5.3.0-050300-generic I: /initrd.img is now a symlink to boot/initrd.img-5.3.0-050300-generic Processing triggers for linux-image-unsigned-5.3.0-050300-generic (5.3.0-050300.201909152230) ... /etc/kernel/postinst.d/dkms: * dkms: running auto installation service for kernel 5.3.0-050300-generic Kernel preparation unnecessary for this kernel. Skipping... Building module: cleaning build area... unset ARCH; env NV_VERBOSE=1 'make' -j12 NV_EXCLUDE_BUILD_MODULES='' KERNEL_UNAME=5.3.0-050300-generic IGNORE_XEN_PRESEN CE=1 IGNORE_CC_MISMATCH=1 SYSSRC=/lib/modules/5.3.0-050300-generic/build LD=/usr/bin/ld.bfd modules......(bad exit statu s: 2) ERROR (dkms apport): kernel package linux-headers-5.3.0-050300-generic is not supported Error! Bad return status for module build on kernel: 5.3.0-050300-generic (x86_64) Consult /var/lib/dkms/nvidia/418.56/build/make.log for more information. ...done. /etc/kernel/postinst.d/initramfs-tools: update-initramfs: Generating /boot/initrd.img-5.3.0-050300-generic cryptsetup: WARNING: The initramfs image may not contain cryptsetup binaries nor crypto modules. If that's on purpose, you may want to uninstall the 'cryptsetup-initramfs' package in order to disable the cryptsetup initramfs integration and avoid this warning. W: Possible missing firmware /lib/firmware/i915/icl_guc_32.0.3.bin for module i915 W: Possible missing firmware /lib/firmware/i915/glk_guc_32.0.3.bin for module i915 W: Possible missing firmware /lib/firmware/i915/kbl_guc_32.0.3.bin for module i915 W: Possible missing firmware /lib/firmware/i915/bxt_guc_32.0.3.bin for module i915 W: Possible missing firmware /lib/firmware/i915/skl_guc_32.0.3.bin for module i915 W: Possible missing firmware /lib/firmware/i915/icl_huc_ver8_4_3238.bin for module i915 W: Possible missing firmware /lib/firmware/i915/glk_huc_ver03_01_2893.bin for module i915 /etc/kernel/postinst.d/zz-update-grub: Sourcing file `/etc/default/grub' Sourcing file `/etc/default/grub.d/init-select.cfg' Generating grub configuration file ... Found linux image: /boot/vmlinuz-5.3.0-050300-generic Found initrd image: /boot/initrd.img-5.3.0-050300-generic Found linux image: /boot/vmlinuz-5.0.0-27-generic Found initrd image: /boot/initrd.img-5.0.0-27-generic Found linux image: /boot/vmlinuz-5.0.0-25-generic Found initrd image: /boot/initrd.img-5.0.0-25-generic Adding boot menu entry for EFI firmware configuration done ---------------------
> After you rmmod i2c_hid, do you see the frequencies drop also? I don't see > turbostate output after that. Yes - frequencies drop to normal. Do you want me to re-test the 'rmmod i2c_hid' command with a turbostate report?
No, I just want to make sure that this is the issue.
Created attachment 285047 [details] Frequencies over time with modprobe [-r] i2c_hid I have a workaround: On reading the docs it appears modprobe is smarter than insmod and can handle more dependencies. So on resume I ran $ sudo modprobe -r i2c_hid and then $ sudo modprobe i2c_hid Immediately after "modprobe -r i2c_hid" the frequencies return to normal and IRQ interrupts stop increasing. On running "modprobe i2c_hid" the mouse and touchscreen start working again.
Created attachment 285049 [details] Interrupts Graph before suspend, during, and after resume with systemd To see if I could automate the modprobe commands on sleep and resume, I created 2 systemd unit files and 2 executables which disable i2c_hid on sleep and enable it on resume. The results were as follows 1) Resume is now almost instantaneous. Before it took 2-5 seconds and I'd have to type furiously to get the keyboard to be recognized for login. 2) interrupts do not climb as quickly or as much 3) interrupts do not climb without stopping. (see attached image) The 4 files to disabling/enabling i2c_hid are as follows. ::::FILE 1:::::::::: /etc/systemd/system/aaa.removei2c.onsuspend.service :::::::::::::: [Unit] Description=Disables i2c_ihd at sleep to fix module issue [Service] Type=oneshot ExecStart=/usr/local/bin/disable_touchpad TimeoutSec=0 StandardOutput=syslog [Install] WantedBy=sleep.target ::::::FILE 2:::::::: /etc/systemd/system/aaa.returni2c.onresume.service :::::::::::::: [Unit] Description=Re-Enables intel i2c mouse driver at resume from sleep After=suspend.target [Service] User=root Type=oneshot ExecStart=/usr/local/bin/enable_touchpad TimeoutSec=0 StandardOutput=syslog [Install] WantedBy=suspend.target ::::FILE 3:::::::::: /usr/local/bin/disable_touchpad :::::::::::::: #!/bin/bash # logger "Disabling i2c_hid" modprobe -r i2c_hid ::::FILE 4:::::::::: /usr/local/bin/enable_touchpad :::::::::::::: #!/bin/bash logger "Enabling i2c_hid" modprobe i2c_hid
Cc'ed Andy if he knows from pinctrl point of view what might explain that those touchpad and touchscreen GPIO interrupts starts producing spurious interrupts after resume. Can this be the same issue than in linux-next commit 6cb0880f0822 ("pinctrl: intel: remap the pin number to gpio offset for irq enabled pin")?
(In reply to Jarkko Nikula from comment #29) > Can this be the same issue than in linux-next commit 6cb0880f0822 ("pinctrl: > intel: remap the pin number to gpio offset for irq enabled pin")? Yes, it's being discussed in bug #203995.
I'd like to test on my machine when that patch (or a similar one) makes its way to a Ubuntu kernel repository, but I'm not sure if I have found the right spot to track this fix through github or git.kernel.org So, apologies if this is an ignorant question and for not being able to figure this out, but it looks like a possible patch for this issue was submitted here: https://github.com/endlessm/linux/pull/494 ( remap the pin number to gpio offset for irq enabled pin )? but it appears that code didn't make it into a kernel merge? The closest thing I found was: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=a0a5f7661e1c49de0c1f93ca435ba2880b88fc5c Is that the code I should make sure is in a kernel when I next test to see if this solves that problem with my HP Zbook?
According to 203995 , the patch is queued for 5.4. *** This bug has been marked as a duplicate of bug 203995 ***
I saw that the latest Ubuntu distro kernel update had a bunch of i2c fixes and HID fixes. https://launchpad.net/ubuntu/+source/linux/5.0.0-31.33 So I installed that kernel and tested while disabling the workaround (see above) that removed/added the i2c_hid module on sleep/resume. The system on resuming from sleep now: * No longer has an interrupt storm * CPU frequencies behave normally * keyboard/mouse is accessible after resume without a long delay. It appears this kernel update has resolved the issue for this brand of HP Zbook. Thanks!