I picked up a HP Victus 15.6 inch Gaming Laptop 15-fb2000 during the mid-July sales frenzy. I did a fresh install of Ubuntu 24.04 which comes with kernel 6.8.0-31. The load after completing boot shoots up to over 80! I found kacpi_notify as the cause. snippets to keep the Description brief with attached files to come. --- top - 20:00:05 up 10 min, 1 user, load average: 86.24, 76.27, 42.44 Tasks: 453 total, 2 running, 451 sleeping, 0 stopped, 0 zombie %Cpu(s): 0.1 us, 8.5 sy, 0.0 ni, 91.4 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st MiB Mem : 62040.5 total, 51595.7 free, 3166.0 used, 8146.3 buff/cache MiB Swap: 0.0 total, 0.0 free, 0.0 used. 58874.4 avail Mem PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 127 root 20 0 0 0 0 D 1.3 0.0 0:03.71 [kworker/11:1+kacpi_notify] 280 root 20 0 0 0 0 D 1.3 0.0 0:06.63 [kworker/5:5+kacpi_notify] 314 root 20 0 0 0 0 D 1.3 0.0 0:06.68 [kworker/5:12+kacpi_notify] ubuntu@ubuntu:~$ grep . -r /sys/firmware/acpi/interrupts/ | awk '$2 > 0' /sys/firmware/acpi/interrupts/sci: 1907 /sys/firmware/acpi/interrupts/gpe_all: 1907 /sys/firmware/acpi/interrupts/gpe17: 1907 EN enabled unmasked --- I found a previous bug that mentions the unloading the ucsi_acpi module. Bug 217076 - Charging causes high CPU usage on LG Gram laptops series Z90Q I would like some help identifying if it is similar in nature since that patch was very specific to those LG models.
Created attachment 306580 [details] output from ubuntu-bug this is a text file which contains all of the output from ubuntu-bug
Hi Bill, I took a look at the logs you attached and this issue does not have the same root cause. In the logs we have: [ 5.860144] ucsi_acpi USBC000:00: failed to reset PPM! [ 5.860150] ucsi_acpi USBC000:00: error -ETIMEDOUT: PPM init failed meaning that the UCSI core could not even initialize, so the problem is already happening at that point. Maybe you could try some profiling with the perf utility to figure out what is actually being notified, as the kacpi_notify function serves many purposes. Diogo
(In reply to Diogo Ivo from comment #2) > already happening at that point. Maybe you could try some profiling with the > perf utility to figure out what is actually being notified, as the > kacpi_notify function serves many purposes. Diogo, I am happy to do some additional debugging if you could guide me. It has been a long time since I used perf.
Diogo, Some additional information: If I "sudo rmmod ucsi_acpi", the CPU load issue resolves, but the GPE17 counters continue. I cannot just add that module to /etc/modprobe.d/blacklist.conf, it will still load after reboot. I had to append to the GRUB_CMDLINE_LINUX_DEFAULT in /etc/default/grub: "sudo vi /etc/default/grub" GRUB_CMDLINE_LINUX_DEFAULT="quiet splash module_blacklist=ucsi_acpi" then, "sudo update-grub" then, "sudo reboot" that finally blacklists the module from loading. --- Having explored all that, I have an interim solution. I have restored everything back to defaults so we can continue gathering evidence.
Some additional data I thought might be helpful: $ sudo su - # echo l > /proc/sysrq-trigger ---/var/log/syslog--- 2024-07-19T12:45:33.815231-04:00 victus kernel: NMI backtrace for cpu 11 2024-07-19T12:45:33.815232-04:00 victus kernel: CPU: 11 PID: 271 Comm: kworker/11:3 Not tainted 6.8.0-38-generic #38-Ubuntu 2024-07-19T12:45:33.815232-04:00 victus kernel: Hardware name: HP Victus by HP Gaming Laptop 15-fb2xxx/8C2F, BIOS F.03 05/20/2024 2024-07-19T12:45:33.815233-04:00 victus kernel: Workqueue: kacpi_notify acpi_os_execute_deferred 2024-07-19T12:45:33.815233-04:00 victus kernel: RIP: 0010:acpi_ps_get_arguments.constprop.0+0x8a/0x360 2024-07-19T12:45:33.815234-04:00 victus kernel: Code: 02 20 0f 85 9c 01 00 00 41 0f b7 44 24 0a 66 83 f8 0e 0f 86 6b 01 00 00 66 83 f8 2d 75 75 48 8d 73 38 b9 01 00 00 00 4c 89 e2 <48> 89 df e8 4e f3 ff ff 85 c0 0f 85 1f 02 00 00 c7 43 28 00 00 00 2024-07-19T12:45:33.815235-04:00 victus kernel: RSP: 0018:ffffbaf5008c3b68 EFLAGS: 00000246 2024-07-19T12:45:33.815236-04:00 victus kernel: RAX: 000000000000002d RBX: ffff94bd8d60a800 RCX: 0000000000000001 2024-07-19T12:45:33.815236-04:00 victus kernel: RDX: ffff94bd82635190 RSI: ffff94bd8d60a838 RDI: 0000000000000000 2024-07-19T12:45:33.815237-04:00 victus kernel: RBP: ffffbaf5008c3b98 R08: 0000000000000000 R09: 0000000000000000 2024-07-19T12:45:33.815237-04:00 victus kernel: R10: 0000000000000000 R11: 0000000000000000 R12: ffff94bd82635190 2024-07-19T12:45:33.815238-04:00 victus kernel: R13: ffffbaf500037da7 R14: ffffbaf500037da7 R15: ffffbaf5008c3ba8 2024-07-19T12:45:33.815238-04:00 victus kernel: FS: 0000000000000000(0000) GS:ffff94cc20180000(0000) knlGS:0000000000000000 2024-07-19T12:45:33.815239-04:00 victus kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 2024-07-19T12:45:33.815239-04:00 victus kernel: CR2: 00007f51bddf9bf8 CR3: 0000000df9e3c000 CR4: 0000000000f50ef0 2024-07-19T12:45:33.815240-04:00 victus kernel: PKRU: 55555554 2024-07-19T12:45:33.815240-04:00 victus kernel: Call Trace: 2024-07-19T12:45:33.815241-04:00 victus kernel: <NMI> 2024-07-19T12:45:33.815241-04:00 victus kernel: ? show_regs+0x6d/0x80 2024-07-19T12:45:33.815241-04:00 victus kernel: ? nmi_cpu_backtrace+0xb5/0x120 2024-07-19T12:45:33.815242-04:00 victus kernel: ? sched_clock_noinstr+0x9/0x10 2024-07-19T12:45:33.815243-04:00 victus kernel: ? nmi_cpu_backtrace_handler+0x11/0x20 2024-07-19T12:45:33.815243-04:00 victus kernel: ? nmi_handle+0x64/0x180 2024-07-19T12:45:33.815244-04:00 victus kernel: ? default_do_nmi+0x47/0x140 2024-07-19T12:45:33.815244-04:00 victus kernel: ? exc_nmi+0x1c2/0x290 2024-07-19T12:45:33.815245-04:00 victus kernel: ? end_repeat_nmi+0xf/0x60 2024-07-19T12:45:33.815245-04:00 victus kernel: ? acpi_ps_get_arguments.constprop.0+0x8a/0x360 2024-07-19T12:45:33.815248-04:00 victus kernel: message repeated 2 times: [ ? acpi_ps_get_arguments.constprop.0+0x8a/0x360] 2024-07-19T12:45:33.815249-04:00 victus kernel: </NMI> 2024-07-19T12:45:33.815253-04:00 victus kernel: <TASK> 2024-07-19T12:45:33.815253-04:00 victus kernel: acpi_ps_parse_loop+0x331/0x780 2024-07-19T12:45:33.815254-04:00 victus kernel: acpi_ps_parse_aml+0x226/0x600 2024-07-19T12:45:33.815254-04:00 victus kernel: acpi_ps_execute_method+0x172/0x3e0 2024-07-19T12:45:33.815255-04:00 victus kernel: acpi_ns_evaluate+0x175/0x5f0 2024-07-19T12:45:33.815255-04:00 victus kernel: acpi_evaluate_object+0x1a2/0x4a0 2024-07-19T12:45:33.815256-04:00 victus kernel: acpi_evaluate_dsm+0xc7/0x150 2024-07-19T12:45:33.815256-04:00 victus kernel: ucsi_acpi_dsm+0x57/0xa0 [ucsi_acpi] 2024-07-19T12:45:33.815257-04:00 victus kernel: ucsi_acpi_read+0x2f/0x70 [ucsi_acpi] 2024-07-19T12:45:33.815257-04:00 victus kernel: ucsi_acpi_notify+0x42/0xe0 [ucsi_acpi] 2024-07-19T12:45:33.815258-04:00 victus kernel: acpi_ev_notify_dispatch+0x56/0xa0 2024-07-19T12:45:33.815258-04:00 victus kernel: acpi_os_execute_deferred+0x17/0x40 2024-07-19T12:45:33.815259-04:00 victus kernel: process_one_work+0x16c/0x350 2024-07-19T12:45:33.815259-04:00 victus kernel: worker_thread+0x306/0x440 2024-07-19T12:45:33.815260-04:00 victus kernel: ? srso_alias_return_thunk+0x5/0xfbef5 2024-07-19T12:45:33.815260-04:00 victus kernel: ? _raw_spin_lock_irqsave+0xe/0x20 2024-07-19T12:45:33.815261-04:00 victus kernel: ? __pfx_worker_thread+0x10/0x10 2024-07-19T12:45:33.815261-04:00 victus kernel: kthread+0xef/0x120 2024-07-19T12:45:33.815262-04:00 victus kernel: ? __pfx_kthread+0x10/0x10 2024-07-19T12:45:33.815262-04:00 victus kernel: ret_from_fork+0x44/0x70 2024-07-19T12:45:33.815263-04:00 victus kernel: ? __pfx_kthread+0x10/0x10 2024-07-19T12:45:33.815263-04:00 victus kernel: ret_from_fork_asm+0x1b/0x30 2024-07-19T12:45:33.815263-04:00 victus kernel: </TASK> $ ps auxwww|egrep kacpi_notify|tail -5 root 1584 1.0 0.0 0 0 ? D 12:01 0:14 [kworker/11:92+kacpi_notify] root 1585 1.0 0.0 0 0 ? D 12:01 0:14 [kworker/11:93+kacpi_notify] root 1591 1.0 0.0 0 0 ? D 12:01 0:14 [kworker/11:94+kacpi_notify] root 1620 1.0 0.0 0 0 ? D 12:01 0:15 [kworker/11:95+kacpi_notify] $ sudo perf stat -p 1620 sleep 10 Performance counter stats for process id '1620': 109.43 msec task-clock # 0.011 CPUs utilized 427 context-switches # 3.902 K/sec 0 cpu-migrations # 0.000 /sec 0 page-faults # 0.000 /sec 539,425,119 cycles # 4.930 GHz 235,983,004 stalled-cycles-frontend # 43.75% frontend cycles idle 357,489,266 instructions # 0.66 insn per cycle # 0.66 stalled cycles per insn 85,793,863 branches # 784.038 M/sec 9,109,923 branch-misses # 10.62% of all branches 10.006953974 seconds time elapsed $ sudo perf record -p 1620 -g -a sleep 10 Warning: PID/TID switch overriding SYSTEM [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.114 MB perf.data (480 samples) ] $ sudo perf report -I Samples: 480 of event 'cycles:P', Event count (approx.): 562755334 Children Self Command Shared Object Symbol - 100.00% 0.00% kworker/11:95+k [kernel.kallsyms] [k] ret_from_fork_asm ret_from_fork_asm ret_from_fork kthread - worker_thread - 99.76% process_one_work - acpi_os_execute_deferred - acpi_ev_notify_dispatch - 99.34% ucsi_acpi_notify ucsi_acpi_read ucsi_acpi_dsm acpi_evaluate_dsm - acpi_evaluate_object - acpi_ns_evaluate - 97.87% acpi_ps_execute_method - 96.80% acpi_ps_parse_aml - 92.02% acpi_ps_parse_loop - 58.41% acpi_ds_exec_end_op - 30.33% acpi_ds_evaluate_name_path - 20.74% acpi_ex_resolve_to_value - acpi_ex_resolve_node_to_value - 2.10% acpi_ex_read_data_from_field 0.84% acpi_ut_create_integer_object - 0.61% acpi_ex_extract_from_field acpi_ex_field_datum_io
Created attachment 306588 [details] perf data file
Hi @Bill, It seems like the problem is strongly related to the error message in: [ 5.860144] ucsi_acpi USBC000:00: failed to reset PPM! [ 5.860150] ucsi_acpi USBC000:00: error -ETIMEDOUT: PPM init failed However, the clue is not that clear. Could you please try the following steps to help us get more debug info: 1. Blacklist the ucsi_acpi.ko and boot 2. Login as root, then $ echo 'file ucsi.c +p' > /sys/kernel/debug/dynamic_debug/control 3. Load the ucsi_acpi.ko 4. Dump the dmesg here En-Wei.
(In reply to En-Wei Wu from comment #7) > Hi @Bill, > > It seems like the problem is strongly related to the error message in: > > [ 5.860144] ucsi_acpi USBC000:00: failed to reset PPM! > [ 5.860150] ucsi_acpi USBC000:00: error -ETIMEDOUT: PPM init failed > > However, the clue is not that clear. Could you please try the following > steps to help us get more debug info: > 1. Blacklist the ucsi_acpi.ko and boot > 2. Login as root, then $ echo 'file ucsi.c +p' > > /sys/kernel/debug/dynamic_debug/control > 3. Load the ucsi_acpi.ko > 4. Dump the dmesg here > > En-Wei. En-Wei, First, I dropped to root $ sudo su - I was unable to echo to /sys/kernel/debug # echo 'file ucsi.c +p' > /sys/kernel/debug/dynamic_debug/control -bash: /sys/kernel/debug/dynamic_debug/control: Operation not permitted so I tried /proc instead: # cat /proc/dynamic_debug/control|egrep ucsi drivers/usb/typec/ucsi/ucsi.c:978 [typec_ucsi]ucsi_connector_change =_ "Bogus connector change event\n" drivers/usb/typec/ucsi/ucsi.c:1616 [typec_ucsi]ucsi_register =_ "Registered UCSI interface with version %x.%x.%x" # echo 'file ucsi.c +p' > /proc/dynamic_debug/control # cat /proc/dynamic_debug/control|egrep ucsi drivers/usb/typec/ucsi/ucsi.c:978 [typec_ucsi]ucsi_connector_change =p "Bogus connector change event\n" drivers/usb/typec/ucsi/ucsi.c:1616 [typec_ucsi]ucsi_register =p "Registered UCSI interface with version %x.%x.%x" # modprobe ucsi_acpi # dmesg|egrep ucsi [ 874.792132] ucsi_acpi USBC000:00: Registered UCSI interface with version 1.0.0 [ 879.815432] ucsi_acpi USBC000:00: failed to reset PPM! [ 879.815437] ucsi_acpi USBC000:00: error -ETIMEDOUT: PPM init failed
@Bill, I have the exact same model as you and had been struggling with the same issue for quite some time. Now CPU loads are back to normal and fans are no longer running all the time! I also used to have terrible battery life (less than two hours) dues to that ACPI bug. Blacklisting the ucsi_acpi module worked like a charm! I'd be glad to help you debug or if you want to compare logs to see how else this could be fixed in a later kernel patch. Cheers!
*** Bug 219393 has been marked as a duplicate of this bug. ***
(In reply to Evan Frouin from comment #9) > @Bill, > > I have the exact same model as you and had been struggling with the same > issue for quite some time. Now CPU loads are back to normal and fans are no > longer running all the time! I also used to have terrible battery life (less > than two hours) dues to that ACPI bug. Blacklisting the ucsi_acpi module > worked like a charm! I'd be glad to help you debug or if you want to compare > logs to see how else this could be fixed in a later kernel patch. > > Cheers! Evan, I got further along within the Ubuntu bug, here: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/2073538 Life got in the way and I have yet to get back to debugging. In my specific use case, I just needed the CPU to be normal. I am using the laptop to run docker containers using the NVidia GPU, so for now I unload the ucsi_acpi after boot. wrmcd@victus:~$ sudo su - root@victus:~# crontab -l #kernel bug @reboot sleep 10 && rmmod ucsi_acpi If you have the time and effort perhaps you can take what I documented of my kernel debugging within the Ubuntu bug, and run with it.
I found something suspicious in your ACPI table: Method (UWFE, 0, Serialized) { ... ^^^^UBTC.CCI0 = CCI0 /* \_SB_.PCI0.SBRG.EC0_.CCI0 */ ^^^^UBTC.CCI1 = CCI1 /* \_SB_.PCI0.SBRG.EC0_.CCI1 */ ^^^^UBTC.CCI2 = CCI2 /* \_SB_.PCI0.SBRG.EC0_.CCI2 */ ^^^^UBTC.CCI3 = CCI3 /* \_SB_.PCI0.SBRG.EC0_.CCI3 */ CCI0 = Zero CCI3 = Zero ... Notify (UBTC, 0x80) // Status Change } Method (_Q81, 0, Serialized) // _Qxx: EC Query, xx=0x00-0xFF { UWFE () } _Q81 (further called UWFE()) is the method called when EC generates SCI (System Control Interrupt) to notify OS (and ucsi_acpi_notify() will finally be called) that the CCI (USB Type-C Command Status and Connector Change Indication) status has changed by the USB-C controller. Looking at UWFE(), CCI0 and CCI3 (are in System RAM) are zeroed after being copied to EC RAM. Here is the point: if OS tried to read this CCI event by only reading the system memory rather than explicitly reading from EC RAM, OS might read the wrong value.
Despite not 100% related to the issue I've mentioned above, could you try this patch from linux-next anyway? Link: https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git/commit/?h=next-20241021&id=fa48d7e81624efdf398b990a9049e9cd75a5aead fa48d7e81624efdf39 usb: typec: ucsi: Do not call ACPI _DSM method for UCSI read operations Many thanks.
En-Wei Wu, After a lot of trial and error, I was able to build linux-next and disable secure boot so I could run the kernel. I confirmed my copy of linux-next already included the referenced patch. I am able to now report that it appears to solve the high CPU complaint. While there still are interrupts as Greg is reporting, I am so happy to see the CPU is idle. My solution until today has been to rmmod after boot with a root's crontab: --- $ sudo crontab -l #Ansible: kernel bug @reboot sleep 10 && rmmod ucsi_acpi --- $ grep . -r /sys/firmware/acpi/interrupts/ | awk '$2 > 0' /sys/firmware/acpi/interrupts/sci: 1247 /sys/firmware/acpi/interrupts/gpe_all: 1247 /sys/firmware/acpi/interrupts/gpe17: 1247 EN enabled unmasked $ lsmod|egrep ucsi typec_ucsi 61440 0 typec 114688 1 typec_ucsi $ sudo modprobe -a /lib/modules/6.12.0-rc5-next-20241030-custom/kernel/drivers/usb/typec/ucsi/ucsi_acpi.ko $ lsmod|egrep ucsi ucsi_acpi 12288 0 typec_ucsi 61440 1 ucsi_acpi typec 114688 1 typec_ucsi $ grep . -r /sys/firmware/acpi/interrupts/ | awk '$2 > 0' /sys/firmware/acpi/interrupts/sci: 1319 /sys/firmware/acpi/interrupts/gpe_all: 1319 /sys/firmware/acpi/interrupts/gpe17: 1319 EN enabled unmasked $ uptime 10:01:36 up 8 min, 2 users, load average: 0.08, 0.09, 0.07 $ uname -a Linux victus 6.12.0-rc5-next-20241030-custom #4 SMP PREEMPT_DYNAMIC Fri Nov 1 08:33:33 EDT 2024 x86_64 x86_64 x86_64 GNU/Linux Thank you for your continued efforts, and Greg Kroah-Hartman <gregkh@linuxfoundation.org> for the commit.
Just wondering, as I'm very new to the kernel dev process. Has this patch been implemented into the mainline kernel yet? If not, how can I know where the patch stands at?
(In reply to Evan Frouin from comment #15) > Just wondering, as I'm very new to the kernel dev process. Has this patch > been implemented into the mainline kernel yet? If not, how can I know where > the patch stands at? Evan, I compared 6.12 to 6.13-rc1 this morning and found the patch contents within 6.13-rc1. I did a search for "static int ucsi_acpi_read_cci" looking for: + if (UCSI_COMMAND(ua->cmd) == UCSI_PPM_RESET) { + ret = ucsi_acpi_dsm(ua, UCSI_DSM_FUNC_READ); + if (ret) + return ret; + } The patch is here: https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git/commit/?h=next-20241021&id=fa48d7e81624efdf398b990a9049e9cd75a5aead The 6.13-rc1 file I examined is here: https://github.com/torvalds/linux/blob/v6.13-rc1/drivers/usb/typec/ucsi/ucsi_acpi.c
(In reply to Evan Frouin from comment #15) > Just wondering, as I'm very new to the kernel dev process. Has this patch > been implemented into the mainline kernel yet? If not, how can I know where > the patch stands at? I forgot to summarize at the end. The answer you are looking for is: The fix will be in 6.13 once it is released.
Thank you!
So to fully fix this issue as of now, correct me if im wrong but all i have to do is update to kernel version 13 right? No blacklisting ucsi_acsi, no masking gpe17, no crontabs etc? Thanks in advance
Due to life circumstances, I have not yet resumed debugging. For my particular use case, I only required the CPU to function normally; I am currently using the laptop to run docker containers with the NVidia GPU, thus I unload the ucsi_acpi after boot herehttps://bugs.launchpad.net/ubuntu/+source/linux/+bug/2073538 https://blockblast-game.io
Hello, Today kernel 6.13.1 was finally released for arch, and updated my laptop. I removed the ucsi blacklist parameter, but I still have a CPU thread maxed out at 100%. Anything I'm doing wrong?