Bug 13013
Summary: | kacpid 100% cpu utilization | ||
---|---|---|---|
Product: | ACPI | Reporter: | Ales Seifert (seifert) |
Component: | ACPICA-Core | Assignee: | Zhang Rui (rui.zhang) |
Status: | REJECTED INSUFFICIENT_DATA | ||
Severity: | high | CC: | dzhonw, rui.zhang, yakui.zhao |
Priority: | P1 | ||
Hardware: | All | ||
OS: | Linux | ||
Kernel Version: | 2.6.30-24 | Subsystem: | |
Regression: | No | Bisected commit-id: | |
Attachments: |
lspci -vxxx output
dmesg output grep . /sys/firmware/acpi/interrupts/* output acpidump output l /proc/acpi/thermal_zone/*/* output lspci -vxxx output cat /proc/acpi/thermal_zone/*/* |
Created attachment 20818 [details]
dmesg output
Created attachment 20819 [details]
grep . /sys/firmware/acpi/interrupts/* output
Created attachment 20820 [details]
acpidump output
Will you please attach the output of /proc/acpi/thermal_zone/*/* when this issue happens? Will you please confirm whether the issue happens after the box is booted or when overheating? Thanks. is this a regression? I mean are there any kernels released earlier that work for you before? Created attachment 20848 [details]
l /proc/acpi/thermal_zone/*/* output
This issue happens always from the notebook is booted. Yes this is probably regresion, 2.6.27.19 was previous kernel I used and it worked fine. Created attachment 20849 [details]
lspci -vxxx output
hi, Ales sorry for my mistake. please attach the output of "cat /proc/acpi/thermal_zone/*/*". From the description it seems that this is a regression. Will you please use git-bisect to identify the commit which causes the regression? Thanks. Created attachment 20877 [details]
cat /proc/acpi/thermal_zone/*/*
I tested 2.6.27.19 again today and found it has same bug, i just didn't noticed before. So it seems not to be regression anymore. Is there anything I can provide to help you to resolve the bug? Hi, Ales Thanks for the info. Does this issue also happen on earlier kernel? For example: 2.6.26.xx or 2.6.17 kernel? And from the info in comment #10 it seems that the temperature is far lower than the threshold. Thanks. Will try earlier kernels as soon as I get them downloaded... :) In the meantime I found, that sometimes when my notebook wakeup from hibernation the problem goes away. Sometimes it comes back after next hibernation cycle. It has definitively nothing to do with the temperature. ping Ales. (In reply to comment #14) > ping Ales. Sorry for the delay, have tried 2.6.25.5 with the same problem, older kernel doesn't even boot correctly for me with some disk partition problems, so I cannot test it (tried 2.6.22, 2.6.19, 2.6.17) Just to confirm comment #13, hibernation helps in 99%. After hibernation I can work normally without any kacpid utilization till next restart. Some more information, after wakeup from sleep (not hibernation) problem comes back again every time. After hibernation I can work normally without any kacpid utilization till next restart or sleep. Any idea what I could do to analyze where could the problem be? Hi, I tried to enable debugging and here are some output: exregion-0290 [00] ex_system_io_space_han: System-IO (width 8) R/W 0 Address=000000000000EF80 repeated endlessly and many times in second (In reply to comment #14) > ping Ales. ping Zhang Rui :) some more info: since I updated to 2.6.29-163 hybernation doesn't help anymore, kacpid has 100% CPU utilization after resume, kernel boot parameter acpi=ht makes system usable, but with limited acpi functionality well, this bug looks like a duplicate of bug 13268 to me. please make a double check. Hi, thanks for pointing me to that bug, solution provided works for me as well!! "echo disable > /sys/firmware/acpi/interrupts/gpe00" But there are a few symptoms different: 1. Temperature of my system is low 2. Problem occurs just after the boot even in single user mode I will try to test it within Windows and will let you know if same interrupt storm is there too. One more question... Is disabling this GPE harmless? It is not blank like in the bug 13268. Method (_L00, 0, NotSerialized) { \_TZ.THEV () } Method (THEV, 0, Serialized) { Store (\_SB.PCI0.LPCB.SMAB (0x19, 0x00, 0x00), Local0) Store (0x00, Local1) Store (0x38, Local3) Store (\_SB.PCI0.LPCB.SMAB (0x5D, 0x23, 0x00), Local0) If (LEqual (And (Local0, 0xFF00), 0x00)) { If (And (Local0, 0x02)) { Store (\_SB.PCI0.LPCB.SMAB (0x5D, 0x25, 0x00), Local2) If (LEqual (And (Local2, 0xFF00), 0x00)) { If (And (Local2, 0x01)) { Or (Local1, 0x20, Local1) And (Local3, Not (0x20), Local3) } If (And (Local2, 0x02)) { Or (Local1, 0x08, Local1) And (Local3, Not (0x08), Local3) } If (And (Local2, 0x04)) { Or (Local1, 0x10, Local1) And (Local3, Not (0x10), Local3) } } } If (And (Local0, 0x04)) { Store (\_SB.PCI0.LPCB.SMAB (0x5D, 0x24, 0x00), Local2) If (LEqual (And (Local2, 0xFF00), 0x00)) { If (And (Local2, 0x01)) { Or (Local1, 0x20, Local1) } If (And (Local2, 0x02)) { Or (Local1, 0x08, Local1) } If (And (Local2, 0x04)) { Or (Local1, 0x10, Local1) } } } } Else { Store (0x38, Local1) } Acquire (THER, 0xFFFF) Or (THSC, Local1, THSC) And (WHTR, Not (0x38), Local4) Or (Local4, Local3, WHTR) Release (THER) If (And (Local1, 0x20)) { Notify (LOCZ, 0x80) } If (And (Local1, 0x08)) { Notify (CPUZ, 0x80) } If (And (Local1, 0x10)) { Notify (CP2Z, 0x80) } } No, we can not disable GPE00 in this case. I guess that 1. GPE00 is fired. 2. \_TZ.THEV sends a notification to the ACPI thermal driver. 3. ACPI thermal driver receives the notification and re-evaluate the thermal zone temperature 4. re-evaluating the temperature triggers another GPE00 interrupt. so this is an endless loop. Please clear CONFIG_ACPI_THERMAL or blacklist the ACPI thermal driver and see if it helps. I've clared CONFIG_ACPI_THERMAL (now I'm running 2.6.30-24). after make and make install I have got: .... Kernel image: /boot/vmlinuz-2.6.30-24 Initrd image: /boot/initrd-2.6.30-24 . . . FATAL: Module thermal not found. WARNING: no dependencies for kernel module 'thermal' found. Kernel Modules: scsi_mod libata ahci hwmon thermal_sys processor fan jbd mbcache ext3 edd crc-t10dif sd_mod usbcore ohci-hcd uhci-hcd ehci-hcd hid usbhid Features: block usb resume.userspace resume.kernel Bootsplash: openSUSE (800x600) 48274 blocks after reboot the system was OK, but after sleep (to RAM) and resume kacpid utilization is back again, only what helps is: "echo disable > /sys/firmware/acpi/interrupts/gpe00" any idea? Do you think it is a BIOS bug or an acpi bug? please open the "/etc/modprobe.d/blacklist" file and add "blacklist thermal" at the end of this file. reboot and see if the problem still exists. yes the problem still exists, module thermal gets loaded despite "blacklist thermal" in "/etc/modprobe.d/blacklist" unloading "modprobe -r thermal" doesn't help edited /etc/sysconfig/kernel and removed "thermal" from initrd so now is thermal not loaded after reboot, system is OK without kacpid utilization... will try sleep and resume if it stay the same... after sleep and resume kacpid utilization is back and "echo disable > /sys/firmware/acpi/interrupts/gpe00" seems doesn't help anymore ... "thermal" is still not loaded "echo disable > /sys/firmware/acpi/interrupts/gpe00" works when out from dock... it is really alchemy this doesn't seem like a software problem to me. it would be greate if you can verify if the problem still exists in windows. This problem affects me when the lid is closed, using either a home-compiled kernel or debian stock kernels. Filed without response to debian: http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=536240 Trying "echo disable > /sys/firmware/acpi/interrupts/gpe00" returns an error (echo: write error: invalid argument). Getting rid of the thermal module doesn't change the behaviour. Linux terek 2.6.30.1-nimloth #2 SMP Thu Jul 9 14:36:36 KGT 2009 i686 GNU/Linux Jonathan, this is a different problem. please open a new bug and attach 1. acpidump output 2. output of "grep . /sys/firmware/acpi/interrupts/*" both before and after closing the lid. New bug posted at: http://bugzilla.kernel.org/show_bug.cgi?id=13802 I wasn't sure what category of ACPI it belonged in, so that might have to be changed. Ales, can you verify if the problem still exists in windows please? as this doesn't look like a Linux kernel bug to me. no response from the bug report for more than two months. close it. |
Created attachment 20817 [details] lspci -vxxx output 2.6.29-59-default #1 SMP Sun Apr 5 12:34:54 CEST 2009 x86_64 x86_64 x86_64 GNU/Linux - openSUSE 11.1 HP EliteBook 8730w PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 34 root 15 -5 0 0 0 R 101 0.0 196:35.70 kacpid it is the same with or without X running boot acpi=off helps but as I need all cores and powermanagement it is not the solution Reproducible always.