Latest working kernel version: N/A Earliest failing kernel version: 2.6.20 Distribution: Vanilla Kernel, tested in default kernels for: zenwalk, ubuntu, opensuse, fedora Hardware Environment: Processor: Core 2 Duo E4400 @ 2.00 GHz Motherboard: Intel Desktop Board D945GCNL Problem Description: Processes kacpid and kacpid_notify consume around 75% and 10% respectively of the first processor at all times. Workarround: Disable acpi at boot leaving hyper threading using acpi=ht, this however makes the computer not able to halt by itself, ie. must press the poweroff button. Useful information: $ top PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 53 root 15 -5 0 0 0 R 76 0.0 13:31.75 kacpid 54 root 15 -5 0 0 0 R 9 0.0 1:50.00 kacpi_notify $ cat /proc/interrupts CPU0 CPU1 0: 1234009 0 IO-APIC-edge timer 1: 3378 0 IO-APIC-edge i8042 7: 0 0 IO-APIC-edge parport0 8: 0 0 IO-APIC-edge rtc 9: 25224917 0 IO-APIC-fasteoi acpi 14: 13664 0 IO-APIC-edge libata 15: 0 0 IO-APIC-edge libata 16: 121633 0 IO-APIC-fasteoi uhci_hcd:usb4, i915@pci:0000:00:02.0 17: 4562 0 IO-APIC-fasteoi eth0 18: 0 0 IO-APIC-fasteoi uhci_hcd:usb3 19: 56486 0 IO-APIC-fasteoi uhci_hcd:usb2, libata 22: 496 0 IO-APIC-fasteoi HDA Intel 23: 2 0 IO-APIC-fasteoi uhci_hcd:usb1, ehci_hcd:usb5 NMI: 0 0 LOC: 1233974 1233892 ERR: 0 Steps to reproduce: None needed, always present
Created attachment 14511 [details] atached result of acpidump
so this was a problem in 2.6.20 and is still a problem in 2.6.23.11. Is there any version of ACPI-enabled linux where this was not a problem?
Hi, Dennis Will you please enable the ACPI debug function in kernel configuration and do the following test after the system is booted? a. cat /proc/interrupts > interrupts_before and dmesg > dmesg_before b. killall acpid c.echo 0x04 > /sys/module/acpi/parameters/debug_layer and echo 0x08000000 > /sys/module/acpi/parameters/debug_layer d. cat /proc/acpi/event > acpi_event e. After the system runs about 1 minutes, cat /proc/interrupts > interrupts_after and dmesg > dmesg_after It will be great if you can attach the above outputs. Thanks.
Len Brown: i know the bug is present in kernels from 2.6.20 to 2.6.23.11 because that is as far as i have tested. I cannot assure if its present in any older kernels or any newer. ykzhao: i have the results of the commands you asked, just in b. i found no process called acpid, perhaps you ment kacpid, but still it cannot be killed. Also for step d. /proc/acpi/event is not readable by cat(?), the terminal just stays idle for some time. Ill try to see more about this. As for the last step, i left my system iddle for a minute afer step c. and while d. was working and then ran the command. I cant upload the files now, but ill repeat the steps and upload them later.
Created attachment 14532 [details] dmesg just after booting
Created attachment 14533 [details] /proc/acpi/interrupts just after booting
Created attachment 14534 [details] dmsg one minute after passing the commands
Created attachment 14535 [details] /proc/acpi/interrupts one minute after passing the commands
Hi, Dennis Thanks for the info. But the info is not helpful . Will you please enable the ACPI debug function in kernel configuration and do the test as required in the comment #3? How to cat /proc/acpi/event is listed in the following. a. use the command of "lsof /proc/acpi/event" to list the PID who is using the /proc/acpi/event. b. kill -9 PID c. cat /proc/acpi/event > acpi_event After the test , please attach the above output. Thanks.
i had enabled this in .config and compiled the kernel before running the test CONFIG_ACPI_DEBUG=y CONFIG_ACPI_DEBUG_FUNC_TRACE=y what else do i need to enable acpi debug? cat /proc/acpi/event stays waiting and reporting events untill i press Ctrl C, how long do i leave it working? it doesn't really report anything most of the time. Also i wanted to confirm the bug is present in a core 2 duo E4500 in the same motherboard, with kernel 2.6.23.12. Some additional info (tested in both desktops) if you boot to an ntfs system like vista or xp and hard reset the computer (press the reset button) and boot into Linux, the bug is not present and kacpid + kacpid_notify work normally, /proc/interrupts doesn't show so much calls to cpu0 either. This doesn't work if you hard reset Linux.
If you have enabled the following in .config, it is ok and acpi debug function is enabled. >CONFIG_ACPI_DEBUG=y >CONFIG_ACPI_DEBUG_FUNC_TRACE=y cat /proc/acpi/event will stay waiting and only when some events happens, it will report the events. So it is necessary to wait for some time. After the system runs about 1 minute, Press Ctr+c to quit the command of cat /proc/acpi/event and then get the output of dmesg, /proc/interrupts. Will you please attach the output after finishing the test as required in comment #3? Of course the output of lspci -vxxx is helpful. It will be great if you can attach the output of dmesg, /proc/interrupts , lspci -vxxx when the Linux can work normally(First boot into vista or XP and then into Linux).
Created attachment 14606 [details] dmesg just after booting
Created attachment 14607 [details] /proc/acpi/interrupts just after booting
Created attachment 14608 [details] dmsg after
Created attachment 14609 [details] acpi interrupts after
Created attachment 14610 [details] lspci -vxxx (when the bug is present)
Created attachment 14611 [details] dmesg when the bug is not present
Created attachment 14612 [details] lspci -vxxx (when the bug is not present)
ok, this time i tried to increase verbosity of the kernel and here are the results, i hope they are helpful, just cat /proc/acpi/event > acpi_event left a blank file so i didnt attached it
Hi, Dennis thanks for the info. There is an error in the comment #3: > echo 0x04 > /sys/module/acpi/parameters/debug_layer and echo 0x08000000 > /sys/module/acpi/parameters/debug_layer ( should be debug_level). Sorry, It is my fault. Will you please test your system again as required in the following? a. echo 0x04 > /sys/module/acpi/parameters/debug_layer and echo 0x08000000 > /sys/module/acpi/parameters/debug_level b. After the system runs about 1 minutes, cat /proc/interrupts > interrupts_after and dmesg > dmesg_after I am very sorry for the fault in comment #3. Thanks.
never mind, we all make mistakes, i have the new results, with the debug level set so high both CPUs work 100%... and dmesg_after indeed shows something loop-like
Created attachment 14621 [details] dmesg just after booting
Created attachment 14622 [details] /proc/acpi/interrupts just after booting
Created attachment 14623 [details] dmsg one minute after increasing debug layer and level just the end of dmesg was shown by the command since it became too big, if you need the full dmesg tell me how to get it and ill upload it here.
Created attachment 14624 [details] /proc/acpi/interrupts one minute after increasing debub layer and level
Created attachment 14625 [details] sorry uploaded the wrong file
Created attachment 14626 [details] dmesg when the bug is not present and the debug layer and level are set as you told me
Comment on attachment 14625 [details] sorry uploaded the wrong file this is the right dmesg after the commands
Thanks for the info.
Created attachment 14645 [details] use the attached tool to read from I/O port Hi, Dennis Will you please use the attached tool to read the following I/O port and attach the output? a. 0x504 16-bit access b. 0x50c 16-bit access c. 0x52c 16-bit access How to use the attached tool can be found in the readme. Thanks.
Hi, Dennis From the log in comment #22 and #24 it seems that this is a problem of acpi interrupt floods, which is caused by GPE 0x1D . Will you please attach the output using the attached tool in comment #30? It will be great if you can do the test in two modes: hard reset and warm reset(first boot into Xp or vista and then into linux). Will you please do the following test after the system is booted(hard reset)? a. write 0x2000 into the 0x52C I/O port (16 bit access) b. cat /proc/interrupts > interrupts_before ; dmesg > dmesg_before; c. echo 0xa. echo 0x04 > /sys/module/acpi/parameters/debug_layer and echo 0x08000000> /sys/module/acpi/parameters/debug_level d. After the system runs about 1 minutes, cat /proc/interrupts > interrupts_after and dmesg > dmesg_after. At the same time please confirm whether the problem disappear. Thanks.
Sure, happy to help. This are the results of the tests using the tool: When the bug is present(normal linux function): a. ./ior --addr 0x504 --width 16 the value of IO port 0x504 is fefb b. ./ior --addr 0x50c --width 16 the value of IO port 0x50c is 1eda c. ./ior --addr 0x52c --width 16 the value of IO port 0x52c is 3000 When the bug is not present (reseting xp) a. ./ior --addr 0x504 --width 16 the value of IO port 0x504 is fefb b. ./ior --addr 0x50c --width 16 the value of IO port 0x50c is 3eda c. ./ior --addr 0x52c --width 16 the value of IO port 0x52c is 3000 as for the second set of tests, shall i make them when the bug is present, not present, or make them for both cases? also, step c, what do you mean by echo 0xa.?
Hi, Dennis Thanks for the test. > echo 0xa is a mistake. please ignore it. You can test it when the bug is present. And it is noted that if the bit 13 of 0x52c IO port returned value is 1, write 0x0000 into the 0x52c port(Otherwise please write 0x2000 into the 0x52C port). Thanks.
Created attachment 14656 [details] defer the gpe enable Hi, Dennis Will you please try the attached patch and see whether the bug can be fixed? Thanks.
I did the tests in comment 31, since the value of IO port 0x52c is 3000, i wrote 0 to 0x52c and the bug disappeared!. Ill attach the results. Ill try the patch in a moment when i get the bug back, thanks for your help.
Created attachment 14657 [details] dmesg just after inserting 0 to the port
Created attachment 14658 [details] interrupts just after inserting zero to the port, when the bug stopped
Created attachment 14659 [details] dmsg after one minute
Created attachment 14660 [details] interrupts after one minute
Hi, Dennis Thanks for so quick response. From the log it seems that the problem is caused by the GPE 0x1D(maybe it is related with the GPIO). And I will investigate it. Will you please test the attched patch in comment #34 and see whether the bug can be fixed? It will be great if you can attach the output of dmesg and /proc/interrupts with the boot option of "acpi.debug_layer=0x04 acpi.debug_level=0x08000010". Thanks.
Yes, i patched and recompiled the kernel but the bug was still present and acting in the same way. I couldn't boot the system with "acpi.debug_layer=0x04 acpi.debug_level=0x08000010" because of the heavy load of printing the debug messages, i left it some time but the screen just kept showing the same loop from comment 26.
Thanks for the test. It seems that the patch in the comment #34 can't solve the problem. I will investigate the GPE 0x1D issue. Thanks.
Hi, Dennis The problem is caused by acpi interrupt floods, which is related with GPE 0x1D. At the same time GPE 0x1D is shared by three ACPI device(SLEEP button, PS2K, PS2M). But from the log it seems that PS2M PNP device isn't registered and OS can't register the mouse device. > PNP: PS/2 appears to have AUX port disabled, if this is incorrect please boot with i8042.nopnp Will you please try the boot option of "i8042.nopnp" and see whether the bug still exists? thanks.
Hi! I tried i8042.nopnp but the bug still exists... dmesg | grep PNP [ 33.978863] i8042: PNP detection disabled I also tried actually plug in a PS/2 mouse but still.
Thanks for the test. It seems that the boot option of "i8042.nopnp" can't resolve the bug. Will you please check whether there is the BIOS option that enables/disables the wakeup function of PS/2 KBD&Mouse? Thanks.
Created attachment 15223 [details] try the debug patch Will you please try the debug patch and see whether the problem still exists?
Hi, Dennis It seems that this bug is related with the shared 0x1D GPE. It is a wakeup GPE and shared by three device : Sleep button, PS2K, PS2M. And GPE 0x1D will be configured as Run_Wake(Runtime wakeup) gpe in button driver. According to ACPI spec the wake events should not be intermixed with non-wake (runtime) events on the same GPE input. In this desktop the wake events are intermixed with the non-wake(runtime) events on the same 0x1D gpe input. Please try the debug patch in comment #46. In this patch the GPE 0x1D is only configured as the wakeup gpe. Thanks.
Searched the BIOS but couldn't find something related. I compiled the patch with kernel 2.6.24 and the bug was solved! Kacpid acts normally, theres no loop in dmesg after increasing the debug layer & level and /proc/interrupts doesn't show large numbers. I took dmesg before and after as before should i upload them? Thanks for your efforts so far in this bug.
Created attachment 15258 [details] try the debug patch Will you please try the debug patch and see whether the problem still exists? In this patch OSPM will call _PSW method to disable the device's ability to wake up the system. Thanks.
I've compiled with the patch in comment #49 and it fixed the bug for me! Thank you all. I'm using kernel 2.6.24.3.
Thanks for the test. It seems that the patch in comment #49 can fix the bug. I will try to send it to the upstream kernel. At the same time the bug will be marked as the duplicated of bug 10224. *** This bug has been marked as a duplicate of bug 10224 ***
hello, I origialy reported bug 9781 due to kacpid eating 80% of cpu. The patch was made and it was included in the upstream kernel. I've using the 2.6.25.4 kernel wich didn't show the bug and had been working well; however, today i realized kacpid was once again consuming a lot of cpu, less than before however (44%). # uname -a Linux zenwalk 2.6.25.4 #1 SMP PREEMPT Fri May 16 14:10:46 CEST 2008 i686 Intel(R) Core(TM)2 Duo CPU E4400 @ 2.00GHz GenuineIntel GNU/Linux I'm pretty sure it was working ok before. iow --addr 0x52c --width 16 --value 0 that still fixes the problem. Am I using the correct kernel version? I've been on this system for months without kacpid.
The bug is fixed by the following commit: > commit 729b2bdbfa19dd9be98dbd49caf2773b3271cc24 > Author: Zhao Yakui <yakui.zhao@intel.com> > Date: Wed Mar 19 13:26:54 2008 +0800 > ACPI : Disable the device's ability to wake the sleeping system in the boot phase But unfortunately this patch is not in the stable 2.6.25.4 kernel. Maybe you can apply the patch manually on the 2.6.25.4 kernel or try the latest stable kernel. thanks.