Bug 11000 - kacpi_notify still takes 90% of cpu
Summary: kacpi_notify still takes 90% of cpu
Status: CLOSED PATCH_ALREADY_AVAILABLE
Alias: None
Product: ACPI
Classification: Unclassified
Component: Other (show other bugs)
Hardware: All Linux
: P1 high
Assignee: ykzhao
URL:
Keywords:
: 8049 (view as bug list)
Depends on:
Blocks:
 
Reported: 2008-06-28 13:53 UTC by Joey Adams
Modified: 2008-10-24 23:00 UTC (History)
4 users (show)

See Also:
Kernel Version: 2.6.25.6
Subsystem:
Regression: ---
Bisected commit-id:


Attachments
This is the output of acpidump run as root. (116.30 KB, text/plain)
2008-06-30 17:54 UTC, Joey Adams
Details
This is the output of /sbin/lspci -vxxx run as root. (17.33 KB, text/plain)
2008-06-30 17:56 UTC, Joey Adams
Details
The output of dmesg before suspending (nothing wrong) (32.66 KB, text/plain)
2008-06-30 17:58 UTC, Joey Adams
Details
The output of dmesg after resuming from suspend. kacpi_notify is consuming CPU, and CONFIG_ACPI_DEBUG is set. (120.25 KB, text/plain)
2008-06-30 17:59 UTC, Joey Adams
Details
Like the above dmesg, but ACPI_LV_FUNCTIONS is enabled with /proc/acpi/debug_level (if I'm not mistaken, this enabled function tracing). (122.16 KB, text/plain)
2008-06-30 18:01 UTC, Joey Adams
Details
Patch against thermal.c to add temperature sanity check (2.44 KB, patch)
2008-07-01 23:41 UTC, Joey Adams
Details | Diff
Output of each file in /sys/firmware/acpi/interrupts/* both before and after suspend. (1.55 KB, text/plain)
2008-07-01 23:43 UTC, Joey Adams
Details
Output of dmesg before suspending. (22.79 KB, text/plain)
2008-07-01 23:47 UTC, Joey Adams
Details
cat /proc/interrupts before suspending. (1.06 KB, text/plain)
2008-07-01 23:49 UTC, Joey Adams
Details
Output of each file in /sys/firmware/acpi/interrupts/* after suspend (same as before suspend) (1.55 KB, text/plain)
2008-07-01 23:50 UTC, Joey Adams
Details
cat /proc/interrupts after suspending. (1.06 KB, text/plain)
2008-07-01 23:51 UTC, Joey Adams
Details
Output of dmesg after suspending. (122.78 KB, text/plain)
2008-07-01 23:52 UTC, Joey Adams
Details
Output of dmidecode (it's the same before and after suspending). (10.09 KB, text/plain)
2008-07-01 23:54 UTC, Joey Adams
Details
try the attached workround patch series(tar.gz package) (3.74 KB, patch)
2008-07-02 23:05 UTC, ykzhao
Details | Diff
Patch adds a flood protector to kacpid/kacpi_notify queues (20.19 KB, patch)
2008-07-05 00:55 UTC, Joey Adams
Details | Diff
A patch series consisting of all of the patches posted so far (10.16 KB, application/x-bzip-compressed-tar)
2008-07-05 01:01 UTC, Joey Adams
Details
The dmesg after pressing 'undock' with ACPI debug enabled. (1.41 KB, application/x-gzip)
2008-08-23 07:35 UTC, Maciej Grela
Details
Output of acpidump run as root. (54.34 KB, application/x-gzip)
2008-08-23 07:35 UTC, Maciej Grela
Details
lspci -vvvvxxx output (22.25 KB, application/x-gzip)
2008-08-23 07:36 UTC, Maciej Grela
Details
Patch 1/3: Attach the ACPI device to the ACPI handle as early as possible (1.46 KB, patch)
2008-09-11 18:19 UTC, ykzhao
Details | Diff
Patch 2/3: Get the device power state in the course of scanning device (4.47 KB, patch)
2008-09-11 18:20 UTC, ykzhao
Details | Diff
Patch 3/3: :Add "acpi.power_nocheck=1" to disable power state check in power transition (5.79 KB, patch)
2008-09-11 18:22 UTC, ykzhao
Details | Diff

Description Joey Adams 2008-06-28 13:53:47 UTC
Distribution:  Fedora 9 (problem has been seen in Ubuntu Hardy and earlier versions of both Fedora and Ubuntu).
Hardware Environment:  HP Pavilion 503n, Intel Celeron 1.7GHz and Pentium 4 Northwood (single core).
Problem Description:

Like the bugs mentioned at http://ubuntuforums.org/showthread.php?t=399619http://ubuntuforums.org/showthread.php?t=399619 and http://bugzilla.kernel.org/show_bug.cgi?id=10224http://bugzilla.kernel.org/show_bug.cgi?id=10224 , the kacpi_notify workqueue process consumes 90% of the CPU after things like a suspend or the CPU getting hot.

After a lot of printk tracing, I found that acpi_thermal_notify (in thermal.c) with event type ACPI_THERMAL_NOTIFY_THRESHOLDS is triggered, leading to acpi_power_off_device (in power.c).  When it gets to:

status = acpi_evaluate_object(resource->device->handle, "_OFF", NULL, NULL);

It does turn down the fan like it should, but it also triggers the acpi_ex_opcode_2A_0T_0R opcode, leading to a notify event, and the cycle begins again.  Moreover, the ACPI driver can't determine the state of the fan (I've only heard it in high power and low power, never off).  It also thinks it should turn down the fan when the CPU gets hot.  However, these are separate issues that I'm not too concerned about.

The kacpid and kacpi_notify problems appear in many different manifestations, and all of them are ostensibly due to buggy ACPI BIOS implementations.  What I posted above is likely going to be a different call chain than someone else's.  Nevertheless, these problems indicate that the ACPI driver should be able to detect when a workqueue is getting flooded and queue delayed events when it is.

Steps to reproduce:

On my system, letting the CPU get above 51C or so for a while or suspending and resuming will turn down (that's right, turn down) the fan and trigger the infinite CPU hogging.  echo 3 > /proc/acpi/fan/FAN1/state will turn down the fan, but will not trigger the spinning.

Proposed solution:

Create a flood protection mechanism for events queued to kacpid and kacpi_notify.  For instance, if more than 10 events are sent to one of these queues within a tenth of a second, more events should be delayed by 10 seconds until a timer elapses.  Thus, normal operation should not be interfered with to a severe extent, and the kacpi_notify and kacpid queue flooding will cause unnoticeable impact.

However, if a whole bunch of ACPI activity happens before the timer is put in place (e.g. the timer source is based on ACPI), my proposed general solution could possibly stall correct ACPI operation and not get the timer up and running at all.  Thus, flood protection should be disabled until, say, ten seconds of ACPI driver uptime (thus verifying that the clock works).
Comment 1 ykzhao 2008-06-29 18:57:51 UTC
Will you please attach the output of acpidump , lspci -vxxx?

Will you please set "CONFIG_ACPI_DEBUG" in kernel configuration and attach the output of dmesg ?(Please boot the system with acpi enabled)

Thanks.
Comment 2 Joey Adams 2008-06-30 17:54:25 UTC
Created attachment 16665 [details]
This is the output of acpidump run as root.
Comment 3 Joey Adams 2008-06-30 17:56:41 UTC
Created attachment 16666 [details]
This is the output of /sbin/lspci -vxxx run as root.
Comment 4 Joey Adams 2008-06-30 17:58:03 UTC
Created attachment 16667 [details]
The output of dmesg before suspending (nothing wrong)
Comment 5 Joey Adams 2008-06-30 17:59:39 UTC
Created attachment 16668 [details]
The output of dmesg after resuming from suspend.  kacpi_notify is consuming CPU, and CONFIG_ACPI_DEBUG is set.
Comment 6 Joey Adams 2008-06-30 18:01:22 UTC
Created attachment 16669 [details]
Like the above dmesg, but ACPI_LV_FUNCTIONS is enabled with /proc/acpi/debug_level (if I'm not mistaken, this enabled function tracing).
Comment 7 Joey Adams 2008-06-30 18:57:19 UTC
I moved from Fedora 9's 2.6.25.6 to the 2.6.25.9 kernel from kernel.org and used the config file from Fedora 9 as well as enabled CONFIG_ACPI_DEBUG and CONFIG_ACPI_DEBUG_FUNC_TRACE (the latter doesn't seem to affect the output of dmesg unless I enable ACPI_LV_FUNCTIONS with /proc/acpi/debug_level (like I did in the last dmesg)).  The bug is still present.

Did I attach the above 5 files correctly, or is there an option in Bugzilla to attach a bunch of files at once without sending 5 emails to you guys?

I wrote a workqueue flood protector similar to what I proposed above.  I'm cleaning it up and getting ready to test it.  Afterward, I will post it here (or should I post it to the mailing list instead?) as a patch against drivers/acpi/osl.c in the 2.6.25.9 kernel.  The main thing it lacks currently is an interface (perhaps a /proc/acpi filesystem entry) through which it can be enabled/disabled and configured.
Comment 8 ykzhao 2008-07-01 00:28:39 UTC
Thanks for the info.
From the acpidump it seems that GPE 0x1D is shared by several ACPI devices: Power button, KBC, Mouse. And the issue is fixed by the following commit:
    commit 729b2bdbfa19dd9be98dbd49caf2773b3271cc24
    Author: Zhao Yakui <yakui.zhao@intel.com>
    Date:   Wed Mar 19 13:26:54 2008 +0800
      ACPI : Disable the device's ability to wake the sleeping system in the boot phase

Will you please try the latest kernel(for example: 2.6.26-rc7) and see whether the problem still exists?
   After the system is booted, please do the following test.
   a. echo 0x00010004 > /sys/module/acpi/parameters/debug_layer && echo 0x08000017 > /sys/module/acpi/parameters/debug_level 
   b. cat /sys/firmware/interrupts/* 
   c. when the problem appears, cat /sys/firmware/interrupts/* and /proc/interrupts/
   d. dmesg >dmesg_after

   After the test, please attach the output.

   Thanks.
Comment 9 ykzhao 2008-07-01 02:05:12 UTC
From the dmesg in comment #5 it seems that OS complains the following warning message.
   ACPI: Transitioning device [FAN1] to D3
   ACPI: Unable to turn cooling device [f78122a0] 'off'

From the acpidump we can know that :
   PowerResource (FN01, 0x00, 0x0000)
            {  
               Method (_STA, 0, NotSerialized)
                {
                    Return (0x01)  // 1 means that the power resource is on; 0 means that the power resource is off.
                }
   And this powerresource is referred by the Fan1 device. 
     Device (FAN1)
            {
               Name (_PR0, Package (0x01)
                {
                    FN01
                })

   When the thermal temperature is below pre-defined temperature, OS will try to turn off the Fan device. But as the _STA objec of FN01 always reports that the power resource is on, OS will complain that ACPI: Unable to turn cooling device [f78122a0] 'off'.
   At the same time when OS try to turn off the Fan device, it will call the _OFF object of FN01, in which the notification event(0x81) will be sent to thermal zone. When the thermal zone receives the 0x81 notification event, OS will recheck the trip-point of thermalzone and try to turn off the FAN device again. So the notification event(0x81) will be sent again to the thermalzone and OS will try to process the event again, which causes that OS enter the infinite loop.
   
   IMO this is a very obvious BIOS bug and had better be fixed by bios upgrading.
   
Comment 10 ykzhao 2008-07-01 02:15:05 UTC
Will you please attach the output of dmidecode ?
   According to the analysis in comment #9 it seems that it is a BIOS bug. In the upstream kernel when the _OFF object is called to turn off the power resource, OS will check the status of power resource to verify whether the power resource is turned off. Maybe for the bogus bios it is unnecessary to check whether the power resource is turned off.
   I will try to write a workaround patch to fix this issue.
Comment 11 Joey Adams 2008-07-01 23:34:08 UTC
When I compiled and tested the 2.6.26-rc8 kernel from kernel.org, the kacpi_notify spinning symptom went away, but it's because the thermal_zone THRM device (/proc/acpi/thermal_zone/THRM) did not load (it did in earlier kernels).  The problem is that my BIOS reports temperatures in degrees Celsius (i.e. 40 degrees Celsius would be '40'), but the ACPI spec clearly states that it should report temperatures in kelvins times 10 (i.e. 300 kelvins would be '3000').  The ACPI driver doesn't know this, so it acts like the reported values are in kelvins and does the conversion, resulting in temperature values like -269C and -273C.  Because of this additional code:

/*
		 * Treat freezing temperatures as invalid as well; some
		 * BIOSes return really low values and cause reboots at startup.
		 * Below zero (Celcius) values clearly aren't right for sure..
		 * ... so lets discard those as invalid.
		 */
		if (ACPI_FAILURE(status) ||
				tz->trips.critical.temperature <= 2732) {
			tz->trips.critical.flags.valid = 0;
			ACPI_EXCEPTION((AE_INFO, status,
					"No or invalid critical threshold"));
			return -ENODEV;

The THRM device fails to load because the critical temperature looks like it's below freezing.

I added a simple sanity check to thermal.c for all the temperature reads I could find.  It will work on correct BIOSes as well, provided that the CPU really isn't -216.7 degrees Celsius or colder.  The patch is posted below.

When I made this change, THRM came back, and the kacpi_notify spinning problem returned as before.  The requested information is posted below.  I triggered the kacpi_notify spinning by suspending and resuming.

By the way, my computer is from around December 2001, if you're interested.

I'm finishing up the event queue flood protector.  It's sort of large because it now has a procfs folder for configuration.  I'll post the patch for that when I'm done and have tested it on my machine.
Comment 12 Joey Adams 2008-07-01 23:41:14 UTC
Created attachment 16683 [details]
Patch against thermal.c to add temperature sanity check

This patch against thermal.c adds a sanity check to make buggy BIOSes that report temperature in degrees Celsius rather than kelvins times 10 work correctly.
Comment 13 Joey Adams 2008-07-01 23:43:49 UTC
Created attachment 16684 [details]
Output of each file in /sys/firmware/acpi/interrupts/* both before and after suspend.
Comment 14 Joey Adams 2008-07-01 23:47:16 UTC
Created attachment 16685 [details]
Output of dmesg before suspending.
Comment 15 Joey Adams 2008-07-01 23:49:13 UTC
Created attachment 16686 [details]
cat /proc/interrupts before suspending.
Comment 16 Joey Adams 2008-07-01 23:50:15 UTC
Created attachment 16687 [details]
Output of each file in /sys/firmware/acpi/interrupts/* after suspend (same as before suspend)
Comment 17 Joey Adams 2008-07-01 23:51:17 UTC
Created attachment 16688 [details]
cat /proc/interrupts after suspending.
Comment 18 Joey Adams 2008-07-01 23:52:14 UTC
Created attachment 16689 [details]
Output of dmesg after suspending.
Comment 19 Joey Adams 2008-07-01 23:54:03 UTC
Created attachment 16690 [details]
Output of dmidecode (it's the same before and after suspending).
Comment 20 ykzhao 2008-07-02 23:05:51 UTC
Created attachment 16708 [details]
try the attached workround patch series(tar.gz package)

Will you please try the attached workaround patch to see whether the problem still exists?(Please add the boot option of "acpi.power_nocheck=1")

Note: the attached tar package contains four patches and the corresponding patch series.

After testing, please attach the output of dmesg.
Thanks.
Comment 21 Joey Adams 2008-07-05 00:55:08 UTC
Created attachment 16742 [details]
Patch adds a flood protector to kacpid/kacpi_notify queues

This is the workqueue flood protector I proposed earlier.  This is my first submission of code to the Linux kernel, so please bear with and inform me of any incorrect spinlocks, poorly formatted code, etc. in my patch :)  It now has a /proc/acpi/flood directory for options and help, but it lacks boot command line configuration options (e.g. acpi.flood=on acpi.flood.start_wait_time=10000).  This can serve as a generic quick fix to any instances of kacpid and/or kacpi_notify spinning, but I'd think users should still file bug reports if they have to enable this.

I added a copyright signature for myself at the top of osl.c (where the vast majority of the flood protector code resides).  Is this correct protocol?

I did test your patch series, and it does indeed seem to clear up all the problems I've mentioned here.  I did remember to include the thermal sanity patch I posted earlier, so /proc/thermal_zone/THRM was there.  The kacpi_notify spinning problem is completely gone, and I can echo 0 and 3 to /proc/fan/FAN1/state to turn the fan up and down respectively (rather than it being stuck at down).  I can't force the acpi.power_nocheck option off with the command line, and this appears to be because my machine type is listed under power_nocheck_dmi_table and set with dmi_check_system(power_nocheck_dmi_table); .  To finish up testing on my flood protector, I commented out that line in my working code (but the dmesg's I will attach as well as this patch will not reflect this).

I'm recompiling the kernel to get rid of any stray printk traces and to make sure all the patches posted so far work with the 2.6.26-rc8 kernel.  I'll post the dmesg before and after suspend pretty soon after that finishes (the flood protector will probably not affect things since there won't be flooding, but will post a warning if there is flooding).
Comment 22 Joey Adams 2008-07-05 01:01:20 UTC
Created attachment 16743 [details]
A patch series consisting of all of the patches posted so far

This is all of the patches posted so far (along with a series file):  They are (in this order):

Power patch series
Thermal sanity check (thermal_sanitycheck.patch)
Flood protector (acpi_floodprotector.patch)

I ran patch -p1 with all the patches in the order of the series file within a vanilla 2.6.26-rc8 source directory, and there were no errors or offsets.



Thanks for fixing my ACPI issues :)
Comment 23 ykzhao 2008-07-06 18:22:58 UTC
Hi, Joey
   Thanks for your work.
   Your proposal about the workqueue flood detector is very good. But maybe it is too agressive. If it is applied, it will have side effect:
   the kacpid/kacpid_notify workqueue will be deferred for some time. And it will have an impact on the performance. For example: The notifiy event will be sent to some device when GPE interrupt is processed. It will take more time to process the notification event. Before the notification event is processed, the corresponding GPE will be disabled. If some delay time is added, it will have impact on the performance.
   
   At the same time the workqueue flood detector can't solve the problem. OS still needs to process a lot of work tasks caused by the broken BIOS. Of course the kacpid/kacpid_notify usage will be decreased.

   IMO the patch about the workqueue flood detector is too agressive and have some side effects. Anyway we can discuss it in acpi_mail list.

   thanks.
Comment 24 ykzhao 2008-07-14 00:46:15 UTC
After the patch in comment #20 is applied, the problem disappears. So the bug will be marked as the resolved.
Comment 25 Maciej Grela 2008-08-23 07:34:12 UTC
Hi,

I have a similar problem with kacpi_notify running out of control and eating all my CPU. It`s triggered by pressing the 'undock' button on the dockstation my Dell D600 is attached. I`m running 2.6.26 and the patches you provided in commment #20 don`t help in my case. I`ve attached my lspci, my acpidump and dmesg messages with ACPI debug enabled. I`m running Gentoo and I`ve created a report on their bugzilla too: http://bugs.gentoo.org/show_bug.cgi?id=235404
Comment 26 Maciej Grela 2008-08-23 07:35:07 UTC
Created attachment 17392 [details]
The dmesg after pressing 'undock' with ACPI debug enabled.
Comment 27 Maciej Grela 2008-08-23 07:35:45 UTC
Created attachment 17393 [details]
Output of acpidump run as root.
Comment 28 Maciej Grela 2008-08-23 07:36:24 UTC
Created attachment 17394 [details]
lspci -vvvvxxx output
Comment 29 Zhang Rui 2008-08-24 20:57:38 UTC
Maciej, that's another problem.
Please file a new bug report. :)
Comment 30 ykzhao 2008-09-11 18:19:23 UTC
Created attachment 17735 [details]
Patch 1/3: Attach the ACPI device to the ACPI handle as early as possible
Comment 31 ykzhao 2008-09-11 18:20:12 UTC
Created attachment 17736 [details]
Patch 2/3: Get the device power state in the course of scanning device
Comment 32 ykzhao 2008-09-11 18:22:00 UTC
Created attachment 17737 [details]
Patch 3/3: :Add "acpi.power_nocheck=1" to disable power state check in power transition
Comment 33 ykzhao 2008-09-11 18:25:03 UTC
The tar.gz in comment #20 is not appropriate. So the patches set are attached instead of the tar package.
Thanks.
Comment 34 Len Brown 2008-10-16 18:21:41 UTC
patch in comment #30 included in acpi test tree:

uthor: Zhao Yakui <yakui.zhao@intel.com>  2008-08-11 02:54:16
Committer: Len Brown <len.brown@intel.com>  2008-09-19 14:57:44
Parent: 9a252fe1069f759a4fe76fdffce3b589b899bcdf (ACPI: Get the device power state in the course of scanning device)
Child:  f2e969acd6d5981e6b1272810002558650d0736e (ACPI: Add "acpi.power_nocheck=1" to disable power state check in power transition)
Branches: bugfixes, test
Follows: v2.6.27-rc6
Precedes: 

    ACPI: Attach the ACPI device to the ACPI handle as early as possible

patch in comment #31:

Author: Zhao Yakui <yakui.zhao@intel.com>  2008-08-11 02:55:05
Committer: Len Brown <len.brown@intel.com>  2008-09-19 14:57:43
Parent: a7be432189eb771907fb52b3f28f3e9c9309683d (acpi: replace AE_BAD_ADDRESS exception code with AE_ERROR)
Child:  05afbbf73abcbb4f6fc9fcf0a6cbbea623928a7a (ACPI: Attach the ACPI device to the ACPI handle as early as possible)
Branches: bugfixes, test
Follows: v2.6.27-rc6
Precedes: 

    ACPI: Get the device power state in the course of scanning device

patch in commednt #32:
Author: Zhao Yakui <yakui.zhao@intel.com>  2008-08-11 02:57:50
Committer: Len Brown <len.brown@intel.com>  2008-09-19 14:57:45
Parent: 05afbbf73abcbb4f6fc9fcf0a6cbbea623928a7a (ACPI: Attach the ACPI device to the ACPI handle as early as possible)
Child:  2f76f1bd9b3ed0992b987d0b3a9b60ee698467d4 (ACPI: Add DMI check to disable power state check in power transition)
Branches: bugfixes, test
Follows: v2.6.27-rc6
Precedes: 

    ACPI: Add "acpi.power_nocheck=1" to disable power state check in power transition

I don't know what the attachment in comment #33 is supposed to be,
it appears to be html
Comment 35 ykzhao 2008-10-16 19:40:12 UTC
Hi, Len
   What I said in comment #33 is that the tar.gz package is attached in comment #20. So the patch set is attached again instead of the tar.gz package.
 
   thanks.
    
Comment 36 Len Brown 2008-10-16 23:56:04 UTC
re: comment #33 confusion
pilot error on my part, for some reason i must have thought the link
was an attachment and saved comment #22 as a file...  sorry, and thanks.
Comment 37 Len Brown 2008-10-16 23:57:29 UTC
*** Bug 8049 has been marked as a duplicate of this bug. ***
Comment 38 Len Brown 2008-10-24 23:00:07 UTC
shipped in linux-2.6.28-rc1
closed

commit 6415e12ba0f92a54f02d9c4ecaa3c82f35f3d335
Author: Zhao Yakui <yakui.zhao@intel.com>
Date:   Mon Aug 11 14:59:59 2008 +0800

    ACPI: Add DMI check to disable power state check in power transition
...
commit f5adfaa372c76423b6e8e4727a9701330374f364
Author: Zhao Yakui <yakui.zhao@intel.com>
Date:   Mon Aug 11 14:57:50 2008 +0800

    ACPI: Add "acpi.power_nocheck=1" to disable power state check in power transition
...
commit eab4b645769fa2f8703f5a3cb0cc4ac090d347af
Author: Zhao Yakui <yakui.zhao@intel.com>
Date:   Mon Aug 11 14:54:16 2008 +0800

    ACPI: Attach the ACPI device to the ACPI handle as early as possible

Note You need to log in before you can comment on or make changes to this bug.