Bug 42981 - Processor Aggregator Device is not stable causing FW-OS communication to stop
Summary: Processor Aggregator Device is not stable causing FW-OS communication to stop
Status: CLOSED CODE_FIX
Alias: None
Product: ACPI
Classification: Unclassified
Component: Power-Processor (show other bugs)
Hardware: x86-64 Linux
: P1 high
Assignee: Len Brown
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2012-03-23 17:01 UTC by Sebastian Jarosz
Modified: 2012-07-14 14:50 UTC (History)
4 users (show)

See Also:
Kernel Version: 2.6.32-131.0.15.el6.x86_64 ,. 3.1.4+
Subsystem:
Regression: No
Bisected commit-id:


Attachments
patch vs 3.5-rc2 (2.36 KB, patch)
2012-06-15 03:01 UTC, Len Brown
Details | Diff

Description Sebastian Jarosz 2012-03-23 17:01:38 UTC
Processors' Idling, as defined in "Processor Aggregator Device" chapter of ACPI 4.0a spec, isn't stable, it doesn't handle big changes of the active cores well. Slow changes (e.g. 2 cores idled or activated every second) are OK, but drastic changes of the number of active cores cause whole FW-OS communication to stop (communication using ACPI defined interface). It happens especially when bigger number of cores (8 or more) is reactivated.

Version-Release number of selected component (if applicable):
I've tested it using kernel 2.6.32 and newer, e.g. 3.1.4 with the same results.

Reproducible: Always

Steps to Reproduce:
1. Request OS to idle 8 or more cores as described in "Processor Aggregator
Device" chapter of ACPI 4.0a spec.
2. Request OS to make all cores active.
3. Repeat steps with bigger number of cores to idle if necessary.
Actual Results:  
Cores are active again but communication with the firmware using ASL code is
stopped, no _OST response from OS. power_saving processes are hanging.


Expected Results:  
Cores are active again.
Comment 1 stuart hayes 2012-06-08 18:25:06 UTC
In drivers/acpi/acpi_pad.c, the isolated_cpus_lock mutex is being held by destroy_power_saving_task(), which calls kthread_stop() on each power_saving thread.  Kthread_stop() is waiting for the thread to end.  But the power_saving thread tries to get the isolated_cpus_lock mutex in round_robin_cpu().   If any of the power_saving threads try to get this mutex after destroy_power_saving_task() starts killing the threads, there's a deadlock kthread_stop() waiting for the thread to end, and the thread is waiting to get the mutex.

I would suggest (and have tested with an older kernel) creating a new mutex round_robin_cpus_lock, and changing round_robin_cpus() to use that mutex instead of the isolated_cpus_lock.  It fixed the issue, and I didn't see the point of having round_robin_cpus() use the same lock as the other functions which use isolated_cpus_lock.
Comment 2 stuart hayes 2012-06-11 20:24:02 UTC
In drivers/acpi/acpi_pad.c, the isolated_cpus_lock mutex is being held by destroy_power_saving_task(), which calls kthread_stop() on each power_saving thread.  Kthread_stop() is waiting for the thread to end.  But the power_saving thread tries to get the isolated_cpus_lock mutex in round_robin_cpu().   If any of the power_saving threads try to get this mutex after destroy_power_saving_task() starts killing the threads, there's a deadlock kthread_stop() waiting for the thread to end, and the thread is waiting to get the mutex.

I would suggest (and have tested with an older kernel) creating a new mutex round_robin_cpus_lock, and changing round_robin_cpus() to use that mutex instead of the isolated_cpus_lock.  It fixed the issue, and I didn't see the point of having round_robin_cpus() use the same lock as the other functions which use isolated_cpus_lock.
Comment 3 stuart hayes 2012-06-12 16:08:57 UTC
(Please ignore comment #2... it was an accidental repost of comment #1.)
Comment 4 Len Brown 2012-06-14 02:03:09 UTC
How to reproduce this issue:

First load the acpi_pad driver.

acpi_pad binds to ACPI000C, the processor aggregator device.
If you have one of those, the driver will load and you'll see
ACPI000C in sysfs with an "idlecpus" attribute beneath it:

/sys/devices/LNXSYSTM:00/LNXSYBUS:00/PNP0A08:00/ACPI000C:00/idlecpus

If you don't have ACPI000C, then hack acpi_pad.c to replace it
with another pnp-id that is present in the DSDT, but has no driver
bound.  Here I used PNP0C14 (and I removed acpi_wmi from my kernel,
as it would bind to PNP0C14)

/sys/devices/LNXSYSTM:00/device:00/PNP0C14:00/idlecpus

cd to the sysfs directory with this acpi_pad attribute.

Here we are running on a system with 32 logical processors,
so we'll repeatedly take 31 off-line, then attempt to re-enable them
all at once by asking for 0 off line.  But first we reduce
the round-robin-time, to make the failure more likely:

# echo 1 > rrtime
# echo 31 > idlecpus; echo 0 > idlecpus
# echo 31 > idlecpus; echo 0 > idlecpus
# echo 31 > idlecpus; echo 0 > idlecpus
(it usually takes only a few attempts)

etc. until the echo does not return

subsequent writes to idlecpus will hang the write.

# rmmod acpi_pad
will now hang.

# ps -ef |grep power_saving
will show a bunch of hung power saving threads.

The only way to clear this condition and to again
have the capability of acpi_pad forcing cpus to idle
is to reboot.
Comment 5 Len Brown 2012-06-15 03:01:59 UTC
Created attachment 73661 [details]
patch vs 3.5-rc2

patch from Stuart Hayes, as applied.
Comment 6 Florian Mickler 2012-07-01 09:38:44 UTC
A patch referencing this bug report has been merged in Linux v3.5-rc5:

commit 5f1601261050251a5ca293378b492a69d590dacb
Author: Stuart Hayes <Stuart_Hayes@Dell.com>
Date:   Wed Jun 13 16:10:45 2012 -0500

    acpi_pad: fix power_saving thread deadlock

Note You need to log in before you can comment on or make changes to this bug.