Bug 6687
Summary: | AE_AML_MUTEX_NOT_ACQUIRED - Acer TM3012, HP NX1625 | ||
---|---|---|---|
Product: | ACPI | Reporter: | Norbert Preining (preining) |
Component: | ACPICA-Core | Assignee: | Robert Moore (Robert.Moore) |
Status: | REJECTED UNREPRODUCIBLE | ||
Severity: | normal | CC: | acpi-bugzilla, protasnb, trenn |
Priority: | P2 | ||
Hardware: | i386 | ||
OS: | Linux | ||
Kernel Version: | 2.6.17-rc6-mm2 | Subsystem: | |
Regression: | --- | Bisected commit-id: | |
Attachments: |
acpidmp output
acpidmp output Do not abort method execution if asked to release not acquired mutex acoidump warnings 2.6.18-rc1 patch to fix acpi_os_get_thread_id() namespace fix from upcoming release test /var/log/warn Patch to move deletion of methods' namespace under its mutex |
Description
Norbert Preining
2006-06-13 16:37:06 UTC
We are interested in "Most recent kernel where this bug did not occur" Please try the latest stable kernel; 2.6.16.20 is available now, Please try the 2.6.17-rc6-mm2 with option ec_intr=0 With 2.6.16.20 no error messages With 2.6.17-rc6-mm2 and ec_intr=0 same error messages as without ec_intr=0 More tests? Please attach acpidump output Created attachment 8316 [details]
acpidmp output
Here is it (above was the link)
Created attachment 8317 [details]
acpidmp output
Please try the latest stable kernel 2.6.17 - it will give us more info. As with 2.6.16.20 no error messages with 2.6.17. So, do you think the bug should be closed? If NOT, please reopen. If Yes, and you do want to know why 2.6.17 works, please take a look at changelog in which you could find out the individual patch. For me it is ok to close the bug. I only thought that what is in the -mm kernels for acpi will go into the main kernel in the next rc round, so I thought this bug might pop in again. We will see. Created attachment 8360 [details]
Do not abort method execution if asked to release not acquired mutex
There is no reason to abort method execution if mutex is not acquired. Printing
of error message should be sufficient.
What is really going on here? In the DSDT, all the releases have a corresponding acquire with an infinite wait, so this error should not happen. It seems that there is an error in our code, that either releases ASL mutex twice, etc. In either case the correct behavior seems to not abort execution. Error seems to vanish in 2.6.17, so we should just apply this patch. Alexey confirms that this regression is seen on an HP NX6125, and that it was not present in 2.6.17 (ACPICA 20060127). ACPI Error (exmutex-0248): Cannot release Mutex [C112], not acquired [20060623] ACPI Error (psparse-0537): Method parse/execution failed [\_TZ_.TZ4_._TMP] (Node ffff810017d47810), AE_AML_MUTEX_NOT_ACQUIRED looking at the latest DSDT for the HP NX6125 in bug #5534 it contains the construct where a mutex is acquired in one method and release in another. while it isn't he mutex in the error message, it may be related: Method (C194, 0, NotSerialized) { Acquire (C191, 0xFFFF) Store (0x55, C17C) } Method (C195, 0, NotSerialized) { Store (0xAA, C17C) Release (C191) } Created attachment 8502 [details]
acoidump
acpidump
Created attachment 8503 [details]
warnings
warnings
This is an additional information about nx6125 system. The errors are occured on kernel 2.6.18-rc1 and others git*. The simple program tcatacpi and /var/log/warn are attached. Run the "tcatacpi" in parallel and see the results at least on nx6125 and F3200 Created attachment 8505 [details]
2.6.18-rc1 patch to fix acpi_os_get_thread_id()
please try this patch
ACPI: acpi_os_get_thread_id() returns current
While the Linux mutexes and the debug code that
that reference acpi_os_get_thread_id() are happy with 0,
the AML mutexes in exmutex.c expect a unique non-zero
number for each thread - as they track this thread_id
to permit the mutex re-entrancy defined by the ACPI spec.
Verified on '2.6.18-rc1 + patch from comment #17'. AML mutexes work fine, but other two error messages still exist - AE_ALREADE_EXISTS and AE_NOT_FOUND,. See warnings from comment #15. Created attachment 8507 [details]
namespace fix from upcoming release
Please check if this fix from 20060707 fixes the namespace problem.
Created attachment 8510 [details]
test
Created attachment 8511 [details]
/var/log/warn
Verified on '2.6.18-rc1 + patch #17 + patch #18'.
AE_ALREADE_EXISTS and AE_NOT_FOUND still exist. But there is progress - the
frequency of error messages is very low.
I ran in a loop 'cat /proc/acpi files' concurensly, this test script and
/var/log/warn are attached.
It is possible that the AE_ALREADY_EXISTS problem is fixed in ACPICA version 20060707, since this is a serialized method that creates namespace objects. From the release note: Fixed a problem with Serialized control methods where the semaphore associated with the method could be over-signaled after multiple method invocations. Created attachment 8538 [details]
Patch to move deletion of methods' namespace under its mutex
Here is a patch against 2.6.18-rc1... to move deletion of methods namespace
under mutex of this method. No more errors in dmesg on nx6125
Problem still exists with the last patch. RE: Patch in comment #23 The interpreter is locked during the execution of TerminateMethod, therefore another thread cannot begin execution of a method. I believe that this means that the signal/deletion order should not matter. Jul 7 21:57:14 linux kernel: ACPI Error (dsfield-0441): [C0A4] Namespace lookup failure, AE_ALREADY_EXISTS Jul 7 21:57:14 linux kernel: ACPI Error (dsfield-0441): [C0A5] Namespace lookup failure, AE_ALREADY_EXISTS Jul 7 21:57:14 linux kernel: ACPI Error (dsfield-0441): [C0A6] Namespace lookup failure, AE_ALREADY_EXISTS Jul 7 21:57:14 linux kernel: ACPI Error (dsfield-0441): [C0A7] Namespace lookup failure, AE_ALREADY_EXISTS Jul 7 21:57:14 linux kernel: ACPI Error (dsfield-0441): [C0A8] Namespace lookup failure, AE_ALREADY_EXISTS Jul 7 21:57:14 linux kernel: ACPI Error (dsfield-0441): [C0A9] Namespace lookup failure, AE_ALREADY_EXISTS Jul 7 21:57:14 linux kernel: ACPI Error (dsfield-0441): [C0AA] Namespace lookup failure, AE_ALREADY_EXISTS Jul 7 21:57:14 linux kernel: ACPI Error (psargs-0355): [C0A7] Namespace lookup failure, AE_NOT_FOUND If these do not go away, try the acpi_serialize option. Any update on this problem please. Thanks. As I said already, I am for closing this bug, since quite some time it does not occur anymore. Len, Alexey, would you agree we should close this bug? |