Most recent kernel where this bug did not occur: no idea Distribution: Debian/sid Hardware Environment: Acer TravelMate 3012WTMi Software Environment: ??? Problem Description: The kernel spits out some warning messages (several of them, but not too many) ACPI Error (exmutex-0248): Cannot release Mutex [MUT1], not acquired [20060310] ACPI Error (psparse-0522): Method parse/execution failed [\_SB_.BAT1._BST] (Node f7c12a80), AE_AML_MUTEX_NOT_ACQUIRED ACPI Exception (acpi_battery-0201): AE_AML_MUTEX_NOT_ACQUIRED, Evaluating _BST [20060310] Steps to reproduce: reboot The dmesg and acpidmp output can be found here: dmesg: http://www.logic.at/people/preining/acer/acpi/dmesg.txt acpidmp: http://www.logic.at/people/preining/acer/acpi/acpidmp.txt Best wishes Norbert
We are interested in "Most recent kernel where this bug did not occur" Please try the latest stable kernel; 2.6.16.20 is available now, Please try the 2.6.17-rc6-mm2 with option ec_intr=0
With 2.6.16.20 no error messages With 2.6.17-rc6-mm2 and ec_intr=0 same error messages as without ec_intr=0 More tests?
Please attach acpidump output
Created attachment 8316 [details] acpidmp output Here is it (above was the link)
Created attachment 8317 [details] acpidmp output
Please try the latest stable kernel 2.6.17 - it will give us more info.
As with 2.6.16.20 no error messages with 2.6.17.
So, do you think the bug should be closed? If NOT, please reopen. If Yes, and you do want to know why 2.6.17 works, please take a look at changelog in which you could find out the individual patch.
For me it is ok to close the bug. I only thought that what is in the -mm kernels for acpi will go into the main kernel in the next rc round, so I thought this bug might pop in again. We will see.
Created attachment 8360 [details] Do not abort method execution if asked to release not acquired mutex There is no reason to abort method execution if mutex is not acquired. Printing of error message should be sufficient.
What is really going on here? In the DSDT, all the releases have a corresponding acquire with an infinite wait, so this error should not happen.
It seems that there is an error in our code, that either releases ASL mutex twice, etc. In either case the correct behavior seems to not abort execution. Error seems to vanish in 2.6.17, so we should just apply this patch.
Alexey confirms that this regression is seen on an HP NX6125, and that it was not present in 2.6.17 (ACPICA 20060127). ACPI Error (exmutex-0248): Cannot release Mutex [C112], not acquired [20060623] ACPI Error (psparse-0537): Method parse/execution failed [\_TZ_.TZ4_._TMP] (Node ffff810017d47810), AE_AML_MUTEX_NOT_ACQUIRED looking at the latest DSDT for the HP NX6125 in bug #5534 it contains the construct where a mutex is acquired in one method and release in another. while it isn't he mutex in the error message, it may be related: Method (C194, 0, NotSerialized) { Acquire (C191, 0xFFFF) Store (0x55, C17C) } Method (C195, 0, NotSerialized) { Store (0xAA, C17C) Release (C191) }
Created attachment 8502 [details] acoidump acpidump
Created attachment 8503 [details] warnings warnings
This is an additional information about nx6125 system. The errors are occured on kernel 2.6.18-rc1 and others git*. The simple program tcatacpi and /var/log/warn are attached. Run the "tcatacpi" in parallel and see the results at least on nx6125 and F3200
Created attachment 8505 [details] 2.6.18-rc1 patch to fix acpi_os_get_thread_id() please try this patch ACPI: acpi_os_get_thread_id() returns current While the Linux mutexes and the debug code that that reference acpi_os_get_thread_id() are happy with 0, the AML mutexes in exmutex.c expect a unique non-zero number for each thread - as they track this thread_id to permit the mutex re-entrancy defined by the ACPI spec.
Verified on '2.6.18-rc1 + patch from comment #17'. AML mutexes work fine, but other two error messages still exist - AE_ALREADE_EXISTS and AE_NOT_FOUND,. See warnings from comment #15.
Created attachment 8507 [details] namespace fix from upcoming release Please check if this fix from 20060707 fixes the namespace problem.
Created attachment 8510 [details] test
Created attachment 8511 [details] /var/log/warn Verified on '2.6.18-rc1 + patch #17 + patch #18'. AE_ALREADE_EXISTS and AE_NOT_FOUND still exist. But there is progress - the frequency of error messages is very low. I ran in a loop 'cat /proc/acpi files' concurensly, this test script and /var/log/warn are attached.
It is possible that the AE_ALREADY_EXISTS problem is fixed in ACPICA version 20060707, since this is a serialized method that creates namespace objects. From the release note: Fixed a problem with Serialized control methods where the semaphore associated with the method could be over-signaled after multiple method invocations.
Created attachment 8538 [details] Patch to move deletion of methods' namespace under its mutex Here is a patch against 2.6.18-rc1... to move deletion of methods namespace under mutex of this method. No more errors in dmesg on nx6125
Problem still exists with the last patch.
RE: Patch in comment #23 The interpreter is locked during the execution of TerminateMethod, therefore another thread cannot begin execution of a method. I believe that this means that the signal/deletion order should not matter.
Jul 7 21:57:14 linux kernel: ACPI Error (dsfield-0441): [C0A4] Namespace lookup failure, AE_ALREADY_EXISTS Jul 7 21:57:14 linux kernel: ACPI Error (dsfield-0441): [C0A5] Namespace lookup failure, AE_ALREADY_EXISTS Jul 7 21:57:14 linux kernel: ACPI Error (dsfield-0441): [C0A6] Namespace lookup failure, AE_ALREADY_EXISTS Jul 7 21:57:14 linux kernel: ACPI Error (dsfield-0441): [C0A7] Namespace lookup failure, AE_ALREADY_EXISTS Jul 7 21:57:14 linux kernel: ACPI Error (dsfield-0441): [C0A8] Namespace lookup failure, AE_ALREADY_EXISTS Jul 7 21:57:14 linux kernel: ACPI Error (dsfield-0441): [C0A9] Namespace lookup failure, AE_ALREADY_EXISTS Jul 7 21:57:14 linux kernel: ACPI Error (dsfield-0441): [C0AA] Namespace lookup failure, AE_ALREADY_EXISTS Jul 7 21:57:14 linux kernel: ACPI Error (psargs-0355): [C0A7] Namespace lookup failure, AE_NOT_FOUND If these do not go away, try the acpi_serialize option.
Any update on this problem please. Thanks.
As I said already, I am for closing this bug, since quite some time it does not occur anymore.
Len, Alexey, would you agree we should close this bug?