Bug 1781

Summary: AE_AML_MUTEX_NOT_ACQUIRED error messages followed by hang
Product: ACPI Reporter: Mitch (mitch)
Component: Power-OtherAssignee: Robert Moore (Robert.Moore)
Status: REJECTED UNREPRODUCIBLE    
Severity: high CC: acpi-bugzilla, bunk
Priority: P2    
Hardware: i386   
OS: Linux   
Kernel Version: 2.6.0 Subsystem:
Regression: --- Bisected commit-id:
Attachments: kernel boot messages
asl file from the compaq evo n600c
Luming's global lock mutex patch
an update
a proposal
update version

Description Mitch 2004-01-02 17:42:21 UTC
Distribution: Home built
Hardware Environment: Compaq Evo n600c laptop
Software Environment: Console or X
Problem Description:

When running 2.6.0 and with or without the ACPI patch acpi-20031203-2.6.0.diff.bz2
after about 20 minutes of uptime i get the following error messages 

Jan  3 00:49:32 client kernel:     ACPI-0245: *** Error: Cannot release Mutex
[_GL_], not acquired
Jan  3 00:49:32 client kernel:     ACPI-1120: *** Error: Method execution failed
[\_SB_.C03E.C053.C0D1.C11C] (Node c7f64540), AE_AML_MUTEX_NOT_ACQUIRED
Jan  3 00:49:32 client kernel:     ACPI-1120: *** Error: Method execution failed
[\_SB_.C03E.C053.C0D1.C12E] (Node c7f64320), AE_AML_MUTEX_NOT_ACQUIRED
Jan  3 00:49:32 client kernel:     ACPI-1120: *** Error: Method execution failed
[\_SB_.C03E.C053.C0D1.C134] (Node c7f642a0), AE_AML_MUTEX_NOT_ACQUIRED
Jan  3 00:49:32 client kernel:     ACPI-1120: *** Error: Method execution failed
[\_SB_.C03E.C053.C0D1.C13B] (Node c7f64240), AE_AML_MUTEX_NOT_ACQUIRED
Jan  3 00:49:32 client kernel:     ACPI-1120: *** Error: Method execution failed
[\_SB_.C13B] (Node c7f60d20), AE_AML_MUTEX_NOT_ACQUIRED
Jan  3 00:49:32 client kernel:     ACPI-1120: *** Error: Method execution failed
[\_SB_.C19F._BST] (Node c7f60be0), AE_AML_MUTEX_NOT_ACQUIRED

this hangs the keyboard and i can no longer do anything on the machine.
I can remotely login and reboot the machine, but if left unattended the fans
no longer come on, so i've marked this as high severity.

I attach a kernel.log file and my Compaq-Evo_N600c.asl file. The laptop
is running the lastest Bios from HP(Compaq).

There are also other AE_AML_UNINITIALIZED_LOCAL messages in the kernel.log
file. I'm unsure if these are related. Before the hang, everything seems
to be working fine (i.e fans turn on, thermal zone reports temperatures, etc)

Steps to reproduce:
Enable ACPI on a n600c laptop and wait 30 minutes or so.
Comment 1 Mitch 2004-01-02 17:43:05 UTC
Created attachment 1784 [details]
kernel boot messages
Comment 2 Mitch 2004-01-02 17:44:28 UTC
Created attachment 1785 [details]
asl file from the compaq evo n600c
Comment 3 Luming Yu 2004-01-05 22:01:42 UTC
Before digging, I hope you try patches filed at bug 1766 and bug 1791
Comment 4 Mitch 2004-01-08 02:30:36 UTC
Hi Luming,

I tried both the patches in those bugs to no avail. I get exactly
the same behavior as before with the hang and the AE_AML_MUTEX_NOT_ACQUIRED
messages.

Please let me know what you'd like me to try next.
Comment 5 Luming Yu 2004-01-15 00:21:08 UTC
The first error is an obvious BIOS bug.
1.  ACPI-1120: *** Error: Method execution failed [\_SB_.C03E.C053.C0D1.C12E] 
(Node c7f64320), AE_AML_UNINITIALIZED_LOCAL

C12E return Local0 which will not be initialized in all execution path.

2. try to release mutex which was not acquired.
Please try patch filed at bug 1669

Comment 6 Mitch 2004-01-15 06:57:05 UTC
With patch from bug 1669 i get a panic on boot. Copied by hand

Unable to hande kernel NULL pointer dereferenced at virtual address 99999999
 printing eip:
c019fa93
*pde = 00000000
Oops: 0002 [#1]
CPU:0
EIP: 0060:[<c019fa93>] Not tainted
EIP is at acpi_ev_acquire_global_lock:0x3f/0x6c
eax: 00000000 ebx: d7fcdf60 ecx: 0000ffff edx:000000002
esi: c13edc00 edi: d7eedaa0 ebp: 00000005 esp: d7f79d8c
ds: 007b es: 007b ss: 0068
Proces  swapper (pid:1, threadinfo=d7f7800 task=d7f6f900)
Call Trace:
acpi_ex_system_acquire_mutex+0x2d/0x48
acpi_ex_acquire_mutex+0xb2/0xec
acpi_ex_opcide_2A_0T_1R+0xa8/0x140
acpi_ds_exec_end_op+0xb1/0x260
acpi_ps_parse_loop+0x57c/0x890
acpi_ut_release_mutex+0x7f/0x88
acpi_ut_release_mutex+0x7f/0x88
acpi_ut_acquire_from_cache+0x45/0xa4
acpi_create_generic_state+0x7/0x14
acpi_ps_parse_aml+0x36/0x1a8
acpi_ps_parse_aml+0x53/0x1a8
acpi_psx_execute+0x15c/0x1c0
...
                                                                                
                                                                                
Next suggestion ?
Comment 7 Robert Moore 2004-01-16 14:34:05 UTC
This may be related to the global lock code problems reported earlier
Comment 8 Mitch 2004-01-18 09:33:46 UTC
Hi Luming, with the patch you sent me privately when bugzilla was down,
i don't get a panic, but unfortunately i get a complete system hard hang and
the bootup never completes. Not even sysrq key working. This happens with
or without the acpi-20031203-2.6.1 patch. The hang is quite early in the
boot process - probably a mutex deadlock. The last messages seen on the
console is

ACPI: Subsystem revision 20031203
ACPI: IRQ SCI: Edge Trigger
ACPI: IRQ 9 SCI: Edge set to Level Triggerd


(p.s. there's a typo in "Triggerd" in the kernel)

I also tried boot option pci=noacpi to disable acpi routing but no luck.

Comment 9 Mitch 2004-01-18 09:38:54 UTC
Created attachment 1895 [details]
Luming's global lock mutex patch
Comment 10 Luming Yu 2004-01-19 00:14:54 UTC
Created attachment 1903 [details]
an update

If use m operand constraint, the gcc seems to like :
c01fbfa0:	a1 1c 5a 4f c0		mov    0xc04f5a1c,%eax
c01fbfa5:	89 c2			mov    %eax,%edx

rather than:
c01fbfa0:	a1 1c 5a 4f c0		mov    0xc04f5a1c,%eax
c01fbfa5:	89 c2			mov    (%eax),%edx
Comment 11 Luming Yu 2004-01-19 00:58:12 UTC
But I guess the global lock mutex patch could not solve Mutex release error.
Comment 12 Luming Yu 2004-01-19 01:35:10 UTC
Because I didn't find any error mesage related to failure of acquiring _GL .
Comment 13 Mitch 2004-01-19 09:31:52 UTC
Luming, do you still wnat me to try the patch in comment 10 ?
Comment 14 Luming Yu 2004-01-19 17:14:16 UTC
Yes, you can have it a try. And post testing result.

Thanks,
Luming
Comment 15 Mitch 2004-01-20 11:38:22 UTC
Luming, i tried the new patch in comment 10 and although i don't get the
hang at boot, i do get the subsequent hang with the AE_AML_MUTEX_NOT_ACQUIRED
messages.
Comment 16 Luming Yu 2004-02-12 00:05:52 UTC
Hmm, currect ACPI CA has bug on handling \_GL as global mutex in AML method. 
It could be acquired by different thread, that is different with other mutex 
which only can be acquired by one thread.
Comment 17 Luming Yu 2004-02-12 00:10:34 UTC
Created attachment 2089 [details]
a proposal

this is a quick fix, please have it a try!
Comment 18 Luming Yu 2004-02-12 00:17:26 UTC
Created attachment 2090 [details]
update version

a proposal
Comment 19 Pierre Chifflier 2004-02-15 02:48:20 UTC
Since there is still no answer, I'll add a 'me too' here.
I have an evo N620C laptop, which suffers the same problem.

Luming using your latest patch (id=2090) with vanilla 2.6.2 seems to solve the
AE_AML_MUTEX_NOT_ACQUIRED problem (no hang, no message), but there remains
another important problem: fans are not working at all (so acpi is for the
moment unusable).
Please have a look at bug 2083
Thanks
Comment 20 Robert Moore 2004-03-11 14:53:50 UTC
Implemented for version 20040311:


Global Lock Support:  Now allows multiple acquires and releases with any 
internal thread.  Removed concept of "owning thread" for this special mutex.


Comment 21 Mitch 2004-03-14 09:28:26 UTC
Confirm Pierre's findings. With the new patch installed, no more hangs
and ACPI appears to all be working except that the fans don't come on
automatically. I can manually switch the fan on with

   echo on >/proc/acpi/fan/C1F?/state

can't switch them off however with "echo off" 

So still basically unuseable. Back to apm.
Comment 22 Robert Moore 2006-07-21 12:55:14 UTC
Please post the acpidump for this machine, or tell me what format the ASL 
attachment is in.

I would like to look a little closer at the exact cause of the problem.
Comment 23 Mitch 2006-07-22 03:11:22 UTC
This bug was logged 2 years ago. I no longer have the laptop. Sorry.