Bug 8171

Summary: acpi_serialize locks system during boot
Product: ACPI Reporter: Colchao (colchaodemola)
Component: ACPICA-CoreAssignee: Robert Moore (Robert.Moore)
Status: CLOSED PATCH_ALREADY_AVAILABLE    
Severity: blocking CC: acpi-bugzilla
Priority: P2    
Hardware: i386   
OS: Linux   
Kernel Version: 2.6.21-rc3 Subsystem:
Regression: Yes Bisected commit-id:
Attachments: acpidump output
2.6.21-rc3 patch to revert ACPICA acpi_serialize changes
serialized bug fix

Description Colchao 2007-03-10 18:02:18 UTC
I have a hp notebook that needs acpi_serialize to acpi functions work properly.
Well , everything work fine on 2.6.20 but now after try 2.6.21-rc3 , my system
locks during boot when i pass acpi_serialize option.
The last three lines before the lock are:

ACPI: Interpreter Enabled
ACPI: (Supports S0 S3 S4 S5)
ACPI: Using IOAPIC for interrupt routing.

so the system locks [ i waited five minutes just to be sure ]...
Comment 1 salatiel.filho 2007-03-11 13:58:34 UTC
Created attachment 10697 [details]
acpidump output
Comment 2 salatiel.filho 2007-03-11 13:59:50 UTC
i have exactly the same problem here. Kernel stops during boot right after those
same messages. I attached the acpidump output that can help to trace the problem.
Comment 3 Len Brown 2007-03-11 17:55:10 UTC
Does acpi_serialize still deadlock on linux-2.6.3-rc3-git6 or later?

Please include information on why you are using the option
in the first place.  What bad things happened in 2.6.20 when
you did not use acpi_serialize?
Comment 4 salatiel.filho 2007-03-11 18:46:39 UTC
i will  test in git and post results.

In old kernel i had a lot of :
 A lot of
Jan  6 00:43:49 localhost kernel: ACPI Error (psargs-0355): [PBST]
Namespace lookup failure, AE_NOT_FOUND
Jan  6 00:43:49 localhost kernel: ACPI Error (psparse-0537): Method
parse/execution failed [\_SB_.PCI0.LPCB.BAT1._BST] (Node c1910770),
AE_NOT_FOUND
Jan  6 00:43:49 localhost kernel: ACPI Exception (acpi_battery-0207):
AE_NOT_FOUND, Evaluating _BST [20060707]
Jan  6 00:44:19 localhost kernel: ACPI Error (psargs-0355): [PBST]
Namespace lookup failure, AE_NOT_FOUND
Jan  6 00:44:19 localhost kernel: ACPI Error (psparse-0537): Method
parse/execution failed [\_SB_.PCI0.LPCB.BAT1._BST] (Node c1910770),
AE_NOT_FOUND
Jan  6 00:44:19 localhost kernel: ACPI Exception (acpi_battery-0207):
AE_NOT_FOUND, Evaluating _BST [20060707]
Jan  6 00:44:49 localhost kernel: ACPI Error (psargs-0355): [PBST]
Namespace lookup failure, AE_NOT_FOUND
Jan  6 00:44:49 localhost kernel: ACPI Error (psparse-0537): Method
parse/execution failed [\_SB_.PCI0.LPCB.BAT1._BST] (Node c1910770),
AE_NOT_FOUND
Jan  6 00:44:49 localhost kernel: ACPI Exception (acpi_battery-0207):
AE_NOT_FOUND, Evaluating _BST [20060707]


this way , power management did not work. Suspend did not work , and a lot of
related problems. acpi_serialize fixed that.

Comment 5 salatiel.filho 2007-03-11 19:05:41 UTC
I`ve just tested in 2.6.21-rc3-git7 and still lock :/
Comment 6 Len Brown 2007-03-12 09:19:03 UTC
any chance you can bisect to see where the hang starts?
In particular, can you see if it starts with this commit?

1ba753acb372c2955a4843302e92e49ce82e2fea

Author: Bob Moore <robert.moore@intel.com>  2007-02-02 11:48:20
Committer: Len Brown <len.brown@intel.com>  2007-02-02 21:14:24
Parent: 95befdb398e0112ede80529f6770644ecfa5a82e (ACPICA: Create tbfadt.c to 
hold all FADT-related functions)
Child:  765ec20180fb70b4ee9d730167b2a0b76879f791 (ACPICA: Delete stale FADT 
functions outside tbfadt.c.)
Branches: origin, master
Follows: v2.6.20-rc7
Precedes: v2.6.21-rc1
Comment 7 salatiel.filho 2007-03-12 09:28:45 UTC
how can i do this ?
I download 2.6.20rc7 and do what ?

i could boot perfectly in 2.6.20 with acpi_serialize so i think that is not the
problem.
Comment 8 salatiel.filho 2007-03-12 11:16:14 UTC
Well , tonight i will try 2.6.21-rc1-git1..7.
A doubt , sould i apply the git 2.6.21-rc1-gitXX patch over 2.6.20.2 ? Are they
cummulative or i must apply one by one ?
Comment 9 Len Brown 2007-03-12 12:44:11 UTC
> Follows: v2.6.20-rc7
I'm sorry this line was mis-leading, it also follows 2.6.20.
> Precedes: v2.6.21-rc1

Since git keeps track of patch dependencies, and later patches
are on top of this one it is simplest to get a git tree,
and roll the history back to the exact place that the patch occurred.

git bisect start
git bisect good v2.6.20
git-bisect bad v2.6.21-rc1
git checkout -b expect-bad 1ba753acb372c2955a4843302e92e49ce82e2fea

if good: bit bisect good
if bad:  git bisect bad

git checkout -b expect-good 95befdb398e0112ede80529f6770644ecfa5a82e

if good: git bisect good
if bad:  git bisects bad

git visualize

The guess here, of course, is that "expect-bad" is bad,
and "expect-good" is good.
However, if this guess is wrong, you'll be able to continue
to use git bisect to look elswhere using bisect's choice of suspects.
 
Comment 10 salatiel.filho 2007-03-12 14:48:40 UTC
sorry , but i have never used git before.
How can i get the kernel from repository to start use those commands you sent me ?
Comment 11 Len Brown 2007-03-13 00:14:23 UTC
grab an RPM for git for your distro if it isn't installed already.
or get a git archive tar file: http://kernel.org/pub/software/scm/git/

If you want to track the very latest version of git, use that to get it:
git clone git://git.kernel.org/pub/scm/git/git.git

Get the current copy of Linus' kernel tree:
git clone git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux-2.6.git

Initial git tutorial lives here:
http://www.kernel.org/pub/software/scm/git/docs/tutorial.html
Comment 12 salatiel.filho 2007-03-14 04:05:21 UTC
can not test : 
%cd /TMP/bisect/
%cd linux-2.6/
%git bisect start
%git bisect good v2.6.20
%git-bisect bad v2.6.21-rc1
Bisecting: 1614 revisions left to test after this
[574009c1a895aeeb85eaab29c235d75852b09eb8] Merge branch 'upstream' of git://ftp$
%git checkout -b expect-bad 1ba753acb372c2955a4843302e92e49ce82e2fea
%make menuconfig
%make
...
  LD      .tmp_vmlinux1
drivers/built-in.o(.text+0x2e3b0): In function `acpi_tb_parse_fadt':
: undefined reference to `acpi_tb_install_table'
drivers/built-in.o(.text+0x2e3c8): In function `acpi_tb_parse_fadt':
: undefined reference to `acpi_tb_install_table'
make: *** [.tmp_vmlinux1] Error 1
%

Comment 13 Len Brown 2007-03-15 00:29:57 UTC
rats!  Sorry I lead you on a wild-goose chase to a commit
that doesn't even build -- buy hey, now you know how to
use git and do a bisect:-)

I've reproduced the "acpi_serialize" hang on my t41,
so I should be able to work on it here.
Comment 14 Len Brown 2007-03-15 01:05:36 UTC
Created attachment 10774 [details]
2.6.21-rc3 patch to revert ACPICA acpi_serialize changes

Please test this patch with "acpi_serialize"
Comment 15 salatiel.filho 2007-03-15 05:04:07 UTC
well , it is working great now  , ooops , not great 'cause  you told me
acpi_serialize is just a workaround :), let`s say that the workaround is working
great :D . Thanks.
Comment 16 Len Brown 2007-03-15 22:36:41 UTC
heh, yeah, we're back to where we started.
we still need to go figure out why it is necessary to
manually use "acpi_serialize" on this box in the first place.
Comment 17 salatiel.filho 2007-03-16 00:50:05 UTC
if there is anything else i can do to help in this , just ask :)
Comment 18 Len Brown 2007-03-23 09:18:16 UTC
patch in comment #14 shipped in linux-2.6.21-rc4-git7
so this regression vs 2.6.20 should be gone,
and this bug is closed.

yes, it is also a bug that booting with acpi_serialize
has been necessary to boot your system -- no matter what release.
Please open a new bug for that issue.

Comment 19 Colchao 2007-07-25 10:35:19 UTC
Hi, this problem has appeared again in 2.6.22. I can not boot my machine with acpi_serialize again.

Systel Locks here:

ACPI: INTERPRETER ENABLED
ACPI: (SUPPORT S0 S3 S4 S5)
ACPI: USING IOAPIC FOR INTERRUPT ROUTING

SO IT STOPS HERE.
Comment 20 Len Brown 2007-07-25 12:53:33 UTC
your machine booted properly with 2.6.21 and acpi_serialize,
but fails to boot with 2.6.22 and acpi_serialize?
Any chance you can bisect where it broke?

Please point me to the bug report showing why
acpi_serialize is necessary on this machine.
Comment 21 Colchao 2007-07-25 13:06:10 UTC
Same problem that Comment #4
power management , suspend ...

I am on dial up so i think it wont be possible use git :/
Last kernel i tried was 2.6.21.5 and it worked fine.
Comment 22 Len Brown 2007-07-25 18:54:59 UTC
Salatiel --
are you having this problem again also?
Comment 23 salatiel.filho 2007-07-25 19:01:02 UTC
Sorry the delay , i was downloading 2.6.22 ...
Yeah, same problem here. My machine also locks.
I have opened a bug about my machine need acpi_serialize today  as soon as i saw this post :).
Everything was working great with 2.6.21 that i do not even try to update to 2.6.22. 
Comment 24 salatiel.filho 2007-08-19 11:55:51 UTC
any news on this ? 
Comment 25 Robert Moore 2007-09-10 14:36:45 UTC
Was _BST the only method failing?

If so, you might want to disassemble the DSDT, patch the _BST declaration to make it serialized, then override the DSDT. Might provide another data point.

-Method (_BST, 0, NotSerialized)
+Method (_BST, 0, Serialized)
Comment 26 salatiel.filho 2007-09-11 06:29:52 UTC
? :)
Comment 27 Robert Moore 2007-09-11 15:47:06 UTC
Please try compiling the ACPICA code with the ACPI_MUTEX_DEBUG flag defined. Actually, just the utilities/utmutex.c file will suffice.

Then, see if any error messages are produced before the system hangs.
Thanks.
Comment 28 salatiel.filho 2007-09-13 10:00:14 UTC
I do not know if i did right , i just add a line
#define ACPI_MUTEX_DEBUG TRUE

well , the strange part is that after that the system does not lock anymore. But i really got a lot of errors in the point where it should lock.


Executed 1 _INI methods requiring 0 _STA executions (examined 63 objects)
ACPI: Interpreter enabled
ACPI: (supports S0 S3 S4 S5)
ACPI: Using IOAPIC for interrupt routing
* IT USED TO LOCK HERE*
ACPI Error (utmutex-0231): Mutex [ACPI_MTX_Interpreter] already acquired by this thread [DFD27A70] [20070126]
ACPI Error (exutils-0096): Could not acquire AML Interpreter mutex [20070126]
ACPI Error (utmutex-0306): Mutex [0] is not acquired, cannot release [20070126]
ACPI Error (exutils-0156): Could not release AML Interpreter mutex [20070126]
ACPI Error (utmutex-0231): Mutex [ACPI_MTX_Interpreter] already acquired by this thread [DFD27A70] [20070126]
ACPI Error (exutils-0096): Could not acquire AML Interpreter mutex [20070126]
ACPI Error (utmutex-0306): Mutex [0] is not acquired, cannot release [20070126]
ACPI Error (exutils-0156): Could not release AML Interpreter mutex [20070126]
ACPI Error (utmutex-0231): Mutex [ACPI_MTX_Interpreter] already acquired by this thread [DFD27A70] [20070126]
ACPI Error (exutils-0096): Could not acquire AML Interpreter mutex [20070126]
ACPI Error (utmutex-0306): Mutex [0] is not acquired, cannot release [20070126]
ACPI Error (exutils-0156): Could not release AML Interpreter mutex [20070126]
ACPI Error (utmutex-0231): Mutex [ACPI_MTX_Interpreter] already acquired by this thread [DFD27A70] [20070126]
ACPI Error (exutils-0096): Could not acquire AML Interpreter mutex [20070126]
ACPI Error (utmutex-0306): Mutex [0] is not acquired, cannot release [20070126]
ACPI Error (exutils-0156): Could not release AML Interpreter mutex [20070126]
ACPI: PCI Root Bridge [PCI0] (0000:00)
PCI: Probing PCI hardware (bus 00)
...
Comment 29 Robert Moore 2007-09-13 14:27:52 UTC
Thanks. I have solved the problem.
Patch forthcoming.
Bob
Comment 30 Robert Moore 2007-09-20 17:21:01 UTC
Created attachment 12890 [details]
serialized bug fix

Please try this patch
Comment 31 salatiel.filho 2007-09-20 18:28:44 UTC
It worked :)
thanks.
Comment 32 salatiel.filho 2007-09-28 14:38:57 UTC
isn`t this patch already commited, is it ? I tried the kernel 2.6.22.9 and it still locks.
Comment 33 Robert Moore 2007-09-28 14:41:07 UTC
Len Brown handles the actual Linux patches. I don't know if it has made it into the kernel yet.
Comment 34 Fu Michael 2007-10-20 18:05:20 UTC
update bug status to resolved.
Comment 35 Len Brown 2008-01-10 20:17:11 UTC
hrm, so it seems that...

2.6.20: okay -- baseline for this report
2.6.21: okay -- .21-rc regression fixed by patch in comment #14
2.6.22: broken -- needs patch in comment #30
2.6.23: broken -- needs patch in comment #30
2.6.24: broken -- needs patch in comment #30

added patch in comment #30 to acpi test tree

Bob, how did we break this in 2.6.22? -- we didn't update
ACPICA between 2.6.21 and 2.6.22, they're both 20070126.
Comment 36 Robert Moore 2008-01-11 07:40:13 UTC
Not sure, depends on when the changes below were integrated into Linux.

Problem was introduced here:

12 September 2006. Summary of changes for version 20060912:
Enhanced the implementation of the "serialized mode" of the interpreter 
(enabled via the AcpiGbl_AllMethodsSerialized flag.) When this mode is 
specified, instead of creating a serialization semaphore per control method, 
the interpreter lock is simply no longer released before a blocking 
operation during control method execution. This effectively makes the AML 
Interpreter single-threaded. The overhead of a semaphore per-method is 
eliminated.

Problem was solved here:

19 September 2007. Summary of changes for version 20070919:
Fixed a problem where the use of the AcpiGbl_AllMethodsSerialized flag 
(acpi_serialized option on Linux) could cause some systems to hang during 
initialization. (Bob Moore) BZ 8171
Comment 37 Len Brown 2008-01-13 23:14:56 UTC
patch in comment #30
shipped in linux-2.6.24-rc7-git5

closed.