Bug 2339

Summary: Unable to start the ACPI Interpreter -- ACPICA 20040311 regression
Product: ACPI Reporter: Len Brown (lenb)
Component: ACPICA-CoreAssignee: Robert Moore (Robert.Moore)
Status: REJECTED INVALID    
Severity: high CC: acpi-bugzilla, andrew
Priority: P2    
Hardware: i386   
OS: Linux   
Kernel Version: 2.4.25-pre5 Subsystem:
Regression: --- Bisected commit-id:
Attachments: 2.4.26-pre5.dmesg
2.4.26-pre4.dmesg
lspci -vv
config
Output of acpidmp command under 2.4.26-pre5
patch to enable ACPI_DEBUG_HIGH
Snapshot of ACPI debug output

Description Len Brown 2004-03-20 18:48:36 UTC
Andrew Clayton noticed this regression in 2.4.26-pre5 (20040311) 
as compared to 2.4.25-pre4 (20040220) 
 
Fujitsu-Siemens C-1020 laptop running Fedora Core 1 
 
$ grep ACPI 2.4.26-pre5.dmesg 
ACPI: Subsystem revision 20040311 
    ACPI-0433: *** Warning: Existing references (4) on node being deleted (c129e660) 
    ACPI-0433: *** Warning: Existing references (9) on node being deleted (c12aeda0) 
ACPI: IRQ9 SCI: Edge set to Level Trigger. 
    ACPI-0097: *** Error: Unable to initialize general purpose events, AE_NOT_FOUND 
ACPI: Unable to start the ACPI Interpreter 
    ACPI-0433: *** Warning: Existing references (65419) on node being deleted (c129e4e0) 
PCI: ACPI tables contain no PCI IRQ routing entries
Comment 1 Len Brown 2004-03-20 18:49:23 UTC
Created attachment 2376 [details]
2.4.26-pre5.dmesg
Comment 2 Len Brown 2004-03-20 18:49:45 UTC
Created attachment 2377 [details]
2.4.26-pre4.dmesg
Comment 3 Len Brown 2004-03-20 18:50:25 UTC
Created attachment 2378 [details]
lspci -vv
Comment 4 Len Brown 2004-03-20 18:50:53 UTC
Created attachment 2379 [details]
config
Comment 5 Len Brown 2004-03-20 18:56:07 UTC
This appears to be a regression in ACPICA 20040311. 
Please attach the output from acpidmp, 
available in /usr/sbin/, or in pmtools: 
http://ftp.kernel.org/pub/linux/kernel/people/lenb/acpi/utils/ 
 
Comment 6 Andrew Clayton 2004-03-20 19:35:35 UTC
Created attachment 2381 [details]
Output of acpidmp command under 2.4.26-pre5
Comment 7 Len Brown 2004-03-20 20:15:43 UTC
Thanks for the acpidmp output.  I verified that 20040311 tools 
cleanly disassemble/re-assemble the AML -- so no _static_ errors. 
 
I expect that Bob may have to simulate this one, and if that fails may need 
a debugging execution trace run on the hardware.  I don't suppose you've 
got a serial port on your laptop? 
 
Comment 8 Andrew Clayton 2004-03-21 04:48:48 UTC
Yes, I do have a serial port on it. :) I can easily hook it 
upto my workstation to obtain any needed output.

Comment 9 Robert Moore 2004-03-25 09:54:06 UTC
This one looks kind of nasty.  you might try using the new "serialized" 
feature of the latest ACPI CA integration.

Bob
Comment 10 Andrew Clayton 2004-03-25 10:14:45 UTC
OK, how might I do this?
Comment 11 Robert Moore 2004-03-25 10:40:41 UTC
Looks like a PCI traversal issue, so I won't be able to reproduce it.  Need 
the trace.

ACPI: Subsystem revision 20040311
PCI: PCI BIOS revision 2.10 entry at 0xe8964, last bus=1
PCI: Using configuration type 1
    ACPI-0433: *** Warning: Existing references (4) on node being deleted 
(c129e660)
    ACPI-0433: *** Warning: Existing references (9) on node being deleted 
(c12aeda0)
ACPI: IRQ9 SCI: Edge set to Level Trigger.
    ACPI-0097: *** Error: Unable to initialize general purpose events, 
AE_NOT_FOUND
ACPI: Unable to start the ACPI Interpreter
    ACPI-0433: *** Warning: Existing references (65419) on node being deleted 
(c129e4e0)
PCI: Probing PCI hardware
PCI: ACPI tables contain no PCI IRQ routing entries
Comment 12 Len Brown 2004-03-25 13:09:16 UTC
The serialization feature is enabled with bootflag "acpi_serialize" 
couldn't hurt to also try "acpi_osi=" to return the AML to old behaviour, 
though I don't recall seeing anything in your AML that would depend on it. 
 
thanks, 
-Len  
Comment 13 Len Brown 2004-03-25 14:08:00 UTC
Created attachment 2400 [details]
patch to enable ACPI_DEBUG_HIGH

Please patch your kernel with this 1-liner and build with CONFIG_ACPI_DEBUG=y
then capture the serial console of the failure.  The messages will be very
verbose, so be sure that the serial console is logged to a file.

thanks,
-Len
Comment 14 Len Brown 2004-03-25 15:14:39 UTC
Comment on attachment 2400 [details]
patch to enable ACPI_DEBUG_HIGH

note: verbose debug patch will take _hours_ to complete. so crank up that baud
rate and let it run all night!;-)
Comment 15 Andrew Clayton 2004-03-25 16:15:38 UTC
lol, I think this file is going to be pretty big (even compressed) what should I
do with it when it's finished?


Cheers,

Andrew
Comment 16 Andrew Clayton 2004-03-26 07:02:22 UTC
heh, it's been going for nearly 15 hours and I have about 510MB 
of log file so far.

I don't know what it's doing, but it's doing a lot of it ;)

When it finishes, will the kernel just resume booting as normal?


Cheers,

Andrew
 
Comment 17 Robert Moore 2004-03-26 14:58:54 UTC
510 Mb sounds like a bit much, it should be about 10Mb.  It may be in some 
awful kind off loop.
Bob
Comment 18 Andrew Clayton 2004-03-26 15:15:25 UTC
Yeah... it does seem to be just doing the same thing over and over...

23 hours and 805MB of log.


Maybe I should just stop it?


Cheers,

Andrew
Comment 19 Andrew Clayton 2004-03-26 16:05:10 UTC
OK,

After about 24 hours and 833MB of log, I stopped it.....
Comment 20 Robert Moore 2004-03-26 16:25:58 UTC
If you can open the file :-), I would examine it for some kind of looping 
behavior and maybe we can narrow it down from there.
Bob
Comment 21 Andrew Clayton 2004-03-26 17:57:06 UTC
Created attachment 2417 [details]
Snapshot of ACPI debug output

Hi,

Not sure what I'd be looking for in the file to indicate a loop,
but for instance in the full log I captured there are 1718 occurences
of "Offset Value".

I've attached the first 15MB (bzip2'd) of the log output. Hopefully
you'll find it useful...


Cheers,

Andrew
Comment 22 Andrew Clayton 2004-03-28 07:33:03 UTC
Hi,

I've just tried 2.4.26-rc1 and that seems to have fixed the problem.
(2.4.26-pre6 was the same as 2.4.26-pre5).


Cheers,

Andrew
Comment 23 Len Brown 2004-03-30 16:42:28 UTC
Never a good feeling when a failure goes away for no apparent reason...
I wonder if there was a build issue.

Any chance you can "make clean" the original failing release
and see if it is really still there?

thanks,
-Len
Comment 24 Andrew Clayton 2004-03-30 17:19:54 UTC
Yes you are right! built a kernel from a clean 2.4.26-pre5 tree
and it's fine.

I'll bear that in mind if something similar happens again.


Sorry for the noise :(

Cheers,

Andrew