Bug 6550 - acpi_gbl_global_list randomly gets corroputed
Summary: acpi_gbl_global_list randomly gets corroputed
Status: CLOSED CODE_FIX
Alias: None
Product: ACPI
Classification: Unclassified
Component: ACPICA-Core (show other bugs)
Hardware: i386 Linux
: P2 high
Assignee: Robert Moore
URL:
Keywords:
: 6549 (view as bug list)
Depends on:
Blocks:
 
Reported: 2006-05-14 05:50 UTC by Luming Yu
Modified: 2006-09-28 13:10 UTC (History)
1 user (show)

See Also:
Kernel Version: 2.6.16 and later
Subsystem:
Regression: ---
Bisected commit-id:


Attachments

Description Luming Yu 2006-05-14 05:50:11 UTC
I got a weird random NULL pointer in acpi_ut_find_allocation. My investigation 
shows that the acpi_gbl_global_list somehow got corrupted between :

exconfig.c: acpi_ex_load_op:	 table_ptr = ACPI_MEM_ALLOCATE
(table_header.length);

And

tbinstal.c : acpi_tb_init_table_descriptor:  table_desc = ACPI_MEM_CALLOCATE
(sizeof(struct acpi_table_desc));

For example:
The corrupted list had a internal node (dfef1200) with
	prev pointer == 0x0 (Wrong)
	next pointer == dff09600 
And node (dff09600) with
	prev pointer == 0x0 (Wrong) 
	next pointer == 182 (Wrong)  XXXX: This cause kernel NULL pointer 
panic.

The interesting thing is if  the value of table_ptr allocated for the first 
SSDT was larger than the successor by 0x200,
then the corresponding node for the first SSDT on the acpi_gbl_global_list 
would be corrupted. Otherwise, everything
is just ok. For example, dfe72c00 was allocated for the first SSDT, dfe72a00 
was allocated for the second SSDT,
then, I observed the node dfe72c00 corrupted.  So, it looks like the execution 
of code from 
exconfig.c: acpi_ex_load_op:	 table_ptr = ACPI_MEM_ALLOCATE
(table_header.length);
to 
 tbinstal.c : acpi_tb_init_table_descriptor:  table_desc = ACPI_MEM_CALLOCATE
(sizeof(struct acpi_table_desc));
for the second SSDT unexpectedly changed the node for the first SSDT on the 
acpi_gbl_global_list.

But, how? Or there are other tricks? 
I will debug further..., and stay tuned. 

Thanks,
Luming

ps: The following are the log for the failed and successful boot.

Failed Log
...
acpi_ex_load_op: table_ptr=dff0a228
SSDT located at dff0a228
Parsing all Control Methods:
Table [SSDT](id 0086) - 5 Objects with 0 Devices 3 Methods 0 Regions
table_header.length = 470 
acpi_ex_load_op: table_ptr=dff0a028
BUG: unable to handle kernel NULL pointer dereference at virtual address 
0000018
6
 printing eip:
c0227068
*pde = 00000000
Oops: 0000 [#1]
SMP . 

Successful Log
...
table_header.length = 422 
acpi_ex_load_op: table_ptr=c1944028
SSDT located at c1944028
Parsing all Control Methods:
Table [SSDT](id 0086) - 5 Objects with 0 Devices 3 Methods 0 Regions
table_header.length = 470 
acpi_ex_load_op: table_ptr=c198be28
SSDT located at c198be28
Parsing all Control Methods:
Table [SSDT](id 0087) - 1 Objects with 0 Devices 1 Methods 0 Regions
ACPI: CPU0 (power states: C1[C1] C2[C2])
ACPI: Processor [CPU0] (supports 8 throttling states)
table_header.length = 135 
acpi_ex_load_op: table_ptr=dfee67a8
SSDT located at dfee67a8
Parsing all Control Methods:
Table [SSDT](id 008B) - 3 Objects with 0 Devices 3 Methods 0 Regions
table_header.length = 133 
acpi_ex_load_op: table_ptr=dfee66a8
SSDT located at dfee66a8
Parsing all Control Methods:
Table [SSDT](id 008C) - 1 Objects with 0 Devices 1 Meth
Comment 1 Luming Yu 2006-05-14 05:51:44 UTC
it is due to this:

In acpi_ex_system_memory_space_handler: 
...
	case ACPI_READ:
		*value = 0;

c0217846:       c7 00 00 00 00 00       movl   $0x0,(%eax)
c021784c:       c7 40 04 00 00 00 00    movl   $0x0,0x4(%eax)


After applying the patch below, it becomes:
In acpi_ex_system_memory_space_handler: 
...
	case ACPI_READ:
		*value = 0;

c0217843:       c6 00 00                movb   $0x0,(%eax)


Because acpi_integer is U64.

Then, the kernel boot just fine, even if the first SSDT table_ptr ==dfe57c28, 
and the
second SSDT table_ptr ==dfe57a28 , which was supposed to fail according to my 
previous analysis.

Thanks,
Luming

diff --git a/drivers/acpi/executer/exregion.c 
b/drivers/acpi/executer/exregion.c
index 6a4cfdf..c0805ef 100644
--- a/drivers/acpi/executer/exregion.c
+++ b/drivers/acpi/executer/exregion.c
@@ -48,6 +48,7 @@
 #define _COMPONENT          ACPI_EXECUTER
 ACPI_MODULE_NAME("exregion")

+extern struct acpi_table_header *debug_table_ptr;
 /*****************************************************************************
**
  *
  * FUNCTION:    acpi_ex_system_memory_space_handler
@@ -69,7 +70,7 @@ acpi_status
 acpi_ex_system_memory_space_handler(u32 function,
                                    acpi_physical_address address,
                                    u32 bit_width,
-                                   acpi_integer * value,
+                                   u8 * value,
                                    void *handler_context, void 
*region_context)
 {
        acpi_status status = AE_OK;

diff --git a/include/acpi/acinterp.h b/include/acpi/acinterp.h
index 9f22cfc..b2808f2 100644
--- a/include/acpi/acinterp.h
+++ b/include/acpi/acinterp.h
@@ -465,7 +465,7 @@ acpi_status
 acpi_ex_system_memory_space_handler(u32 function,
                                    acpi_physical_address address,
                                    u32 bit_width,
-                                   acpi_integer * value,
+                                   u8 * value,
                                    void *handler_context,
                                    void *region_context);

Comment 2 Luming Yu 2006-05-14 05:54:23 UTC
*** Bug 6549 has been marked as a duplicate of this bug. ***
Comment 3 Robert Moore 2006-05-15 07:53:36 UTC
Should be fixed in ACPICA version 20060512
Comment 4 Len Brown 2006-07-05 19:13:57 UTC
ACPICA 20060512 shipped before linux-2.6.18-rc1
closed.

Note You need to log in before you can comment on or make changes to this bug.