Bug 4749 - Avoid loading of ACPI tables twice - fix attached
Summary: Avoid loading of ACPI tables twice - fix attached
Status: CLOSED CODE_FIX
Alias: None
Product: ACPI
Classification: Unclassified
Component: Config-Tables (show other bugs)
Hardware: i386 Linux
: P2 normal
Assignee: Robert Moore
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2005-06-14 08:25 UTC by Thomas Renninger
Modified: 2007-03-07 22:48 UTC (History)
2 users (show)

See Also:
Kernel Version: 2.6.12-rc5
Subsystem:
Regression: ---
Bisected commit-id:


Attachments
compare table's OEM ID and deny loading if it's already loaded into namespace (1.50 KB, patch)
2005-06-14 08:33 UTC, Thomas Renninger
Details | Diff
Make _PDC call value additive and add an option to disable second _PDC call (3.57 KB, patch)
2005-06-21 10:23 UTC, Venkatesh Pallipadi
Details | Diff
Avoid double loading of ACPI tables - also check for table revision (1.74 KB, patch)
2005-07-15 06:03 UTC, Thomas Renninger
Details | Diff
Updated patch: iterate over all tables, return AE_OK to avoid aftereffects (1.93 KB, patch)
2005-08-01 07:19 UTC, Thomas Renninger
Details | Diff
Same patch, but also check for the table's checksum (2.04 KB, patch)
2005-08-02 07:36 UTC, Thomas Renninger
Details | Diff
Move checks under MUTEXes... (3.19 KB, patch)
2005-12-08 10:53 UTC, Alexey Starikovskiy
Details | Diff

Description Thomas Renninger 2005-06-14 08:25:25 UTC
I recently posted this on the ACPI list, but no response... 
Could someone verify if this is the right way to solve this. 
Is the ACPI code (explicitly loading the SSDT in DSDT code) allowed? 
I searched the APCI spec a bit and I think it is. 
 
 
I have a machine with following DSDT code:  
  
Processor (CPU1, 0x01, 0x00000810, 0x06)  
        {  
            OperationRegion (SSDT, SystemMemory, 0x1DFD4050, 0x03CC)  
            ...  
            Method (_PDC, 1, NotSerialized)  
            {  
                ...  
                Load (SSDT, HNDL)  
                ...  
            }  
        }  
  
The machine works fine using acpi-cpufreq.  
  
If loading the speedstep-centrino module is tried, the _PDC function is invoked  
and the SSDT is loaded a second time which results in:  
  
linux kernel: ACPI: CPU0 (power states: C1[C1] C2[C2])  
linux kernel: ACPI: Processor [CPU1] (supports 8 throttling states)  
linux kernel:  dswload-0294: *** Error: Looking up [ACST] in namespace,  
AE_ALREADY_EXISTS  
linux kernel:  psparse-0601 [20] ps_parse_loop         : During name  
lookup/catalog, AE_ALREADY_EXISTS  
linux kernel:  psparse-1140: *** Error: Method execution failed  
[\_PR_.CPU1._PDC] (Node ddfcd3a8), AE_ALREADY_EXISTS  
  
and a kernel freeze with different backtraces later:  
linux kernel: Unable to handle kernel paging request at virtual address XYZ  
  
The attached patch worksarounds/fixes this problem on this machine.  
Could someone have a look at it, please.  
 
Thanks.
Comment 1 Thomas Renninger 2005-06-14 08:33:27 UTC
Created attachment 5165 [details]
compare table's OEM ID and deny loading if it's already loaded into namespace
Comment 2 Venkatesh Pallipadi 2005-06-21 10:21:26 UTC
This patch will work on today's platform. But, this will not work in future. 
We do want multiple _PDC calls to happen in future. As _PDC includes soem bits 
for P-states and some others for C-states. As P-state and C-state drivers are 
different, we wont know which driver will be loaded at any time, and each 
driver when it gets loaded has to write its own bits in _PDC.

I have another patch that solves the same issue. However, with this patch a 
speaicl boot parameter is required to limit the _PDC call to 1. I think that 
is a better approach, as we are actually trying to workaround a BIOS bug. 
Platforms with known failures, we can add some DMI checks and enable this 
single _PDC call.

Comment 3 Venkatesh Pallipadi 2005-06-21 10:23:20 UTC
Created attachment 5199 [details]
Make _PDC call value additive and add an option to disable second _PDC call
Comment 4 Thomas Renninger 2005-06-22 02:32:54 UTC
Ok, I just realized that the culprit always was the double loading of SSDTs when
invoking _PDC twice.
However this one also fails even if only trying to load the speedstep-centrino
module.
acpi_cpufreq works.

It seems as if the SSDT already has been loaded at boot time or in other code
parts in this case.

Maybe only SMP capabilities have been given to _PDC by acpi_cpufreq in the
kernel I tried and therefore it did not went through the path where the SSDT is
loaded again and acpi_cpufreq worked?


So that happens when _PDC is called twice?
   - loading cpufreq module -> _PDC function is called
   - the SSDT is loaded
   - loading another cpufreq module
   - The whole SSDT table is loaded again (Load() ACPI call in _PDC)
   - Memory is messed up somewhere due name clashes by double loading functions
     into namespace
   - Machine freezes shortly after that ...


Checking the unique OEM ID of tables whether they already were loaded into
namespace or not, sounds like a reasonable solution/check to me. 
I wonder why this should not work in future verions? 
It would make the allow multiple PDC call parameter dispensable.
And particular would work on machines where the SSDT has already been loaded at
other code parts than _PDC.
Comment 5 Thomas Renninger 2005-06-30 00:21:03 UTC
Venkatesh, could you please explain why checking whether an ACPI table has
already been loaded into namespace will break future versions?

This patch actually allows multiple _PDC calls without any restrictions.

Until now I only saw the Load(SSDT,...) call in the _PDC func.
However, in the future BIOS programmers might also use the Load() function in
other parts. Then you have to take care again that these parts are only called once.

Here is a part of the ACPI spec (3.0, p.489):
The OS can also check the OEM Table ID and Revision ID against a database for a
newer revision Definition Block of the same OEM Table ID and load it instead.

So what the patch is missing is to also compare the Revision ID, to be
absolutely sure that exactly the same table has already been loaded. Then deny
loading it.

Ok, just reread your comment again:
I think we are not workaround a BIOS bug.
Could you please explain why these BIOSes should be buggy?

I think this is definitely a kernel bug. The ACPI spec does not say anything in
the Load() function description that it is not allowed to load the same table twice.
The kernel has to assure that this will not happen.

Sorry, maybe I miss a point, but I am sure that this patch is the right solution
to get rid of the multiple _PDC call problem for ever (which is actually a
problem of loading SSDT's twice, right?) and avoid similar problems in other
BIOS parts for the future. 
Comment 6 Venkatesh Pallipadi 2005-07-13 11:42:38 UTC
Sorry. I didn't read your earlier comment and patch fully. 

I agree that checking whether the table is already loaded is the right way 
here.

Comment 7 Thomas Renninger 2005-07-15 06:03:15 UTC
Created attachment 5335 [details]
Avoid double loading of ACPI tables - also check for table revision

Same patch as before, but also checks for revision ID.
Should apply (maybe with small offset) in current (2.6.11 and above?) kernels.

I am not totally sure what is better: to return AE_OK or AE_ALREADY_EXISTS.
The latter could have some afteraffects: if return value is not AE_OK, invoking
methods might also fail, even they could be successful.
So in general I'd go for returning AE_OK. However I don't have a machine
anymore to test (was a private one) and generating the double loading SSDT case
could get nasty (adding some Load(SSDT,...) statements in the DSDT ... Could
try that if you think the patch is OK).

Patch is compile tested and tested on a "multiple _PDC call" bug in it's
original shape (see my previous attached patch). Worked fine, no need to
disable the second call to _PDC anymore (in fact, on this machine already the
first call to _PDC froze the machine as for some reason the SSDT was already
loaded before invoking _PDC).
Comment 8 Len Brown 2005-07-20 15:10:41 UTC
move state to RESOLVED since a valid patch is attached
will apply when i get back, and if it flys will change
state to CLOSED when it hits Linus' tree.
Comment 9 Len Brown 2005-07-27 18:30:57 UTC
Bob to verify that this is spec compliant (and put workaround-disable hook
around it if it is not), and to consider for ACPICA.  Note that we probably
don't want the warning on the console if this is normal operation.
Comment 10 Robert Moore 2005-07-28 15:34:50 UTC
Looks like a bug in the patch, it seems to be only looking at the first table 
in the list by virtue of this line of code within the loop:

table_desc_loaded = acpi_gbl_table_lists[table_desc->type].next;

If we agree, I'll fix it (need to rewrite for the base ACPI CA source anyway.)


Comment 11 Thomas Renninger 2005-07-29 02:31:36 UTC
Thanks a lot.
I recognised yesterday that the tables checked are always the same.
I now got a SMP machine to test that has this problem (SMP, Xeon 3,4GHz).
It loads a SSDT per CPU when loading the cpufreq driver.
I'll give it a test and report back.

Comment 12 Thomas Renninger 2005-08-01 07:19:45 UTC
Created attachment 5450 [details]
Updated patch: iterate over all tables, return AE_OK to avoid aftereffects

I tested this on a SMP Xeon machine that loads a new SSDT for each CPU during
runtime (when loading cpufreq modules).
The func needs to return AE_OK, or the whole method execution fails and bad
things may happen (on this machine it freezes when returning AE_ALREADY_EXISTS,
because the cpufreq driver thinks the _PDC call failed).
With this patch everything worked fine.
P.S.: Not sure about the warning, I'd let it in, not much machines are affected
(yet).
P.S.S.: Diff might patch with a small offset (used current SUSE kernel), should
not matter.
Comment 13 Thomas Renninger 2005-08-01 10:45:50 UTC
Please wait. I got a bugreport where five SSDTs exist that all have the same
header, but only differ in the checksum.

I have to leave immediately. I will provide another patch also checking for the
checksum tomorrow.
Comment 14 Thomas Renninger 2005-08-02 07:36:50 UTC
Created attachment 5478 [details]
Same patch, but also check for the table's checksum

IIRC the same OEM and revision ID in different table headers is forbidden by
spec?
However, I already hit a machine where the complete header of the SSDT tables
are equal, only the checksum diffs.
Therefore: also check for the checksum.

Patch is currently in SUSE stable kernel and gets extensively tested (preview
phase) and should be considered to work stable, please apply.
Comment 15 Robert Moore 2005-08-04 13:19:06 UTC
I have implemented support for the duplicate table issue in ACPI CA 20050729. 
It does not check for duplicate tables in the RSDT/XSDT, only duplicates 
during dynamic load operations. A duplicate table is simply ignored with an 
AE_OK.

As far as checking the checksum, etc: I'm starting to think that we should 
simply 1) match the signature and table length, then 2) perform a full memcmp 
on the two tables, thus we will know for sure that we have an exact duplicate. 

Bob
Comment 16 Thomas Renninger 2005-08-05 00:19:51 UTC
Thanks for having a look into this.
I put my nose into this code deeper and deeper, but it's very easy to oversee
something or put something into the wrong place ...

Putting it into ACPICA code means it will go mainline with the next bigger
ACPICA commit, probably after 2.6.13?
If so, could you point me to this change, please? SL 10.0 will be 2.6.13  based
and I really like to have this in. A lot of cheap machines don't care much of
Linux BIOS support and I got a lot bug reports for them (even Venketash's
workaround "avoid multiple _PDC calls" works for most of them I'd like to have
it fixed properly) E.g. I know one machine having the automatic SSDT load bit
set and machine freezes on the first _PDC call.
Comment 17 Robert Moore 2005-08-17 08:50:34 UTC
ACPICA version 20050815 now checks the signatures and table lengths first. If 
these match, a full memcmp is performed on the two tables in order to detect a 
duplicate.
Comment 18 Thomas Renninger 2005-09-15 09:10:27 UTC
I got a report from a P4 Hyperthreaded machine where the check seems to work 
with HT disabled, but the table is still loaded twice if HT is enabled. 
 
I only have parts of dmesg, but could it be that there is a race condition on 
SMP machines? 
 
Could it be that the the tables on multiple CPUs are loaded in parallel and 
that something like that is missing?: 
 
semaphore up 
check for already loaded tables 
set already_loaded_into_namespace flag 
semaphore down 
 
To be honest, I didn't look at the code for that right now, the DSDT/SSDTs are 
rather complicated (second SSDT is loaded from first SSDT in _PDC call for each 
processor) but from the report it sounds sane, what do you think? 
 
Comment 19 Robert Moore 2005-09-29 12:37:53 UTC
I agree that the tables should be locked whilst fussing with them.
Comment 20 Thomas Renninger 2005-10-26 04:23:51 UTC
> ACPICA version 20050815 now checks the signatures and table lengths first.  
-> Could you please point me to a function where this is done.  
   As 20050815 contains all the Lindent lines and I cannot find it.  
 
All I find is this (2.6.14-rc5) - drivers/acpi/namespace/nsload.c: 
 
table_desc = acpi_gbl_table_lists[table_type].next; 
		for (i = 0; i < acpi_gbl_table_lists[table_type].count; i++) { 
			/* 
			 * Only attempt to load table into namespace if it is 
not 
			 * already loaded! 
			 */ 
			if (!table_desc->loaded_into_namespace) { 
				status = 
				    acpi_ns_load_table(table_desc, 
						       acpi_gbl_root_node); 
				if (ACPI_FAILURE(status)) { 
					break; 
				} 
 
				table_desc->loaded_into_namespace = TRUE; 
			} 
 
			table_desc = table_desc->next; 
		} 
		break; 
 
But this does not work and should be very old. 
Has the ACPI table double load check been forgotten or do I just not find it? 
 
Thanks. 
Comment 21 Robert Moore 2005-10-26 10:11:52 UTC
The code that checks for duplicate tables appears in tbutils.c, as below:

/*
 * If the table lengths match, perform a full bytewise compare. This
 * means that we will allow tables with duplicate OemTableId(s), as
 * long as the tables are different in some way.
 *
 * Checking if the table has been loaded into the namespace means that
 * we don't check for duplicate tables during the initial installation
 * of tables within the RSDT/XSDT.
 */
if ((TableDesc->LoadedIntoNamespace) &&
    (TableDesc->Pointer->Length == NewTableDesc->Pointer->Length) &&
    (!ACPI_MEMCMP (TableDesc->Pointer, NewTableDesc->Pointer,
        NewTableDesc->Pointer->Length)))
{
    /* Match: this table is already installed */

    ACPI_DEBUG_PRINT ((ACPI_DB_TABLES,
        "Table [%4.4s] already installed: Rev %X OemTableId [%8.8s]\n",
        NewTableDesc->Pointer->Signature,
        NewTableDesc->Pointer->Revision,
        NewTableDesc->Pointer->OemTableId));

    NewTableDesc->OwnerId       = TableDesc->OwnerId;
    NewTableDesc->InstalledDesc = TableDesc;

    return_ACPI_STATUS (AE_ALREADY_EXISTS);
}

Comment 22 Venkatesh Pallipadi 2005-10-28 17:05:31 UTC
As the code is in acpica now, closing the bug.
Comment 23 Len Brown 2005-11-10 14:34:34 UTC
to be specific...
The bytewise compare shipped in ACPICA 20050815,
which shipped in Linux-2.6.14
Comment 24 Thomas Renninger 2005-12-08 09:11:03 UTC
I still have one (with some duplicates) P4-HT reports about double loaded SSDTs.
http://bugzilla.novell.com/show_bug.cgi?id=121839

There clearly is still a bug concerning locking?
What needs to be locked is the loaded_into_namespace variable from the time it
is checked whether the table is already loaded to the time the
loaded_into_namespace flag is set after loading the table.

Currently it is like:
lock (ACPI_MTX_TABLE)
if (!table already loaded into namespace)
install table
unlock (ACPI_MTX_TABLE)
load table into namespace      (potentially takes a long time -> context switch 
                               -> same table is loaded twice as loaded into 
                               namesspace flag for the table is not set, yet)
set table is loaded into namespace

and it should be:
lock (ACPI_MTX_TABLE)
if (!table already loaded into namespace)
install table
load table into namespace
set table is loaded into namespace
unlock (ACPI_MTX_TABLE)

Is that right?
I tried to fix it, but I couldn't... (see bug pointed to above).
It would be easier if the load table functions would be cleaned up a bit, like
load_table invokes load_table_by_type (function names not exactly copied),
instead of each of them having separate code?

If you think this is the way to go, please tell me and I can give it a second try...

Any help/fix from the ACPI experts is appreciated and gets tested ASAP (I can't
access such a machine myself, but it will get tested for sure sooner or later).

Adding Alexey to CC, he already watched the Novell bug and AFAIK is familiar
with SSDT loading?
Comment 25 Alexey Starikovskiy 2005-12-08 10:53:00 UTC
Created attachment 6784 [details]
Move checks under MUTEXes...

Please try this patch, it should do the trick...
Comment 26 Robert Moore 2006-01-17 12:52:39 UTC
I will integrate this as soon as I hear that it works properly.
Comment 27 Robert Moore 2006-05-12 14:40:17 UTC
This is pending, waiting for the new table manager code to be completed and 
integrated.
Comment 28 Robert Moore 2006-09-25 13:34:32 UTC
This has been integrated with the new table manager (20060823) and can be 
closed

Note You need to log in before you can comment on or make changes to this bug.