Bug 4749
Summary: | Avoid loading of ACPI tables twice - fix attached | ||
---|---|---|---|
Product: | ACPI | Reporter: | Thomas Renninger (trenn) |
Component: | Config-Tables | Assignee: | Robert Moore (Robert.Moore) |
Status: | CLOSED CODE_FIX | ||
Severity: | normal | CC: | acpi-bugzilla, astarikovskiy |
Priority: | P2 | ||
Hardware: | i386 | ||
OS: | Linux | ||
Kernel Version: | 2.6.12-rc5 | Subsystem: | |
Regression: | --- | Bisected commit-id: | |
Attachments: |
compare table's OEM ID and deny loading if it's already loaded into namespace
Make _PDC call value additive and add an option to disable second _PDC call Avoid double loading of ACPI tables - also check for table revision Updated patch: iterate over all tables, return AE_OK to avoid aftereffects Same patch, but also check for the table's checksum Move checks under MUTEXes... |
Description
Thomas Renninger
2005-06-14 08:25:25 UTC
Created attachment 5165 [details]
compare table's OEM ID and deny loading if it's already loaded into namespace
This patch will work on today's platform. But, this will not work in future. We do want multiple _PDC calls to happen in future. As _PDC includes soem bits for P-states and some others for C-states. As P-state and C-state drivers are different, we wont know which driver will be loaded at any time, and each driver when it gets loaded has to write its own bits in _PDC. I have another patch that solves the same issue. However, with this patch a speaicl boot parameter is required to limit the _PDC call to 1. I think that is a better approach, as we are actually trying to workaround a BIOS bug. Platforms with known failures, we can add some DMI checks and enable this single _PDC call. Created attachment 5199 [details]
Make _PDC call value additive and add an option to disable second _PDC call
Ok, I just realized that the culprit always was the double loading of SSDTs when invoking _PDC twice. However this one also fails even if only trying to load the speedstep-centrino module. acpi_cpufreq works. It seems as if the SSDT already has been loaded at boot time or in other code parts in this case. Maybe only SMP capabilities have been given to _PDC by acpi_cpufreq in the kernel I tried and therefore it did not went through the path where the SSDT is loaded again and acpi_cpufreq worked? So that happens when _PDC is called twice? - loading cpufreq module -> _PDC function is called - the SSDT is loaded - loading another cpufreq module - The whole SSDT table is loaded again (Load() ACPI call in _PDC) - Memory is messed up somewhere due name clashes by double loading functions into namespace - Machine freezes shortly after that ... Checking the unique OEM ID of tables whether they already were loaded into namespace or not, sounds like a reasonable solution/check to me. I wonder why this should not work in future verions? It would make the allow multiple PDC call parameter dispensable. And particular would work on machines where the SSDT has already been loaded at other code parts than _PDC. Venkatesh, could you please explain why checking whether an ACPI table has already been loaded into namespace will break future versions? This patch actually allows multiple _PDC calls without any restrictions. Until now I only saw the Load(SSDT,...) call in the _PDC func. However, in the future BIOS programmers might also use the Load() function in other parts. Then you have to take care again that these parts are only called once. Here is a part of the ACPI spec (3.0, p.489): The OS can also check the OEM Table ID and Revision ID against a database for a newer revision Definition Block of the same OEM Table ID and load it instead. So what the patch is missing is to also compare the Revision ID, to be absolutely sure that exactly the same table has already been loaded. Then deny loading it. Ok, just reread your comment again: I think we are not workaround a BIOS bug. Could you please explain why these BIOSes should be buggy? I think this is definitely a kernel bug. The ACPI spec does not say anything in the Load() function description that it is not allowed to load the same table twice. The kernel has to assure that this will not happen. Sorry, maybe I miss a point, but I am sure that this patch is the right solution to get rid of the multiple _PDC call problem for ever (which is actually a problem of loading SSDT's twice, right?) and avoid similar problems in other BIOS parts for the future. Sorry. I didn't read your earlier comment and patch fully. I agree that checking whether the table is already loaded is the right way here. Created attachment 5335 [details]
Avoid double loading of ACPI tables - also check for table revision
Same patch as before, but also checks for revision ID.
Should apply (maybe with small offset) in current (2.6.11 and above?) kernels.
I am not totally sure what is better: to return AE_OK or AE_ALREADY_EXISTS.
The latter could have some afteraffects: if return value is not AE_OK, invoking
methods might also fail, even they could be successful.
So in general I'd go for returning AE_OK. However I don't have a machine
anymore to test (was a private one) and generating the double loading SSDT case
could get nasty (adding some Load(SSDT,...) statements in the DSDT ... Could
try that if you think the patch is OK).
Patch is compile tested and tested on a "multiple _PDC call" bug in it's
original shape (see my previous attached patch). Worked fine, no need to
disable the second call to _PDC anymore (in fact, on this machine already the
first call to _PDC froze the machine as for some reason the SSDT was already
loaded before invoking _PDC).
move state to RESOLVED since a valid patch is attached will apply when i get back, and if it flys will change state to CLOSED when it hits Linus' tree. Bob to verify that this is spec compliant (and put workaround-disable hook around it if it is not), and to consider for ACPICA. Note that we probably don't want the warning on the console if this is normal operation. Looks like a bug in the patch, it seems to be only looking at the first table in the list by virtue of this line of code within the loop: table_desc_loaded = acpi_gbl_table_lists[table_desc->type].next; If we agree, I'll fix it (need to rewrite for the base ACPI CA source anyway.) Thanks a lot. I recognised yesterday that the tables checked are always the same. I now got a SMP machine to test that has this problem (SMP, Xeon 3,4GHz). It loads a SSDT per CPU when loading the cpufreq driver. I'll give it a test and report back. Created attachment 5450 [details]
Updated patch: iterate over all tables, return AE_OK to avoid aftereffects
I tested this on a SMP Xeon machine that loads a new SSDT for each CPU during
runtime (when loading cpufreq modules).
The func needs to return AE_OK, or the whole method execution fails and bad
things may happen (on this machine it freezes when returning AE_ALREADY_EXISTS,
because the cpufreq driver thinks the _PDC call failed).
With this patch everything worked fine.
P.S.: Not sure about the warning, I'd let it in, not much machines are affected
(yet).
P.S.S.: Diff might patch with a small offset (used current SUSE kernel), should
not matter.
Please wait. I got a bugreport where five SSDTs exist that all have the same header, but only differ in the checksum. I have to leave immediately. I will provide another patch also checking for the checksum tomorrow. Created attachment 5478 [details]
Same patch, but also check for the table's checksum
IIRC the same OEM and revision ID in different table headers is forbidden by
spec?
However, I already hit a machine where the complete header of the SSDT tables
are equal, only the checksum diffs.
Therefore: also check for the checksum.
Patch is currently in SUSE stable kernel and gets extensively tested (preview
phase) and should be considered to work stable, please apply.
I have implemented support for the duplicate table issue in ACPI CA 20050729. It does not check for duplicate tables in the RSDT/XSDT, only duplicates during dynamic load operations. A duplicate table is simply ignored with an AE_OK. As far as checking the checksum, etc: I'm starting to think that we should simply 1) match the signature and table length, then 2) perform a full memcmp on the two tables, thus we will know for sure that we have an exact duplicate. Bob Thanks for having a look into this. I put my nose into this code deeper and deeper, but it's very easy to oversee something or put something into the wrong place ... Putting it into ACPICA code means it will go mainline with the next bigger ACPICA commit, probably after 2.6.13? If so, could you point me to this change, please? SL 10.0 will be 2.6.13 based and I really like to have this in. A lot of cheap machines don't care much of Linux BIOS support and I got a lot bug reports for them (even Venketash's workaround "avoid multiple _PDC calls" works for most of them I'd like to have it fixed properly) E.g. I know one machine having the automatic SSDT load bit set and machine freezes on the first _PDC call. ACPICA version 20050815 now checks the signatures and table lengths first. If these match, a full memcmp is performed on the two tables in order to detect a duplicate. I got a report from a P4 Hyperthreaded machine where the check seems to work with HT disabled, but the table is still loaded twice if HT is enabled. I only have parts of dmesg, but could it be that there is a race condition on SMP machines? Could it be that the the tables on multiple CPUs are loaded in parallel and that something like that is missing?: semaphore up check for already loaded tables set already_loaded_into_namespace flag semaphore down To be honest, I didn't look at the code for that right now, the DSDT/SSDTs are rather complicated (second SSDT is loaded from first SSDT in _PDC call for each processor) but from the report it sounds sane, what do you think? I agree that the tables should be locked whilst fussing with them. > ACPICA version 20050815 now checks the signatures and table lengths first.
-> Could you please point me to a function where this is done.
As 20050815 contains all the Lindent lines and I cannot find it.
All I find is this (2.6.14-rc5) - drivers/acpi/namespace/nsload.c:
table_desc = acpi_gbl_table_lists[table_type].next;
for (i = 0; i < acpi_gbl_table_lists[table_type].count; i++) {
/*
* Only attempt to load table into namespace if it is
not
* already loaded!
*/
if (!table_desc->loaded_into_namespace) {
status =
acpi_ns_load_table(table_desc,
acpi_gbl_root_node);
if (ACPI_FAILURE(status)) {
break;
}
table_desc->loaded_into_namespace = TRUE;
}
table_desc = table_desc->next;
}
break;
But this does not work and should be very old.
Has the ACPI table double load check been forgotten or do I just not find it?
Thanks.
The code that checks for duplicate tables appears in tbutils.c, as below: /* * If the table lengths match, perform a full bytewise compare. This * means that we will allow tables with duplicate OemTableId(s), as * long as the tables are different in some way. * * Checking if the table has been loaded into the namespace means that * we don't check for duplicate tables during the initial installation * of tables within the RSDT/XSDT. */ if ((TableDesc->LoadedIntoNamespace) && (TableDesc->Pointer->Length == NewTableDesc->Pointer->Length) && (!ACPI_MEMCMP (TableDesc->Pointer, NewTableDesc->Pointer, NewTableDesc->Pointer->Length))) { /* Match: this table is already installed */ ACPI_DEBUG_PRINT ((ACPI_DB_TABLES, "Table [%4.4s] already installed: Rev %X OemTableId [%8.8s]\n", NewTableDesc->Pointer->Signature, NewTableDesc->Pointer->Revision, NewTableDesc->Pointer->OemTableId)); NewTableDesc->OwnerId = TableDesc->OwnerId; NewTableDesc->InstalledDesc = TableDesc; return_ACPI_STATUS (AE_ALREADY_EXISTS); } As the code is in acpica now, closing the bug. to be specific... The bytewise compare shipped in ACPICA 20050815, which shipped in Linux-2.6.14 I still have one (with some duplicates) P4-HT reports about double loaded SSDTs. http://bugzilla.novell.com/show_bug.cgi?id=121839 There clearly is still a bug concerning locking? What needs to be locked is the loaded_into_namespace variable from the time it is checked whether the table is already loaded to the time the loaded_into_namespace flag is set after loading the table. Currently it is like: lock (ACPI_MTX_TABLE) if (!table already loaded into namespace) install table unlock (ACPI_MTX_TABLE) load table into namespace (potentially takes a long time -> context switch -> same table is loaded twice as loaded into namesspace flag for the table is not set, yet) set table is loaded into namespace and it should be: lock (ACPI_MTX_TABLE) if (!table already loaded into namespace) install table load table into namespace set table is loaded into namespace unlock (ACPI_MTX_TABLE) Is that right? I tried to fix it, but I couldn't... (see bug pointed to above). It would be easier if the load table functions would be cleaned up a bit, like load_table invokes load_table_by_type (function names not exactly copied), instead of each of them having separate code? If you think this is the way to go, please tell me and I can give it a second try... Any help/fix from the ACPI experts is appreciated and gets tested ASAP (I can't access such a machine myself, but it will get tested for sure sooner or later). Adding Alexey to CC, he already watched the Novell bug and AFAIK is familiar with SSDT loading? Created attachment 6784 [details]
Move checks under MUTEXes...
Please try this patch, it should do the trick...
I will integrate this as soon as I hear that it works properly. This is pending, waiting for the new table manager code to be completed and integrated. This has been integrated with the new table manager (20060823) and can be closed |