Bug 52191 - Some ACPI errors on an Acer Aspire One AO725
Summary: Some ACPI errors on an Acer Aspire One AO725
Status: CLOSED CODE_FIX
Alias: None
Product: ACPI
Classification: Unclassified
Component: ACPICA-Core (show other bugs)
Hardware: x86-64 Linux
: P1 normal
Assignee: Lv Zheng
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2013-01-02 21:35 UTC by Francesco Muzio
Modified: 2014-03-04 08:24 UTC (History)
7 users (show)

See Also:
Kernel Version: 3.7.1
Subsystem:
Regression: No
Bisected commit-id:


Attachments
output of "dmesg | grep -i acpi" (7.96 KB, text/plain)
2013-01-02 21:35 UTC, Francesco Muzio
Details
acpidump output (229.04 KB, application/octet-stream)
2013-01-09 19:24 UTC, Francesco Muzio
Details
dmesg grep -i acpi + acpidump (316.43 KB, application/octet-stream)
2013-10-20 11:14 UTC, sov.info@mail.ru
Details
dmesg of vmlinuz-3.11-2-amd64 root=UUID=3d398f1d-2af1-4c45-81a1-537ec0cdcb6d ro acpi_serialize (54.38 KB, application/octet-stream)
2013-12-23 10:01 UTC, Francesco Muzio
Details
ACPICA: Interpreter: Extends AcpiGbl_AllMethodsSerialized to global lock and non initialization callbacks (4.01 KB, patch)
2014-01-17 02:19 UTC, Lv Zheng
Details | Diff
ACPICA: Executer: Add lockless interpreter locking primitive support. (7.95 KB, patch)
2014-01-17 13:24 UTC, Lv Zheng
Details | Diff
boot with kernel patched with attachment 122371 (56.67 KB, application/octet-stream)
2014-01-19 14:18 UTC, Francesco Muzio
Details
boot with kernel patched with attachment 122401 (494.54 KB, image/jpeg)
2014-01-19 14:19 UTC, Francesco Muzio
Details
ACPICA: Namespace: Fix issue of returning AE_ALREADY_EXIST for field objects creation. (1.61 KB, patch)
2014-01-20 08:04 UTC, Lv Zheng
Details | Diff
ACPICA: Namespace: Fix issue of returning AE_ALREADY_EXISTS for field objects creation. (4.43 KB, patch)
2014-01-20 13:43 UTC, Lv Zheng
Details | Diff
ACPICA: Namespace: Fix issue of returning AE_ALREADY_EXISTS for field objects creation. (4.84 KB, patch)
2014-01-20 16:26 UTC, Lv Zheng
Details | Diff
ACPICA: Dispatcher: Add support to automatically mark named object creation method as Serialized. (12.17 KB, patch)
2014-01-22 04:30 UTC, Lv Zheng
Details | Diff
ACPICA: Executer: Cleanup acpi_gbl_all_method_serialized mechanism. (8.71 KB, patch)
2014-01-22 04:32 UTC, Lv Zheng
Details | Diff
ACPICA: Dispatcher: Cleanup ACPI_METHOD_SERIALIZED_PENDING mechanism. (4.77 KB, patch)
2014-01-22 04:33 UTC, Lv Zheng
Details | Diff
dmesg for solution 1 (56.42 KB, application/octet-stream)
2014-01-25 23:07 UTC, Francesco Muzio
Details
dmesg for solution 2 (56.26 KB, application/octet-stream)
2014-01-25 23:07 UTC, Francesco Muzio
Details
The minimized DSDT that contains all SSZE creation that can be found in DSDT/SSDTs (1.55 KB, application/octet-stream)
2014-01-26 03:29 UTC, Lv Zheng
Details
dmesg output after booting AO725 with a clean 3.13 version of the kernel (55.75 KB, application/octet-stream)
2014-01-26 16:05 UTC, Francesco Muzio
Details
ACPICA: Add auto-serialization support for ill-behaved control methods. (15.73 KB, patch)
2014-01-27 01:20 UTC, Lv Zheng
Details | Diff
ACPICA: Dispatcher: Add auto-serialization for CreateXField and XField opcodes. (1.04 KB, patch)
2014-01-27 01:22 UTC, Lv Zheng
Details | Diff
dmesg output after patch submitted on comment #44 (61.47 KB, application/octet-stream)
2014-02-01 22:20 UTC, Francesco Muzio
Details
An example ASL that might be affected by these fixes. (826 bytes, application/octet-stream)
2014-02-12 00:44 UTC, Lv Zheng
Details

Description Francesco Muzio 2013-01-02 21:35:14 UTC
Created attachment 90191 [details]
output of "dmesg | grep -i acpi"

I have an Acer Aspire One AO725 netbook in which everything (ACPI related) seems to work perfectly

the battery/ac adapter management, works
the power button, works
the Lid Switch, works
the FN key to swiitch on/off the wireless, works
the FN keys to control the brightness backlight, works

but I see some ACPI error in the dmesg who I do not know and do not understand, like these:

ACPI Warning: 0x0000000000000b00-0x0000000000000b07 SystemIO conflicts with Region \_SB_.PCI0.SMBS.SMB0 1 (20120913/utaddress-251)

[    5.375825] ACPI Error: [SSZE] Namespace lookup failure, AE_ALREADY_EXISTS (20120913/dsfield-211)
[    5.376075] ACPI Error: Method parse/execution failed [\_SB_.ACAD._PSR] (Node ffff88010a287c18), AE_ALREADY_EXISTS (20120913/psparse-536)
[    5.376315] ACPI Exception: AE_ALREADY_EXISTS, Error reading AC Adapter state (20120913/ac-122)

I have attached the "dmesg | grep -i acpi" output of my netbook 

I hope you help me understand what these error messages

Thanks
Francesco Muzio
Comment 1 Zhang Rui 2013-01-09 06:18:00 UTC
please attach the acpidump output of your netbook.
Comment 2 Francesco Muzio 2013-01-09 19:24:10 UTC
Created attachment 90921 [details]
acpidump output

acpidump output attached
Comment 3 Zhang Rui 2013-10-14 11:03:48 UTC
            Method (_PSR, 0, NotSerialized)
            {
                  ...
                  CreateWordField (XX00, Zero, SSZE)
                  ...
            }
This seems to be an ACPICA issue to me.
Bob, for the above ASL code, will SSZE be created as a global variable, or a static one, say it is release soon after _PSR evaluation finished?
BTW, what will happen if _PSR is re-entered?
Comment 4 sov.info@mail.ru 2013-10-20 11:14:56 UTC
Created attachment 111701 [details]
dmesg grep -i acpi + acpidump

my output is little different, but the same problem
Comment 5 sov.info@mail.ru 2013-10-20 11:15:41 UTC
(In reply to sov.info@mail.ru from comment #4)
> Created attachment 111701 [details]
> dmesg grep -i acpi + acpidump
> 
> my output is little different, but the same problem

if this is a problem )
Comment 6 Robert Moore 2013-12-10 16:05:29 UTC
If _PSR is reentered, it will fail on the CreateWordField with an AE_ALREADY_EXISTS.

However, in this type of case, ACPICA will dynamically mark the method as "serialized" to prevent further similar issues.
Comment 7 Robert Moore 2013-12-10 17:18:12 UTC
This global can be set to mark all methods serialized

/*
 * Automatically serialize ALL control methods? Default is FALSE, meaning
 * to use the Serialized/NotSerialized method flags on a per method basis.
 * Only change this if the ASL code is poorly written and cannot handle
 * reentrancy even though methods are marked "NotSerialized".
 */
UINT8       ACPI_INIT_GLOBAL (AcpiGbl_AllMethodsSerialized, FALSE);

(acpi_gbl_all_methods_serialized on linux)
Comment 8 Lv Zheng 2013-12-23 00:40:24 UTC
Currently ACPICA treats such issues as BIOS bugs as asynchrony of ACPICA methods' executions is ensured by multi-threading mutexes rather than a stallable asynchronous execution schedular.

Please try this global variable through the following kernel parameter: acpi_serialize.
Comment 9 Francesco Muzio 2013-12-23 10:01:19 UTC
Created attachment 119341 [details]
dmesg of vmlinuz-3.11-2-amd64 root=UUID=3d398f1d-2af1-4c45-81a1-537ec0cdcb6d ro acpi_serialize

Thanks for your support

I have booted the laster kernel available on Debian testing with acpi_serialized option

Unfortunately the ACPI messages persist

I have attached the dmesg output
Comment 10 Lv Zheng 2014-01-14 01:36:02 UTC
Hi,

Thanks for the report.  This is good case to learn how ACPICA interpreter behaves.

I think the implementation of acpi_gbl_all_methods_serialized is wrong.  It relies on interpreter lock but there is really unlocking cases in the control methods execution.  So if a control method is executed and blocked, another execution instance of the same control method can re-enter to trigger the failure message you've seen.

Bob, is it right?

Rui, please re-assign this bug to me, let me try to fix it.
Comment 11 Robert Moore 2014-01-14 15:33:40 UTC
If the acpi_gbl_all_methods_serialized is set, the interpreter is not relinquished in the blocking cases.

void
AcpiExRelinquishInterpreter (
    void)
{
    ACPI_FUNCTION_TRACE (ExRelinquishInterpreter);


    /*
     * If the global serialized flag is set, do not release the interpreter.
     * This forces the interpreter to be single threaded.
     */
    if (!AcpiGbl_AllMethodsSerialized)
    {
        AcpiExExitInterpreter ();
    }

    return_VOID;
}
Comment 12 Lv Zheng 2014-01-15 00:51:04 UTC
There are still cases AcpiExExitInterpreter invoked directly. For example, region callbacks (before invoking setup/handler), global lock acquisition, module code execution after "Load/LoadTable" opcodes.  They can happen during an execution of a control method.
Specific to this case, there are region accesses happening in _PSR, thus when _PSR is blocked and interpreter lock is released, same instance of _PSR can be re-entered to trigger this error message.
Comment 13 Francesco Muzio 2014-01-15 20:01:27 UTC
thank you for haven't abandoned this bug report. The machine still have the described problem above and is running a recent version 3.12 of the kernel.

I'm available to test workarounds, patches and give logs/dumps if requested
Comment 14 Lv Zheng 2014-01-17 02:19:02 UTC
Created attachment 122371 [details]
ACPICA: Interpreter: Extends AcpiGbl_AllMethodsSerialized to global lock and non initialization callbacks

This is a workaround.
I tested it on my platform, it doesn't hang.

Please give it a try.
Comment 15 Lv Zheng 2014-01-17 02:22:41 UTC
Let me also say something more about this issue.

I think originally AcpiGbl_AllMethodsSerialized might be only used to avoid explict interpreter exit (Sleep/Acquire opcodes).

We just see a requirement that the same control method should not be re-entered in any cases (Serialize/NotSerialize).  This does not relate to "serialization", so it should be handled in different way.

Thus currently we don't need a workaround to extend AcpiGbl_AllMethodsSerialized to implicit interpreter exit (region accesses and etc.), it might be dangerous as it might introduce regressions like dead locks (but I have offered such a workaround patch for you in the previous post, you can test it to see if it can work for your platform).

We may also need your test support to test another patch that offers protection to avoid re-entrance on the the same control method later.
Comment 16 Lv Zheng 2014-01-17 13:24:07 UTC
Created attachment 122401 [details]
ACPICA: Executer: Add lockless interpreter locking primitive support.

Well, I start to think that the issue is just caused by the following reason:
We need a lockless environment for callback invocations, but the implementation simply exits the interpreter lock to achieve this which breaks control method serialization.
This patch can offer a lockless environment for callbacks with interpreter lock still held.  I tested it in the ACPICA simulation environment and my Linux box.
Could you give it a try?
Comment 17 Francesco Muzio 2014-01-17 18:24:45 UTC
I need clarification.
The second patch replaced the first?
Comment 18 Robert Moore 2014-01-17 18:26:42 UTC
Please try 122371 first and report the result.
Then, please try 122401 and report the result.

Thanks.
Comment 19 Francesco Muzio 2014-01-19 14:18:31 UTC
Created attachment 122601 [details]
boot with kernel patched with attachment 122371 [details]

I have bad news

kernel patched with attachment 122371 [details] boot the machine very slowly 
it stops for about 30 sec on "hda-codec: out of range cmd 0:20:400:ffffffff" messages
and the ACPI errors are still present.
see the dmesg attached

kernel patched with attachment 122401 [details] won't boot the machine, it hangs after some "ACPI : Added _OSI.." messages
see the photo attached
Comment 20 Francesco Muzio 2014-01-19 14:19:49 UTC
Created attachment 122611 [details]
boot with kernel patched with attachment 122401 [details]
Comment 21 Francesco Muzio 2014-01-19 14:23:23 UTC
boot the machine with kernel patched with attachment 122371 [details] also broken the analog audio device
Comment 22 Lv Zheng 2014-01-20 00:53:52 UTC
Thanks for the report.

Patch 1 is just an workaround making all control methods serialized, the result shows we do need interpreter lock unheld for region accesses.

Patch 2 is just a fix to remove intpreter lock exit/enter sequence around the region accesses even without the workaround specified, this leads to dead locks.

I considered this issue again, we can find 3 possbile aspects to address the root cause of this issue:
1. control method serialization
2. lockless environment of region accesses
3. object creations

All solutions we've been discussing are around possible issues that caused by 1 and 2, they are all not working.  I think it is time to talk about solutions for 3.

What if we never return AE_ALREADY_EXIST errors for object creations.  We can just return reference increased object if it is already exist.
Let us compose a patch to achieve this for you.
Comment 23 Lv Zheng 2014-01-20 08:04:10 UTC
Created attachment 122661 [details]
ACPICA: Namespace: Fix issue of returning AE_ALREADY_EXIST for field objects creation.

This patch tries not to return AE_ALREADY_EXISTS but to wait until creation possible. Hence this patch implements a fix to fix the issue caused by the 3rd possible cause listed in Comment 22.
I've booted my kernel with this patch attached. Please also give it a try.
Comment 24 Lv Zheng 2014-01-20 08:08:18 UTC
(In reply to Francesco Muzio from comment #19)
> Created attachment 122601 [details]
> boot with kernel patched with attachment 122371 [details]

I checked the dmesg, it seems you didn't specify acpi_serialize boot parameter.  Actually, this patch only takes effect when acpi_serialize is specified, otherwise it is an no-op.

Could you please boot again with this patch applied and acpi_serialize specified?

> I have bad news
> kernel patched with attachment 122371 [details]
> [details] boot the machine very slowly 
> it stops for about 30 sec on "hda-codec: out of range cmd 0:20:400:ffffffff"
> messages

Since this patch should be no-op, this seems to be caused by other issues.

> and the ACPI errors are still present.

This is reasonable.
Comment 25 Lv Zheng 2014-01-20 08:27:53 UTC
(In reply to Francesco Muzio from comment #20)
> Created attachment 122611 [details]
> boot with kernel patched with attachment 122401 [details]

I tested this patch, there was no crash but hang.
Hang is reasonable but crash is not.

This patch change interpreter lock into a lock that do not held any OSPM locks.  So that if there are OSPM locks locked before invoking a control method and the same locks would get locked again in the region/exception callbacks, this patch can ensure control method serialization without dead lock introduced.

But this patch cannot solve such situation:
OSPM invokes control method inside of a callback.  I checked the code and found that there are _HID/_CID/_ADR/_SET/_BBN control method invocations in acpi_ev_pci_config_region_setup.
Then dead lock could happen against since such invocations will try to lock intpreter lock again.

I've tried a modified patch with acpi_ex_exit_interpreter()/acpi_ex_enter_interpreter() restored for region->setup callbacks and Linux successfully booted.

In order to use this solution, we need to first modify acpi_ev_pci_config_region_setup to avoid invoking control methods in it.  Then applying the patch of "attachment 122401 [details]".

I trend not to do such things if other solutions can work.
Comment 26 Lv Zheng 2014-01-20 08:52:24 UTC
Here is my requests.

1. Please apply attachment 122661 [details], do not apply others, then perform a build/boot test, and post the dmesg here.  I expect this patch can solve the issue and can be the final solution we want to upstream.

2. Please apply attachment 122371 [details], do not apply others, then perform a build and boot the kernel with acpi_serialize specified, and post the dmesg here.  I expect the result would be same as the result of booting a kernel with attachment 122401 [details] applied and acpi_serialize not specified.

Thanks in advance.
Comment 27 Lv Zheng 2014-01-20 13:43:30 UTC
Created attachment 122701 [details]
ACPICA: Namespace: Fix issue of returning AE_ALREADY_EXISTS for field objects creation.

The previous version has issues.
I even forgot to delete the error message.
So if you booted the test kernel with it applied, there surely will be the AE_ALREADY_EXISTS errors in the new dmesg output.
This updated patch fixed the issues and I marked the old one as obsoleted.
Please use this new revision instead.
Comment 28 Lv Zheng 2014-01-20 16:26:08 UTC
Created attachment 122761 [details]
ACPICA: Namespace: Fix issue of returning AE_ALREADY_EXISTS for field objects creation.

Sorry for the noise.  The buggy one is obsoleted.
Comment 29 Francesco Muzio 2014-01-20 22:16:04 UTC
I'm a bit confused,
Can you repeat how many tests, with patched kernel, are to do? 

Please give me an ordered list of the tests and for each tell me which patch should I use

Thanks in advance
Comment 30 Lv Zheng 2014-01-21 02:06:32 UTC
1. Please apply "attachment 122371 [details]", do not apply others, then perform a kernel build and boot the kernel "__with__ acpi_serialize specified", and post the dmesg here.

We need to confirm this workaround is not working.

2. Please apply "attachment 122761 [details]", do not apply others, then perform a kernel build and boot the kernel "__without__ acpi_serialize specified", and post the dmesg here.

I need to confirm whether this workaround can work or not.
Comment 31 Lv Zheng 2014-01-22 04:30:42 UTC
Created attachment 122981 [details]
ACPICA: Dispatcher: Add support to automatically mark named object creation method as Serialized.

The new solution suggested by Rafael, Authorized by Bob and tested by me.
I'll list the test requests later as there are 2 more patches related.
Comment 32 Lv Zheng 2014-01-22 04:32:53 UTC
Created attachment 122991 [details]
ACPICA: Executer: Cleanup acpi_gbl_all_method_serialized mechanism.

We are going to deprecate acpi_gbl_all_methods_serialized option, so the test result of attachment 12371 [details] is no longer need to be confirmed.
Let me list the latest test requests after uploading all of the patches.
Comment 33 Lv Zheng 2014-01-22 04:33:54 UTC
Created attachment 123001 [details]
ACPICA: Dispatcher: Cleanup ACPI_METHOD_SERIALIZED_PENDING mechanism.

The old marking mechanism is no longer needed.
Comment 34 Lv Zheng 2014-01-22 04:44:58 UTC
Here are the test requests.  We now only need to confirm the following things:

SOLUTION 1. applying the following patches, build and boot the kernel, post the dmesg here:
   attachment 122991 [details]
   attachment 123001 [details]
   attachment 122981 [details]
SOLUTION 2. applying the following patches, build and boot the kernel, post the dmesg here:
   attachment 122991 [details]
   attachment 123001 [details]
   attachment 122761 [details]

Sorry for confusing you so much.  Please give them a try.  Thanks.
Comment 35 Francesco Muzio 2014-01-25 23:07:20 UTC
Created attachment 123321 [details]
dmesg for solution 1

I have patched the source code of the latest kernel available (3.13)

I had to patch  manually include/acpi/acpixf.h because the patch 122991 could not find in the line 74 "extern u8 acpi_gbl_all_methods_serialized"
What version of the kernel have you based this patch?


for both solutions I have booted the machine without acpi_serialize parameter, I hope to have wrong nothing

Both solution doesn't fix the problem (ACPI errors are still present) and an " hda-codec: out of range cmd 0:20:400:ffffffff" message is printed many times 

Also the 2nd solution boot completely the machine one time on three, the others two times a black screen is showed after init is entering on runlevel 2
Comment 36 Francesco Muzio 2014-01-25 23:07:42 UTC
Created attachment 123331 [details]
dmesg for solution 2
Comment 37 Lv Zheng 2014-01-26 03:29:58 UTC
Created attachment 123361 [details]
The minimized DSDT that contains all SSZE creation that can be found in DSDT/SSDTs

I checked the results.
1. In solution 1 result, there are 2 times AE_ALREADY_EXISTS appearing for \_SB.ACAD._PSR evalution.
2. In solution 2 result, there is 1 time AE_ALREADY_EXISTS appearing for \_SB.ACAD._PSR evalution.

I searched the whole DSDT/SSDTs, and collected all SSZE creations in the attached minimized DSDT for us to discuss.
Comment 38 Lv Zheng 2014-01-26 07:12:05 UTC
Hi, let me post some investigation results here.

For solution 1, I think I know why it doesn't work.  This solution only changes control methods that contain "Name()" opcodes into "Serialized", but not control methods that contain "CreateXField()" opcodes into "Serialized" (root cause 1), thus you can still see the AE_ALREADY_EXISTS error message twice.
My further investigation shows if we marked control methods that contain "CreateXField()" opcodes into "Serialized", the platforms would fail to boot!  There were many drivers hung, just like what you've seen for hda-codec on your platform.  This means there are really many control methods requiring to be executed in parallel and we cannot simply change them into "Serialized" (root cause 2).  This cause is also the cause why solution 2 cannot work and we still can see the AE_ALREADY_EXISTS error message once (because of timeouts).

Let's figure out which control methods on your platform are facing such issue mentioned by "root cause 2".  I'll collect a list of marked "control methods" and asking hda expert for help.

We also want to know if you built and booted the same kernel without all of the above posted patches applied and did not specify "acpi_serilized", would you still suffer from the hda-codec issue?
Comment 39 Lv Zheng 2014-01-26 07:25:32 UTC
> I had to patch  manually include/acpi/acpixf.h because the patch 122991
> could not find in the line 74 "extern u8 acpi_gbl_all_methods_serialized"
> What version of the kernel have you based this patch?

Did you revert solution 1 patches before trying solution 2?
Comment 40 Francesco Muzio 2014-01-26 16:05:45 UTC
Created attachment 123391 [details]
dmesg output after booting AO725 with a clean 3.13 version of the kernel

>We also want to know if you built and booted the same kernel without all of
>the above >posted patches applied and did not specify "acpi_serilized", would
>you still suffer from >the hda-codec issue?

it's true, 
With an non-patched kernel (v3.13) the hda-codec issue also happens, see the dmesg attached 

>Did you revert solution 1 patches before trying solution 2?

yes, but the fist two patches are the same, so these are the steps:
I have downloaded and extracted the source of kernel v3.13
I have put in the resulted directory the .config files shipped with standard Debian kernel
I have applied the attachment 122991 [details] as a patch
I have applied the attachment 123001 [details] as a patch
I have applied the attachment 122981 [details] as a patch
I have build the kernel for the first solution 
I have reverted the patch of attachment 122981 [details]
I have applied the attachment 122761 [details] as a patch
I have build the kernel for the second solution
Comment 41 Lv Zheng 2014-01-27 01:19:12 UTC
> yes, but the fist two patches are the same, so these are the steps:
> I have downloaded and extracted the source of kernel v3.13

I was working on the following branch:
1. torvalds/linux/master
2. rafael/linux-pm/linux-next
3. internal branch containing ACPICA patches that haven't been upstreamed.
You are right, the attachment 122991 [details] need to be rebased for v3.13.  Since you have corrected it, I'd not upload a seperate one for v3.13.

> it's true, 
> With an non-patched kernel (v3.13) the hda-codec issue also happens,
> see the dmesg attached 

OK, so we have to forget "root cause 2".
For the "root cause 1", I'll upload patches of solution 1 for you to try again.
Comment 42 Lv Zheng 2014-01-27 01:20:56 UTC
Created attachment 123441 [details]
ACPICA: Add auto-serialization support for ill-behaved control methods.

The patch extracted from ACPICA upstream.
This is the updated solution 1 patch.
Comment 43 Lv Zheng 2014-01-27 01:22:02 UTC
Created attachment 123451 [details]
ACPICA: Dispatcher: Add auto-serialization for CreateXField and XField opcodes.

This is a fix for root cause 1.
Comment 44 Lv Zheng 2014-01-27 01:28:31 UTC
Here is the test request:

1. Apply the following patches:
   attachment 122991 [details] (you need to rebase it for v3.13)
   attachment 123001 [details]
   attachment 123441 [details]
   attachment 123451 [details]
2. Build the kernel with "CONFIG_ACPI_DEBUG" enabled.
   This is meant to enable the logging of ACPI_DEBUG_PRINT messages.
3. Boot the kernel build with "acpi.debug_layer=0x00000040"
   This means ACPI_DISPATCHER component is enabled, we want to see "Method serialized ..." messages in the kernel log.
4. Post the dmesg output here.

Thanks.
Comment 45 Francesco Muzio 2014-02-01 22:20:36 UTC
Created attachment 124141 [details]
dmesg output after patch submitted on comment #44

very good, after the patches I see only two "ACPI Exception" probably not related to this bug

see the dmesg
Comment 46 Lv Zheng 2014-02-08 01:54:38 UTC
Hi,

Sorry for the delayed response.  We've new year holidays here.
Thanks for the testing.

I can see one risk in your previous test result:
---
Also the 2nd solution boot completely the machine one time on three, the others two times a black screen is showed after init is entering on runlevel 2.
---

So I need your confirmaion:
Do you face the same problem when booting the same kernel without patching any patches in this thread?
And if you don't, do you face the same problem when booting the same kernel after patching it with the 1st solution patches listed in the comment 44?
Comment 47 Francesco Muzio 2014-02-08 13:01:40 UTC
Unpatched kernel, and kernel patched with attachments retrieved on comment 44, boots my machine normally without to experience the issue described  on comment 35
Comment 48 Lv Zheng 2014-02-10 02:54:40 UTC
OK.
The required patches of solution 1 is upstreamed in acpica/master branch.
Let's close this bug.
Thanks for the reporting and testing.
Comment 49 Francesco Muzio 2014-02-10 19:52:39 UTC
well, thank you

just one last question/curiosity: 
this solution only quiets an error or make any improvements?
I can experienced now some enhancement related to the ACPI layer?
Comment 50 Lv Zheng 2014-02-11 00:14:04 UTC
There is code originally in ACPICA marking methods as "Serialized" when AE_ALREADY_EXISTS encountered, thus for your platform, the answer is "yes, it only reduces 1 error message".

But we also notice that it not only leads to an error message, but also leads to a failure of the very first execution of such a control method.  So if this control method execution is among the driver initialization steps, it can actually lead to malfunctioning of OSPM.

Note that ACPICA has made an assumption that "BIOSes will not write ASL to allow same control method to be re-entered from multiple threads".  If this assumption is correct, this solution improves ACPICA interpreter; if this assumption is not correct, we may see dead locks triggered by this solution on the NotSerialized control methods that are meant to be re-entered from multiple threads where the execution instances are blocked waiting for each other.  Without knowing the other de-facto standard interpreter's behavior, this is going to be tested in the real world.
Comment 51 Robert Moore 2014-02-11 17:49:51 UTC
ACPICA has made an assumption that "BIOSes will not write ASL to allow same control method to be re-entered from multiple threads".

This is not true, ACPICA makes no such assumption. We only mark methods "serialized" if they create named objects.
Comment 52 Lv Zheng 2014-02-12 00:44:23 UTC
Created attachment 125671 [details]
An example ASL that might be affected by these fixes.

Hi, Bob

You are right.  Thanks for pointing out my mistake.
It should be reworded as:
The solutions in this thread assume there is no control method in the real world allowing multi-threading reentrance while still creating named objects.

The penalty ASL may look like the attached example.
Comment 53 Robert Moore 2014-02-12 16:02:29 UTC
"The solutions in this thread assume there is no control method in the real world allowing multi-threading reentrance while still creating named objects."

This is becoming an off-topic discussion, but however:

This is still not correct.

If anything, by marking methods that create named objects "serialized", ACPICA is making an assumption that there are no AML control methods that _require_ multiple threads to enter it in order to function properly.
Comment 54 Lv Zheng 2014-02-13 03:24:43 UTC
Hi, Bob

I just worry about the way the defacto standard interpreter implements object creation, probably it is uneccessary:

For the following creation conflict cases:
1. A named object is conflict with an existing named object created by the same control method;
2. A field object is conflict with an existing field object created to reference the same global region/buffer using same parameters (offset/length).
The object to be created and the existing object can be deemed as same objects.
Such object creation opcodes can also be moved out of the control method.

If the interpreter implementation implements an object creation facility that looks up the existing objects to find the _same_ object using the conditions mentioned above and increases reference count of the existing one, the method then doesn't require to be changed from "NotSerialized" to "Serialized".

Well, this concern is just for academic rigour in approaching this problem.
Comment 55 Lv Zheng 2014-03-04 08:24:45 UTC
Closing this bug due to code has been shipped in linux-pm/linux-next branch.

Note You need to log in before you can comment on or make changes to this bug.