Bug 90351 - \_SB_.PCI0:_OSC invalid UUID on Samsung NP530U3C _OSC request data:1 1f 0
Summary: \_SB_.PCI0:_OSC invalid UUID on Samsung NP530U3C _OSC request data:1 1f 0
Status: CLOSED DUPLICATE of bug 36932
Alias: None
Product: ACPI
Classification: Unclassified
Component: BIOS (show other bugs)
Hardware: All Linux
: P1 normal
Assignee: Lv Zheng
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2014-12-26 20:13 UTC by Reimundo Heluani
Modified: 2015-10-02 20:47 UTC (History)
5 users (show)

See Also:
Kernel Version: 3.14
Subsystem:
Regression: No
Bisected commit-id:


Attachments
DSDT Table for Samsung NP530u3c (334.17 KB, application/octet-stream)
2014-12-26 20:14 UTC, Reimundo Heluani
Details
dmesg of last boot (59.10 KB, text/plain)
2014-12-26 20:15 UTC, Reimundo Heluani
Details
acpidump XPS13 9333 (452.31 KB, application/octet-stream)
2014-12-29 17:56 UTC, Gabriele Mazzotta
Details
dmesg XPS13 9333 (53.30 KB, text/plain)
2014-12-29 18:23 UTC, Gabriele Mazzotta
Details

Description Reimundo Heluani 2014-12-26 20:13:38 UTC
On dmesg I see 

[    0.152457] ACPI: PCI Root Bridge [PCI0] (domain 0000 [bus 00-3e])
[    0.152466] acpi PNP0A08:00: _OSC: OS supports [ExtendedConfig ASPM ClockPM S
egments MSI]
[    0.152613] \_SB_.PCI0:_OSC invalid UUID
[    0.152614] _OSC request data:1 1f 0 
[    0.152619] acpi PNP0A08:00: _OSC failed (AE_ERROR); disabling ASPM


which is very similar to #36932 however the explanation in that case was as a BIOS bug in https://bugzilla.kernel.org/show_bug.cgi?id=36932#c26 given by the fact that certain variable (NEXP) was not initialized in any of the DSDT/SSDT.

In this case the implementation of _OSC in DSDT is similar to #36932 (with an extra check) but the variable NEXP is declared in 


  OperationRegion (GNVS, SystemMemory, 0xDAF7CE18, 0x01C8)
    Field (GNVS, AnyAcc, Lock, Preserve)
    {
....
     Offset (0xE1), 
        OSCC,   8, 
        NEXP,   8, 
        SBV1,   8, 
....


And that region of memory is reported as ACPI NVS

[    0.000000] BIOS-e820: [mem 0x00000000bd4b5000-0x00000000ce3eefff] usable
[    0.000000] BIOS-e820: [mem 0x00000000ce3ef000-0x00000000daeeefff] reserved
[    0.000000] BIOS-e820: [mem 0x00000000daeef000-0x00000000daf9efff] ACPI NVS
[    0.000000] BIOS-e820: [mem 0x00000000daf9f000-0x00000000daffefff] ACPI data
[    0.000000] BIOS-e820: [mem 0x00000000dafff000-0x00000000daffffff] usable

So it does not get overwritten by the OS on boot. I don't know anything about ACPI and just started reading today, but from the ACPI specification the _OSC parameters are a UUID in Arg0, a Revision ID in Arg1 and a paramenter count in Arg2 so the line in dmesg:

[    0.152614] _OSC request data:1 1f 0 

I see as if the kernel is passing 0 parameters in Arg3. 

The first check in the definition of _OSC in DSDT that may be the cause of this is:

 Method (_OSC, 4, Serialized)  // _OSC: Operating System Capabilities
            {
                Store (Arg3, Local0)
                CreateDWordField (Local0, Zero, CDW1)
                CreateDWordField (Local0, 0x04, CDW2)
                CreateDWordField (Local0, 0x08, CDW3)
                If (^XHC.CUID (Arg0))
                {
                    Return (^XHC.POSC (Arg1, Arg2, Arg3))
                }
                Else
                {
                    If (_OSI ("Windows 2012"))
                    {
                        If (LEqual (XCNT, Zero))
                        {
                            ^XHC.XSEL ()
                            Increment (XCNT)
                        }
                    }
                }
Comment 1 Reimundo Heluani 2014-12-26 20:14:33 UTC
Created attachment 161911 [details]
DSDT Table for Samsung NP530u3c
Comment 2 Reimundo Heluani 2014-12-26 20:15:28 UTC
Created attachment 161921 [details]
dmesg of last boot
Comment 3 Gabriele Mazzotta 2014-12-29 17:56:43 UTC
Created attachment 162101 [details]
acpidump XPS13 9333

I have the exact same problem on my Dell XPS13 9333. As for the NP530u3c, NEXP is defined in ACPI NVS and it's the cause of the invalid UUID error.

Here attached the acpidump.
Comment 4 Gabriele Mazzotta 2014-12-29 18:23:34 UTC
Created attachment 162111 [details]
dmesg XPS13 9333
Comment 5 Lv Zheng 2014-12-30 07:55:54 UTC
Hi,

(In reply to rheluani from comment #0)
> On dmesg I see 
> 
> [    0.152457] ACPI: PCI Root Bridge [PCI0] (domain 0000 [bus 00-3e])
> [    0.152466] acpi PNP0A08:00: _OSC: OS supports [ExtendedConfig ASPM
> ClockPM S
> egments MSI]
> [    0.152613] \_SB_.PCI0:_OSC invalid UUID
> [    0.152614] _OSC request data:1 1f 0 
> [    0.152619] acpi PNP0A08:00: _OSC failed (AE_ERROR); disabling ASPM
> 
> 
> which is very similar to #36932 however the explanation in that case was as
> a BIOS bug in https://bugzilla.kernel.org/show_bug.cgi?id=36932#c26 given by
> the fact that certain variable (NEXP) was not initialized in any of the
> DSDT/SSDT.

This comment is written by me.
Let me explain what I've done for 36932.
Please correct me if I am wrong.

I decompiled attachment 62002 [details].
The _OSC looks like what the comment "https://bugzilla.kernel.org/show_bug.cgi?id=36932#c26" said.

And the _OSC request data is:
[    0.796762] \_SB_.PCI0:_OSC invalid UUID
[    0.796764] _OSC request data:1 1f 1f 
Note that acpi_osc_context.cap contains the input value and acpi_osc_context.ret contains the output value.
Yes, it might not make sense that ACPICA need 2 buffer here, but this is a different issue than _OSC one.
This log entry is trying to dump acpi_osc_context.cap, thus it only means the input values:
CDW1=1 (IN=query type)
CDW2=SUPP=8 (IN)
CDW3=CTRL=1f

When CDW1 is used as an input value, it is 1.
When CDW1 is used as an output value, it is 0x04 (we can see the output value by the _OSC invalid UUID error log, it is 0x04).

The 0x04 is only returned by this line:
                Else
                {
                    Or (CDW1, 0x04, CDW1)
I also executed the AML in the acpiexec which is the simulation environment of the same AML interpreter used by the kernel.
The step by step execution exactly shows the GUID matched.
So I made the conclusion that the only possible 0x04 returning value was caused by NEXP.

> In this case the implementation of _OSC in DSDT is similar to #36932 (with
> an extra check) but the variable NEXP is declared in 
> 
> 
>   OperationRegion (GNVS, SystemMemory, 0xDAF7CE18, 0x01C8)
>     Field (GNVS, AnyAcc, Lock, Preserve)
>     {
> ....
>      Offset (0xE1), 
>         OSCC,   8, 
>         NEXP,   8, 
>         SBV1,   8, 
> ....

The acpidump seems to be different from attachment 62002 [details].

> And that region of memory is reported as ACPI NVS
> 
> [    0.000000] BIOS-e820: [mem 0x00000000bd4b5000-0x00000000ce3eefff] usable
> [    0.000000] BIOS-e820: [mem 0x00000000ce3ef000-0x00000000daeeefff]
> reserved
> [    0.000000] BIOS-e820: [mem 0x00000000daeef000-0x00000000daf9efff] ACPI
> NVS
> [    0.000000] BIOS-e820: [mem 0x00000000daf9f000-0x00000000daffefff] ACPI
> data
> [    0.000000] BIOS-e820: [mem 0x00000000dafff000-0x00000000daffffff] usable
> 
> So it does not get overwritten by the OS on boot.

It is not overwritten, but it doesn't mean it's value is not 0 by default.
As far as I know, there are many BIOSes leaked with NEXP wired to 0.
We can confirm this by DSDT override.
The problem is I don't know if Linux should rely on what _OSC has returned.
I just don't have time to discuss with correct people to determine what should be done here in Linux for _OSC. A proper quirk for NEXP=0 BIOSes (pcie_aspm=force doesn't work here) or something else.

> I don't know anything
> about ACPI and just started reading today, but from the ACPI specification
> the _OSC parameters are a UUID in Arg0, a Revision ID in Arg1 and a
> paramenter count in Arg2 so the line in dmesg:
> 
> [    0.152614] _OSC request data:1 1f 0 
> 
> I see as if the kernel is passing 0 parameters in Arg3.

The Arg3 is actually the capbility buffer.
It is comprised of CDW1/2/3 as stated above.
Well, CTRL=0 is also the input value.
It seems the error code is still 0x04 on this platform...
[    0.152613] \_SB_.PCI0:_OSC invalid UUID
[    0.152614] _OSC request data:1 1f 0 
So this might still mean an NEXP=0 failure.

You might be able to validate this by back porting Linux acpi_pci_query_osc() to acpiexec and execute the AML extracted from the acpidump output.

<cut>

Thanks and best regards
-Lv
Comment 6 Lv Zheng 2015-02-11 02:18:15 UTC
Closing it due to no feedback.
You can re-open it or if you found different issues, you could open a new one.

Thanks
-Lv
Comment 7 Gabriele Mazzotta 2015-02-11 09:40:01 UTC
I'm sorry, I didn't understand you were waiting for some feedback.
Before I reported anything here I evaluated NEXP with acpi_evaluate_integer() through a simple kernel module and also verified that pci_osc_uuid_str was correct in my DSDT.

To be more specific, this is how I evaluated NEXP:

  status = acpi_evaluate_integer(handle, "\\NEXP", NULL, &value);

The returned value was 0, so I assumed this was the problem NEXP=0.

If this is not enough, I can try something else.

I can't reopen this report since I'm not the original reporter.
Comment 8 Reimundo Heluani 2015-02-11 10:13:42 UTC
Opening due to feedback from Gabriele Mazzota. I haven't given any feedback since I couldn't get new info on the problem after comment #5 by Lv Zheng
Comment 9 Lv Zheng 2015-02-16 07:39:45 UTC
Then this is a duplicate bug of NEXP=0.
If you want a quirk for this machine, we need to discuss with BIOS guys to see what kind of quirk is proper here.

*** This bug has been marked as a duplicate of bug 36932 ***
Comment 10 Gabriele Mazzotta 2015-02-22 22:36:59 UTC
Isn't it better to keep one of the two bug open, even if it's not really the kernel's fault that things do not work as expected?

Anyway, looking in the commit history I found that Apple's laptops suffer from the same problem and that a quirk is already in place.

commit 7bc5a2bad0b8d9d1ac9f7b8b33150e4ddf197334
    ACPI: Support _OSI("Darwin") correctly

Looking at the commit message the problem seems to be the same.
Comment 11 Gabriele Mazzotta 2015-02-22 23:03:52 UTC
I'm sorry, I read better the commit and it says that _OSC does nothing on Apple's machines, so I've looked for acpidumps to understand things better.
I found that NHPG and NPME are nops on Macbooks, while they do something on my laptop. Those two methods are also called from _WAK depending on the value of NEXP on my laptop, so I guess that here things are a bit more complicated.
Comment 12 Lv Zheng 2015-02-25 08:16:25 UTC
Hi,

(In reply to Gabriele Mazzotta from comment #10)
> Isn't it better to keep one of the two bug open, even if it's not really the
> kernel's fault that things do not work as expected?

I didn't say it worked as expected, I just said that we ddin't know what's the expectation of BIOS here. :-)

Thanks and best regards
-Lv
Comment 13 Lv Zheng 2015-02-25 08:17:19 UTC
(In reply to Gabriele Mazzotta from comment #11)
> I'm sorry, I read better the commit and it says that _OSC does nothing on
> Apple's machines, so I've looked for acpidumps to understand things better.
> I found that NHPG and NPME are nops on Macbooks, while they do something on
> my laptop. Those two methods are also called from _WAK depending on the
> value of NEXP on my laptop, so I guess that here things are a bit more
> complicated.

Yes, maybe.
If you know what's the expectation of BIOS here, please let us know.

Thanks and best regards
-Lv
Comment 14 Gabriele Mazzotta 2015-02-25 11:47:09 UTC
(In reply to Lv Zheng from comment #13)
> Yes, maybe.
> If you know what's the expectation of BIOS here, please let us know.
> 
> Thanks and best regards
> -Lv

I'm afraid I don't know, but if you have some directions to find that out, I'll do my best.

I have a question. Looking at my DSDT I've noticed that _OSC will do something else if the UUID passed is "7c9512a9-1705-4cb4-af7d-506a2423ab71".

Passing this UUID will make _OSC call POSC which, if I'm not wrong, should call XSEL. _OSC should then return Arg2 instead of 0x04.
If the usual UUID is passed, XSEL will be called if _OSI is "at least" "Windows 2012", which is the case. Everything else is skipped because NEXP is zero.

I'm not sure of this, I simply looked at the decompiled DSDT, so I could be wrong.

Things are a bit different on Reimundo's system according to the DSDT portion here reported because _OSI must be exactly equal to "Windows 2012" (since 3.15 _OSI should be "Windows 2013"), but I guess it's the same other than this detail.

I looked for more info about this UUID, but I couldn't find much. Do you know anything about it?

Thanks
Comment 15 Gabriele Mazzotta 2015-03-26 19:27:44 UTC
(In reply to Lv Zheng from comment #13)
> Yes, maybe.
> If you know what's the expectation of BIOS here, please let us know.
> 
> Thanks and best regards
> -Lv

I took another look at at my dsdt again. I noticed that most of the code depends on CDW3 which is not supplied by the kernel (so I assume it's zero). If CDW3 is zero, it doesn't matter whether NEXP is zero or not, _OSC does nothing.

I hence decided to ignore errors if those are "invalid UUID" errors since I know the UUID is correct to see what happens and got the following:

acpi PNP0A08:00: _OSC: OS now controls [PCIeHotplug PME AER PCIeCapability]


I also got the acpidump of another system and found a very similar implementation of _OSC, but it doesn't depend on NEXP.

   Method (_OSC, 4, Serialized)  // _OSC: Operating System Capabilities
    {
        Local0 = Arg3
        CreateDWordField (Local0, Zero, CDW1)
        CreateDWordField (Local0, 0x04, CDW2)
        CreateDWordField (Local0, 0x08, CDW3)
        If (^XHC.CUID (Arg0))
        {
            Return (^XHC.POSC (Arg1, Arg2, Arg3))
        }
        Else
        {
            If ((OSYS >= 0x07DC))
            {
                If ((XCNT == Zero))
                {
                    ^XHC.XSEL ()
                    XCNT++
                }
            }
        }

        If ((Arg0 == GUID))
        {
            SUPP = CDW2 /* \_SB_.PCI0._OSC.CDW2 */
            CTRL = CDW3 /* \_SB_.PCI0._OSC.CDW3 */
            If ((NEXP == Zero))
            {
                CTRL &= 0xFFFFFFF8
            }

            If (NEXP)
            {
                If (~(CDW1 & One))
                {
                    If ((CTRL & One))
                    {
                        NHPG ()
                    }

                    If ((CTRL & 0x04))
                    {
                        NPME ()
                    }
                }
            }

            If ((Arg1 != One))
            {
                CDW1 |= 0x08
            }

            If ((CDW3 != CTRL))
            {
                CDW1 |= 0x10
            }

            CDW3 = CTRL /* \_SB_.PCI0.CTRL */
            OSCC = CTRL /* \_SB_.PCI0.CTRL */
            Return (Local0)
        }
        Else
        {
            CDW1 |= 0x04
            Return (Local0)
        }
    }
Comment 16 Bjorn Helgaas 2015-07-08 16:11:48 UTC
Reimundo, Gabriele, can I trouble you to try the patch here: https://bugzilla.kernel.org/show_bug.cgi?id=94661#c3 and attach the results there?

I know this bug is closed as a duplicate of bug 36932, and I know 36932 is closed as "will not fix" because it's a BIOS issue, but if we still emit a cryptic message like "invalid UUID", I don't think that's satisfactory from a user's point of view.

Note You need to log in before you can comment on or make changes to this bug.