Bug 94661
Summary: | \_SB_.PCI0:_OSC invalid UUID | ||
---|---|---|---|
Product: | Drivers | Reporter: | Bjorn Helgaas (bjorn) |
Component: | PCI | Assignee: | drivers_pci (drivers_pci) |
Status: | NEW --- | ||
Severity: | normal | CC: | bugzilla, deprez.maarten, gabriele.mzt, luto |
Priority: | P1 | ||
Hardware: | All | ||
OS: | Linux | ||
Kernel Version: | v3.19 | Subsystem: | |
Regression: | No | Bisected commit-id: | |
Attachments: |
_OSC debug info without CONFIG_HOTPLUG_PCI_ACPI
_OSC debug info with CONFIG_HOTPLUG_PCI_ACPI _OSC debug patch Dell XPS13 9333 - dmesg Dell XPS13 9333 - dmesg with patch debug patch for https://bugzilla.kernel.org/show_bug.cgi?id=94661 Winbook TW100 4.3.0-0.rc0.git11.2.fc22.i686.txt |
Description
Bjorn Helgaas
2015-03-10 15:33:57 UTC
Created attachment 170811 [details] _OSC debug info without CONFIG_HOTPLUG_PCI_ACPI As requested in bug #93561, boot log with _OSC debug patch applied without CONFIG_HOTPLUG_PCI_ACPI set... Created attachment 170821 [details]
_OSC debug info with CONFIG_HOTPLUG_PCI_ACPI
... and with CONFIG_HOTPLUG_PCI_ACPI set.
Created attachment 182181 [details]
_OSC debug patch
If you see messages like:
\_SB_.PCI0:_OSC invalid UUID
_OSC request data:1 1f 0
in your dmesg log, please attach the dmesg log here, then apply this debug patch and also attach the new dmesg log.
This patch is based on v4.2-rc1, but should work on older kernels, too.
Created attachment 182201 [details]
Dell XPS13 9333 - dmesg
Dell XPS13 9333, dmesg without the patch applied.
Created attachment 182211 [details]
Dell XPS13 9333 - dmesg with patch
Dell XPS13 9333, dmesg with the patch applied.
Created attachment 187251 [details] debug patch for https://bugzilla.kernel.org/show_bug.cgi?id=94661 Created attachment 187431 [details]
Winbook TW100 4.3.0-0.rc0.git11.2.fc22.i686.txt
I'm pretty sure this is a DSDT bug combined with poor handling on our part. With improved debugging (and I'll send the patch out to the list shortly), I see: [ 0.236336] ACPI: PCI Root Bridge [PCI0] (domain 0000 [bus 00-fe]) [ 0.236342] acpi PNP0A08:00: _OSC: OS supports [ExtendedConfig ASPM ClockPM S egments MSI] [ 0.236390] \_SB_.PCI0 (33DB4D5B-1FF7-401C-9657-7441C03DD766): _OSC invalid U UID [ 0.236391] _OSC request data: 1 1f 0 [ 0.236395] acpi PNP0A08:00: _OSC failed (AE_ERROR); disabling ASPM [ 0.237093] PCI host bridge to bus 0000:00 This appears to match both the spec and my DSDT's expectation. *However*, my DSDT (Dell XPS 13 9350) actually checks: If (((Arg0 == GUID) && NEXP)) { success; } else { fail (and return "invalid UUID"; } Can we *please* include the evaluate-any-ACPI-method patches upstream so I can just read NEXP directly from the command line? Pretty please? It's okay if it's behind a debug option. Anyway, we should probably respond by just disabling ASPM and such for the part of the hierarchy behind the bridge that failed the request -- from my reading of the spec, this error should not be considered a global problem. Meanwhile, if anyone has a Dell contact, we should consider asking them to change their error code. I'm going to try to figure out what NEXP is. See also bug 36932. This is a longstanding issue on Dell laptops, apparently. We may want to add a quirk if we can figure out what's going on. Uh, WTF? OperationRegion (GNVS, SystemMemory, 0x37718000, 0x05F5) vs. BIOS-e820: [mem 0x0000000000100000-0x000000007829dfff] usable Unless I'm missing something, this is dangerously wrong. /proc/iomem does *not* have a reservation for this opregion. This could easily cause data corruption if ACPI writes to GNVS, and it could cause screwups (maybe like the one here) when ACPI reads it. Why aren't we throwing a giant warning at boot here? I can't tell yet whether this is a GRUB bug (why does anyone still use GRUB?), a Linux bug, or a firmware bug. (In reply to Andy Lutomirski from comment #8) > I'm pretty sure this is a DSDT bug combined with poor handling on our part. > my DSDT (Dell XPS 13 9350) actually checks: > > If (((Arg0 == GUID) && NEXP)) { success; } else { fail (and return "invalid > UUID"; } I'm not a firmware guy, but I think NEXP is a "native PCIe support" flag in a global ACPI memory region ("GNVS"), called "NPCE" here: http://review.coreboot.org/gitweb?p=coreboot.git;a=blob;f=src/southbridge/intel/bd82x6x/acpi/globalnvs.asl Many BIOSes from Dell, Fujitsu, HP, Intel, Lenovo, Apple, Panasonic, etc., have a similar flag. It might be just a bug that got copied everywhere. We claim we're "disabling ASPM", but I think that really means "Linux won't touch ASPM configuration". So if the BIOS enabled ASPM, we'll leave it enabled, and ASPM "works". But I don't think we'll enable ASPM on hot-added devices. Maybe nobody tested that part. |