Bug 15480

Summary: [regression] Fails to boot properly unless given pci=nocrs
Product: ACPI Reporter: Yanko Kaneti (yaneti)
Component: Config-OtherAssignee: Bjorn Helgaas (bjorn.helgaas)
Status: CLOSED CODE_FIX    
Severity: normal CC: acpi-bugzilla, bjorn.helgaas, lenb, maciej.rutecki, rjw, rui.zhang
Priority: P1    
Hardware: All   
OS: Linux   
Kernel Version: 2.6.34-rc1 Subsystem:
Regression: Yes Bisected commit-id:
Bug Depends on:    
Bug Blocks: 15310    
Attachments: boot log without using pci=nocrs
lspci -vvnn
debug patch (trim window to remove conflicts)
/proc/iomem
boot log latest mainline git + att. 25460
boot log latest mainline git + att. 25460
2.6.34-0.11.rc1.git1.fc13.x86_64 + patches dmesg
GA-MA78GM-S2H rev 1.0 F11 - acpidump
GA-MA78GM-S2H rev 1.0 F11 - DSDT
2.6.34-0.17.rc2.git1.fc14.x86_64 + workaround + att. 25523 debug
Windows GA-MA78GM-S2H PCI bus resources
truncate _CRS windows with _LEN > _MAX - _MIN + 1
2.6.34-0.17.1.rc2.git1.fc14.x86_64 + attachment 25691 log
Windows _MIN/_MAX/_LEN parsing

Description Yanko Kaneti 2010-03-09 01:24:48 UTC
Created attachment 25416 [details]
boot log without using pci=nocrs

Since commit 7bc5e3f2be32ae6fb0c74cd0f707f986b3a01a26
x86/PCI: use host bridge _CRS info by default on 2008 and newer machines
my systems fails to boot properly

GigaByte GA-MA78GM-S2H rev.1.0 with BIOS - F11, 09/16/2009

The attached log ends at the point where I decided to reboot
Comment 1 Yanko Kaneti 2010-03-09 01:25:47 UTC
Created attachment 25417 [details]
lspci -vvnn
Comment 2 Bjorn Helgaas 2010-03-09 20:56:45 UTC
http://bugzilla.kernel.org/show_bug.cgi?id=15480

This is a regression since 2.6.33.  Thanks a lot for the report!

  pci_root PNP0A03:00: host bridge window [io  0x0000-0x0cf7]
  pci_root PNP0A03:00: host bridge window [io  0x0d00-0xffff]
  pci_root PNP0A03:00: host bridge window [mem 0x000a0000-0x000bffff]
  pci_root PNP0A03:00: host bridge window [mem 0x000c0000-0x000dffff]
  pci_root PNP0A03:00: host bridge window [mem 0xfed40000-0xfed44fff]
  pci_root PNP0A03:00: can't allocate host bridge window [mem 0xcff00000-0x10ed0ffff]

The last window completely encloses the previous one, which is fine,
so the problem must be that something overlaps *part* of that last
window.

Please attach /proc/iomem (booted with "pci=nocrs"), and try the
attached patch to find out what region conflicts.

If you happen to have Windows on this machine, I'd also like to know
what the Device Manager reports about these host bridge resources.
Comment 3 Bjorn Helgaas 2010-03-10 20:51:11 UTC
On Tuesday 09 March 2010 01:56:02 pm Bjorn Helgaas wrote:
> http://bugzilla.kernel.org/show_bug.cgi?id=15480
> 
> This is a regression since 2.6.33.  Thanks a lot for the report!
> 
>   pci_root PNP0A03:00: host bridge window [io  0x0000-0x0cf7]
>   pci_root PNP0A03:00: host bridge window [io  0x0d00-0xffff]
>   pci_root PNP0A03:00: host bridge window [mem 0x000a0000-0x000bffff]
>   pci_root PNP0A03:00: host bridge window [mem 0x000c0000-0x000dffff]
>   pci_root PNP0A03:00: host bridge window [mem 0xfed40000-0xfed44fff]
>   pci_root PNP0A03:00: can't allocate host bridge window [mem
>   0xcff00000-0x10ed0ffff]
> 
> The last window completely encloses the previous one, which is fine,
> so the problem must be that something overlaps *part* of that last
> window.

My guess is that the conflict is with a System RAM area, possibly one
starting at 0x100000000 like the one here:
  http://fixunix.com/debian/514784-bug-492865-installation-report-mostly-good-some-gripes-about-partitioning-installer-error-mesgs.html

That feels like a BIOS bug in the host bridge description, since
accesses to the conflict area will probably go to RAM, not to PCI.

Can you try the attached debug patch and report the dmesg output?

If this makes your system boot, we'll have to think about whether
this is the right workaround, and whether and how we'd want to get
the conflicting resource out of kernel/resource.c.  There are no
other interfaces there that return the conflict resource, so maybe
there's a reason for keeping them internal.

Bjorn



commit db86a01c1dd7d0a6c18e1b9edd479c1e6a08de93
Author: Bjorn Helgaas <bjorn.helgaas@hp.com>
Date:   Tue Mar 9 11:43:17 2010 -0700

diff --git a/arch/x86/pci/acpi.c b/arch/x86/pci/acpi.c
index 6e22454..42d8f01 100644
--- a/arch/x86/pci/acpi.c
+++ b/arch/x86/pci/acpi.c
@@ -118,7 +118,7 @@ static acpi_status
 setup_resource(struct acpi_resource *acpi_res, void *data)
 {
 	struct pci_root_info *info = data;
-	struct resource *res;
+	struct resource *res, *conflict;
 	struct acpi_resource_address64 addr;
 	acpi_status status;
 	unsigned long flags;
@@ -157,21 +157,39 @@ setup_resource(struct acpi_resource *acpi_res, void *data)
 		return AE_OK;
 	}
 
-	if (insert_resource(root, res)) {
+	for (;;) {
+		conflict = insert_resource_conflict(root, res);
+		if (!conflict)
+			break;
+
 		dev_err(&info->bridge->dev,
-			"can't allocate host bridge window %pR\n", res);
-	} else {
-		pci_bus_add_resource(info->bus, res, 0);
-		info->res_num++;
-		if (addr.translation_offset)
-			dev_info(&info->bridge->dev, "host bridge window %pR "
-				 "(PCI address [%#llx-%#llx])\n",
-				 res, res->start - addr.translation_offset,
-				 res->end - addr.translation_offset);
-		else
-			dev_info(&info->bridge->dev,
-				 "host bridge window %pR\n", res);
+		        "host bridge window %pR conflicts with %pR\n",
+			res, conflict);
+		if (res->start < conflict->end && conflict->end < res->end)
+			res->start = conflict->end + 1;
+		if (res->start < conflict->start && conflict->start < res->end)
+			res->end = conflict->start - 1;
+
+		if (res->start >= res->end) {
+			dev_err(&info->bridge->dev,
+				"can't allocate host bridge window\n");
+			return AE_OK;
+		}
+
+		dev_info(&info->bridge->dev,
+			 "host bridge window trimmed to %pR\n", res);
 	}
+
+	pci_bus_add_resource(info->bus, res, 0);
+	info->res_num++;
+	if (addr.translation_offset)
+		dev_info(&info->bridge->dev, "host bridge window %pR "
+			 "(PCI address [%#llx-%#llx])\n",
+			 res, res->start - addr.translation_offset,
+			 res->end - addr.translation_offset);
+	else
+		dev_info(&info->bridge->dev,
+			 "host bridge window %pR\n", res);
 	return AE_OK;
 }
 
diff --git a/include/linux/ioport.h b/include/linux/ioport.h
index dda9841..9f88526 100644
--- a/include/linux/ioport.h
+++ b/include/linux/ioport.h
@@ -117,6 +117,7 @@ extern void reserve_region_with_split(struct resource *root,
 			     resource_size_t start, resource_size_t end,
 			     const char *name);
 extern int insert_resource(struct resource *parent, struct resource *new);
+extern struct resource *insert_resource_conflict(struct resource *parent, struct resource *new);
 extern void insert_resource_expand_to_fit(struct resource *root, struct resource *new);
 extern int allocate_resource(struct resource *root, struct resource *new,
 			     resource_size_t size, resource_size_t min,
diff --git a/kernel/resource.c b/kernel/resource.c
index 2d5be5d..8ec71a2 100644
--- a/kernel/resource.c
+++ b/kernel/resource.c
@@ -496,6 +496,16 @@ int insert_resource(struct resource *parent, struct resource *new)
 	return conflict ? -EBUSY : 0;
 }
 
+struct resource *insert_resource_conflict(struct resource *parent, struct resource *new)
+{
+	struct resource *conflict;
+
+	write_lock(&resource_lock);
+	conflict = __insert_resource(parent, new);
+	write_unlock(&resource_lock);
+	return conflict;
+}
+
 /**
  * insert_resource_expand_to_fit - Insert a resource into the resource tree
  * @root: root resource descriptor
Comment 4 Bjorn Helgaas 2010-03-10 20:53:09 UTC
Created attachment 25460 [details]
debug patch (trim window to remove conflicts)

same patch as included inline above
Comment 5 Yanko Kaneti 2010-03-10 22:20:30 UTC
Created attachment 25463 [details]
/proc/iomem

Sorry for the delay, Here is /proc/iomem
I'll try to find the time to test with the patches tomorrow.
Comment 6 Yanko Kaneti 2010-03-11 11:42:51 UTC
Created attachment 25471 [details]
boot log latest mainline git + att. 25460

Yes, with applied attachment 25460 [details] the machine seems to boot ok without pci= tweaking. Fedora like kernel config. Log attached.
Comment 7 Yanko Kaneti 2010-03-11 11:50:47 UTC
Created attachment 25473 [details]
 boot log latest mainline git + att. 25460

Something got munged in the previous attachment
Comment 8 Yanko Kaneti 2010-03-12 14:44:41 UTC
Created attachment 25484 [details]
2.6.34-0.11.rc1.git1.fc13.x86_64 + patches dmesg

Latest fedora rawhide kernel + the three patch series from
http://lkml.org/lkml/2010/3/11/512
Boots and works fine so far. Dmesg attached
Thanks.
Comment 9 Rafael J. Wysocki 2010-03-18 21:48:34 UTC
Handled-By : Bjorn Helgaas <bjorn.helgaas@hp.com>
Comment 10 Rafael J. Wysocki 2010-03-21 20:11:28 UTC
Patch : http://lkml.org/lkml/2010/3/11/512
Comment 11 Bjorn Helgaas 2010-03-23 19:03:38 UTC
I tried to reproduce this by tweaking the SeaBIOS DSDT to report a similar overlap and booting Windows via qemu.  Windows stops with this error: http://support.microsoft.com/kb/314830, so I'm concerned that there's still some _CRS-parsing subtlety we're missing.

Yanko, would you mind attaching an acpidump (see http://kernel.org/pub/linux/kernel/people/helgaas/debug)?  Also, if you can apply the patch here: https://bugzilla.kernel.org/show_bug.cgi?id=15533#c5 and attach the resulting dmesg, that would also be useful.  Thanks very much.
Comment 12 Yanko Kaneti 2010-03-24 07:39:34 UTC
Created attachment 25673 [details]
GA-MA78GM-S2H rev 1.0 F11 - acpidump
Comment 13 Yanko Kaneti 2010-03-24 07:40:39 UTC
Created attachment 25674 [details]
GA-MA78GM-S2H rev 1.0 F11 - DSDT
Comment 14 Yanko Kaneti 2010-03-24 09:14:15 UTC
Created attachment 25679 [details]
2.6.34-0.17.rc2.git1.fc14.x86_64 + workaround + att. 25523  debug
Comment 15 Bjorn Helgaas 2010-03-24 22:06:09 UTC
Created attachment 25690 [details]
Windows GA-MA78GM-S2H PCI bus resources

From the DSDT in comment 13, we can see that the BIOS starts with this template:

    DWordMemory (ResourceProducer, PosDecode, MinFixed, MaxFixed, ...
      0x00100000,         // Range Minimum
      0xFEBFFFFF,         // Range Maximum
      0xFFF00000,         // Length

and fills in the starting address, probably based on the system memory size.  What we see in Linux (from comment 14) is this:

    [07] 32-Bit DWORD Address Space Resource
              Resource Type : Memory Range
         Min Relocatability : MinFixed
         Max Relocatability : MaxFixed
            Address Minimum : CFF00000  (_MIN)
            Address Maximum : FEBFFFFF  (_MAX)
             Address Length : 3EE10000  (_LEN)

Per ACPI spec, _LEN must be (_MAX - _MIN + 1), but 3EE10000 != FEBFFFFF - CFF00000 + 1, so this looks like a BIOS defect.

But Windows deals with it, and Linux should, too.

purana@gmail.com went far out of his way to collect the attached Windows Device Manager screenshot from a GA-MA78GM-S2H.  The resources shown there match what Linux found, except for this "end-of-memory to FEBFFFFF" region.  There, Windows appears to have trimmed the _LEN so it fits between _MIN and _MAX.

I think it will be much better for Linux to enforce this "LEN <= _MAX - _MIN + 1" constraint than to trim it based on other resources that conflict.  This way, we'll end up with [mem 0xcff00000-0xfebfffff] rather than [mem 0xcff00000-0xffffffff], which should match Windows exactly and will remove the possibility of placing a device at 0xfec00000, where it probably won't work.
Comment 16 Bjorn Helgaas 2010-03-24 23:03:11 UTC
Created attachment 25691 [details]
truncate _CRS windows with _LEN > _MAX - _MIN + 1

Yanko, can you test this patch, please?  You should only need this patch on top of 2.6.34-0.17.rc2.git1.fc14.x86_64 (or whatever recent upstream kernel you like).  We should see [mem 0xcff00000-0xfebfffff] rather than [mem
0xcff00000-0xffffffff], which I think is more accurate.
Comment 17 Yanko Kaneti 2010-03-25 01:44:19 UTC
Created attachment 25692 [details]
2.6.34-0.17.1.rc2.git1.fc14.x86_64 + attachment 25691 [details] log

Works ok so far.
Comment 18 Len Brown 2010-04-04 04:42:55 UTC
d558b483d5a73f5718705b270cb2090f66ea48c8
Author: Bjorn Helgaas <bjorn.helgaas@hp.com>
Date:   Thu Mar 25 09:28:30 2010 -0600

    x86/PCI: truncate _CRS windows with _LEN > _MAX - _MIN + 1

shipped (via the PCI tree) in Linux-2.6.34-rc3

commit b049fdf93dd1925aea02210e5e8fcedcc607c05c
Author: Bjorn Helgaas <bjorn.helgaas@hp.com>
Date:   Thu Mar 25 10:32:49 2010 -0600

    PNPACPI: truncate _CRS windows with _LEN > _MAX - _MIN + 1

is in the acpi tree.
Comment 19 Rafael J. Wysocki 2010-04-08 19:51:37 UTC
On Thursday 08 April 2010, Bjorn Helgaas wrote:
> On Wednesday 07 April 2010 03:08:37 pm Rafael J. Wysocki wrote:
> > This message has been generated automatically as a part of a summary report
> > of recent regressions.
> > 
> > The following bug entry is on the current list of known regressions
> > from 2.6.33.  Please verify if it still should be listed and let the
> tracking team
> > know (either way).
> > 
> > Bug-Entry   : http://bugzilla.kernel.org/show_bug.cgi?id=15480
> > Subject             : [regression] Fails to boot properly unless given
> pci=nocrs
> > Submitter   : Yanko Kaneti <yaneti@declera.com>
> > Date                : 2010-03-09 01:24 (30 days old)
> > Handled-By  : Bjorn Helgaas <bjorn.helgaas@hp.com>
> > Patch               : http://lkml.org/lkml/2010/3/11/512
> 
> This should be closed.  The fix is in Linus' tree:
> 
>
> http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=d558b483d5a73f5718705b270cb2090f66ea48c8
Comment 20 Bjorn Helgaas 2010-04-26 20:59:09 UTC
Created attachment 26152 [details]
Windows _MIN/_MAX/_LEN parsing

This experiment used the same QEMU/SeaBIOS environment as https://bugzilla.kernel.org/show_bug.cgi?id=15817

The normal host bridge _CRS contains this:

  DWordMemory (...,  0xE0000000,         // Address Range Minimum
                     0xFEBFFFFF,         // Address Range Maximum
                     0x1EC00000,         // Address Length

where 0xFEBFFFFF == 0xE0000000 + 0x1EC00000 - 1.  I replaced the _MAX with 0xF2123456, booted Windows, and collected this screenshot.  It appears that Windows ignored _LEN (0x1EC00000) and merely used [_MIN to _MAX].