Bug 14437 (BAR1) - 4GB not working with a 32bit northbridge with an error of BAR1.
Summary: 4GB not working with a 32bit northbridge with an error of BAR1.
Status: CLOSED OBSOLETE
Alias: BAR1
Product: Drivers
Classification: Unclassified
Component: PCI (show other bugs)
Hardware: All Linux
: P1 normal
Assignee: drivers_pci@kernel-bugs.osdl.org
URL: http://www.codingfriends.com/index.ph...
Keywords:
Depends on:
Blocks:
 
Reported: 2009-10-18 18:53 UTC by Ian
Modified: 2012-06-13 20:26 UTC (History)
2 users (show)

See Also:
Kernel Version: 2.6.31-14 Ubuntu
Subsystem:
Regression: No
Bisected commit-id:


Attachments
4GB 2GB dmesg and proc/iomem files (26.27 KB, application/octet-stream)
2009-10-18 18:53 UTC, Ian
Details

Description Ian 2009-10-18 18:53:37 UTC
Created attachment 23462 [details]
4GB 2GB dmesg and proc/iomem files

I have a acer aspire 9815 with from what I have read a 32bit
Northbridge and when you add 4GB of RAM with a nVidia 256 of "virtual
ram" causes a base address registers (BAR1), but works fine with 2GB
because there is a problem with the way that Linux kernel is
allocating the memory associated with the devices and places then out
of range.

Here is my lspci for the PCI host bridge

Host bridge: Intel Corporation Mobile 945GM/PM/GMS, 943/940GML and
945GT Express Memory Controller Hub (rev 03)
PCI bridge: Intel Corporation Mobile 945GM/PM/GMS, 943/940GML and
945GT Express PCI Express Root Port (rev 03)

From this website it appears that some one was trying to update the
PCI IOMEM part of the kernl

http://tjworld.net/wiki/Linux/PCIDynamicResourceAllocationManagement

but says that it will be in the .30 / .31 kernel but I am not able to
still use the 4GB of ram and I am using the  2.6.31-10-generic
#35-Ubuntu SMP Tue Sep 22 17:33:42 UTC 2009 i686 GNU/Linux from a
kubuntu 9.10 setup and was hoping to have the fix already applied.

Was wondering since I am using the k/ubuntu kernel does the "real"
kernel have the PCI IOMEM Upgrade ? or do I have pass some kernel
parameters on grub to allow for this to work correctly ? or has there
been another upgrade to the PCI structure that will come in a later
release ?

Also there was some advice to update the BIOS for some laptops and set
the upper memory limit, does this make the PC memory have a virtual
head to it and thus the OS will not see the rest of RAM ? or is it
just for the PCI setup aspects that look at the upper memory limit and
the rest of the OS can see the and use the rest of the RAM ?

Any advice would be great.

I have included the 
2GB / 4GB / 4GB kernel = pci=use_crs dmesg + /proc/iomem files as requested.
Comment 1 Bjorn Helgaas 2009-10-19 17:42:43 UTC
Correct me if I'm wrong, but I think your complaint is that the nVidia
card works when you have only 2GB of memory, but stops working when you
have 4GB of memory.  (With 4GB, it probably still works in plain VGA
mode, but not in modes that use the 256MB frame buffer.)

The nVidia device is at 0000:01:00.0 and requires these resources:

  BAR 0 (0x10): mem  16MB (32-bit BAR)
  BAR 1 (0x14): mem 256MB (64-bit BAR, prefetchable (frame buffer))
  BAR 3 (0x1c): mem  16MB (64-bit BAR)
  BAR 5 (0x24): io  128 ports
  ROM   (0x30): mem 128KB (32-bit BAR)

With 2GB of RAM, you have memory from 0-0x80000000, and the 1.5GB region
from 0x80000000-0xE0000000 is available for PCI devices.  With 4GB of
RAM, you have memory from 0-0xC0000000, and only the 512MB region from
0xC0000000-0xE0000000 is left for PCI.

Even 512MB should be enough to accommodate the devices you have, but
the BIOS divided it up in such a way that only 32MB is actually routed
to bus 0000:01, and I don't think Linux is smart enough to redistribute
things and fix that.  The dynamic allocation work you mentioned might
be able to do it, but it is not in the mainline kernel, and I haven't
seen any discussion about it.

The frame buffer is a 64-bit BAR, and you seem to be running the
32-bit Ubuntu kernel.  I don't know whether the 64-bit kernel would
make a difference here or not.  It might be worth trying a live CD
or something.

Bjorn
Comment 2 Anonymous Emailer 2009-10-20 10:21:25 UTC
Reply-To: ianporter1976@googlemail.com

Hi Bjorn,

Yes you are correct on your reports and I also forgot to say that
there is a patch from nvidia for a older kernel

sudo apt-get build-dep linux-image-2.6.24-17-rt
sudo apt-get source linux-image-2.6.24-17-rt

# Install Kernel Modules Sources
sudo apt-get build-dep linux-ubuntu-modules-2.6.24-17-rt
sudo apt-get source linux-ubuntu-modules-2.6.24-17-rt

# Apply NVRM patch (download the patch first!)
sudo patch -p0 < NVRM_512M_fix.txt

# Build debs for linux-image & linux-headers
cd linux-2.6.24/
sudo cp /boot/config-2.6.24-17-rt debian/config/amd64/config.rt
sudo CONCURRENCY_LEVEL=2 AUTOBUILD=1 NOEXTRAS=1 fakeroot debian/rules
custom-binary-rt
cd ..

# Build Kernel Modules
cd linux-ubuntu-modules-2.6.24-2.6.24/
sudo CONCURRENCY_LEVEL=2 fakeroot debian/rules binary-debs
cd ..

and the patch being for 268MB allocation.

diff -Naur linux-2.6.24.orig/arch/x86/pci/i386.c
linux-2.6.24/arch/x86/pci/i386.c
--- linux-2.6.24.orig/arch/x86/pci/i386.c	2008-06-03 20:24:26.000000000 -0400
+++ linux-2.6.24/arch/x86/pci/i386.c	2008-06-03 20:25:40.000000000 -0400
@@ -122,6 +122,10 @@
 				r = &dev->resource[idx];
 				if (!r->flags)
 					continue;
+				if ((r->start == 0xbdf00000) && (r->end == 0xddefffff)) {
+					r->start = 0xc0000000;
+					r->end = 0xd0000000;
+				}
 				pr = pci_find_parent_resource(dev, r);
 				if (!r->start || !pr ||
 				    request_resource(pr, r) < 0) {

But I cannot find this in the present 2.6.31, I can find similar
lines, but not the same ones

                                r = &dev->resource[idx];
                                if (!r->flags)
                                        continue;
                                if (!r->start ||
                                    pci_claim_resource(dev, idx) < 0) {
                                        dev_warn(&dev->dev, "BAR %d:
can't allocate resource\n", idx);
                                        /*
                                         * Something is wrong with the region.
                                         * Invalidate the resource to prevent
                                         * child resource allocations in this
                                         * range.
                                         */
                                        r->flags = 0;
                                }

as a guess could I do this instead ?

                                r = &dev->resource[idx];
                                if (!r->flags)
                                        continue;
				if ((r->start == 0xbdf00000) && (r->end == 0xddefffff)) {
					r->start = 0xc0000000;
					r->end = 0xd0000000;
				}
 				pr = pci_find_parent_resource(dev, r);
                                if (!r->start ||
                                    pci_claim_resource(dev, idx) < 0) {
                                        dev_warn(&dev->dev, "BAR %d:
can't allocate resource\n", idx);
                                        /*
                                         * Something is wrong with the region.
                                         * Invalidate the resource to prevent
                                         * child resource allocations in this
                                         * range.
                                         */
                                        r->flags = 0;
                                }
Regards
Ian

On Mon, Oct 19, 2009 at 6:42 PM, Bjorn Helgaas <bjorn.helgaas@hp.com> wrote:
> Correct me if I'm wrong, but I think your complaint is that the nVidia
> card works when you have only 2GB of memory, but stops working when you
> have 4GB of memory.  (With 4GB, it probably still works in plain VGA
> mode, but not in modes that use the 256MB frame buffer.)
>
> The nVidia device is at 0000:01:00.0 and requires these resources:
>
>  BAR 0 (0x10): mem  16MB (32-bit BAR)
>  BAR 1 (0x14): mem 256MB (64-bit BAR, prefetchable (frame buffer))
>  BAR 3 (0x1c): mem  16MB (64-bit BAR)
>  BAR 5 (0x24): io  128 ports
>  ROM   (0x30): mem 128KB (32-bit BAR)
>
> With 2GB of RAM, you have memory from 0-0x80000000, and the 1.5GB region
> from 0x80000000-0xE0000000 is available for PCI devices.  With 4GB of
> RAM, you have memory from 0-0xC0000000, and only the 512MB region from
> 0xC0000000-0xE0000000 is left for PCI.
>
> Even 512MB should be enough to accommodate the devices you have, but
> the BIOS divided it up in such a way that only 32MB is actually routed
> to bus 0000:01, and I don't think Linux is smart enough to redistribute
> things and fix that.  The dynamic allocation work you mentioned might
> be able to do it, but it is not in the mainline kernel, and I haven't
> seen any discussion about it.
>
> The frame buffer is a 64-bit BAR, and you seem to be running the
> 32-bit Ubuntu kernel.  I don't know whether the 64-bit kernel would
> make a difference here or not.  It might be worth trying a live CD
> or something.
>
> Bjorn
>
Comment 3 Bjorn Helgaas 2009-11-03 15:28:08 UTC
Here's the situation with 2GB of RAM:

  00100000-7fe8ffff : System RAM
  80000000-febfffff : PCI Bus 0000:00 ("pci=use_crs" would show this)
    b0000000-bfffffff : PCI Bus 0000:01 (prefetchable mem aperture)
      b0000000-bfffffff : 0000:01:00.0  nVidia 256MB BAR 1
    d0000000-d1ffffff : PCI Bus 0000:01 (non-prefetchable mem aperture)
      d0000000-d0ffffff : 0000:01:00.0  nVidia 16MB BAR 0
      d1000000-d1ffffff : 0000:01:00.0  nVidia 16MB BAR 3

But with 4GB of RAM, we have much less PCI MMIO space:

  00100000-bfe8ffff : System RAM
  c0000000-febfffff : PCI Bus 0000:00
    d0000000-d1ffffff : PCI Bus 0000:01 (non-prefetchable mem aperture)
      d0000000-d0ffffff : 0000:01:00.0  nVidia 16MB BAR 0
      d1000000-d1ffffff : 0000:01:00.0  nVidia 16MB BAR 3

BIOS still put the bridge prefetchable mem aperture and the nVidia frame buffer at 0xb0000000-0xbfffffff, but I think this is a BIOS bug because this overlaps the System RAM region.

Linux noticed this overlap and disabled the prefetchable mem aperture.  Then the nVidia frame buffer alloc failed because the aperture was closed:

  pci 0000:00:01.0: bridge 64bit mmio pref: [0xb0000000-0xbfffffff]
  pci 0000:01:00.0: reg 14 64bit mmio: [0xb0000000-0xbfffffff]
  pci 0000:00:01.0: BAR 15: address space collision on of bridge [0xb0000000-0xbfffffff]
  pci 0000:00:01.0: BAR 15: can't allocate resource
  pci 0000:01:00.0: BAR 1: no parent found for of device [0xb0000000-0xbfffffff]
  pci 0000:01:00.0: BAR 1: can't allocate resource
  pci 0000:00:01.0: BAR 15: can't allocate mem resource [0x100000000-0xffffffff]
  pci 0000:01:00.0: BAR 1: can't allocate mem resource [0xe0000000-0xd1ffffff]

[I have some patches to make these messages more sensible.]

I don't see an easy way to fix this.  There *is* still enough MMIO space to make everything fit below 4GB, but that would require moving other devices around, and I don't think we're smart enough to do this yet.

If the host bridge supports it, we could move that prefetchable aperture above 4GB; then everything would fit easily.  But Ian suggests that the hardware doesn't support that.

The NVRM patch is a gross hack that just clobbers the resources of a specific device.  You can always do that and make a particular machine work (something similar could be done for this 4GB config), but it could never go upstream.

I think right now, we should just treat this as a BIOS bug that Linux isn't smart enough to work around.  The BIOS *should* have set up all the devices so it would just work, but it didn't.
Comment 4 Anonymous Emailer 2009-11-04 11:39:49 UTC
Reply-To: ianporter1976@googlemail.com

Hi Borun,

Thanks very much for your help and information, so basically it is
down to the ACER BIOS problem.  And there latest version does not fix
this issue either.

When I do upgrade this laptop, I shall make sure that it does not have
this problem.

Kind regards
Ian

On Tue, Nov 3, 2009 at 3:28 PM,  <bugzilla-daemon@bugzilla.kernel.org> wrote:
> http://bugzilla.kernel.org/show_bug.cgi?id=14437
>
>
>
>
>
> --- Comment #3 from Bjorn Helgaas <bjorn.helgaas@hp.com>  2009-11-03 15:28:08
> ---
> Here's the situation with 2GB of RAM:
>
>  00100000-7fe8ffff : System RAM
>  80000000-febfffff : PCI Bus 0000:00 ("pci=use_crs" would show this)
>    b0000000-bfffffff : PCI Bus 0000:01 (prefetchable mem aperture)
>      b0000000-bfffffff : 0000:01:00.0  nVidia 256MB BAR 1
>    d0000000-d1ffffff : PCI Bus 0000:01 (non-prefetchable mem aperture)
>      d0000000-d0ffffff : 0000:01:00.0  nVidia 16MB BAR 0
>      d1000000-d1ffffff : 0000:01:00.0  nVidia 16MB BAR 3
>
> But with 4GB of RAM, we have much less PCI MMIO space:
>
>  00100000-bfe8ffff : System RAM
>  c0000000-febfffff : PCI Bus 0000:00
>    d0000000-d1ffffff : PCI Bus 0000:01 (non-prefetchable mem aperture)
>      d0000000-d0ffffff : 0000:01:00.0  nVidia 16MB BAR 0
>      d1000000-d1ffffff : 0000:01:00.0  nVidia 16MB BAR 3
>
> BIOS still put the bridge prefetchable mem aperture and the nVidia frame
> buffer
> at 0xb0000000-0xbfffffff, but I think this is a BIOS bug because this
> overlaps
> the System RAM region.
>
> Linux noticed this overlap and disabled the prefetchable mem aperture.  Then
> the nVidia frame buffer alloc failed because the aperture was closed:
>
>  pci 0000:00:01.0: bridge 64bit mmio pref: [0xb0000000-0xbfffffff]
>  pci 0000:01:00.0: reg 14 64bit mmio: [0xb0000000-0xbfffffff]
>  pci 0000:00:01.0: BAR 15: address space collision on of bridge
> [0xb0000000-0xbfffffff]
>  pci 0000:00:01.0: BAR 15: can't allocate resource
>  pci 0000:01:00.0: BAR 1: no parent found for of device
> [0xb0000000-0xbfffffff]
>  pci 0000:01:00.0: BAR 1: can't allocate resource
>  pci 0000:00:01.0: BAR 15: can't allocate mem resource
> [0x100000000-0xffffffff]
>  pci 0000:01:00.0: BAR 1: can't allocate mem resource [0xe0000000-0xd1ffffff]
>
> [I have some patches to make these messages more sensible.]
>
> I don't see an easy way to fix this.  There *is* still enough MMIO space to
> make everything fit below 4GB, but that would require moving other devices
> around, and I don't think we're smart enough to do this yet.
>
> If the host bridge supports it, we could move that prefetchable aperture
> above
> 4GB; then everything would fit easily.  But Ian suggests that the hardware
> doesn't support that.
>
> The NVRM patch is a gross hack that just clobbers the resources of a specific
> device.  You can always do that and make a particular machine work (something
> similar could be done for this 4GB config), but it could never go upstream.
>
> I think right now, we should just treat this as a BIOS bug that Linux isn't
> smart enough to work around.  The BIOS *should* have set up all the devices
> so
> it would just work, but it didn't.
>
> --
> Configure bugmail: http://bugzilla.kernel.org/userprefs.cgi?tab=email
> ------- You are receiving this mail because: -------
> You reported the bug.
>
Comment 5 Ian 2010-01-07 14:40:26 UTC
I have come up with a very basic hack, Bjorn pointed me in the right direction, thanks again.

Basically, because other resources was getting to the memory location where the nvidia nvnews hack was trying to allocate for (moving passed where the new system memory was because there is 4GB and no longer 2GB, 0xb0000000 - 0xbfffffff) I done a small hack / test to see if I stopped the other resources trying to claim the space before the nvidia device tried to claim 0xc0000000 - 0xcfffffff and it has worked.  I have explained more here

http://www.codingfriends.com/index.php/2010/01/07/bar-15-bar-1-no-parent-nvidia-graphics-card-does-not-work/

But, that has fixed my problem, thanks again to Bjorn.

I am going to see if I am able to code in a way that will do this hack / test automatically instead of writing the pure code into the kernel.

Here is the code, incase it helps.

if ((r->start >= 0xc0000000) && (r->end <= 0xcfffffff)) {
	dev_info(&dev->dev,                              
		  " not allocating resource 0xc - 0xcf %pR\n",
		  r);                                         
	/*                                                   
		stop any resources gaining the 0xc0000000 - 0xcfffffff
		region, the linux kernel will re-place them.          
	*/                                                            
	r->flags = 0;                                                 
}                                                                     

/* where the nvidia is going and replace in the above region */
if ((r->start == 0xb0000000) && (r->end == 0xbfffffff)) {      
	r->start = 0xc0000000;                                 
	r->end = 0xcfffffff;                                   
}

Note You need to log in before you can comment on or make changes to this bug.