Kernel Bug Tracker – Bug 7917
"PCI: Failed to allocate mem resource" for PCI E-to-PCI Bridge
Last modified: 2009-03-23 10:01:35 UTC
Distribution: slackware 11.0
Hardware Environment: Supermicro motherboard X6DH8-G
booting up the machine displayes an error :
PCI: Failed to allocate mem resource #8:100000@dd200000 for 0000:01:00.0
this is followed by :
PCI: Bridge: 0000:01:00.0
IO window: disabled.
MEM window: disabled.
PREFETCH window: df200000-df7fffff
lspci output for the device in question (i hope :) ) :
lspci -s 01:00.0 -v
01:00.0 PCI bridge: Intel Corporation 6700PXH PCI Express-to-PCI Bridge A (rev 09) (prog-if 00
Flags: bus master, fast devsel, latency 0
Bus: primary=01, secondary=02, subordinate=02, sec-latency=48
Prefetchable memory behind bridge: 00000000df200000-00000000df700000
Capabilities:  Express PCI/PCI-X Bridge IRQ 0
Capabilities: [5c] Message Signalled Interrupts: 64bit+ Queue=0/0 Enable-
Capabilities: [6c] Power Management version 2
Capabilities: [d8] PCI-X bridge device
Can you please provide a boot trace?
Does the system boot up? And have you tried latest kernel (2.6.22-rc7)?
Created attachment 11976 [details]
i guess 'boot trace' is dmesg output right ? if so, one from 188.8.131.52 is attached (which is the latest running on the machine).
yes, machine boots up successfully.
Created attachment 12198 [details]
output of lspci -vvxxx
attaching output of 'lspci -vvxxx' and contents of /proc/iomem, as requested by email.
Created attachment 12199 [details]
forgot to add : both of these are when running 184.108.40.206
It _looks_ like the BIOS may have configured both devices
00:1c.0 PCI bridge (bridging to bus #6)
01:00.0 PCI bridge (bridging to bus #2)
Memory behind bridge: dd200000-dd2fffff
and the first bridge (00:1c) gets it, and then the second bridge (01:00)
quite reasonably gets a resource allocation error.
Now, the thing is, that the 00:1c device is *not* a bridge for the 01:xx
bus that the 01:00.0 bus is on (it's a bridge to the 06:xx bus), so that
BIOS allocation really wasn't right, afaik. The 01:00.0 PCI bridge is
actually behind the 00:02.0 PCI bridge.
That's quite confusing, and it appears to be made much worse by the fact
that those 00:xx bridges are probably transparent PCI bridges (which is
quite normal for Intel core bridges) even though they say "normal decode".
So I think the BIOS made a mess of it, and Linux complains a bit, but
everything is likely to work. It _is_ confusing, but I don't think we can
fix it up any better without re-programming all the bridges (which is
actually pretty hard, and likely to fail more often than fix anything up,
since PCI bridges almost always have hidden regions that they bridge etc).
IOW, the only thing I can think of is to remove the warning message, but
at the same time, that warning message actually can be very useful for the
cases where the end result really _is_ so messed up that something breaks.
Rich - it looks like everything actually works well, no? All the devices
do actually get resource allocations, both the MegaRAID controllers behind
that confusing bus #2, _and_ the e1000 ethernet behind bus #6.
So I would suggest we close this as "apparently confused BIOS, but Linux
yes, the system appears to be working just fine (though we haven't tried populating all slots or other things).
would it make sense to report this to the bios vendor ?
Greg, could you please consider removing that message, or rephrasing it
in some manner so that it is less alarming?
The message is now gone in 2.6.23. Closing this bug.
I don't want to break your fun, but the messages
are still (again?) visible in 2.6.24.
This was oberserved with the development version of Ubuntu (Hardy) using package version 2.6.24-1-generic (on i386).
See the corresponding launchpad entry https://bugs.launchpad.net/ubuntu/+source/linux-meta/+bug/159241) and the output of lspci and dmesg right after bootup (that I will attach shortly).
Unfortunately I can't reopen this bug, maybe someone else can.
Created attachment 14020 [details]
Output of dmesg after bootup.
Created attachment 14021 [details]
Output of lspci on 2.6.24-1
Greg, the message is still there in arch/x86/pci/acpi.c.
Since there is a theoretical chance that resource conflict might have real consequences (which never happened in my experience, so far), how about making this printk to be of debug level?
Downstream lowered the printk to warning.
See also the commits that actually lowered the messages:
Thanks for the pointer, Stephan. I'll see if we can't do something smarter here, otherwise we'll clean up the warnings.
You're welcome. Cleaning up the warnings would be the quick fix for this and prevent a lot of people from complaining. But it won't repair the problem itself. The thing is that this allocation thing takes up roughly 10 seconds before the booting process can continue.
Just look at this (http://img126.imageshack.us/img126/7234/hardy200804011ek6.png) bootchart. There are about 10 seconds before busybox actually starts. This produces a very akward silence (if the warnings are "hidden").
Are you sure it's the resource allocation taking so long? Can you try booting with 'initcall_debug'?
Created attachment 16010 [details]
dmesg output after bootup with the 2.6.24-16 ubuntu generic kernel
I've created a log right after bootup with initcall_debug set, as you requested. Please let me know if you need anything else.
Wow, yeah look at that, pci_init is taking >7s... that's bad.
[ 35.169739] Calling initcall 0xc021ef50: pci_init+0x0/0x30()
[ 43.162816] 0000:00:1d.7 EHCI: BIOS handoff failed (BIOS bug ?) 01010001
[ 43.162952] Boot video device is 0000:01:00.0
[ 43.162968] PCI: Firmware left 0000:08:08.0 e100 interrupts enabled, disabling
[ 43.162981] initcall 0xc021ef50: pci_init+0x0/0x30() returned 0.
[ 43.162989] initcall 0xc021ef50 ran for 7627 msecs: pci_init+0x0/0x30()
I noticed that when I boot grub into Fedora Core 8 with kernel-220.127.116.11-27.fc8, I don't generate that pci-allocation message, however, when I boot into my custom kernel on the same machine, tried 18.104.22.168 and 22.214.171.124, I do get the allocation failure message. IMHO, that suggests something in the config, not the hardware or the kernel code. Now all I have to do is diff the config files to find the config line that generates the pci mem allocation failure. Fedora seems to use a config with all of the modules enabled, so the search will take a while. At least I've narrowed it down to a compile option.
126.96.36.199 on the original system for this report still shows the message.
henry, aren't there also some patches to that fedora kernel that could impact the message ?