Bug 7917

Summary: "PCI: Failed to allocate mem resource" for PCI E-to-PCI Bridge
Product: Drivers Reporter: richlv
Component: PCIAssignee: Jesse Barnes (jbarnes)
Status: REJECTED INSUFFICIENT_DATA    
Severity: normal CC: akpm, alan, garyhade, greg, protasnb, stephan.klein
Priority: P2    
Hardware: i386   
OS: Linux   
Kernel Version: 2.6.19.2 Subsystem:
Regression: --- Bisected commit-id:
Attachments: dmesg output
output of lspci -vvxxx
/proc/iomem
Output of dmesg after bootup.
Output of lspci on 2.6.24-1
dmesg output after bootup with the 2.6.24-16 ubuntu generic kernel

Description richlv 2007-02-01 02:14:51 UTC
Distribution: slackware 11.0
Hardware Environment: Supermicro motherboard X6DH8-G
Problem Description:

booting up the machine displayes an error :
PCI: Failed to allocate mem resource #8:100000@dd200000 for 0000:01:00.0

this is followed by :

PCI: Bridge: 0000:01:00.0
  IO window: disabled.
  MEM window: disabled.
  PREFETCH window: df200000-df7fffff

lspci output for the device in question (i hope :) ) :

lspci -s 01:00.0 -v
01:00.0 PCI bridge: Intel Corporation 6700PXH PCI Express-to-PCI Bridge A (rev 09) (prog-if 00 
[Normal decode])
        Flags: bus master, fast devsel, latency 0
        Bus: primary=01, secondary=02, subordinate=02, sec-latency=48
        Prefetchable memory behind bridge: 00000000df200000-00000000df700000
        Capabilities: [44] Express PCI/PCI-X Bridge IRQ 0
        Capabilities: [5c] Message Signalled Interrupts: 64bit+ Queue=0/0 Enable-
        Capabilities: [6c] Power Management version 2
        Capabilities: [d8] PCI-X bridge device
Comment 1 Natalie Protasevich 2007-07-06 17:48:52 UTC
Rich,
Can you please provide a boot trace?
Does the system boot up? And have you tried latest kernel (2.6.22-rc7)?
Thanks.
Comment 2 richlv 2007-07-09 00:32:07 UTC
Created attachment 11976 [details]
dmesg output

i guess 'boot trace' is dmesg output right ? if so, one from 2.6.20.3 is attached (which is the latest running on the machine).

yes, machine boots up successfully.
Comment 3 richlv 2007-07-30 01:49:16 UTC
Created attachment 12198 [details]
output of lspci -vvxxx

attaching output of 'lspci -vvxxx' and contents of /proc/iomem, as requested by email.
Comment 4 richlv 2007-07-30 01:50:06 UTC
Created attachment 12199 [details]
/proc/iomem

forgot to add : both of these are when running 2.6.22.1
Comment 5 Linus Torvalds 2007-07-30 08:30:11 UTC
It _looks_ like the BIOS may have configured both devices

        00:1c.0 PCI bridge (bridging to bus #6)

and

        01:00.0 PCI bridge (bridging to bus #2)

to have

        Memory behind bridge: dd200000-dd2fffff

and the first bridge (00:1c) gets it, and then the second bridge (01:00)
quite reasonably gets a resource allocation error.

Now, the thing is, that the 00:1c device is *not* a bridge for the 01:xx
bus that the 01:00.0 bus is on (it's a bridge to the 06:xx bus), so that
BIOS allocation really wasn't right, afaik. The 01:00.0 PCI bridge is
actually behind the 00:02.0 PCI bridge.

That's quite confusing, and it appears to be made much worse by the fact
that those 00:xx bridges are probably transparent PCI bridges (which is
quite normal for Intel core bridges) even though they say "normal decode".

So I think the BIOS made a mess of it, and Linux complains a bit, but
everything is likely to work. It _is_ confusing, but I don't think we can
fix it up any better without re-programming all the bridges (which is
actually pretty hard, and likely to fail more often than fix anything up,
since PCI bridges almost always have hidden regions that they bridge etc).

IOW, the only thing I can think of is to remove the warning message, but
at the same time, that warning message actually can be very useful for the
cases where the end result really _is_ so messed up that something breaks.

Rich - it looks like everything actually works well, no? All the devices
do actually get resource allocations, both the MegaRAID controllers behind
that confusing bus #2, _and_ the e1000 ethernet behind bus #6.

So I would suggest we close this as "apparently confused BIOS, but Linux
works".
Comment 6 richlv 2007-07-30 08:40:58 UTC
yes, the system appears to be working just fine (though we haven't tried populating all slots or other things).

would it make sense to report this to the bios vendor ?
Comment 7 Andrew Morton 2007-08-02 15:40:09 UTC
Greg, could you please consider removing that message, or rephrasing it
in some manner so that it is less alarming?
Comment 8 Greg Kroah-Hartman 2007-10-03 16:03:29 UTC
The message is now gone in 2.6.23.  Closing this bug.
Comment 9 Stephan Klein 2007-12-13 23:19:13 UTC
I don't want to break your fun, but the messages
are still (again?) visible in 2.6.24.
This was oberserved with the development version of Ubuntu (Hardy) using package version 2.6.24-1-generic (on i386).
See the corresponding launchpad entry https://bugs.launchpad.net/ubuntu/+source/linux-meta/+bug/159241) and the output of lspci and dmesg right after bootup (that I will attach shortly).

Unfortunately I can't reopen this bug, maybe someone else can.
Comment 10 Stephan Klein 2007-12-13 23:19:55 UTC
Created attachment 14020 [details]
Output of dmesg after bootup.
Comment 11 Stephan Klein 2007-12-13 23:20:31 UTC
Created attachment 14021 [details]
Output of lspci on 2.6.24-1
Comment 12 Natalie Protasevich 2008-02-14 21:20:43 UTC
Greg, the message is still there in arch/x86/pci/acpi.c.
Since there is a theoretical chance that resource conflict might have real consequences (which never happened in my experience, so far), how about making this printk to be of debug level?
Comment 13 Stephan Klein 2008-02-15 00:11:06 UTC
Downstream lowered the printk to warning.
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/159241
Comment 15 Jesse Barnes 2008-05-01 17:42:12 UTC
Thanks for the pointer, Stephan.  I'll see if we can't do something smarter here, otherwise we'll clean up the warnings.
Comment 16 Stephan Klein 2008-05-02 04:20:55 UTC
You're welcome. Cleaning up the warnings would be the quick fix for this and prevent a lot of people from complaining. But it won't repair the problem itself. The thing is that this allocation thing takes up roughly 10 seconds before the booting process can continue.
Just look at this (http://img126.imageshack.us/img126/7234/hardy200804011ek6.png) bootchart. There are about 10 seconds before busybox actually starts. This produces a very akward silence (if the warnings are "hidden").
Comment 17 Jesse Barnes 2008-05-02 14:03:42 UTC
Are you sure it's the resource allocation taking so long?  Can you try booting with 'initcall_debug'?
Comment 18 Stephan Klein 2008-05-02 14:40:34 UTC
Created attachment 16010 [details]
dmesg output after bootup with the 2.6.24-16 ubuntu generic kernel

I've created a log right after bootup with initcall_debug set, as you requested. Please let me know if you need anything else.
Comment 19 Jesse Barnes 2008-05-02 14:48:02 UTC
Wow, yeah look at that, pci_init is taking >7s...  that's bad.

[   35.169739] Calling initcall 0xc021ef50: pci_init+0x0/0x30()
[   43.162816] 0000:00:1d.7 EHCI: BIOS handoff failed (BIOS bug ?) 01010001
[   43.162952] Boot video device is 0000:01:00.0
[   43.162968] PCI: Firmware left 0000:08:08.0 e100 interrupts enabled, disabling
[   43.162981] initcall 0xc021ef50: pci_init+0x0/0x30() returned 0.
[   43.162989] initcall 0xc021ef50 ran for 7627 msecs: pci_init+0x0/0x30()
Comment 20 Henry Pfeil 2008-06-28 07:13:09 UTC
I noticed that when I boot grub into Fedora Core 8 with kernel-2.6.25.6-27.fc8, I don't generate that pci-allocation message, however, when I boot into my custom kernel on the same machine, tried 2.5.25.6 and 2.6.25.8, I do get the allocation failure message. IMHO, that suggests something in the config, not the hardware or the kernel code. Now all I have to do is diff the config files to find the config line that generates the pci mem allocation failure. Fedora seems to use a config with all of the modules enabled, so the search will take a while. At least I've narrowed it down to a compile option.
Comment 21 richlv 2008-07-03 02:31:15 UTC
2.6.25.10 on the original system for this report still shows the message.
henry, aren't there also some patches to that fedora kernel that could impact the message ?