pcie_capability_read_word(dev, PCI_EXP_SLTCAP, ...) can read a non-zero value even for bridges that do not have a slot. This can make us falsely believe the bridge supports hotplug and needs resources assigned to it. On the system below, it reads the value 0x00040060, which indicates the bridge is Hot-Plug Capable (0x40) and supports Hot-Plug Surprise (0x20) and has No Command Completed Support (0x40000). But the Slot Implemented bit in the PCI Express Capabilities register is not set, so we should ignore the Slot Capabilities register altogether. On this BL460c, this error leads us to set dev->is_hotplug_bridge when we shouldn't. That in turn causes us to attempt to allocate resources for the bridge, even though there's no slot below it. DMI: HP ProLiant BL460c Gen8, BIOS I31 09/08/2013 00:1c.0 PCI bridge: Intel Corporation C600/X79 series chipset PCI Express Root Port 1 (rev b5) (prog-if 00 [Normal decode]) Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr+ Stepping- SERR+ FastB2B- DisINTx- Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx- Latency: 0, Cache Line Size: 64 bytes Bus: primary=00, secondary=08, subordinate=08, sec-latency=0 Secondary status: 66MHz- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort+ <SERR- <PERR- BridgeCtl: Parity+ SERR+ NoISA- VGA- MAbort- >Reset- FastB2B- PriDiscTmr- SecDiscTmr- DiscTmrStat- DiscTmrSERREn- Capabilities: [40] Express (v2) Root Port (Slot-), MSI 00 # lspci -s00:1c.0 -xxx 00:1c.0 PCI bridge: Intel Corporation C600/X79 series chipset PCI Express Root Port 1 (rev b5) 00: 86 80 10 1d 47 01 10 00 b5 00 04 06 10 00 81 00 10: 00 00 00 00 00 00 00 00 00 08 08 00 10 10 00 00 20: 20 f0 30 f0 41 f0 51 f0 00 00 00 00 00 00 00 00 30: 00 00 00 00 40 00 00 00 00 00 00 00 ff 01 03 00 40: 10 80 42 00 00 80 00 00 06 00 10 00 42 4c 11 01 50: 10 00 01 18 60 00 04 00 00 00 40 00 06 00 00 00 60: 00 00 00 00 16 00 00 00 00 00 00 00 00 00 00 00 70: 01 00 01 00 00 00 00 00 00 00 00 00 00 00 00 00 80: 05 90 00 00 00 00 00 00 00 00 00 00 00 00 00 00 90: 0d a0 00 00 3c 10 a9 18 00 00 00 00 00 00 00 00 a0: 01 00 02 c8 00 00 00 00 00 00 00 00 00 00 00 00 b0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 c0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 d0: 00 00 00 01 02 0b 00 00 00 80 11 01 00 00 00 00 e0: 00 3f 00 00 00 00 00 00 01 00 00 00 00 00 00 00 f0: 00 00 00 00 00 00 00 00 87 0f 07 08 00 00 00 00
This was fixed in v3.12 by the following two changes: https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=c8b303d0206b28c4ff3aecada47108d1655ae00f https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=6d3a1741f1e648cfbd5a0cc94477a0d5004c6f5e After these two changes, we read the Slot Capabilities register only if the device is either a Switch Downstream Port or a Root Port and the Port implements a slot. This corresponds to the language in the PCIe spec r3.0, sec. 7.8. Previously we relied on the spec statement that unimplemented registers in v2 capabilities must be hardwired to zero. We could use that to argue that the BL460c is out of spec, but it seems better to pay attention to the Slot Implemented bit in the PCIe Capabilities Register and not even bother reading the slot registers if there's no slot. lspci already behaves this way.
This issue was reported and diagnosed by Myron Stowe <myron.stowe@redhat.com>.