Bug 65211 - Incorrect PCIe Slot Capabilities returned on HP BL460c
Summary: Incorrect PCIe Slot Capabilities returned on HP BL460c
Status: RESOLVED CODE_FIX
Alias: None
Product: Drivers
Classification: Unclassified
Component: PCI (show other bugs)
Hardware: All Linux
: P1 normal
Assignee: drivers_pci@kernel-bugs.osdl.org
URL: http://lkml.kernel.org/r/CAL-B5D2WtAL...
Keywords:
Depends on:
Blocks:
 
Reported: 2013-11-19 20:47 UTC by Bjorn Helgaas
Modified: 2013-11-19 21:29 UTC (History)
1 user (show)

See Also:
Kernel Version: 3.7
Subsystem:
Regression: No
Bisected commit-id:


Attachments

Description Bjorn Helgaas 2013-11-19 20:47:01 UTC
pcie_capability_read_word(dev, PCI_EXP_SLTCAP, ...) can read a non-zero value even for bridges that do not have a slot.  This can make us falsely believe the bridge supports hotplug and needs resources assigned to it.

On the system below, it reads the value 0x00040060, which indicates the bridge is Hot-Plug Capable (0x40) and supports Hot-Plug Surprise (0x20) and has No Command Completed Support (0x40000).  But the Slot Implemented bit in the PCI Express Capabilities register is not set, so we should ignore the Slot Capabilities register altogether.

On this BL460c, this error leads us to set dev->is_hotplug_bridge when we shouldn't.  That in turn causes us to attempt to allocate resources for the bridge, even though there's no slot below it.


  DMI: HP ProLiant BL460c Gen8, BIOS I31 09/08/2013

  00:1c.0 PCI bridge: Intel Corporation C600/X79 series chipset PCI Express Root Port 1 (rev b5) (prog-if 00 [Normal decode])
          Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop-
                   ParErr+ Stepping- SERR+ FastB2B- DisINTx-
          Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort-
                  <TAbort- <MAbort- >SERR- <PERR- INTx-
          Latency: 0, Cache Line Size: 64 bytes
          Bus: primary=00, secondary=08, subordinate=08, sec-latency=0
          Secondary status: 66MHz- FastB2B- ParErr- DEVSEL=fast >TAbort-
                            <TAbort- <MAbort+ <SERR- <PERR-
          BridgeCtl: Parity+ SERR+ NoISA- VGA- MAbort- >Reset- FastB2B-
                  PriDiscTmr- SecDiscTmr- DiscTmrStat- DiscTmrSERREn-
          Capabilities: [40] Express (v2) Root Port (Slot-), MSI 00

  # lspci -s00:1c.0 -xxx
      00:1c.0 PCI bridge: Intel Corporation C600/X79 series chipset PCI
      Express Root Port 1 (rev b5)
      00: 86 80 10 1d 47 01 10 00 b5 00 04 06 10 00 81 00
      10: 00 00 00 00 00 00 00 00 00 08 08 00 10 10 00 00
      20: 20 f0 30 f0 41 f0 51 f0 00 00 00 00 00 00 00 00
      30: 00 00 00 00 40 00 00 00 00 00 00 00 ff 01 03 00
      40: 10 80 42 00 00 80 00 00 06 00 10 00 42 4c 11 01
      50: 10 00 01 18 60 00 04 00 00 00 40 00 06 00 00 00
      60: 00 00 00 00 16 00 00 00 00 00 00 00 00 00 00 00
      70: 01 00 01 00 00 00 00 00 00 00 00 00 00 00 00 00
      80: 05 90 00 00 00 00 00 00 00 00 00 00 00 00 00 00
      90: 0d a0 00 00 3c 10 a9 18 00 00 00 00 00 00 00 00
      a0: 01 00 02 c8 00 00 00 00 00 00 00 00 00 00 00 00
      b0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
      c0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
      d0: 00 00 00 01 02 0b 00 00 00 80 11 01 00 00 00 00
      e0: 00 3f 00 00 00 00 00 00 01 00 00 00 00 00 00 00
      f0: 00 00 00 00 00 00 00 00 87 0f 07 08 00 00 00 00
Comment 1 Bjorn Helgaas 2013-11-19 20:56:54 UTC
This was fixed in v3.12 by the following two changes:

https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=c8b303d0206b28c4ff3aecada47108d1655ae00f

https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=6d3a1741f1e648cfbd5a0cc94477a0d5004c6f5e

After these two changes, we read the Slot Capabilities register only if the device is either a Switch Downstream Port or a Root Port and the Port implements a slot.  This corresponds to the language in the PCIe spec r3.0, sec. 7.8.

Previously we relied on the spec statement that unimplemented registers in v2 capabilities must be hardwired to zero.  We could use that to argue that the BL460c is out of spec, but it seems better to pay attention to the Slot Implemented bit in the PCIe Capabilities Register and not even bother reading the slot registers if there's no slot.  lspci already behaves this way.
Comment 2 Bjorn Helgaas 2013-11-19 21:29:52 UTC
This issue was reported and diagnosed by Myron Stowe <myron.stowe@redhat.com>.

Note You need to log in before you can comment on or make changes to this bug.