Created attachment 142301 [details] dmesg log from a RHEL based system (issue verified to occur with upstream kernel also) Seeing the following 'pciehp' messages during boot: ... pciehp 0000:82:04.0:pcie24: Device 0000:83:00.0 already exists at 0000:83:00, cannot hot-add pciehp 0000:82:04.0:pcie24: Cannot add device at 0000:83:00 ... These are occurring during normal booting, no forced hot-plug activity is in play. The corresponding thread on the linux-pci list is https://lkml.org/lkml/2014/6/17/531
From the 'dmesg' log we can isolate the hierarchy involved ACPI: PCI Root Bridge [PCI1] (domain 0000 [bus 80-fe]) acpi PNP0A08:01: _OSC: OS supports [ExtendedConfig ASPM ClockPM Segments MSI] acpi PNP0A08:01: PCIe AER handled by firmware acpi PNP0A08:01: _OSC: OS now controls [PCIeHotplug PME PCIeCapability] PCI host bridge to bus 0000:80 pci_bus 0000:80: root bus resource [bus 80-fe] pci_bus 0000:80: root bus resource [io 0x8000-0xffff] pci_bus 0000:80: root bus resource [mem 0xc8000000-0xfbffbfff] pci_bus 0000:80: root bus resource [mem 0x34000000000-0x37fffffffff] pci 0000:80:03.0: PCI bridge to [bus 81-86] pci 0000:80:03.0: bridge window [mem 0xc8000000-0xc87fffff] pci 0000:81:00.0: PCI bridge to [bus 82-86] pci 0000:81:00.0: bridge window [mem 0xc8000000-0xc86fffff] pci 0000:82:04.0: [10b5:8734] type 01 class 0x060400 pci 0000:82:04.0: PCI bridge to [bus 83] pci 0000:82:04.0: bridge window [mem 0xc8600000-0xc86fffff] pciehp 0000:82:04.0:pcie24: Slot #4 AttnBtn- AttnInd- PwrInd- PwrCtrl- MRL- Interlock- NoCompl- LLActRep+ pci 0000:83:00.0: [144d:a820] type 00 class 0x010802 pci 0000:83:00.0: reg 0x10: [mem 0xc8600000-0xc8603fff 64bit] The device is a Samsung Electronics Co Ltd z820 NVMe SSD Controller (171X) residing under a PLX Technology, Inc PEX 8734 32-line, 8-Port PCIe Gen 3 Switch's downstream port. The messages occur during boot and come from 'board_added' so the logical path to follow is pciehp_probe pciehp_enable_slot board_added We get into 'pciehp_enable_slot' via a work queue based "ENABLE_REQ" event. pciehp_ctrl.c::'interrupt_event_handler' calls handlers for various events: INT_BUTTON_PRESS handle_button_press_event() INT_POWER_FAULT pciehp_set_attention_status(), pciehp_green_led_off() INT_PRESENCE_{ON,OFF} handle_surprise_event() INT_LINK_{UP,DOWN} handle_link_event() Looking at the slot capabilities of the switch's downstream port leading to the SSD device we see pciehp 0000:82:04.0:pcie24: Slot #4 AttnBtn- AttnInd- PwrInd- PwrCtrl- MRL- Interlock- NoCompl- LLActRep+ So the switch's downstream port does not support any "slot control" except LLActRep (Data Link Layer Link Active Reporting Capable) [7.8.10 Slot Control Register (offset 18h)]. Putting everything together we have pciehp_probe // Entry point for setting up PCIe Hot-Plug pcie_init if (NO_CMD_CMPL(ctrl) || !(POWER_CTRL(ctrl) | ATTN_LED(ctrl) | PWR_LED(ctrl) | EMI(ctrl))) ctrl->no_cmd_complete = 1; // Intel switch HACK /* Check if Data Link Layer Link Active Reporting is implemented */ pcie_capability_read_dword(pdev, PCI_EXP_LNKCAP, &link_cap); if (link_cap & PCI_EXP_LNKCAP_DLLLARC) { ctrl_dbg(ctrl, "Link Active Reporting supported\n"); ctrl->link_active_reporting = 1; } /* Clear all remaining event bits in Slot Status register */ pcie_capability_write_word(pdev, PCI_EXP_SLTSTA, PCI_EXP_SLTSTA_ABP | PCI_EXP_SLTSTA_PFD | PCI_EXP_SLTSTA_MRLSC | PCI_EXP_SLTSTA_PDC | PCI_EXP_SLTSTA_CC); Note: The "Data Link Layer State Changed" bit is *not* cleared! [7.8.11 Slot Status Register (Offset 1Ah)] init_slot pcie_init_notification // Put event handlers in place pciehp_request_irq request_irq(..., pcie_isr, ... // PCIe hot-plug ISR: 'pcie_isr' pcie_enable_notification // Enable events pcie_write_cmd(PCI_EXP_SLTCTL_DLLSCE | PCI_EXP_SLTCTL_PDCE | PCI_EXP_SLTCTL_HPIE | PCI_EXP_SLTCTL_CCIE) We likely get a LLActRep interrupt at this point which puts us into 'pcie_isr': pcie_isr wake_up(&ctrl->queue) // ?? Expect occurs ... pciehp_handle_linkstate_change pciehp_check_link_active queue_interrupt_event(p_slot, INT_LINK_UP) interrupt_event_handler // INT_LINK_UP handle_link_event(p_slot, INT_LINK_UP) info->req = event == INT_LINK_UP ? ENABLE_REQ : DISABLE_REQ; INIT_WORK(&info->work, pciehp_power_thread); case ENABLE_REQ: pciehp_enable_slot(p_slot); So we likely are getting a LLActRep. I believe this is occurring as a result of commit e48f1b67f668 *and* the BIOS not clearing DLLSC before handing off to the OS. Adding PCI_EXP_SLTSTA_DLLSC to the 'pcie_capability_write_word' of 'pcie_init' to clear out any outstanding event bits in the slot status register fixes the issue indicating that the above analysis is correct. I do not know if BIOS should, or should not, be clearing DLLSC before handoff to the OS. Seems as if it should but perhaps there is a reason for not doing such I'm unaware of. In either case, adding DLLSC to the cleared out events during initialization solves this issue.
Created attachment 142501 [details] dmesg log from a RHEL based system (issue verified to occur with upstream kernel also)
Fixed by 0d25d35c987d ("PCI: pciehp: Clear Data Link Layer State Changed during init") which appeared in v3.17-rc1.