Bug 215143
Summary: | PCI: pciehp: clear cmd_busy bit when Command Completed in polling mode | ||
---|---|---|---|
Product: | Drivers | Reporter: | Liguang (zhangliguang) |
Component: | PCI | Assignee: | drivers_pci (drivers_pci) |
Status: | NEW --- | ||
Severity: | normal | CC: | lukas |
Priority: | P1 | ||
Hardware: | All | ||
OS: | Linux | ||
Kernel Version: | 5.10.23 | Subsystem: | |
Regression: | No | Bisected commit-id: | |
Attachments: | hotplug timeout log |
#lspci -tv -+-[0000:fc]-+-00.0 Device 1ded:8001 | \-0a.0 Device 1ded:8003 +-[0000:b0]---00.0-[b1-b2]--+-00.0 Mellanox Technologies MT27800 Family [ConnectX-5] | \-00.1 Mellanox Technologies MT27800 Family [ConnectX-5] +-[0000:98]-+-00.0-[99]-- | +-01.0-[9a]-- | +-02.0-[9b]-- | \-03.0-[9c]-- +-[0000:80]---00.0-[81]-- +-[0000:7c]-+-00.0 Device 1ded:8001 | \-0a.0 Device 1ded:8003 +-[0000:30]-+-00.0-[31]-- | +-01.0-[32]-- | +-02.0-[33]-- | \-03.0-[34]-- +-[0000:10]---00.0-[11]-- \-[0000:00]-+-00.0-[01-02]----00.0-[02]----00.0 ASPEED Technology, Inc. ASPEED Graphics Family +-01.0-[03]----00.0 Renesas Technology Corp. uPD720201 USB 3.0 Host Controller +-02.0-[04]----00.0 Marvell Technology Group Ltd. 88SE9235 PCIe 2.0 x2 4-port SATA 6 Gb/s Controller \-03.0-[05]-- [ 36.384101] pcieport 0000:00:03.0: Adding to iommu group 4 [ 36.392032] pcieport 0000:00:03.0: PME: Signaling with IRQ 87 [ 36.399782] pcieport 0000:00:03.0: pciehp: Slot Capabilities : 0x0022005a [ 36.409054] pcieport 0000:00:03.0: pciehp: Slot Status : 0x0000 [ 36.417986] pcieport 0000:00:03.0: pciehp: Slot Control : 0x01ca [ 36.761412] pcieport 0000:00:03.0: pciehp: Slot #4 AttnBtn- PwrCtrl+ MRL- AttnInd+ PwrInd+ HotPlug+ Surprise- Interlock+ NoCompl- IbPresDis- LLActRep+ [ 36.779026] pcieport 0000:00:03.0: pciehp: pciehp_get_power_status: SLOTCTRL 88 value read 1ca [ 36.789833] pcieport 0000:00:03.0: pciehp: pciehp_check_link_active: lnk_status = 1011 [ 36.857413] pcieport 0000:00:03.0: pciehp: pcie_disable_notification: SLOTCTRL 88 write cmd 0 [ 38.745412] pcieport 0000:00:03.0: pciehp: Timeout on hotplug command 0x01c0 (issued 1948 msec ago) [ 38.817411] pcieport 0000:00:03.0: pciehp: pciehp_power_off_slot: SLOTCTRL 88 write cmd 400 [ 40.705412] pcieport 0000:00:03.0: pciehp: Timeout on hotplug command 0x05c0 (issued 1948 msec ago) [ 40.718993] pcieport 0000:00:03.0: pciehp: pcie_enable_notification: SLOTCTRL 88 write cmd 1038 [ 40.730110] pcieport 0000:00:03.0: pciehp: pciehp_check_link_active: lnk_status = 1011 The vendor ID 1ded (Alibaba) and the device IDs 8000 (Root Port), 8001 and 8003 (Co-processor) are currently not registered in the PCI IDs database. You may want to add them there: https://pci-ids.ucw.cz/ thanks. |
Created attachment 299725 [details] hotplug timeout log This patch fixes this problem that on driver probe from system startup, pciehp checks the Presence Detect State bit in the Slot Status register to bring up an occupied slot or bring down an unoccupied slot. If empty slot's power status is on, turn power off. The Hot-Plug interrupt isn't requested yet, so avoid triggering a notification by calling pcie_disable_notification(). Both the CCIE and HPIE bits are masked in pcie_disable_notification(), when we issue a hotplug command, pcie_wait_cmd() will polling the Command Completed bit instead of waiting for an interrupt. But cmd_busy bit was not cleared when Command Completed which results in timeouts like this in pciehp_power_off_slot() and pcie_init_notification(): pcieport 0000:00:03.0: pciehp: Timeout on hotplug command 0x01c0 (issued 2264 msec ago) pcieport 0000:00:03.0: pciehp: Timeout on hotplug command 0x05c0 (issued 2288 msec ago) After some debug, the first timeout occurs: pciehp_power_off_slot pcie_write_cmd(ctrl, PCI_EXP_SLTCTL_PWR_OFF, PCI_EXP_SLTCTL_PCC) pcie_do_write_cmd /* * Always wait for any previous command that might still be in progress */ pcie_wait_cmd(ctrl); // at the beginning if (!ctrl->cmd_busy) // cmd_busy was not zero return; Why cmd_busy was not flase, because in function pcie_disable_notification cmd_busy was not setting correctly: pcie_disable_notification // Both the CCIE and HPIE bits are masked pcie_write_cmd pcie_do_write_cmd ctrl->cmd_busy = 1 // cmd_busy was setting to 1 pcie_wait_cmd pcie_poll_cmd // pcie_wait_cmd() will polling the Command Completed bit instead of waiting for an interrupt After debug we found Command Completed can be signaled without cmd_busy was setting to 0. If we use interrupt mode pciehp_isr: pciehp_isr if (events & PCI_EXP_SLTSTA_CC) { ctrl->cmd_busy = 0; // cmd_busy was setting to zero, this was left in pcie_poll_cmd.