Bug 71691
Summary: | pciehp surprise removal broken by INTx enable | ||
---|---|---|---|
Product: | Drivers | Reporter: | Bjorn Helgaas (bjorn) |
Component: | PCI | Assignee: | drivers_pci (drivers_pci) |
Status: | RESOLVED CODE_FIX | ||
Severity: | normal | CC: | andreas.noever |
Priority: | P1 | ||
Hardware: | All | ||
OS: | Linux | ||
URL: | http://lkml.kernel.org/r/CAMxnaaXd9-VbxSYa9q1YVT+fwfDFYKGG2Hq-1jkc-awLX-HwtQ@mail.gmail.com | ||
Kernel Version: | v3.14-rc4 | Subsystem: | |
Regression: | Yes | Bisected commit-id: | |
Attachments: |
dmesg
lscpi -vv lspci -vvv -xxxx -s 06:03 (after boot, cold-plugged adapter) lspci -vvv -xxxx -s 06:03 (unplugged adapter (first time, worked)) lspci -vvv -xxxx -s 06:03 (hotplugged adapter) lspci -vvv -xxxx -s 06:03 (unplugged adapter (second time, failed)) |
Description
Bjorn Helgaas
2014-03-07 16:39:32 UTC
Created attachment 128501 [details]
dmesg
Created attachment 128511 [details]
lscpi -vv
The hp bridge in question is the Thunderbolt downstream port (06:03). Note that the MacBook firmware only initializes cold-plugged thunderbolt devices. Scenario 3 requires my driver: https://lkml.org/lkml/2013/11/28/492 I observed the following: Scenario 1 (as expected): - boot with coldplugged Thunderbolt adapter - DisINTx is not set - load pcieport driver - DisINTx is set (by pci_enable_msi) - unplug adapter pciehp 0000:06:03.0:pcie24: Card not present on Slot(3-1) pciehp 0000:09:00.0:pcie24: unloading service driver pciehp pci_bus 0000:0a: busn_res: [bus 0a] is released pci_bus 0000:09: busn_res: [bus 09-0a] is released Scenario 2 (verify that DisINTx- breaks notifications): - boot with cold-plugged Thunderbolt adapter - DisINTx is not set - load pcieport driver - DisINTx is set (by pci_enable_msi) - clear DisINTx: setpci -s 06:03 COMMAND=0007 - unplug adapter - pciehp does nothing, lspci -vv now shows INTx+ (the bridge is asserting the intx interrupt) - set DisINTx again: setpci -s 06:03 COMMAND=0407 - pciehp finally reacts, as in Scenario 1 Scenario 3 (verify that 1f42db786b14 clears DisINTx): - boot with cold-plugged Thunderbolt adapter - unplug, pciehp removes the device - load thunderbolt driver, hotplug adapter - pciehp detects the adapter - do_pci_enable_device get called for 06:03, clears DisINTx - unplug - pciehp does nothing, lspci -vv now shows INTx+ (the bridge is asserting the intx interrupt) - set DisINTx again: setpci -s 06:03 COMMAND=0407 - pciehp finally reacts, as in Scenario 1 Without 1f42db786b14 Scenario 3 works as expected (the devices are removed when unplugged). It would be interesting to see whether that problem is specific to the thunderbolt controller or whether it also appears on other bridges. The PCI Express Base Spec v3 states in 6.7.3.4 Software Notification of Hot-Plug Events: If the Port is enabled for level-triggered interrupt signaling using the INTx messages, the virtual INTx wire must be asserted whenever and as long as the following conditions are satisfied: ... and If the Port is enabled for edge-triggered interrupt signaling using MSI or MSI-X, an interrupt message must be sent every time the logical AND of the following conditions transitions from FALSE to TRUE: ... Since we enable MSI on the bridge I would say that the device is at fault here. Still the msi code is very careful and always calls pci_intx_for_msi (which calls pci_intx, unless the device is blacklisted) whenever we enable/disable msi (see for example: http://lxr.free-electrons.com/source/drivers/pci/msi.c#L627 ). Since the original patch was addressing a problem where the BIOS set the flag I would suggest that we only clear it if it was indeed set by the BIOS and leave it alone otherwise. Created attachment 128521 [details]
lspci -vvv -xxxx -s 06:03 (after boot, cold-plugged adapter)
Created attachment 128531 [details]
lspci -vvv -xxxx -s 06:03 (unplugged adapter (first time, worked))
Created attachment 128541 [details]
lspci -vvv -xxxx -s 06:03 (hotplugged adapter)
Created attachment 128551 [details]
lspci -vvv -xxxx -s 06:03 (unplugged adapter (second time, failed))
This should be fixed by http://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=3cdeb713dc66057b50682048c151eae07b186c42 which appeared in v3.14-rc7. Andreas, let me know if that is not the case. |