Created attachment 300308 [details]
While testing on latest upstream kernel(https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/) we noticed that with the merge commit (https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=d0a231f01e5b25bacd23e6edc7c979a18a517b2b)
hotplug and hotunplug of nvme drives stopped working.
Rescan PCI does not help.
echo "1" > /sys/bus/pci/rescan
Issue does not reproduce on a kernel built on an antecedent commit(88db8458086b1dcf20b56682504bdb34d2bca0e2).
During hot-remove device does not disappear, however when we try to do I/O on the disk then there is an I/O error, and the device disappears.
Before I/O no logs regarding the disk appeared in the dmesg, only after I/O the entries appeared like below:
[ 177.943703] nvme nvme5: controller is down; will reset: CSTS=0xffffffff, PCI_STATUS=0xffff
[ 177.971661] nvme 10000:0b:00.0: can't change power state from D3cold to D0 (config space inaccessible)
[ 177.981121] pcieport 10000:00:02.0: can't derive routing for PCI INT A
[ 177.987749] nvme 10000:0b:00.0: PCI INT A: no GSI
[ 177.992633] nvme nvme5: Removing after probe failure status: -19
[ 178.004633] nvme5n1: detected capacity change from 83984375 to 0
[ 178.004677] I/O error, dev nvme5n1, sector 0 op 0x0:(READ) flags 0x0 phys_seg 1 prio class 0
OS: RHEL 8.4 GA
Platform: Intel Purley
The logs are collected on a non-recent upstream kernel, but a issue also occurs on the newest upstream kernel(dd81e1c7d5fb126e5fbc5c9e334d7b3ec29a16a0)
Created attachment 300338 [details]
dmesg when hotplugs not works, parameter pci=nommconf and dyndbg added.
Created attachment 300339 [details]
dmesg when hotplugs works, without parameter pci=nommconf, dyndbg added.
Summary from the email thread at https://lore.kernel.org/r/20220124214635.GA1553164@bhelgaas:
Issue only occurs when booting with "pci=nommconf" and is related to 04b12ef163d1 ("PCI: vmd: Honor ACPI _OSC on PCIe features") .
Prior to 04b12ef163d1, when booting with "pci=nommconf", pciehp hotplug did not work for devices in general because Linux requires support for extended config before it requests control of PCIe features [2, 3, 4]. However, hotplug *did* work for devices below a VMD because VMD acts like a host bridge and inherited the default "native_pcie_hotplug" setting , lwhich is "OS owns pciehp".
After 04b12ef163d1, VMD inherits the real ACPI host bridge "native_pcie_hotplug" setting, which is "platform owns pciehp" since Linux didn't ask for control of it. Therefore, pciehp hotplug doesn't work for *any* devices.
We can avoid the issue by omitting the "pci=nommconf" parameter. In that case, Linux *will* request control of pciehp, and hotplug will work for all devices, including those below a VMD.