Bug 192441

Summary: PCIEASPM breaks PCIe on Marvell Armada 385 machine
Product: Drivers Reporter: Uwe Kleine-König (uwe+kernel)
Component: PCIAssignee: drivers_pci (drivers_pci)
Status: NEW ---    
Severity: normal CC: bjorn, pali, uwe+kernel
Priority: P1    
Hardware: ARM   
OS: Linux   
Kernel Version: 4.7 and 4.9 Subsystem:
Regression: No Bisected commit-id:
Attachments: output of lspci -vv taken on (broken) 4.7.8
dmesg (4.9, broken)
dmesg (4.9, without PCIEASPM)
Patch adding output to mvebu_pcie_{wr,rd}_conf
dmesg (4.9, broken) with debug.patch
dmesg (4.9, without PCIEASPM) with debug patch
lspci -vv in good state (kernel cmdline += pcie_aspm=off)
dmesg (4.9, broken) with debug.patch
dmesg (4.9, without PCIEASPM) with debug patch
Patch adding output to mvebu_pcie_{wr,rd}_conf
device-not-found and instrumentation patches
dmesg (4.9, broken) with bjorn's debug.patch
dmesg (4.9, without PCIEASPM) with bjorn's debug.patch

Description Uwe Kleine-König 2017-01-12 13:09:11 UTC
Created attachment 251321 [details]
output of lspci -vv taken on (broken) 4.7.8

The Debian kernels have PCIEASPM enabled and when booting these (tested with kernels 4.7.8-1 and 4.9-1) the ath10k driver fails to probe:

# dmesg | grep ath10
[    8.352179] ath10k_pci 0000:02:00.0: Refused to change power state, currently in D3
[    8.391059] ath10k_pci 0000:02:00.0: failed to wake up device : -110
[    8.407552] ath10k_pci: probe of 0000:02:00.0 failed with error -110

It doesn't matter if PCIEASPM_DEFAULT is selected (as is in the Debian kernel) or PCIEASPM_PERFORMANCE. When disabling PCIEASPM the driver probes correctly (and also works).
Comment 1 Uwe Kleine-König 2017-01-12 13:16:28 UTC
Created attachment 251341 [details]
dmesg (4.9, broken)
Comment 2 Uwe Kleine-König 2017-01-12 13:18:30 UTC
lspci -vv on 4.9 (broken) doesn't differ from lspci -vv on 4.7 (broken)
Comment 3 Uwe Kleine-König 2017-01-13 07:43:19 UTC
Created attachment 251441 [details]
dmesg (4.9, without PCIEASPM)
Comment 4 Uwe Kleine-König 2017-01-13 20:28:58 UTC
Created attachment 251481 [details]
Patch adding output to mvebu_pcie_{wr,rd}_conf
Comment 5 Uwe Kleine-König 2017-01-13 20:29:31 UTC
Created attachment 251491 [details]
dmesg (4.9, broken) with debug.patch
Comment 6 Uwe Kleine-König 2017-01-13 20:29:54 UTC
Created attachment 251511 [details]
dmesg (4.9, without PCIEASPM) with debug patch
Comment 7 Uwe Kleine-König 2017-01-13 21:08:35 UTC
Created attachment 251521 [details]
lspci -vv in good state (kernel cmdline += pcie_aspm=off)
Comment 8 Uwe Kleine-König 2017-01-14 19:47:46 UTC
Created attachment 251571 [details]
dmesg (4.9, broken) with debug.patch
Comment 9 Uwe Kleine-König 2017-01-14 19:48:18 UTC
Created attachment 251581 [details]
dmesg (4.9, without PCIEASPM) with debug patch
Comment 10 Uwe Kleine-König 2017-01-14 19:49:08 UTC
Created attachment 251591 [details]
Patch adding output to mvebu_pcie_{wr,rd}_conf
Comment 11 Bjorn Helgaas 2017-01-14 21:33:53 UTC
Created attachment 251621 [details]
device-not-found and instrumentation patches

Hi Uwe, would you mind trying this patch?  It gets rid of the PCIBIOS_DEVICE_NOT_FOUND cases, which I think is a bug (or at least a difference from how other arches handle them) and I don't know how that will affect the higher-level code.  It also instruments the config accessors.  It should be a little less verbose and also catch a case or two we missed before (the root bridge reads).

BTW, have you tried any devices other than the Atheros that support ASPM?  I'm just hoping to verify that ASPM on the Marvell end is known to work.
Comment 12 Uwe Kleine-König 2017-01-15 09:54:28 UTC
Created attachment 251691 [details]
dmesg (4.9, broken) with bjorn's debug.patch

The patch didn't compile, so I applied this on top:

diff --git a/drivers/pci/host/pci-mvebu.c b/drivers/pci/host/pci-mvebu.c
index 91c7dc2ba710..623782cd9f8a 100644
--- a/drivers/pci/host/pci-mvebu.c
+++ b/drivers/pci/host/pci-mvebu.c
@@ -621,8 +621,7 @@ static int mvebu_sw_pci_bridge_read(struct mvebu_pcie_port *port,
                 * normally.
                 */
                *value = 0;
-               printk("pci 0000:%02x:%02x.%d: rd where=%#05x size=%d val=%#x (sw reserved)\n",
-                      bus->number, PCI_SLOT(devfn), PCI_FUNC(devfn),
+               printk("pci 0000:??:??.?: rd where=%#05x size=%d val=%#x (sw reserved)\n",
                       where, size, *value);
                return PCIBIOS_SUCCESSFUL;
        }
@@ -632,8 +631,7 @@ static int mvebu_sw_pci_bridge_read(struct mvebu_pcie_port *port,
        else if (size == 1)
                *value = (*value >> (8 * (where & 3))) & 0xff;
 
-       printk("pci 0000:%02x:%02x.%d: rd where=%#05x size=%d val=%#x (sw)\n",
-              bus->number, PCI_SLOT(devfn), PCI_FUNC(devfn),
+       printk("pci 0000:??:??.?: rd where=%#05x size=%d val=%#x (sw)\n",
               where, size, *value);
        return PCIBIOS_SUCCESSFUL;
 }
@@ -655,9 +653,8 @@ static int mvebu_sw_pci_bridge_write(struct mvebu_pcie_port *port,
        else
                return PCIBIOS_BAD_REGISTER_NUMBER;
 
-       printk("pci 0000:%02x:%02x.%d: wr where=%#05x size=%d val=%#x (sw)\n",
-              bus->number, PCI_SLOT(devfn), PCI_FUNC(devfn),
-              where, size, val);
+       printk("pci 0000:??:??.?: wr where=%#05x size=%d val=%#x (sw)\n",
+              where, size, value);
 
        err = mvebu_sw_pci_bridge_read(port, where & ~3, 4, &reg);
        if (err)

This obviously destroys some information, but I don't know where to get it from.
Comment 13 Uwe Kleine-König 2017-01-15 09:55:19 UTC
Created attachment 251701 [details]
dmesg (4.9, without PCIEASPM) with bjorn's debug.patch

counterpart to dmesg-bad-bjorn-debug with PCIEASPM disabled
Comment 14 Pali Rohár 2021-04-17 18:23:32 UTC
I bet that this is same issue as in bug #209833.

Uwe, could you try following patch?
https://lore.kernel.org/linux-pci/20210326124326.21163-1-pali@kernel.org/T/#u