Created attachment 287907 [details] dmesg with s2idle
Created attachment 287909 [details] lspci -vvnn
Created attachment 287911 [details] lspci -t
Created attachment 287913 [details] proposed patch to use current link speed
Created attachment 287915 [details] dmesg with the patch
Can you please add some description of the issue and steps to reproduce. Also please enable CONFIG_PCI_DEBUG=y in your .config, reproduce and attach dmesg so we can hopefully see the timings logged and maybe figure out where the time is spend.
Ok, I forgot to copy the description here, sorry. The issue is that it takes a long time (> 1s) to runtime resume or system resume Thunderbolt bridges, since the msleep(1100) was call in pci_bridge_wait_for_secondary_bus() -> pcie_wait_for_link_delay().
Created attachment 287917 [details] dmesg with debug message Runtime resume via lspci: [ 49.291962] pcieport 0000:04:00.0: !pdev->link_active_reporting, msleep(1100) [ 49.292495] pcieport 0000:04:02.0: !pdev->link_active_reporting, msleep(1100) System resume: [ 122.407100] pcieport 0000:04:00.0: !pdev->link_active_reporting, msleep(1100) [ 122.407102] pcieport 0000:04:02.0: !pdev->link_active_reporting, msleep(1100)
Created attachment 287919 [details] another lspci -vvv This is reproduce on another CML-H + Titan Ridge system, because the original system is unavailable now.
If we use current link speed to determine the speed, i.e. pcie_capability_read_word(dev, PCI_EXP_LNKSTA, &linksta); speed = pcie_link_speed[linksta & PCI_EXP_LNKSTA_CLS]; We can identify the speed is less than PCIE_SPEED_5_0GT so we don't need to wait that long.
Hmm, if we look at the first Titan Ridge downstream port here: 04:00.0 PCI bridge: Intel Corporation JHL7540 Thunderbolt 3 Bridge [Titan Ridge 4C 2018] (rev 06) (prog-if 00 [Normal decode]) ... LnkCap: Port #0, Speed 2.5GT/s, Width x4, ASPM L1, Exit Latency L1 <1us dmesg says: [ 49.291959] pcieport 0000:04:00.0: waiting 100 ms for downstream link, after activation [ 49.291962] pcieport 0000:04:00.0: !pdev->link_active_reporting, msleep(1100) How it can go to the path where it calls pcie_wait_for_link_delay() because the cap says it is <= 5 GT/s? I would expect here that it says: [ 49.291959] pcieport 0000:04:00.0: waiting 100 ms for downstream link I'm suprised because I specifically tested this on CFL-H which is similar to CML-H with Titan Ridge and I did not see this happening.
This is because pciutils only uses PCI_EXP_LNKCAP but pcie_get_speed_cap() prioritizes PCI_EXP_LNKCAP2 over PCI_EXP_LNKCAP. The PCI_EXP_LNKCAP2 read is 8GT for this case. We also need to update pciutils to use PCI_EXP_LNKCAP2.
Filed a request against pciutils: https://github.com/pciutils/pciutils/issues/38
Can you dump the raw PCI_EXP_LNKCAP2 of these ports? It is weird that it claims 8 GT/s because then it must implement active link reporting according to spec. We cannot use the current link speed here because the link is not yet trained. One option is to use target link speed instead (that should reflect the LNKCAP max supported).
Created attachment 287947 [details] dmesg with speed cap dumped The diff: diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c index d828ca835a98..c7a795569f6d 100644 --- a/drivers/pci/pci.c +++ b/drivers/pci/pci.c @@ -4619,6 +4619,7 @@ static bool pcie_wait_for_link_delay(struct pci_dev *pdev, bool active, * case, we wait for 1000 + 100 ms. */ if (!pdev->link_active_reporting) { + pci_info(pdev, "!pdev->link_active_reporting, msleep(1100)\n"); msleep(1100); return true; } @@ -5783,7 +5784,9 @@ enum pci_bus_speed pcie_get_speed_cap(struct pci_dev *dev) * should use the Supported Link Speeds field in Link Capabilities, * where only 2.5 GT/s and 5.0 GT/s speeds were defined. */ + pcie_capability_read_dword(dev, PCI_EXP_LNKCAP, &lnkcap); pcie_capability_read_dword(dev, PCI_EXP_LNKCAP2, &lnkcap2); + pci_info(dev, "%s: cap: 0x%0x cap2: 0x%0x\n", __func__, lnkcap, lnkcap2); if (lnkcap2) { /* PCIe r3.0-compliant */ if (lnkcap2 & PCI_EXP_LNKCAP2_SLS_32_0GB) return PCIE_SPEED_32_0GT;
It indeed announces 8GT/s as supported speed (0xe >> 1 == 0x7). That is weird because you would expect that it has the same value for both fields. Also for > 5GT/s it must support data link layer active reporting. I'm wondering if this could be some sort of firmware issue? What NVM version the card reports?
The nvm_version is 55.0.
Sorry, that's a typo. The nvm_version is 50.0
Created attachment 287973 [details] Use link activation to determine when to start waiting I realized that speeds > 5 GT/s and support for active link reporting capability goes hand in hand (it is mandatory in that case for downstream ports). So I'm wondering whether we should just check for that instead of trying to figure this based on speed. Can you try the attached patch?
Yes, it solves the issue. Thanks a lot! Please collect my Tested-by tag: Tested-by: Kai-Heng Feng <kai.heng.feng@canonical.com>
Created attachment 288997 [details] lspci -vvxxxx
Created attachment 288999 [details] lspci -vvxxxx with linkcap2 patch applied
The above two lspci outputs are collect from another system, albeit exhibit the same slow TBT resume symptom.
commit ec411e02b7a2e785a4ed9ed283207cd14f48699d Author: Mika Westerberg <mika.westerberg@linux.intel.com> Date: Thu May 14 16:30:43 2020 +0300 PCI/PM: Assume ports without DLL Link Active train links in 100 ms
(In reply to Kai-Heng Feng from comment #24) > commit ec411e02b7a2e785a4ed9ed283207cd14f48699d > Author: Mika Westerberg <mika.westerberg@linux.intel.com> > Date: Thu May 14 16:30:43 2020 +0300 > > PCI/PM: Assume ports without DLL Link Active train links in 100 ms This commit regression runpm with Nouveau on one of my systems. Bug filed here: https://bugzilla.kernel.org/show_bug.cgi?id=208597