Bug 199671

Summary: pcie_aspm: inconsistent configuration on probe
Product: Drivers Reporter: Frederick Lawler (fred)
Component: PCIAssignee: drivers_pci (drivers_pci)
Status: NEW ---    
Severity: normal CC: bjorn, gilles.buloz, okaya
Priority: P1    
Hardware: ARM   
OS: Linux   
Kernel Version: 4.17-rc1 Subsystem:
Regression: No Bisected commit-id:
Attachments: Kernel 4.1.35 and 4.17-rc1 dmesg and lspci output
Kernel 4.1.35 and 4.17-rc1 dmesg and lspci output (fixed archive)
pcie_aspm_init_link_state() with some debug
Kernel .config files used for testing on 4.1.35, 4.16.8, and 4.17-rc1
With earlydump, when all devices are detected
With earlydump, when we have missing devices
test patch to disable common clock configuration
Boot messages with earlydump and patch disabling common clock cfg (comment #15)

Description Frederick Lawler 2018-05-10 02:41:50 UTC
Created attachment 275891 [details]
Kernel 4.1.35 and 4.17-rc1 dmesg and lspci output

Reported by Gilles Buloz <Gilles.Buloz@kontron.com> at https://lkml.kernel.org/r/5AF2E91E.3030804@kontron.com :

[snip]
Gilles Buloz wrote:
> Info : with kernel 4.17-rc1, it turns out I need pcie_aspm=off to
> have the PMC devices behind the PCI-to-PCIe bridge of the PMC safely
> detected/configured. But this is not caused by the patch.
> Without pcie_aspm=off I saw this at one boot :
>    "pci 0000:02:0e.0: ASPM: Could not configure common clock" for this
>    bridge, but devices correctly detected/configured 
> but at most boots I get:
>    no ASPM message but "pci 0000:04:02.0: bridge configuration invalid ([bus
>    ff-ff]), reconfiguring "
> instead, and some devices are missing. Also lspci show "rev ff" for some
> devices.
> I don't see this problem on 4.1.35 with the same backported patch.

Here are full dmesg + lspci -vv + ... for the following cases :
- 4.17-rc1_probe_failed_without_pcieaspmoff.txt : I get a broken PCI config at most boots, without pcie_aspm=off

- 4.17-rc1_probe_ok_without_pcieaspmoff.txt : I get a correct PCI config at some boots, even without pcie_aspm=off

- 4.17-rc1_probe_always_ok_with_pcieaspmoff.txt : I get a correct PCI config at all boots when using pcie_aspm=off

- 4.1.35_probe_always_ok_without_pcieaspmoff.txt : I get a correct PCI config at all boots when using kernel 4.1.35-rt41 (from NXP Yocto BSP), even without pcie_aspm=off
 
I noticed these strange things :
- with 4.17-rc1, without pcie_aspm=off, the kernel is hanging during ~1 second :
   - before line "ASPM: Could not configure common clock" when PCI config is OK
   - or before line "bridge configuration invalid ([bus ff-ff]), reconfiguring" when PCI config is NOT OK
     (the FFs in "[bus ff-ff]" seems side effects of read returning ~0 because device is off)
- no such hang when pcie_aspm=off is used
- applying reverse patch for "PCI/ASPM: Don't warn if already in common clock mode"
   (commit 04875177dbe034055f23960981ecf6fb8ea1d638) does not give more message
Comment 1 Gilles Buloz 2018-05-11 09:09:40 UTC
Created attachment 275915 [details]
Kernel 4.1.35 and 4.17-rc1 dmesg and lspci output (fixed archive)

This archive is a replacement to the first one (log-output.tar had a missing log file and extra hidden files)
Comment 2 Gilles Buloz 2018-05-11 14:08:20 UTC
Created attachment 275917 [details]
pcie_aspm_init_link_state() with some debug

With some pci_info() added to pcie_aspm_init_link_state() (drivers/pci/pcie/aspm.c) we can see some delays at timestamps 0.715598-1.112375, 1.130030-1.701657, 1.711809-2.018397.
Comment 3 Gilles Buloz 2018-05-11 14:10:44 UTC
Context for attachment 275917 [details] :
- kernel 4.17-rc1 + patch "PCI: Check whether bridges allow access to extended config space" (needed to boot on my hardware, see LKML link above)
- hardware topology :
  LS1043A PCIe root
    -> PEX8112 PCIe-to-PCI bridge (not supporting ext cfg on PCI side)
      -> PMC slot connector for legacy PMC modules
  With the following devices on the PMC module (TEWS TPMC-885) :
    -> PCI-to-PCIe bridge (PCI-XPI7C9X130)
      -> PCIe switch (4 ports)
        -> 4 PCIe devices (one on each port)

Things seem to go wrong when pcie_aspm_init_link_state() is run on the PCI-to-PCIe bridge of the PMC
Comment 4 Gilles Buloz 2018-05-11 16:50:18 UTC
Some good news :

- this is not a 4.17 regression : also the same with kernel 4.16.8 + patch "PCI: Check whether bridges allow access to extended config space"

- this is not a regression due to patch "PCI: Check whether bridges allow access to extended config space" : also the same with kernel 4.16.8 without this patch. Instead of this patch, I set PCIE_ABSERR_SETTING to 0x9400 instead of 0x9401 in drivers/pci/dwc/pci-layerscape.c to disable error on UR received by root bridge to be able to boot on my hardware.
Comment 5 Gilles Buloz 2018-05-11 17:03:41 UTC
Created attachment 275923 [details]
Kernel .config files used for testing on 4.1.35, 4.16.8, and 4.17-rc1
Comment 6 Sinan Kaya 2018-05-15 13:26:36 UTC
Bjorn,

I was on vacation this week. I returned to office this week. What's your take on this issue?

Do you think that it is caused by my patch or something else is going on.

Sinan
Comment 7 Bjorn Helgaas 2018-05-15 14:14:21 UTC
I haven't looked at this at all yet, so I really don't have any ideas.
Comment 8 Bjorn Helgaas 2018-05-22 21:32:37 UTC
Here's the topology we expect to see (just domain 0000):

  00:00.0 Freescale Root Port to [bus 01-ff]
  01:00.0 PEX8112 PCIe-to-PCI bridge to [bus 02-08]
  02:09.0 Kontron bridge
  02:0e.0 PCI-XPI7C9X130 PCI-to-PCIe bridge  to [bus 03-08]
  03:00.0 IDT Upstream Port   to [bus 04-08]
  04:02.0 IDT Downstream Port to [bus 05]
  04:03.0 IDT Downstream Port to [bus 06]
  04:04.0 IDT Downstream Port to [bus 07]
  04:05.0 IDT Downstream Port to [bus 08]
  05:00.0 Intel NIC
  06:00.0 Intel NIC
  07:00.0 Intel NIC
  08:00.0 Intel NIC

Here are the PCIe links, where we could potentially enable ASPM.  Both ends of each link claim some ASPM support:

  00:00.0 -- 01:00.0
  02:0e.0 -- 03:00.0
  04:02.0 -- 05:00.0
  04:03.0 -- 06:00.0
  04:04.0 -- 07:00.0
  04:05.0 -- 08:00.0

The problem is that we don't enumerate 04:03.0, 04:04.0, 04:05.0, or the NICs below them.  The problem happens most of the time (but not always) with 4.17-rc1.  It never happens with 4.1.35 or when booting 4.17-rc1 with "pcie_aspm=off".

When the problem happens, this is in the dmesg log:

  pcieport 0000:03:00.0: Refused to change power state, currently in D3 
  pcieport 0000:04:02.0: buffer not found in pci_save_pcie_state 

and lspci shows:

  0000:03:00.0 PCI bridge: Integrated Device Technology, Inc. [IDT] Device 803e (rev ff) (prog-if ff) 
   !!! Unknown header type 7f 

This looks like 03:00.0 is powered off.  I don't know why that would be related to ASPM.

You could try bisecting; since it seems to be ASPM-related, you might be able to bisect just those changes, which should go pretty fast, e.g.,

  git bisect start -- drivers/pci/pcie/aspm.c
  git bisect bad v4.17-rc1
  git bisect good v4.1.35

If that doesn't work or you'd rather brute-force it, you could try doing something like "pci=earlydump".  This is currently only implemented for x86, so it would be some work to implement something similar for arm64.  It would be a nice upstream addition if you *did* implement it.

In the earlydump output I would be looking for clues about the power state of 03:00.0 to try to figure out if the bootloader left it powered off or if Linux is doing something to turn it off.
Comment 9 Gilles Buloz 2018-06-01 17:31:47 UTC
Bisecting with the parameters above was not easy because of several issues (crash or pci probe looping forever) with kernel below 4.14. However at 4.14.0-rc5 (commit e4dca7b7aa08b22893c45485d222b5807c1375ae) the ASPM issue is already present.

Then I bissected on all commits (quite long) to determine the first release booting correctly. This was 4.13.0-rc3 ([a36deff6d13da8e146ad02cf38c0b81752efb305] PCI: layerscape: Call dw_pcie_setup_rc() from ls_pcie_host_init()) that fixed the pci probe looping forever. The ASPM issue is already present with this release.

Then applying the patch of this commit to older releases, I've been able to bisect again on aspm.c and see that 4.11.0-rc1 (3bd7db63a841e8c5297bb18ad801df67d5e38ad2) also has the same ASPM problem.

So this ASPM issue is not a regression made in the last kernel releases.

For older releases I get a crash (kernel NULL pointer dereference in ls_pcie_probe). Also drivers/pci/dwc/pci-layerscape.c does not exist but the code is in drivers/pci/dwc/pci-layerscape.c

On my design, I use kernel 4.1.35 but with addons from NXP (supplied as a Yocto BSP) so it can not be compared directly to a mainstream kernel.

For the state of the devices under bootloader (uboot), all PCIe devices can be accessed and return valid data (so not powered-off).
Comment 10 Bjorn Helgaas 2018-06-02 06:04:35 UTC
Wow, you've done a lot of work.  It doesn't sound like you've found a commit that is related to the ASPM problems you're seeing.

I guess the plan B would be to try out Sinan's earlydump patch (https://lkml.kernel.org/r/1527879620-22720-1-git-send-email-okaya@codeaurora.org).  That should show valid data for all the devices, since they're all powered on under uboot.

Then we can try to figure out where things are going wrong.  I'd probably add printks in the config read/write functions.  That'll generate a lot of data and it might be a pain to correlate it back to the function, but we should be able to tell when devices get powered off and stop responding to config reads (this should turn into ~0 data back to the CPU).
Comment 11 Gilles Buloz 2018-06-04 14:12:13 UTC
Created attachment 276319 [details]
With earlydump, when all devices are detected
Comment 12 Gilles Buloz 2018-06-04 14:15:20 UTC
Created attachment 276321 [details]
With earlydump, when we have missing devices

It seems 0000:04:02.0 goes off while we are dumping its config space
Comment 13 Sinan Kaya 2018-06-05 02:42:31 UTC
can you also add this patch on top and boot with pci=safemode argument.

https://lkml.org/lkml/2018/5/29/1333

This is going to minimize the PCI configuration in linux.
Comment 14 Sinan Kaya 2018-06-05 02:44:47 UTC
Nevermind, my patch will also disable ASPM. You already found that turning off ASPM is helping.
Comment 15 Bjorn Helgaas 2018-06-15 21:48:08 UTC
Created attachment 276577 [details]
test patch to disable common clock configuration

Can you test this patch, please?

For the link between 02:0e.0 and 03:00.0, both ends have "Slot Clock Configuration" set.  In that case, pcie_aspm_configure_common_clock() normally sets the "Common Clock Configuration" bit. 

The earlydumps from comment #11 and comment #12 show that Common Clock Configuration is initially clear (02:0e.0 LnkCtl at 0xc0 and 03:00.0 LnkCtl at 0x50 are both 0x0000).

In comment #11, where we detect all devices, we try to set Common Clock Configuration, but it fails:

  pci 0000:02:0e.0: ASPM: Could not configure common clock

In comment #12, where some devices are missing, I think we probably do set Common Clock Configuration, but there's no message.  The 4.17-rc1_probe_failed_without_pcieaspmoff.txt log from comment #1 does show Common Clock Configuration being set.

I don't know why this would cause a problem, but maybe we can at least figure out if this is related to what you're seeing.
Comment 16 Gilles Buloz 2018-06-18 12:04:00 UTC
Created attachment 276647 [details]
Boot messages with earlydump and patch disabling common clock cfg (comment #15)

With Bjorn's patch from comment #15, all devices are correctly detected.