Bug 199581 - Resource assignment failure
Summary: Resource assignment failure
Status: NEW
Alias: None
Product: Drivers
Classification: Unclassified
Component: PCI (show other bugs)
Hardware: All Linux
: P1 normal
Assignee: drivers_pci@kernel-bugs.osdl.org
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2018-05-01 20:04 UTC by Bjorn Helgaas
Modified: 2019-09-14 05:30 UTC (History)
3 users (show)

See Also:
Kernel Version: v4.16
Subsystem:
Regression: No
Bisected commit-id:


Attachments
proposed patch (10.98 KB, patch)
2018-05-01 20:05 UTC, Bjorn Helgaas
Details | Diff
dmesg (showing problem) (179.90 KB, text/plain)
2018-05-01 20:08 UTC, Bjorn Helgaas
Details
dmesg (with proposed patch) (173.02 KB, text/plain)
2018-05-01 20:08 UTC, Bjorn Helgaas
Details
lspci (before proposed patch) (245.69 KB, text/plain)
2018-05-01 20:09 UTC, Bjorn Helgaas
Details
lspci (with proposed patch) (249.39 KB, text/plain)
2018-05-01 20:09 UTC, Bjorn Helgaas
Details

Description Bjorn Helgaas 2018-05-01 20:04:53 UTC
Reported by Mika Westerberg <mika.westerberg@linux.intel.com> at https://lkml.kernel.org/r/20180416103453.46232-3-mika.westerberg@linux.intel.com :

When hot-adding a PCIe switch the way we currently distribute resources
does not always work well because devices connected to the switch might
need to have their MMIO resources aligned to something else than the
default 1 MB boundary. For example Intel Gigabit ET2 quad port server
adapter includes PCIe switch leading to 4 x GbE NIC devices that want
to have their MMIO resources aligned to 2 MB boundary instead.

The current resource distribution code does not take this alignment into
account and might try to add too much resources for the extension
hotplug bridge(s). The resulting bridge window is too big which makes
the resource assignment operation fail, and we are left with a bridge
window with minimal amount (1 MB) of MMIO space.

Here is what happens when an Intel Gigabit ET2 quad port server adapter
is hot-added:

  pci 0000:39:00.0: BAR 14: assigned [mem 0x53300000-0x6a0fffff]
                                          ^^^^^^^^^^
  pci 0000:3a:01.0: BAR 14: assigned [mem 0x53400000-0x547fffff]
                                          ^^^^^^^^^^
The above shows that the downstream bridge (3a:01.0) window is aligned
to 2 MB instead of 1 MB as is the upstream bridge (39:00.0) window. The
remaining MMIO space (0x15a00000) is assigned to the hotplug bridge
(3a:04.0) but it fails:

  pci 0000:3a:04.0: BAR 14: no space for [mem size 0x15a00000]
  pci 0000:3a:04.0: BAR 14: failed to assign [mem size 0x15a00000]
Comment 1 Bjorn Helgaas 2018-05-01 20:05:57 UTC
Created attachment 275697 [details]
proposed patch
Comment 2 Bjorn Helgaas 2018-05-01 20:08:09 UTC
Created attachment 275699 [details]
dmesg (showing problem)
Comment 3 Bjorn Helgaas 2018-05-01 20:08:30 UTC
Created attachment 275701 [details]
dmesg (with proposed patch)
Comment 4 Bjorn Helgaas 2018-05-01 20:09:00 UTC
Created attachment 275703 [details]
lspci (before proposed patch)
Comment 5 Bjorn Helgaas 2018-05-01 20:09:19 UTC
Created attachment 275705 [details]
lspci (with proposed patch)
Comment 6 Nicholas Johnson 2019-02-01 16:03:18 UTC
Head's up for anybody interested. This bug is in process of being solved with this patch: https://lkml.org/lkml/2019/2/1/391

Mika Westerberg tested the exact same setup with the patch and it worked fine.

I added this bug URL to the latest revision of the patch that I just emailed a few minutes ago.
Comment 7 Benoît 2019-09-08 08:41:46 UTC
I have issue since this patch, i need to add pci=nommconf in boot config, for more information : 
- https://bugzilla.kernel.org/show_bug.cgi?id=203617
- https://github.com/dhedlund/kernel-patch-lg-gram-17
Comment 8 Nicholas Johnson 2019-09-14 05:30:20 UTC
@Benoit, which patch is causing the issue? As far as I know, none of these have been applied to the kernel. Mika Westerberg's proposed patches do not solve the issue (not enough information conveyed in function parameters to be possible) and mine has never been accepted into mainline.

I had a brief read of the links you sent and did not find the answer. None of these patches relate to ACPI, which appears to be the issue in the links.

If you are not daisy chaining Thunderbolt devices with native enumeration, then the bug on this page and the proposed patches have no effect. They only matter when hot-adding Thunderbolt devices with >1M alignment at the end of a daisy chain. Adding the device straight to the Thunderbolt 3 port of the computer works regardless.

Cheers

Note You need to log in before you can comment on or make changes to this bug.