191921 – PCI Express memory fails to allocate in complex topology

Bug 191921 - PCI Express memory fails to allocate in complex topology

Summary: PCI Express memory fails to allocate in complex topology

Status:	NEEDINFO

Alias:	None

Product:	Drivers
Classification:	Unclassified
Component:	PCI (show other bugs)
Hardware:	Intel Linux

Importance:	P1 normal
Assignee:	drivers_pci@kernel-bugs.osdl.org

URL:	https://lkml.kernel.org/r/B56F52C5-72...
Keywords:

Depends on:
Blocks:

Reported:	2017-01-04 16:06 UTC by Harry Mallon
Modified:	2017-03-21 15:00 UTC (History)
CC List:	1 user (show)

See Also:
Kernel Version:	4.9.0
Subsystem:
Regression:	No
Bisected commit-id:

Attachments
lspci 4.9.0 pci=realloc (122.62 KB, text/plain) 2017-01-04 16:06 UTC, Harry Mallon	Details
zipped messages log 4.9.0 pci=realloc (268.75 KB, application/zip) 2017-01-04 16:07 UTC, Harry Mallon	Details
dmesg log (incomplete) (132.87 KB, text/plain) 2017-01-11 16:49 UTC, Bjorn Helgaas	Details
Full DMESG (136.28 KB, text/plain) 2017-03-21 13:52 UTC, Harry Mallon	Details
ZIP of kernel patches (6.17 KB, application/zip) 2017-03-21 14:32 UTC, Harry Mallon	Details
DMESG after patches (133.34 KB, text/plain) 2017-03-21 15:00 UTC, Harry Mallon	Details
Show Obsolete (1) Add an attachment (proposed patch, testcase, etc.)

Description Harry Mallon 2017-01-04 16:06:09 UTC

Created attachment 250241 [details]
lspci 4.9.0 pci=realloc

On a machine with complex PCI topology I am getting failures to assign BAR memory.

This is incredibly machine specific so feel free to close. I attach a lspci -vvv and /var/log/messages.

Topology:
-[0000:00]-+-00.0
           +-01.0-[01-10]----00.0-[02-10]--+-08.0-[03-08]----00.0-[04-08]--+-00.0-[05]--+-00.0
           |                               |                               |            \-00.1
           |                               |                               +-09.0-[06]--
           |                               |                               +-10.0-[07]----00.0
           |                               |                               \-11.0-[08]--
           |                               +-09.0-[09]----00.0
           |                               +-10.0-[0a]----00.0
           |                               \-11.0-[0b-10]----00.0-[0c-10]--+-01.0-[0d]--
           |                                                               +-02.0-[0e]--
           |                                                               +-03.0-[0f]--
           |                                                               \-04.0-[10]--
           +-02.0
           +-14.0
           +-16.0
           +-16.3
           +-19.0
           +-1a.0
           +-1b.0
           +-1c.0-[11]--
           +-1c.4-[12]----00.0
           +-1c.6-[13]----00.0
           +-1c.7-[14-17]----00.0-[15-17]--+-01.0-[16]--
           |                               \-02.0-[17]----00.0
           +-1d.0
           +-1f.0
           +-1f.2
           \-1f.3

Comment 1 Harry Mallon 2017-01-04 16:07:27 UTC

Created attachment 250251 [details]
zipped messages log 4.9.0 pci=realloc

Comment 2 Bjorn Helgaas 2017-01-11 16:49:28 UTC

Created attachment 251231 [details]
dmesg log (incomplete)

Hi Harry, I'm attaching the last boot, which I extracted from your zipped messages log.  It's actually not complete because it doesn't include the PCI enumeration information about individual devices.  This would tell us about any initial resource assignments from the firmware.  The output of the "dmesg" command or /var/log/dmesg should contain that.

This topology might be more complicated than some, but I do consider it a bug if complete assignment is theoretically possible but Linux doesn't do it.

I think it's also a bug if firmware gave us a working but incomplete assignment, and Linux makes it worse.  It looks like that might be the case here: 05:00.0 probably had working BARs from firmware, but it looks like we broke it.

In your email you mentioned 05:00.0 differences (valid config space but disabled BARs vs. "unknown header type 7f").  That looks like possibly a different problem to be teased out separately.  The "unknown header" case looks like possibly we're reading 0xff from its config space due to bridge misconfiguration or some other issue.

You also mentioned hotplug, which might be something we can look at separately.

The patches you use to get things working correctly would also be helpful.  Even if they're hacky, they would give a clue as to what's going wrong.

As you say, it's pretty hard for us to debug issues in older kernels like CentOS 3.10.0, but if you can reproduce them on v4.9 as you did here, there's a chance we can make some progress.

Comment 3 Bjorn Helgaas 2017-03-03 03:10:08 UTC

Hi Harry, if you have a chance to collect and attach the complete dmesg log, that would be great.  It should include lines like this that are missing from the one you attached:

  pci 0000:00:00.0: [8086:xxxx] type 00 class ...
  pci 0000:00:01.0: [8086:xxxx] type 01 class ...
  pci 0000:00:02.0: [8086:xxxx] type 00 class ...
  pci 0000:00:02.0: reg 0x10: [mem 0xf6c00000-0xf6ffffff 64bit]
  pci 0000:00:02.0: reg 0x18: [mem 0xd0000000-0xdfffffff 64bit pref]
  pci 0000:00:02.0: reg 0x20: [io  0xf000-f03f]

This shows us what the BIOS initially assigned, so we can see where Linux went wrong.

Comment 4 Harry Mallon 2017-03-21 13:52:27 UTC

Created attachment 255401 [details]
Full DMESG

Comment 5 Harry Mallon 2017-03-21 14:31:05 UTC

I also attach a collection of our patches that fix this. The one with changes to setup-bus.c is VERY hardware specific. The one that changes pciehp_hpc.c will not apply to the tip of the kernel tree.

Comment 6 Harry Mallon 2017-03-21 14:32:49 UTC

Created attachment 255403 [details]
ZIP of kernel patches

Comment 7 Harry Mallon 2017-03-21 15:00:33 UTC

Created attachment 255405 [details]
DMESG after patches

Note You need to log in before you can comment on or make changes to this bug.