Bug 218050 - ixgbe: can't reserve BAR, mis-reserved memory
Summary: ixgbe: can't reserve BAR, mis-reserved memory
Status: RESOLVED PATCH_ALREADY_AVAILABLE
Alias: None
Product: Drivers
Classification: Unclassified
Component: PCI (show other bugs)
Hardware: Intel Linux
: P1 normal
Assignee: drivers_pci@kernel-bugs.osdl.org
URL:
Keywords:
: 218107 (view as bug list)
Depends on:
Blocks:
 
Reported: 2023-10-27 15:10 UTC by Tomasz Pala
Modified: 2023-11-23 05:31 UTC (History)
2 users (show)

See Also:
Kernel Version:
Subsystem:
Regression: Yes
Bisected commit-id:


Attachments
ixgbe failing (77.90 KB, text/plain)
2023-10-27 15:10 UTC, Tomasz Pala
Details
ixgbe after disabling mentioned part of the code (79.61 KB, text/plain)
2023-10-27 15:12 UTC, Tomasz Pala
Details
MCFG debug (10.51 KB, patch)
2023-11-09 18:41 UTC, Bjorn Helgaas
Details | Diff
/proc/iomem contents (4.15 KB, text/plain)
2023-11-18 12:26 UTC, Tomasz Pala
Details
acpidump output (136.19 KB, application/octet-stream)
2023-11-18 12:30 UTC, Tomasz Pala
Details
dmesg output after patching with provided debug code (68.26 KB, text/plain)
2023-11-18 12:34 UTC, Tomasz Pala
Details
patch to workaround lack of PNP0C02 for ECAM (28.85 KB, patch)
2023-11-20 16:49 UTC, Bjorn Helgaas
Details | Diff
dmesg with patch.ecam applied (76.06 KB, text/plain)
2023-11-21 12:51 UTC, Tomasz Pala
Details
patch with workaround only, excluding subsequent cleanup (3.89 KB, patch)
2023-11-22 15:19 UTC, Bjorn Helgaas
Details | Diff

Description Tomasz Pala 2023-10-27 15:10:22 UTC
Created attachment 305301 [details]
ixgbe failing

I'm having a problem initializing ixgbe NICs with pristine 6.5.7 kernel. Probably the same issue was reported in:

https://forum.proxmox.com/threads/proxmox-8-kernel-6-2-16-4-pve-ixgbe-driver-fails-to-load-due-to-pci-device-probing-failure.131203/


I'm attaching the full dmesg for the maillist thread:

https://lore.kernel.org/linux-pci/20231026205319.GA1832508@bhelgaas/T/

and repeating the most important parts of the logs for search engine indexing.


efi: Remove mem63: MMIO range=[0x80000000-0x8fffffff] (256MB) from e820 map
[...]
[mem 0x7f800000-0xfed1bfff] available for PCI devices
[...]
PCI: MMCONFIG for domain 0000 [bus 00-ff] at [mem 0x80000000-0x8fffffff] (base 0x80000000)
[Firmware Info]: PCI: MMCONFIG at [mem 0x80000000-0x8fffffff] not reserved in ACPI motherboard resources
PCI: MMCONFIG at [mem 0x80000000-0x8fffffff] reserved as EfiMemoryMappedIO
[...]
ixgbe 0000:02:00.0: enabling device (0140 -> 0142)
ixgbe 0000:02:00.0: BAR 0: can't reserve [mem 0x80000000-0x8007ffff 64bit]
ixgbe 0000:02:00.0: pci_request_selected_regions failed 0xfffffff0
ixgbe: probe of 0000:02:00.0 failed with error -16


After disabling the code causing this (using always-false condition:
		if (size >= 256*1024 && 0) {
) in the chunk:

https://lore.kernel.org/lkml/20221208190341.1560157-2-helgaas@kernel.org/

the BAR starts at 0x90000000 (not 0x80000000):

efi: Not removing mem63: MMIO range=[0x80000000-0x8fffffff] (262144KB) from e820 map
[...]
[mem 0x90000000-0xfed1bfff] available for PCI devices
[...]
PCI: MMCONFIG for domain 0000 [bus 00-ff] at [mem 0x80000000-0x8fffffff] (base 0x80000000)
PCI: MMCONFIG at [mem 0x80000000-0x8fffffff] reserved as E820 entry

and everything seems to work again.
Comment 1 Tomasz Pala 2023-10-27 15:12:59 UTC
Created attachment 305302 [details]
ixgbe after disabling mentioned part of the code
Comment 2 Bjorn Helgaas 2023-11-09 18:41:17 UTC
Created attachment 305381 [details]
MCFG debug

Can you please attach the output of the "acpidump" command?

If it's practical for you to build a kernel, please apply this debug patch and attach the complete dmesg log after booting.

I think the problem is that the MCFG table (which should be dumped by acpidump) says [mem 0x80000000-0x8fffffff] is the ECAM space for [bus 00-ff], but it is not mentioned in a PNP0C02 _CRS method, and it is included in the host bridge window:

  PCI: MMCONFIG for domain 0000 [bus 00-ff] at [mem 0x80000000-0x8fffffff] (base 0x80000000)
  PCI: MMCONFIG at [mem 0x80000000-0x8fffffff] not reserved in ACPI motherboard resources
  pci_bus 0000:00: root bus resource [mem 0x80000000-0xfbffffff window]

So Linux thinks [mem 0x80000000-0x8fffffff] is available for use and assigns it to the ixgbe devices:

  pci 0000:02:00.0: BAR 0: assigned [mem 0x80000000-0x8007ffff 64bit]

I think the lack of a PNP0C02 _CRS that reserves the region is a BIOS defect, but I want to figure out if there's a way we can work around it.
Comment 3 Tomasz Pala 2023-11-18 12:26:18 UTC
Created attachment 305419 [details]
/proc/iomem contents

Sorry for the delay, I got to reassemble this box. I've also confirmed the problem on another NIC and updated kernel to 6.6.0.
Comment 4 Tomasz Pala 2023-11-18 12:30:32 UTC
Created attachment 305420 [details]
acpidump output
Comment 5 Tomasz Pala 2023-11-18 12:34:51 UTC
Created attachment 305421 [details]
dmesg output after patching with provided debug code
Comment 6 Bjorn Helgaas 2023-11-20 16:49:28 UTC
Created attachment 305441 [details]
patch to workaround lack of PNP0C02 for ECAM

Could you try this patch, please, and attach the dmesg log?  It's based on v6.7-rc1.
Comment 7 Bjorn Helgaas 2023-11-20 16:52:17 UTC
*** Bug 218107 has been marked as a duplicate of this bug. ***
Comment 8 Tomasz Pala 2023-11-21 12:51:10 UTC
Created attachment 305450 [details]
dmesg with patch.ecam applied

ixgbe detected, with one new issue:

memremap attempted on mixed range 0x0000000000000000 size: 0x8000
WARNING: CPU: 0 PID: 1 at kernel/iomem.c:78 memremap+0x154/0x170
Comment 9 Bjorn Helgaas 2023-11-21 18:15:56 UTC
Thanks a lot for testing this, Tomasz!  From your comment #8 log:

  PCI: ECAM for domain 0000 [bus 00-ff] at [mem 0xe0000000-0xefffffff] (base 0xe0000000)
  PCI: [Firmware Info]: ECAM at [mem 0xe0000000-0xefffffff] not reserved in ACPI motherboard resources
  PCI: ECAM at [mem 0xe0000000-0xefffffff] reserved as EfiMemoryMappedIO
+ PCI: ECAM [mem 0xe0000000-0xefffffff] reserved to work around lack of ACPI motherboard _CRS
  pci_bus 0000:00: root bus resource [mem 0xe0000000-0xfbffffff window]
  pci 0000:05:00.0: BAR 7: no space for [mem size 0x00100000 64bit]
  pci_bus 0000:00: No. 2 try to assign unassigned res
  pci 0000:05:00.0: BAR 0: assigned [mem 0xf0000000-0xf007ffff 64bit]
  pci 0000:05:00.0: BAR 7: assigned [mem 0xf0204000-0xf0303fff 64bit]

A few things changed since your https://bugzilla.kernel.org/attachment.cgi?id=305301 log (ECAM addr moved from 0x80000000-0x8fffffff to 0xe0000000-0xefffffff, ixgbe moved from 02:00.0 to 05:00.0), but the same situation: MCFG describes ECAM space that is included in the PNP0A03 host bridge window but not reserved by an ACPI motherboard device.

But the patch did add a reservation to work around that, so we assigned non-ECAM space to ixgbe, and as far as I can tell, the NICs do work as expected.

The "memremap attempted on mixed range" message is related to booting with "efi=debug" and is unrelated to the ixgbe issue.

I plan to apply the patches from comment #6 for v6.8.
Comment 10 Sebastian Manciulea 2023-11-22 03:58:19 UTC
Hello Bjorn,

I tried to apply the patch from comment #6 but it failed on v6.2.19. Not sure if you want to spend time porting it to 6.2.x, Proxmox will upgrade to 6.5 at some point in time, and I have the WA setting the 64b BAR.

Regards,
Sebastian
Comment 11 Sebastian Manciulea 2023-11-22 05:08:38 UTC
Hello Bjorn,

The patch fails with Proxmox kernel 6.5.11 as well.

Regards,
Sebastian
Comment 12 Tomasz Pala 2023-11-22 10:48:06 UTC
(In reply to Bjorn Helgaas from comment #9)

Indeed, I've moved the card from PCIe slot 3 to slot 1, so the DEVPATH is different now. In the meantime I've also installed SAS controller in slot 3 and was investigating some link speed/bifurcation problems with different (ixgbe...) 10GbE NIC in slot 1.

The cleanest diff I've run is between:
https://bugzilla.kernel.org/attachment.cgi?id=305421
https://bugzilla.kernel.org/attachment.cgi?id=305450

and as far as I can tell now it LGTM.

> I plan to apply the patches from comment #6 for v6.8.

Thanks for looking into the issue! These NICs are very fussy and one cannot rely on Intel support (actually, they've deleted the entire bug tracker of e1000 at SF a few/dozens months ago...)

Tested-by: Tomasz Pala <gotar@polanet.pl>


Sebastian: you should report the backport request to Proxmox. As 6.2 is not even the LTS line of kernels, I'm sure they keep some maintenance by themself. Anyway, 6.6 is now oficially LTS (and the patch applies cleanly), so I honestly don't see any rationale in their choices:

"The 6.2 kernel is derived from Ubuntu 23.04.

Proxmox VE 8.1 (2023/Q4) will be based on the 6.5 kernel, derived from Ubuntu 23.10."

but it gives one more opportunity - to request backport in Ubuntu.
Comment 13 Bjorn Helgaas 2023-11-22 15:13:02 UTC
(In reply to Tomasz Pala from comment #12)

> Thanks for looking into the issue! These NICs are very fussy and one cannot
> rely on Intel support (actually, they've deleted the entire bug tracker of
> e1000 at SF a few/dozens months ago...)

Nothing here is the fault of e1000 or the Intel driver.  Purely a BIOS issue (I think) that doesn't seem to affect Windows either because (a) Windows treats MCFG entries as a resource reservation (unlikely IMO) or (b) Windows allocates BAR space from the top down, while Linux does it from the bottom up (which is where the ECAM space is).

> Tested-by: Tomasz Pala <gotar@polanet.pl>

Thank you!  I added this to each patch, since I think you tested the entire series on v6.6.
Comment 14 Bjorn Helgaas 2023-11-22 15:19:21 UTC
Created attachment 305460 [details]
patch with workaround only, excluding subsequent cleanup

Sebastian, this patch (the first one from the series at https://bugzilla.kernel.org/attachment.cgi?id=305441&action=edit) applies cleanly to v6.2.  I assume it will also apply cleanly to v6.5 and v6.5.

This should work around the problem.  The rest of the patches are just cleanup and debugging aids.
Comment 15 Sebastian Manciulea 2023-11-23 05:31:44 UTC
Hello Bjorn,

The patch applies to 6.2 and 6.5 cleanly. Thank you. Tested successfully on 6.2

Regards,
Sebastian

Note You need to log in before you can comment on or make changes to this bug.