Bug 218107 - ixgbe driver fails to load due to PCI device probing failure
Summary: ixgbe driver fails to load due to PCI device probing failure
Status: RESOLVED DUPLICATE of bug 218050
Alias: None
Product: Drivers
Classification: Unclassified
Component: PCI (show other bugs)
Hardware: Intel Linux
: P3 normal
Assignee: drivers_pci@kernel-bugs.osdl.org
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2023-11-06 06:13 UTC by Sebastian Manciulea
Modified: 2023-11-20 16:52 UTC (History)
1 user (show)

See Also:
Kernel Version:
Subsystem:
Regression: Yes
Bisected commit-id:


Attachments
pci_request_selected_regions fails for ixgbe (80.80 KB, text/plain)
2023-11-06 06:13 UTC, Sebastian Manciulea
Details
MCFG debug patch (10.51 KB, patch)
2023-11-09 18:37 UTC, Bjorn Helgaas
Details | Diff
acpidump (158.31 KB, text/plain)
2023-11-19 06:47 UTC, Sebastian Manciulea
Details
dmesg with PCI debug (92.99 KB, text/plain)
2023-11-19 06:47 UTC, Sebastian Manciulea
Details

Description Sebastian Manciulea 2023-11-06 06:13:55 UTC
Created attachment 305372 [details]
pci_request_selected_regions fails for ixgbe

Hello,

I have installed an Intel X520-DA2 NIC in a HP Proliant ML30 Gen9 server. With the default Proxmox 8 kernel, 6.2.16-4-pve, the ixgbe driver fails to load with the following errors:

[ 1.709221] ixgbe 0000:06:00.0: enabling device (0140 -> 0142)
[ 1.709498] ixgbe 0000:06:00.0: BAR 0: can't reserve [mem 0x80100000-0x8017ffff 64bit]
[ 1.709500] ixgbe 0000:06:00.0: pci_request_selected_regions failed 0xfffffff0
[ 1.709895] ixgbe: probe of 0000:06:00.0 failed with error -16

[ 1.710247] ixgbe 0000:06:00.1: enabling device (0140 -> 0142)
[ 1.710306] ixgbe 0000:06:00.1: BAR 0: can't reserve [mem 0x80180000-0x801fffff 64bit]
[ 1.710308] ixgbe 0000:06:00.1: pci_request_selected_regions failed 0xfffffff0
[ 1.710384] ixgbe: probe of 0000:06:00.1 failed with error -16

Attached is the full dmesg output.

I understand that the report is not for a self-compiled kernel, however based on my bug report on the Proxmox forum Bjorn Helgaas asked to open a similar bug report here for further investigation.

Regards,
Sebastian
Comment 1 Bjorn Helgaas 2023-11-09 18:37:35 UTC
Created attachment 305380 [details]
MCFG debug patch

Can you please attach the output of the "acpidump" command?

If it's practical for you to build a kernel, please apply this debug patch and attach the complete dmesg log after booting.

I think the problem is that the MCFG table (which should be dumped by acpidump) says [mem 0x80000000-0x8fffffff] is the ECAM space for [bus 00-ff], but it is not mentioned in a PNP0C02 _CRS method, and it is included in the host bridge window:

  PCI: MMCONFIG for domain 0000 [bus 00-ff] at [mem 0x80000000-0x8fffffff] (base 0x80000000)
  PCI: MMCONFIG at [mem 0x80000000-0x8fffffff] not reserved in ACPI motherboard resources
  pci_bus 0000:00: root bus resource [mem 0x80000000-0xfeafffff window]

So Linux thinks [mem 0x80000000-0x8fffffff] is available for use and assigns it to the ixgbe devices:

  pci 0000:06:00.0: BAR 0: assigned [mem 0x80100000-0x8017ffff]

I think the lack of a PNP0C02 _CRS that reserves the region is a BIOS defect, but I want to figure out if there's a way we can work around it.
Comment 2 Sebastian Manciulea 2023-11-19 06:45:54 UTC
Hello Bjorn,

I have attached the requested debug information. Hopefully I got it right with your kernel patch, it's quite an adventure to compile the Proxmox kernel ...

Regards,
Sebastian
Comment 3 Sebastian Manciulea 2023-11-19 06:47:04 UTC
Created attachment 305423 [details]
acpidump
Comment 4 Sebastian Manciulea 2023-11-19 06:47:34 UTC
Created attachment 305424 [details]
dmesg with PCI debug
Comment 5 Bjorn Helgaas 2023-11-20 16:52:17 UTC
Thanks very much, Sebastian!  I'm marking this as a duplicate of Bug 218050, and I attached a patch there that I think will work around this defect.  If you can test it, please attach the dmesg log to 218050.

*** This bug has been marked as a duplicate of bug 218050 ***

Note You need to log in before you can comment on or make changes to this bug.