Created attachment 72121 [details]
dmesg log from kernel 188.8.131.52 (Knoppix 6.2.1 based on Debian squeeze/sid)
Observed a problem with a PCI Express card in a certain server PC with 8 PCI Express slots on the mainboard. There are no other PCI slots than the PCIe ones, and all slots are empty except the one with the card under test, which requires a legacy I/O address range for base address 0.
In some of the slots the card works properly, but in some other slots on the same
machine the card doesn't work since it is not accessible by the associated kernel
driver. In case it doesn't work lspci -v says for the I/O base address 0:
Region 0: I/O ports at <ignored>
Found out that this problem did *not* occur with kernels 184.108.40.206 and earlier, but *does* occur with kernels 220.127.116.11 and later, up to at least 3.4.1, on the same hardware.
Yet I didn't try more kernel versions between 2.6.32 and 2.6.36 to identify in which version exactly the problem started to occur.
Appending dmesg output from kernels 18.104.22.168 (now) and 3.1.4 (later) as requested by Bjorn.
Created attachment 72122 [details]
dmesg log from kernel 3.1.4 (Ubuntu 10.10)
Another dmesg log from a more recent kernel where no I/O address range is assigned.
Created attachment 72171 [details]
dmesg log from kernel 2.6.32-36-generic (Ubuntu 10.10)
dmesg log from a 2.6.32 kernel where the problem doesn't occur.
Replaces an earlier log which was incomplete and thus didn't contain all the required information.
Created attachment 72176 [details]
dmesg log from a 3.1.4 kernel with the card in a slot which works
Created attachment 72180 [details]
Windows resource info (from AIDA64)
We have two host bridges; both report the same [io 0xf000-0xffff] region:
ACPI: PCI Root Bridge [PCI0] (domain 0000 [bus 00-7f])
pci_root PNP0A08:00: host bridge window [io 0x0d00-0xefff]
pci_root PNP0A08:00: host bridge window [io 0xf000-0xffff]
ACPI: PCI Root Bridge [BR50] (domain 0000 [bus 80-fb])
pci_root PNP0A08:01: host bridge window [io 0xf000-0xffff]
When the card is below the PCI0 bridge, BIOS puts it in the [io 0x0d00-0xefff] range, which works (3.1.4 dmesg log from attachment #72176 [details]):
pci 0000:00:07.0: PCI bridge to [bus 04-04]
pci 0000:00:07.0: bridge window [io 0xd000-0xdfff]
pci 0000:04:00.0: reg 10: [io 0xdc00-0xdcff]
When the card is below BR50, BIOS puts it in the [io 0xf000-0xffff] range, which Linux believes is assigned to PCI0 and unavailable for BR50 (3.1.4 dmesg log from attachment #72122 [details]):
pci 0000:80:07.0: PCI bridge to [bus 85-85]
pci 0000:80:07.0: bridge window [io 0xf000-0xffff]
pci 0000:80:07.0: address space collision: [io 0xf000-0xffff] conflicts with PCI Bus 0000:00 [io 0xf000-0xffff]
pci 0000:85:00.0: reg 10: [io 0xfc00-0xfcff]
pci 0000:85:00.0: no compatible bridge window for [io 0xfc00-0xfcff]
The attached AIDA64 dump shows that Windows Server 2008 accepts the BR50 configuration (device at 85:00.0, with ports 0xfc00-0xfcff), and it works.
I think the PCI0 [io 0xf000-0xffff] _CRS descriptor is likely a BIOS bug. No devices or bridges below PCI0 use that range.
We could consider relaxing the address space collision check, at least at the host bridge level.
Booting with "pci=nocrs" is a workaround.
Anything else i could/should do?
Was this ever resolved or left with the workaround ?
I don't know since there was no more reply to my latest comment. :-(
Can you check whether there are any BIOS updates available? I know that's not an ideal solution, and theoretically if Windows works, Linux should work too. But I'm not sure how we could work around this in a generic way.
Bjorn, just a quick note right now:
I'm actually out on vacation so I'm unable to check this. I'll come back to this when I'm back at the office after June 16.
This looks like a BIOS bug on Supermicro X8DTH-i/6/iF/6F/X8DTH, BIOS 2.0a 09/29/2010.
If this is still an issue, we might consider some kind of DMI-based quirk to remove [io 0xf000-0xffff] from the PCI0 _CRS.
the workaround you mentioned in comment 4, "Booting with "pci=nocrs", really helped to get the PCI cards working again.
I'm not sure if a workaround for such a specific BIOS bug should generally be added to the Linux kernel, as long as "pci=nocrs" is still supported by the kernel to fix this locally, if required.
So I think this issue can be closed.
Thanks for your help to fiddle this out.
Created attachment 248861 [details]
quirk to ignore _CRS on Supermicro X8DTH-i/6/iF/6F
Hi Martin, here's a patch that basically turns on "pci=nocrs" automatically on this system. If you have time to test it, that'd be great.
The comment #12 patch is in Linus' tree  and appeared in v4.10-rc5, so I'm going to close this as resolved. Please reopen it if you still need to boot with "pci=nocrs" -- it's possible that I got the DMI strings wrong. If that's the case, please attach the output of "sudo dmidecode" so I can fix the quirk strings.