Bug 90831

Summary: can't claim BAR 0: no compatible bridge window NVIDIA with pci=use_crs
Product: Drivers Reporter: gob_iron
Component: PCIAssignee: drivers_pci (drivers_pci)
Status: RESOLVED INVALID    
Severity: normal CC: bjorn, yinghai
Priority: P1    
Hardware: x86-64   
OS: Linux   
Kernel Version: 3.18 low-latency compile Subsystem:
Regression: No Bisected commit-id:
Attachments: kern.log
lspci -vvn output

Description gob_iron 2015-01-05 21:08:58 UTC
Created attachment 162561 [details]
kern.log

Hi, I've just fitted an Nvidia 750 gtx to a mac pro 2006 1.1, and the card is not being detected in the 4 lane 16x pcie slot 1, although it is detected in slot 2 and 3 although at 16x only with only 1 lane, so it has really low bandwidth.

In a recent bug report https://bugzilla.kernel.org/show_bug.cgi?id=85491 the can't claim Bar 0 is reported as causing a problem for a Radeon card, a patch is submitted and it seems a bisect revealed a kernel patch issue.
As this is an NVidia card (on the proprietary driver) I am submitting this as a separate bug report, also kernel.log reports that :

PCI: Ignoring host bridge windows from ACPI; if necessary, use "pci=use_crs" and report a bug

The kernel only detects that it can't claim Bar 0 when I put the above grub paramater on the boot line.

In either case, the machine detects my old Nvidia GTX 8800 with no problem.  In the case of both graphic cards they are PC unflashed, not Apple.

I note that there is a stack trace listed at the end of each boot which might relate to the Nvidia driver.
Comment 1 gob_iron 2015-01-06 13:16:30 UTC
Created attachment 162581 [details]
lspci -vvn output
Comment 2 gob_iron 2015-01-06 13:38:33 UTC
I'm going to attempt to debug this myself, as I'm reasonably capable of reading code and know how to add debug symbols etc.  I don't (as yet) know how to make patches with diff, but I'll probably work it out.

There is a patch suggested in bug 85941, which I'm going to start with as an entry point into understanding the code, along with the effect of the grub commandline parameter pci=use_crs.  If I find anything of any use I'll post it here.
Comment 3 gob_iron 2015-01-06 17:19:47 UTC
https://bugzilla.kernel.org/show_bug.cgi?id=85491 - the above comment's bug listing leads to the wrong bug.
Comment 4 gob_iron 2015-01-06 17:20:26 UTC
If you have any advice or suggestions, I'd be very grateful.  Many thanks.
Comment 5 Yinghai Lu 2015-01-06 22:47:38 UTC
BIOS does not allocate that bar in parent range at all.

and with pci=use_crs and pci=realloc, all bar get assigned resources.

the trace is not related to pci.

should be driver problem?
Comment 6 Bjorn Helgaas 2015-03-09 22:38:03 UTC
This message:

  PCI: Ignoring host bridge windows from ACPI; if necessary, use "pci=use_crs" and report a bug

is normal.  By default, we ignore ACPI host bridge information if the BIOS date is older than 2008 (see pci_acpi_crs_quirks()).  You can override that default by booting with "pci=use_crs".  On your system, that results in:

  Command line: ... pci=use_crs ...
  pci_bus 0000:00: root bus resource [mem 0x80000000-0xfe000000]
  pci 0000:00:08.0: can't claim BAR 0 [mem 0xfe700000-0xfe7003ff 64bit]: no compatible bridge window
  pci 0000:00:08.0: BAR 0: assigned [mem 0x80e00000-0x80e003ff 64bit]

The "no compatible bridge window" message means that 0xfe700000 isn't inside the host bridge window of [mem 0x80000000-0xfe000000], so we have to assume that accesses to 0xfe700000 will not reach the device.  Therefore, Linux reassigns BAR 0 to be at [mem 0x80e00000-0x80e003ff] instead.  This apparently works fine.

This warning:

  WARNING: CPU: 0 PID: 2125 at drivers/gpu/drm/drm_ioctl.c:143 drm_setversion+0x17e/0x190 [drm]()
  No drm_driver.set_busid() implementation provided by nv_drm_driver [nvidia]. Use drm_dev_set_unique() to set the unique name explicitly.

is unrelated to the BAR 0 messages above and is not a PCI issue.

That looks like something related to the proprietary Nvidia driver.  Since we can't fix that, I'm closing this as invalid.