Bug 202207 - GeForce GT 710: "BAR 1: failed to assign"
Summary: GeForce GT 710: "BAR 1: failed to assign"
Status: NEEDINFO
Alias: None
Product: Drivers
Classification: Unclassified
Component: PCI (show other bugs)
Hardware: x86-64 Linux
: P1 normal
Assignee: drivers_pci@kernel-bugs.osdl.org
URL: https://devtalk.nvidia.com/default/to...
Keywords:
Depends on:
Blocks:
 
Reported: 2019-01-09 23:20 UTC by alan.aversa
Modified: 2019-01-12 21:20 UTC (History)
1 user (show)

See Also:
Kernel Version: 4.19.13
Subsystem:
Regression: No
Bisected commit-id:


Attachments
bug report (without pci=nocrs) (54.83 KB, application/gzip)
2019-01-11 03:53 UTC, alan.aversa
Details
bug report (with pci=nocrs) (1.01 MB, application/gzip)
2019-01-11 03:54 UTC, alan.aversa
Details

Description alan.aversa 2019-01-09 23:20:47 UTC
pci 0000:41:00.0: BAR 1: no space for [mem size 0x08000000 64bit pref]
pci 0000:41:00.0: BAR 1: trying firmware assignment [mem 0x98000000-0x9fffffff 64bit pref]
pci 0000:41:00.0: BAR 1: [mem 0x98000000-0x9fffffff 64bit pref] conflicts with PCI Bus 0000:40 [mem 0x9c000000-0xa95fffff window]
pci 0000:41:00.0: BAR 1: failed to assign [mem size 0x08000000 64bit pref]

pci=nocrs fixes the problem, but then the video card does results in a blank or green screen over HDMI, despite working with the VESA driver.

more info and bug reports:
https://devtalk.nvidia.com/default/topic/1046009/linux/blank-green-screen-w-geforce-gt-710-driver-415-25-amp-kernel-4-19-13
Comment 1 Bjorn Helgaas 2019-01-09 23:44:00 UTC
Can you please attach the complete dmesg log here so this bugzilla doesn't depend on the Nvidia forum?

This looks like a system BIOS bug.  It claims the host bridge window doesn't cover the entire space used by 41:00.0:

  ACPI: PCI Root Bridge [S0D2] (domain 0000 [bus 40-5f])
  pci_bus 0000:40: root bus resource [mem 0x9c000000-0xa95fffff window]
  pci 0000:40:01.3: PCI bridge to [bus 41]
  pci 0000:40:01.3:   bridge window [mem 0x98000000-0xa1ffffff 64bit pref]
  pci 0000:41:00.0: reg 0x14: [mem 0x98000000-0x9fffffff 64bit pref]

The device is programmed to use 0x98000000-0x9bffffff, but according to the ACPI host bridge description, that region is not routed to the device.

Can you check to see whether there's a BIOS update available for your system?

This resource issue is independent of whatever driver you use.  Can you also attach a complete dmesg log when using the VESA driver?  Maybe that driver just doesn't use BAR 1?
Comment 2 alan.aversa 2019-01-11 03:53:34 UTC
Created attachment 280397 [details]
bug report (without pci=nocrs)

without any extra kernel parameters
Comment 3 alan.aversa 2019-01-11 03:54:21 UTC
Created attachment 280399 [details]
bug report (with pci=nocrs)

When I use pci=nocrs, the Nvidia modules load because my GeForce RT 710 is found, but starting X results either in a blank screen or a green screen, depending on the resolution settings set or the monitor connected (green screen occurs over HDMI when connected to an LG 24UD58-B 4K monitor).
Comment 4 Bjorn Helgaas 2019-01-12 01:52:39 UTC
I looked at attachment 280397 [details] and attachment 280399 [details], but neither mentions VESA or a framebuffer.  Can you attach a dmesg log when using VESA?  If we knew where the framebuffer was, we could tell if it uses BAR 1.  If it doesn't use BAR 1, that would explain why VESA works even though we have a problem with its address.

BAR 1 looks like it's likely a framebuffer, so it probably is read/write memory as opposed to registers.  When booting with "pci=nocrs", can you use http://cmp.felk.cvut.cz/~pisa/linux/rdwrmem.c to try reading/writing a few bytes at 0x98000000 and a few more bytes at 0x9c000000?  Per the ACPI host bridge description, the first might not be accessible (you might read all 0xff bytes, even if you try to write something else to it), while the second *should* be accessible.
Comment 5 alan.aversa 2019-01-12 21:20:22 UTC
I can't tell anymore because the framebuffer couldn't be found with the GPU in PCI slot #1, and I had to put the GPU in slot #3, and now it works, so I'm not sure if slot #1 was going bad or this was really a BIOS issue. (Asus recommends slot #1 for GPUs…)

Also, when I upgraded from 4.19.13 to 4.19.14, the dmesg error indicated that the GPU's memory size was "0M".

Note You need to log in before you can comment on or make changes to this bug.