Kernel Bug Tracker – Bug 20432
NVIDIA GPU doesn't work if loaded at last after all other device drivers have already been loaded
Last modified: 2013-04-09 06:23:26 UTC
Created attachment 33782 [details]
My 2.6.36-rc8 .config-uration and miscellaneous info
I haven't been running NVIDIA GPU on this computer for almost a year, but I do remember that this option wasn't necessary last time I ran this video accelerator.
Without this option my GPU doesn't get assigned any interrupts and I'm met with this message upon trying to run X server:
(EE) NVIDIA(0): The NVIDIA kernel module does not appear to be receiving
(EE) NVIDIA(0): interrupts generated by the NVIDIA graphics device
(EE) NVIDIA(0): PCI:5:0:0. Please see Chapter 8: Common Problems in the
(EE) NVIDIA(0): README for additional information.
(EE) NVIDIA(0): Failed to initialize the NVIDIA graphics device!
/proc/interrupts inspection reveals that indeed NVIDIA is not listed there.
Created attachment 33832 [details]
pci=biosirq dmesg where I cannot use NVIDIA GPU
Quite randomly I can reproduce the problem even with pci=biosirq option.
I'm gonna try running with irqpoll.
Created attachment 33842 [details]
dmesg with no options and NVIDIA GPU *working* normally
It doesn't really matter whether I use pci=biosirq or irqpoll options.
It only matters in which order modules and devices are being initialized.
If NVIDIA GPU is to be initialized at last it will certainly fail. If NVIDIA module loads before some other devices it runs OK without any special options.
This issue is better reported to NVIDIA developers.
why can I always find this bug when search ACPI bugs?
And it shows this is still an ACPI config bug...
(In reply to comment #3)
> This issue is better reported to NVIDIA developers.
I'm not sure they will respond (positively):
Here's what their documentation says on this matter:
My X server fails to start, and my X log file contains the error:
(EE) NVIDIA(0): The NVIDIA kernel module does not appear to
(EE) NVIDIA(0): be receiving interrupts generated by the NVIDIA graphics
(EE) NVIDIA(0): device PCI:x:x:x. Please see the COMMON PROBLEMS
(EE) NVIDIA(0): section in the README for additional information.
This can be caused by a variety of problems, such as PCI IRQ routing errors,
I/O APIC problems or conflicts with other devices sharing the IRQ (or their
If possible, configure your system such that your graphics card does not share
its IRQ with other devices (try moving the graphics card to another slot if
applicable, unload/disable the driver(s) for the device(s) sharing the card's
IRQ, or remove/disable the device(s)).
Depending on the nature of the problem, one of (or a combination of) these
kernel parameters might also help:
pci=noacpi don't use ACPI for PCI IRQ routing
pci=biosirq use PCI BIOS calls to retrieve the IRQ routing table
noapic don't use I/O APICs present in the system
acpi=off disable ACPI
In fact options listed as possible solutions don't help.
I tend to think it's a Linux kernel problem, because when I load nvidia.ko module it doesn't get listed in /proc/interrupts.
It's a non free driver only they have th e source to all the parts so only they can debug it
Aaron Plattner said I just should disable MSI as it's "problematic": "I talked to our kernel guy and he said that MSI is notoriously problematic throughout the hardware and software stack, and he recommended that you just stick with traditional interrupts."
OK, it seems like I won't get any help from any side of this bug. Probably sharing interrupts isn't that bad idea in 2010.