Kernel Bug Tracker – Bug 16443
Reproducible boot failures on NVidia MCP55-based desktop
Last modified: 2010-07-23 16:27:27 UTC
This is a regression, although it's not a new one and it also is a timing issue in related to the PCI quirks. Namely, my MCP55-based occasionally (in fact, quite often) fails to boot with recent kernels.
I was able to track the issue down to the final PCI quirks called on that box by pci_do_fixups(). More specifically, this most likely is related to the NVidia-specific MSI vs HT quirks. However, adding debug printks to that code changes the behavior in such a way that the problem becomes less reproducible.
I wonder, though, if the HT/MSI quirks are really necessary even for devices that are not going to use MSI.
Created attachment 27221 [details]
dmesg log from the affected machine
dmesg log from a fresh boot with 2.6.35-rc6 and one patch (adding some debug printks) on top.
Created attachment 27222 [details]
Outout of lspci -vvv from the affected machine
Created attachment 27223 [details]
Patch adding debug printks to the PCI quirks code
This patch introduces the additional debug printks present in the dmesg output from comment #1.
Created attachment 27224 [details]
This patch prevents us from enabling HT MSI mappings for the NV SATA2 controller(s) on MCP55.
On top of the previous patch it appears to fix the problem on this box, but it doesn't help when used alone.
Created attachment 27225 [details]
PCI: Do not run NVidia-specific MSI quirks for pci=nomsi
Running the NVidia-specific MSI quirks doesn't make sense for pci=nomsi, so make __nv_msi_ht_cap_quirk() do nothing in that case.
OK, this patch at least allows the affected box to boot with pci=nomsi, so the problem is resolved for me. I'll submit the patch for upstream inclusion later today.