Bug 16443

Summary: Reproducible boot failures on NVidia MCP55-based desktop
Product: Drivers Reporter: Rafael J. Wysocki (rjw)
Component: PCIAssignee: drivers_pci (drivers_pci)
Status: RESOLVED PATCH_ALREADY_AVAILABLE    
Severity: normal CC: jbarnes, yinghai
Priority: P1    
Hardware: All   
OS: Linux   
Kernel Version: 2.6.34 Subsystem:
Regression: Yes Bisected commit-id:
Attachments: dmesg log from the affected machine
Outout of lspci -vvv from the affected machine
Patch adding debug printks to the PCI quirks code
Debug patch
PCI: Do not run NVidia-specific MSI quirks for pci=nomsi

Description Rafael J. Wysocki 2010-07-23 15:39:25 UTC
This is a regression, although it's not a new one and it also is a timing issue in related to the PCI quirks.  Namely, my MCP55-based occasionally (in fact, quite often) fails to boot with recent kernels.

I was able to track the issue down to the final PCI quirks called on that box by pci_do_fixups().  More specifically, this most likely is related to the NVidia-specific MSI vs HT quirks.  However, adding debug printks to that code changes the behavior in such a way that the problem becomes less reproducible.

I wonder, though, if the HT/MSI quirks are really necessary even for devices that are not going to use MSI.
Comment 1 Rafael J. Wysocki 2010-07-23 15:48:24 UTC
Created attachment 27221 [details]
dmesg log from the affected machine

dmesg log from a fresh boot with 2.6.35-rc6 and one patch (adding some debug printks) on top.
Comment 2 Rafael J. Wysocki 2010-07-23 15:49:39 UTC
Created attachment 27222 [details]
Outout of lspci -vvv from the affected machine
Comment 3 Rafael J. Wysocki 2010-07-23 15:53:06 UTC
Created attachment 27223 [details]
Patch adding debug printks to the PCI quirks code

This patch introduces the additional debug printks present in the dmesg output from comment #1.
Comment 4 Rafael J. Wysocki 2010-07-23 15:55:35 UTC
Created attachment 27224 [details]
Debug patch

This patch prevents us from enabling HT MSI mappings for the NV SATA2 controller(s) on MCP55.

On top of the previous patch it appears to fix the problem on this box, but it doesn't help when used alone.
Comment 5 Rafael J. Wysocki 2010-07-23 16:27:05 UTC
Created attachment 27225 [details]
PCI: Do not run NVidia-specific MSI quirks for pci=nomsi

Running the NVidia-specific MSI quirks doesn't make sense for pci=nomsi, so make __nv_msi_ht_cap_quirk() do nothing in that case.


OK, this patch at least allows the affected box to boot with pci=nomsi, so the problem is resolved for me.  I'll submit the patch for upstream inclusion later today.