Bug 16443 - Reproducible boot failures on NVidia MCP55-based desktop
Summary: Reproducible boot failures on NVidia MCP55-based desktop
Status: RESOLVED PATCH_ALREADY_AVAILABLE
Alias: None
Product: Drivers
Classification: Unclassified
Component: PCI (show other bugs)
Hardware: All Linux
: P1 normal
Assignee: drivers_pci@kernel-bugs.osdl.org
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2010-07-23 15:39 UTC by Rafael J. Wysocki
Modified: 2010-07-23 16:27 UTC (History)
2 users (show)

See Also:
Kernel Version: 2.6.34
Subsystem:
Regression: Yes
Bisected commit-id:


Attachments
dmesg log from the affected machine (86.89 KB, text/plain)
2010-07-23 15:48 UTC, Rafael J. Wysocki
Details
Outout of lspci -vvv from the affected machine (31.68 KB, text/plain)
2010-07-23 15:49 UTC, Rafael J. Wysocki
Details
Patch adding debug printks to the PCI quirks code (2.48 KB, patch)
2010-07-23 15:53 UTC, Rafael J. Wysocki
Details | Diff
Debug patch (628 bytes, patch)
2010-07-23 15:55 UTC, Rafael J. Wysocki
Details | Diff
PCI: Do not run NVidia-specific MSI quirks for pci=nomsi (482 bytes, patch)
2010-07-23 16:27 UTC, Rafael J. Wysocki
Details | Diff

Description Rafael J. Wysocki 2010-07-23 15:39:25 UTC
This is a regression, although it's not a new one and it also is a timing issue in related to the PCI quirks.  Namely, my MCP55-based occasionally (in fact, quite often) fails to boot with recent kernels.

I was able to track the issue down to the final PCI quirks called on that box by pci_do_fixups().  More specifically, this most likely is related to the NVidia-specific MSI vs HT quirks.  However, adding debug printks to that code changes the behavior in such a way that the problem becomes less reproducible.

I wonder, though, if the HT/MSI quirks are really necessary even for devices that are not going to use MSI.
Comment 1 Rafael J. Wysocki 2010-07-23 15:48:24 UTC
Created attachment 27221 [details]
dmesg log from the affected machine

dmesg log from a fresh boot with 2.6.35-rc6 and one patch (adding some debug printks) on top.
Comment 2 Rafael J. Wysocki 2010-07-23 15:49:39 UTC
Created attachment 27222 [details]
Outout of lspci -vvv from the affected machine
Comment 3 Rafael J. Wysocki 2010-07-23 15:53:06 UTC
Created attachment 27223 [details]
Patch adding debug printks to the PCI quirks code

This patch introduces the additional debug printks present in the dmesg output from comment #1.
Comment 4 Rafael J. Wysocki 2010-07-23 15:55:35 UTC
Created attachment 27224 [details]
Debug patch

This patch prevents us from enabling HT MSI mappings for the NV SATA2 controller(s) on MCP55.

On top of the previous patch it appears to fix the problem on this box, but it doesn't help when used alone.
Comment 5 Rafael J. Wysocki 2010-07-23 16:27:05 UTC
Created attachment 27225 [details]
PCI: Do not run NVidia-specific MSI quirks for pci=nomsi

Running the NVidia-specific MSI quirks doesn't make sense for pci=nomsi, so make __nv_msi_ht_cap_quirk() do nothing in that case.


OK, this patch at least allows the affected box to boot with pci=nomsi, so the problem is resolved for me.  I'll submit the patch for upstream inclusion later today.

Note You need to log in before you can comment on or make changes to this bug.