Bug 15626

Summary: Radeon Xpress 200m needs pci quirk to fix or disable MSI.
Product: Drivers Reporter: Tom Stellard (tstellar)
Component: PCIAssignee: drivers_pci (drivers_pci)
Status: RESOLVED CODE_FIX    
Severity: normal CC: akpm, alan, alexdeucher, andyrtr, daniel.blueman
Priority: P1    
Hardware: All   
OS: Linux   
Kernel Version: 2.6.34-rc2 Subsystem:
Regression: Yes Bisected commit-id:
Attachments: lspci output for the Radeon xpress 200m graphics card
lspci -t
lspci -xxxx
/proc/interrupts without kms
/proc/interrupts with kms
lspci -vnn
check if gfx msi is enabled
updated patch
fix
fix

Description Tom Stellard 2010-03-25 02:11:01 UTC
Created attachment 25694 [details]
lspci output for the Radeon xpress 200m graphics card

With KMS enabled, running glxgears causes my machine to lockup.  This bug was caused by commit 3e5cb98dfe87cc61d0a1119dd8aa2b1e4cfab424 which enables MSI for this card.  Booting with pci=nomsi prevents this problem.

This is probably also the cause of this bug: https://bugzilla.kernel.org/show_bug.cgi?id=14801

Here is the original discussion of this bug from the freedesktop bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=25662
Comment 1 Tom Stellard 2010-03-25 02:11:40 UTC
Created attachment 25695 [details]
lspci -t
Comment 2 Tom Stellard 2010-03-25 02:12:01 UTC
Created attachment 25696 [details]
lspci -xxxx
Comment 3 Tom Stellard 2010-03-25 02:13:48 UTC
Created attachment 25697 [details]
/proc/interrupts without kms
Comment 4 Tom Stellard 2010-03-25 02:20:20 UTC
Created attachment 25698 [details]
/proc/interrupts with kms
Comment 5 Alex Deucher 2010-03-25 16:18:54 UTC
Can you attach the output of lspci -vnn and /proc/interrupts
Comment 6 Tom Stellard 2010-03-25 16:28:35 UTC
Created attachment 25712 [details]
lspci -vnn
Comment 7 Alex Deucher 2010-03-25 18:25:59 UTC
Created attachment 25715 [details]
check if gfx msi is enabled

This patch should fix it, but requires you have the patch referenced below applied first (should already be in Linus' tree) as it's really the same issue.

http://marc.info/?l=dri-devel&m=126926011226719&w=2
Comment 8 Alex Deucher 2010-03-25 18:32:01 UTC
Created attachment 25716 [details]
updated patch

whoops, missed an id in the last patch.  Same conditions apply.
Comment 9 Tom Stellard 2010-03-25 21:31:47 UTC
I tested out the patch, but I am still experiencing the same problem.  

I added a printk to line 2503 of dirvers/pci/quirks.c to check the value of nb_cntl and its value is 0x78f.  I am not sure what this value means, but it does cause the if statement immediately following to not be executed, so I don't think MSI is being disabled.
Comment 10 Alex Deucher 2010-03-29 16:49:01 UTC
Created attachment 25757 [details]
fix

Does this patch do the trick?
Comment 11 Tom Stellard 2010-03-30 02:43:02 UTC
(In reply to comment #10)
> Created an attachment (id=25757) [details]
> fix
> 
> Does this patch do the trick?

This patch fixes the problem.
Thank you!
Comment 12 Andrew Morton 2010-03-31 19:18:23 UTC
Alex, please ensure that this patch has Cc:<stable@kernel.org> in the changelog so that we don't forget to backport it.
Comment 13 Alex Deucher 2010-04-01 17:27:46 UTC
Created attachment 25803 [details]
fix

Updated patch with stable cc.
Comment 14 Daniel J Blueman 2010-04-05 23:28:25 UTC
Alex - disabling MSI for only this device (and it's subordinate buses via the existing patches) is insufficient, right?

If sufficient, this may be a better solution than disabling MSI globally:

diff --git a/drivers/pci/quirks.c b/drivers/pci/quirks.c
index 81d19d5..603cd97 100644
--- a/drivers/pci/quirks.c
+++ b/drivers/pci/quirks.c
@@ -2215,6 +2215,16 @@ DECLARE_PCI_FIXUP_EARLY(PCI_VENDOR_ID_NVIDIA,
 			PCI_DEVICE_ID_NVIDIA_NVENET_15,
 			nvenet_msi_disable);
 
+/* The Radeon Xpress 200M/RS480 has problems with delivering MSI interrupts
+ * correctly, causing crashing; disable MSI for this device.
+ */
+static void __devinit radeon_msi_disable(struct pci_dev *dev)
+{
+	dev_info(&dev->dev, "Disabling MSI for ATI Radeon\n");
+	dev->no_msi = 1;
+}
+DECLARE_PCI_FIXUP_EARLY(PCI_VENDOR_ID_ATI, 0x5a3f, radeon_msi_disable);
+
 static int __devinit ht_check_msi_mapping(struct pci_dev *dev)
 {
 	int pos, ttl = 48;
Comment 15 Alex Deucher 2010-04-06 01:05:47 UTC
(In reply to comment #14)
> Alex - disabling MSI for only this device (and it's subordinate buses via the
> existing patches) is insufficient, right?

My patch is correct as is.  It doesn't disable MSI's globally, it only disables them for the internal gfx bridge and it's subordinate devices.  MSI's work fine on other bridges on the chipset.
Comment 16 Tom Stellard 2010-04-20 15:05:21 UTC
I haven't seen this patch in any of the 2.6.34-rc* releases.  Is there any chance this fix makes it into 2.6.34?
Comment 17 Sylvia 2010-05-18 09:23:54 UTC
possibly related to this bug as well?
https://bugzilla.kernel.org/show_bug.cgi?id=15982

Xpress 200m, conflict with ALSA when KMS is enabled
Comment 18 Sylvia 2010-05-18 09:36:44 UTC
(In reply to comment #17)
> possibly related to this bug as well?
> https://bugzilla.kernel.org/show_bug.cgi?id=15982
> 
> Xpress 200m, conflict with ALSA when KMS is enabled


booting with pci=noirq,nomsi doesnt help with sound
Comment 19 Alex Deucher 2010-05-18 14:13:30 UTC
(In reply to comment #17)
> possibly related to this bug as well?
> https://bugzilla.kernel.org/show_bug.cgi?id=15982

If pci=nomsi doesn't help than it's not related.  MSI's are already disabled on other rs4xx chipsets.
Comment 20 Alex Deucher 2010-05-18 14:47:59 UTC
I've sent a rebased patch to Jesse and linux-pci.
Comment 21 Tom Stellard 2010-07-23 01:58:05 UTC
This bug was fixed in kernel version 2.6.34 by this commit:

commit c414a117c6094c3f86b533f97beaf45ef9075f03
Author: Alex Deucher <alexdeucher@gmail.com>
Date:   Tue Mar 30 17:22:32 2010 -0400

    drm/radeon/kms: disable MSI on IGP chips
    
    Doesn't seem to work reliably and the pci quirks don't
    always work.
    
    Signed-off-by: Alex Deucher <alexdeucher@gmail.com>
    Signed-off-by: Dave Airlie <airlied@redhat.com>


This commit is different than the proposed fix that was attached to this bug.  The patch attached to this bug was committed as 9313ff450400e6a2ab10fe6b9bdb12a828329410 after the 2.6.34 release.

Both of these commits cause the audio on my laptop to stop working.  However, if I boot with pci=noacpi then audio works fine.  As far as I am concerned this bug is fixed, I'll file a different bug about the audio problem.