Bug 15287

Summary: RadeonKMS segfaults kdm on mobility radeon x700 pcie
Product: Drivers Reporter: Jan Kreuzer (kontrollator)
Component: PCIAssignee: drivers_pci (drivers_pci)
Severity: high CC: alexdeucher, jarryson, maciej.rutecki, matthew, rjw
Priority: P1    
Hardware: All   
OS: Linux   
Kernel Version: 2.6.34-rc1 Subsystem:
Regression: Yes Bisected commit-id:
Bug Depends on:    
Bug Blocks: 14885    
Attachments: messages drm.debug=5 from nonfunctionating startup
/proc/interrupts nonkms
/proc/interrupts kms
lspci -t
lspci -xxxx
Disable MSI for devices below a VIA K8T890

Description Jan Kreuzer 2010-02-12 17:58:00 UTC
Using the KMS option of the radeon-driver makes the kernel segfault and reboot when starting an kde session. This happens on daves radeon-testing tree using latest mesa-git and libdrm-git. Also happens on latest linus tree. The same combination works great when using the non-kms codepath.

Hardware is an mobility radeon x700 pcie.

lspci -vn:
02:00.0 0300: 1002:5653 (prog-if 00 [VGA controller])
        Subsystem: 1734:1093
        Flags: bus master, fast devsel, latency 0, IRQ 24
        Memory at b0000000 (64-bit, prefetchable) [size=128M]
        Memory at feaf0000 (64-bit, non-prefetchable) [size=64K]
        I/O ports at b000 [size=256]
        Expansion ROM at feac0000 [disabled] [size=128K]
        Capabilities: [50] Power Management version 2
        Capabilities: [58] Express Endpoint, MSI 00
        Capabilities: [80] MSI: Enable- Count=1/1 Maskable- 64bit+
        Capabilities: [100] Advanced Error Reporting
Comment 1 Jan Kreuzer 2010-02-12 17:59:42 UTC
Created attachment 25010 [details]
messages drm.debug=5 from nonfunctionating startup

This is from latest drm-radeon-testing booted with drm.debug=5.
Comment 2 Jan Kreuzer 2010-02-13 17:50:36 UTC
Created attachment 25031 [details]
/proc/interrupts nonkms
Comment 3 Jan Kreuzer 2010-02-13 17:51:59 UTC
Created attachment 25032 [details]
/proc/interrupts kms

in the kms case interrupt type is pci-msi-edge. it seems interrupts dont get caught.
Comment 4 Jan Kreuzer 2010-02-13 18:03:18 UTC
Disabling pci-msi per boot option pci=nomsi,routeirq fixes the problem, Xorg starts, no crash, proper counting of interrupts.
Comment 5 Alex Deucher 2010-02-13 19:26:13 UTC
Broken MSIs is a motherboard chipset issue rather than a radeon bug.  Your motherboard chipset may need a quirk or to disable MSIs altogether.
Comment 6 Jan Kreuzer 2010-02-13 19:44:33 UTC
Ok, so this bug needs reassignment to pci ? How do i change it ? Bugzilla says i have no right to change the assignemnet to pci.
Comment 7 Alex Deucher 2010-02-13 20:05:43 UTC
Correct.  I'm not sure how to change it.  It wouldn't let me change it either.
Comment 8 Matthew Wilcox 2010-02-15 21:42:18 UTC
We need lspci -t and lspci -xxxx from the faulty machine so we can make a credible blacklist entry for this device.
Comment 9 Jan Kreuzer 2010-02-20 01:01:28 UTC
Created attachment 25122 [details]
lspci -t
Comment 10 Jan Kreuzer 2010-02-20 01:02:32 UTC
Created attachment 25123 [details]
lspci -xxxx
Comment 11 Jan Kreuzer 2010-02-27 09:01:29 UTC
Also happens with 2.6.33 final. The crucial part seems to be the no-msi option, routeirq is not needed.
Comment 12 Jan Kreuzer 2010-03-12 13:19:44 UTC
With git of today (commit 522dba7134d6b2e5821d3457f7941ec34f668e6d
) it crashes when kms activates. Works with the no-msi option.
Comment 13 Matthew Wilcox 2010-03-12 14:06:16 UTC
Created attachment 25483 [details]
Disable MSI for devices below a VIA K8T890

Sorry I missed your response to this bug earlier.  This patch should fix your problem; please verify.
Comment 14 Jan Kreuzer 2010-03-14 19:35:31 UTC
Works great, many thanks. I will close the bug as soon the fix is committed.
Thank you very much.
Comment 15 Rafael J. Wysocki 2010-03-14 19:40:46 UTC
Handled-By : Matthew Wilcox <matthew@wil.cx>
Patch : http://bugzilla.kernel.org/attachment.cgi?id=25483
Comment 16 lh 2010-03-23 08:17:34 UTC
intresting, i have the exactly same hardware like you do, and seems i met much serious problem. but very similar.

i can not get any logs cause system will reboot immediately. cold start will make KMS works but cause reboot when gdm. hot start will cause reboot while just the time loading radeon modules.

Hardware: VIA motherborad, X700 mobility, AMD turion MT34
Comment 17 lh 2010-03-23 08:22:13 UTC
kernel 2.6.31 works very well here, but 2.6.33 and 2.6.34-rc2 doesn't.

i will try the patch later.
Comment 18 lh 2010-03-23 12:44:50 UTC
it works! confirmed here. Thanks very much!! i can use the new kernel now!!

BTW: when will it merge to kernel?
Comment 19 Jan Kreuzer 2010-03-27 10:23:04 UTC
Its merged, commit 134b345081534235dbf228b1005c14590e0570ba

Thank you

Jan Kreuzer
Comment 20 Rafael J. Wysocki 2010-03-27 23:24:53 UTC
On Saturday 27 March 2010, Jan Kreuzer wrote:
> Fixed in commit 134b345081534235dbf228b1005c14590e0570ba
> Thank you
> Jan Kreuzer
> Am 21.03.2010 21:30, schrieb Rafael J. Wysocki:
> > This message has been generated automatically as a part of a report
> > of regressions introduced between 2.6.32 and 2.6.33.
> > 
> > The following bug entry is on the current list of known regressions
> > introduced between 2.6.32 and 2.6.33.  Please verify if it still should
> > be listed and let the tracking team know (either way).
> > 
> > 
> > Bug-Entry   : http://bugzilla.kernel.org/show_bug.cgi?id=15287
> > Subject             : RadeonKMS segfaults kdm on mobility radeon x700 pcie
> > Submitter   : Jan Kreuzer <kontrollator@gmx.de>
> > Date                : 2010-02-12 17:58 (38 days old)
> > Handled-By  : Matthew Wilcox <matthew@wil.cx>
> > Patch               : http://bugzilla.kernel.org/attachment.cgi?id=25483