Bug 42162

Summary: [bisected] continuous gpu resets on radeon
Product: Drivers Reporter: Niels Ole Salscheider (niels_ole)
Component: PCIAssignee: drivers_pci (drivers_pci)
Status: CLOSED CODE_FIX    
Severity: normal CC: alan, florian, jbarnes, jdmason, Nicolas.Mailhot, nissarin
Priority: P1    
Hardware: All   
OS: Linux   
Kernel Version: b03e7495a862b028294f59fc87286d6d78ee7fa1 Tree: Mainline
Regression: Yes
Attachments: dmesg output
output of lspci -vvvxxx
fix
v2
Remove MRRS modification from PCI code
Remove MRRS modification from PCI code, version 2

Description Niels Ole Salscheider 2011-09-01 15:05:05 UTC
Since commit b03e7495a862b028294f59fc87286d6d78ee7fa1 I experience continuous gpu resets on my Radeon HD6870 (see attached dmesg output).
They happen during desktop usage but more often (nearly immediately) when playing a game.

Booting with pci=pcie_bus_safe does not help.
Comment 1 Niels Ole Salscheider 2011-09-01 15:06:26 UTC
Created attachment 71112 [details]
dmesg output
Comment 2 Niels Ole Salscheider 2011-09-01 15:11:33 UTC
Created attachment 71122 [details]
output of lspci -vvvxxx
Comment 3 Alex Deucher 2011-09-01 17:12:04 UTC
Created attachment 71142 [details]
fix

Possible fix.
Comment 4 Alex Deucher 2011-09-01 18:40:15 UTC
Created attachment 71182 [details]
v2

slightly better version.
Comment 5 Jon Mason 2011-09-01 20:30:30 UTC
It might be better to simply rip out the MRRS tweaking code from pcie_bus_configure_set
Comment 6 Jon Mason 2011-09-01 21:52:50 UTC
Created attachment 71222 [details]
Remove MRRS modification from PCI code

Do not modify the value of MRRS when determining the MPS value.
Comment 7 Jon Mason 2011-09-01 21:53:32 UTC
Please try the patch from comment #6 and verify that it resolves your issues.
Comment 8 Niels Ole Salscheider 2011-09-02 06:57:33 UTC
Yes, this patch solves my issue, too.
Comment 9 Jon Mason 2011-09-02 13:29:03 UTC
Thanks, I'll push the patch shortly (and give you some "Tested-by" credit).
Comment 10 nissarin 2011-09-03 01:18:38 UTC
*** Bug 42172 has been marked as a duplicate of this bug. ***
Comment 11 Nicolas Mailhot 2011-09-04 13:52:06 UTC
(In reply to comment #7)
> Please try the patch from comment #6 and verify that it resolves your issues.

Fixes radeon problems here too
https://bugzilla.redhat.com/show_bug.cgi?id=734201
http://koji.fedoraproject.org/koji/taskinfo?taskID=3323239
Comment 12 Jon Mason 2011-09-07 22:01:07 UTC
Created attachment 71932 [details]
Remove MRRS modification from PCI code, version 2

Updated version of the patch which uses the MPS "safe" method by default (as the "performance" method was causing issues on some systems).
Comment 13 Jon Mason 2011-09-07 22:39:35 UTC
If it's not too much trouble, I'd appreciate the updated patch being tested.
Comment 14 Florian Mickler 2011-09-08 10:15:23 UTC
A patch referencing this bug report has been merged in Linux v3.1-rc5:

commit d054ac16eeb658bccadb06b12c39cee22243b10f
Author: Alex Deucher <alexander.deucher@amd.com>
Date:   Thu Sep 1 17:46:15 2011 +0000

    drm/radeon/kms: make sure pci max read request size is valid on evergreen+ (v2)
Comment 15 Florian Mickler 2012-01-12 21:25:17 UTC
A patch referencing this bug report has been merged in Linux v3.1-rc10:

commit ed2888e906b56769b4ffabb9c577190438aa68b8
Author: Jon Mason <mason@myri.com>
Date:   Thu Sep 8 16:41:18 2011 -0500

    PCI: Remove MRRS modification from MPS setting code