Bug 86891
Summary: | AMD/ATI Tahiti XT 7970 - long lags/stutters in games | ||
---|---|---|---|
Product: | Drivers | Reporter: | Michael Mair-Keimberger (mmk+bugs) |
Component: | Video(DRI - non Intel) | Assignee: | drivers_video-dri |
Status: | RESOLVED CODE_FIX | ||
Severity: | high | CC: | adf.lists, alan, alexdeucher, curaga, Dieter |
Priority: | P1 | ||
Hardware: | All | ||
OS: | Linux | ||
Kernel Version: | >=3.17 until 3.18rc1 | Subsystem: | |
Regression: | No | Bisected commit-id: | |
Attachments: |
Kernel config of 3.18rc1
bisec.tar.gz dmesg output picture another picture picture with VRAM and GTT usage vally screenshot |
Description
Michael Mair-Keimberger
2014-10-25 15:28:22 UTC
Likely a duplicate of: https://bugs.freedesktop.org/show_bug.cgi?id=84662 and https://bugs.freedesktop.org/show_bug.cgi?id=84570 Can you bisect? (In reply to Alex Deucher from comment #1) > Likely a duplicate of: > https://bugs.freedesktop.org/show_bug.cgi?id=84662 > and > https://bugs.freedesktop.org/show_bug.cgi?id=84570 > Can you bisect? Sure, give me some days - i'll try to bisect the problem. :) Guess it will take some time, since there were many commits between 3.16 and 3.17rc1. I'll take the kernel git tree for the bisect (git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git) and bisect between 3.16rc8 and 3.17rc1. As soon as i have some result's i'll gonna update this bug. I though it takes more time, but i already finished bisecting :) The result: 59bc1d89d6a4d67c94a9b70fa81bda1d5b04f0cb is the first bad commit commit 59bc1d89d6a4d67c94a9b70fa81bda1d5b04f0cb Author: Lauri Kasanen <cand@gmx.com> Date: Sun Apr 20 20:29:33 2014 +0300 drm/radeon: Inline r100_mm_rreg, -wreg, v3 This was originally un-inlined by Andi Kleen in 2011 citing size concerns. Indeed, a first attempt at inlining it grew radeon.ko by 7%. However, 2% of cpu is spent in this function. Simply inlining it gave 1% more fps in Urban Terror. v2: We know the minimum MMIO size. Adding it to the if allows the compiler to optimize the branch out, improving both performance and size. The v2 patch decreases radeon.ko size by 2%. I didn't re-benchmark, but common sense says perf is now more than 1% better. v3: Also change _wreg, make the threshold a define. Inlining _wreg increased the size a bit compared to v2, so now radeon.ko is only 1% smaller. Signed-off-by: Lauri Kasanen <cand@gmx.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> :040000 040000 91cde817761a93a06d21855ec896d22f03685665 e7de121e74c415308e8266c26ae7ad518d0e8530 M drivers This is the bad commit. asterix linux # git bisect log git bisect start # bad: [7d1311b93e58ed55f3a31cc8f94c4b8fe988a2b9] Linux 3.17-rc1 git bisect bad 7d1311b93e58ed55f3a31cc8f94c4b8fe988a2b9 # good: [64aa90f26c06e1cb2aacfb98a7d0eccfbd6c1a91] Linux 3.16-rc7 git bisect good 64aa90f26c06e1cb2aacfb98a7d0eccfbd6c1a91 # good: [ae045e2455429c418a418a3376301a9e5753a0a8] Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next git bisect good ae045e2455429c418a418a3376301a9e5753a0a8 # bad: [44c916d58b9ef1f2c4aec2def57fa8289c716a60] Merge tag 'cleanup-for-3.17' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc git bisect bad 44c916d58b9ef1f2c4aec2def57fa8289c716a60 # good: [e669830526a0abaf301bf408df69cde33901ac63] Merge branch 'upstream' of git://git.linux-mips.org/pub/scm/ralf/upstream-linus git bisect good e669830526a0abaf301bf408df69cde33901ac63 # bad: [7963e9db1b1f842fdc53309baa8714d38e9f5681] Revert "drm: drop redundant drm_file->is_master" git bisect bad 7963e9db1b1f842fdc53309baa8714d38e9f5681 # good: [8a105aaa25f4504d26ca828f12d709d2213a230e] Merge branch 'drm-armada-devel' of git://ftp.arm.linux.org.uk/~rmk/linux-arm into drm-next git bisect good 8a105aaa25f4504d26ca828f12d709d2213a230e # good: [a2fe6cdc03d7a9b0d048a7f32f9d8827e06c67fa] drm/msm/hdmi: fix HDMI_MUX_EN gpio request typo git bisect good a2fe6cdc03d7a9b0d048a7f32f9d8827e06c67fa # bad: [e7e31600d3e2f8b7726b0521149fc55c62a90467] drm/radeon: remove taking mclk_lock from radeon_bo_unref git bisect bad e7e31600d3e2f8b7726b0521149fc55c62a90467 # bad: [c748990b7b1c320c626c758379d50748588c6ed6] drm/radeon: Use correct value for unknown audio/video latency git bisect bad c748990b7b1c320c626c758379d50748588c6ed6 # good: [96b1b9711031a1e95e3cf15d830802aed38479a6] Merge branch 'drm_kms_for_next-v8' of git://git.linaro.org/people/benjamin.gaignard/kernel into drm-next git bisect good 96b1b9711031a1e95e3cf15d830802aed38479a6 # good: [636e2582658742b94e7620becce58f939996c961] drm/radeon/dpm: add support for SVI2 voltage for SI git bisect good 636e2582658742b94e7620becce58f939996c961 # good: [f2c6b0f452c3804496f55655fda28c2809e1a58b] drm/radeon/cik: Add support for new ucode format (v5) git bisect good f2c6b0f452c3804496f55655fda28c2809e1a58b # good: [da9976206c15178eeae1b4445c9266125bf35b0a] drm/radeon: enable display scaling on all connectors (v2) git bisect good da9976206c15178eeae1b4445c9266125bf35b0a # bad: [59bc1d89d6a4d67c94a9b70fa81bda1d5b04f0cb] drm/radeon: Inline r100_mm_rreg, -wreg, v3 git bisect bad 59bc1d89d6a4d67c94a9b70fa81bda1d5b04f0cb # good: [3e22920fbd0005927bc41f71daeb056a0f4def82] drm/radeon: consolidate vga and dvi get_modes functions (v2) git bisect good 3e22920fbd0005927bc41f71daeb056a0f4def82 # first bad commit: [59bc1d89d6a4d67c94a9b70fa81bda1d5b04f0cb] drm/radeon: Inline r100_mm_rreg, -wreg, v3 Created attachment 155491 [details]
bisec.tar.gz
Additional all benchmarks i've made from every bisected kernel. the first one is the 3.16rc7 benchmark, the second the 3.17rc1. The others are from the bisected kernels.
You'll see on the benchmarks it always has a difference of about 100 points (good vs bad), which is about 10% performance difference.
Can you please test with one of kernel git | 3.18-rc2 | drm-next together with git revert 59bc1d8? (In reply to Dieter Nützel from comment #5) > Can you please test with one of kernel git | 3.18-rc2 | drm-next together > with > git revert 59bc1d8? I've tried it with kernel git 3.18rc2 with a pull a few minutes ago and with `git revert 59bc1d8`. The result looks promising. I've made two benchmarks: 1st 2nd FPS: 18.3 18.1 Score: 765 757 Min FPS: 5.9 6.3 Max FPS: 32.4 31.9 I didn't include drm-next simply because i don't know how todo that. :) If you want me to test drm-next as well please point me to some documentation how to include it. :) Does Mesa 10.3.2 work better, specifically commit 64c2bdc334ba472603b1e7cd2c3046cfbce285b6? (In reply to Michael Mair-Keimberger from comment #6) > (In reply to Dieter Nützel from comment #5) > > Can you please test with one of kernel git | 3.18-rc2 | drm-next together > > with > > git revert 59bc1d8? > > I've tried it with kernel git 3.18rc2 with a pull a few minutes ago and with > `git revert 59bc1d8`. The result looks promising. I've made two benchmarks: > > 1st 2nd > FPS: 18.3 18.1 > Score: 765 757 > Min FPS: 5.9 6.3 > Max FPS: 32.4 31.9 Yes, looks much better, but the code shouldn't touch any relevant (radeonsi/r600g) code paths. - Michel? On r600g (RV730 AGP) I do NOT see any (real) change with this revert... Maybe we do not hit the real BAD commit. > I didn't include drm-next simply because i don't know how todo that. :) If > you want me to test drm-next as well please point me to some documentation > how to include it. :) Alex's drm-next-3.19-wip (it shows 3.17-rc5 ;-) for example: git clone git://people.freedesktop.org/~agd5f/linux/ drm-next-3.19-wip If you have it already, get it or change it to another tree: cd drm-next-3.19-wip git checkout -b drm-next-3.19-wip remotes/origin/drm-next-3.19-wip Sometimes you need this: git fetch origin git reset --hard origin/drm-next-3.19-wip (In reply to Dieter Nützel from comment #8) > > Yes, looks much better, but the code shouldn't touch any relevant > (radeonsi/r600g) code paths. - Michel? > > On r600g (RV730 AGP) I do NOT see any (real) change with this revert... > Maybe we do not hit the real BAD commit. What issue are you seeing and what makes you think it has anything to do with this bug? (In reply to Michel Dänzer from comment #7) > Does Mesa 10.3.2 work better, specifically commit > 64c2bdc334ba472603b1e7cd2c3046cfbce285b6? I'll get slightly better results with 10.3.2 (with 3.18rc1): FPS: 16.6 Score: 694 Min FPS: 3.7 Max FPS: 33.1 But honestly, watching the demo feels like it got even worse. Still very long lag's, especially at the beginning of new scene's (before it starts to render). (In reply to Dieter Nützel from comment #8) > Alex's drm-next-3.19-wip (it shows 3.17-rc5 ;-) for example: > git clone git://people.freedesktop.org/~agd5f/linux/ drm-next-3.19-wip > > If you have it already, get it or change it to another tree: > cd drm-next-3.19-wip > git checkout -b drm-next-3.19-wip remotes/origin/drm-next-3.19-wip > > Sometimes you need this: > git fetch origin > git reset --hard origin/drm-next-3.19-wip I've just started cloning drm-next-3.19 but freedesktop seems to be quite slow - looks like i can start testing it tomorrow :/ Created attachment 155841 [details]
dmesg output
Don't know if this helps but i just saw that i got strange (?) output in dmesg. This output showed up about 20min after i made that benchmark mentioned before (mesa-10.3.2/kernel-3.18rc1). As far as i can remember i didn't do anything specific that moment - just internet surfing.
(In reply to Michael Mair-Keimberger from comment #10) > I'll get slightly better results with 10.3.2 (with 3.18rc1): [...] > But honestly, watching the demo feels like it got even worse. Still very > long lag's, especially at the beginning of new scene's (before it starts to > render). Weird, it seemed to help a lot for myself and many others. Any chance you could try current Mesa Git master? Can you create a screenshot from running with GALLIUM_HUD and showing the graphs corresponding to a lag, such as in https://bugs.freedesktop.org/show_bug.cgi?id=84570 ? OK, today i made another benchmarks: drm-next, mesa-10.3.2. First without any changes, second with `git revert 59bc1d8`: without any changes with `git revert 59bc1d8` FPS: 14.3 19.1 Score: 599 801 Min FPS: 2.1 12.2 Max FPS: 30.7 30.7 Honestly, the difference is incredible. Can't believe such a small change has such a big impact. It even seems with the commit the performance get's worse over time - never had under 600 points before.. @Michel: mesa-10.3.2 does indeed help. I already had minor lag's/stutters in the past (pre-3.17) - that was "normal" for me, but now i have ZERO lags. The complete benchmark was done without one major lag. AWESOME :) If anyone is interested i can also create videos, so you can see the differences :) Created attachment 155911 [details]
picture
I've made a few other benchmarks and tried to take some screenshots. Unfortunately vally doesn't include GALLIUM_HUD when i'm taking screenshot's. As a workaround i've made photos with my mobile. Hope that's ok :)
On a side-note: I've got other kernel msg's with the unchanged kernel. Don't know if it's relevant but it looks like that:
[ 2915.586345] radeon 0000:01:00.0: GPU fault detected: 146 0x00139004
[ 2915.586350] radeon 0000:01:00.0: VM_CONTEXT1_PROTECTION_FAULT_ADDR 0x00004B00
[ 2915.586353] radeon 0000:01:00.0: VM_CONTEXT1_PROTECTION_FAULT_STATUS 0x13090004
[ 2915.586357] VM fault (0x04, vmid 9) at page 19200, write from CB (144)
[ 2915.586362] radeon 0000:01:00.0: GPU fault detected: 146 0x00339004
[ 2915.586365] radeon 0000:01:00.0: VM_CONTEXT1_PROTECTION_FAULT_ADDR 0x00004B07
[ 2915.586367] radeon 0000:01:00.0: VM_CONTEXT1_PROTECTION_FAULT_STATUS 0x13090004
[ 2915.586370] VM fault (0x04, vmid 9) at page 19207, write from CB (144)
[ 2915.586374] radeon 0000:01:00.0: GPU fault detected: 146 0x00539004
[ 2915.586377] radeon 0000:01:00.0: VM_CONTEXT1_PROTECTION_FAULT_ADDR 0x00004B02
[ 2915.586379] radeon 0000:01:00.0: VM_CONTEXT1_PROTECTION_FAULT_STATUS 0x13090004
[ 2915.586382] VM fault (0x04, vmid 9) at page 19202, write from CB (144)
[ 2915.586386] radeon 0000:01:00.0: GPU fault detected: 146 0x00739004
.
.
.
I don't know if i would get them with the changed kernel as-well, but with my stable kernel (3.16) i never see such messages.
Created attachment 155921 [details]
another picture
another picture
(In reply to Michael Mair-Keimberger from comment #14) > Honestly, the difference is incredible. Can't believe such a small change > has such a big impact. Yeah, it's really weird. Looking at the change, the only way I could imagine it possibly having any negative impact would be if it somehow caused the indirect register access method to be used even when it's not necessary. But not sure how that could happen. (In reply to Michael Mair-Keimberger from comment #15) > Unfortunately vally doesn't include GALLIUM_HUD when i'm taking > screenshot's. As a workaround i've made photos with my mobile. Hope that's > ok :) That's fine, but we also need to see the VRAM and GTT graphs. Created attachment 156051 [details] picture with VRAM and GTT usage (In reply to Michel Dänzer from comment #17) > (In reply to Michael Mair-Keimberger from comment #15) > > Unfortunately vally doesn't include GALLIUM_HUD when i'm taking > > screenshot's. As a workaround i've made photos with my mobile. Hope that's > > ok :) > > That's fine, but we also need to see the VRAM and GTT graphs. Sorry, completely forget about that.. Another picture with VRAM and GTT usage. I've used `GALLIUM_HUD=fps,requested-VRAM+VRAM-usage,requested-GTT+GTT` to start the benchmark. (In reply to Michael Mair-Keimberger from comment #18) > Created attachment 156051 [details] > picture with VRAM and GTT usage > > (In reply to Michel Dänzer from comment #17) > > (In reply to Michael Mair-Keimberger from comment #15) > > > Unfortunately vally doesn't include GALLIUM_HUD when i'm taking > > > screenshot's. As a workaround i've made photos with my mobile. Hope > that's > > > ok :) > > > > That's fine, but we also need to see the VRAM and GTT graphs. > > Sorry, completely forget about that.. > Another picture with VRAM and GTT usage. I've used > `GALLIUM_HUD=fps,requested-VRAM+VRAM-usage,requested-GTT+GTT` to start the > benchmark. Should be ...requested-GTT+GTT-usage I used to have similar issues with valley, but for my setup/card (R9270X) they are fixed with current mesa + drm-next-3.19-wip. One thing I always do is set CPUs to performance in case cpufreq messes things up - may be worth a try to see if it helps. What setting(s)/res do you run valley with? It may be less hassle for you to use a phone, but FWIW the way I get screenshots that include the HUD is to use xwd - for something fullscreen I would before starting valley from a different xterm/console/whatever do something like - sleep 100 && xwd -root -out whatever.xwd then start valley and wait. To view "whatever.xwd" you can use xwud,to upload you could convert to another "normal" format. You need some image program to do this - I have ImageMagick installed and can just type in a terminal - convert whatever.xwd whatever.png Created attachment 156161 [details] vally screenshot (In reply to Andy Furniss from comment #19) > (In reply to Michael Mair-Keimberger from comment #18) > > Created attachment 156051 [details] > > picture with VRAM and GTT usage > > > > (In reply to Michel Dänzer from comment #17) > > > (In reply to Michael Mair-Keimberger from comment #15) > > > > Unfortunately vally doesn't include GALLIUM_HUD when i'm taking > > > > screenshot's. As a workaround i've made photos with my mobile. Hope > that's > > > > ok :) > > > > > > That's fine, but we also need to see the VRAM and GTT graphs. > > > > Sorry, completely forget about that.. > > Another picture with VRAM and GTT usage. I've used > > `GALLIUM_HUD=fps,requested-VRAM+VRAM-usage,requested-GTT+GTT` to start the > > benchmark. > > Should be ...requested-GTT+GTT-usage > > I used to have similar issues with valley, but for my setup/card (R9270X) > they are fixed with current mesa + drm-next-3.19-wip. > > One thing I always do is set CPUs to performance in case cpufreq messes > things up - may be worth a try to see if it helps. > > What setting(s)/res do you run valley with? > > It may be less hassle for you to use a phone, but FWIW the way I get > screenshots that include the HUD is to use xwd - for something fullscreen I > would before starting valley from a different xterm/console/whatever do > something like - > > sleep 100 && xwd -root -out whatever.xwd > > then start valley and wait. To view "whatever.xwd" you can use xwud,to > upload you could convert to another "normal" format. You need some image > program to do this - I have ImageMagick installed and can just type in a > terminal - > > convert whatever.xwd whatever.png It's fixed! (for me) - mesa git did the miracle :) FYI - changing CPU's to performance didn't had any influence. I've made another screenshot, this time with xwd (thanks for the tip btw) and with GTT-usage (thangs for pointing that out - that was a copy paste error). Don't know if it's still important but i'll upload it anyway. Settings for vally are as followed: Quality: Ultra Stereo 3d: Disabled Monitors: Single Anti-aliasing: Off Full Screen: Yes Resolution: 2560x1600 Just to clarify - the screenshot was made with mesa-10.3.2 and cpu's frequency set to performance. With mesa git i got following results: FPS: 20.0 Score: 838 Min FPS: 7.4 Max FPS: 30.5 Pretty neat actually! I'll gonna make another benchmark with my patched kernel (git revert 59bc1d8) and look if it has an influence in performance :) |