Bug 13713

Summary: [drm/i915] Possible regression due to commit "Change GEM throttling to be 20ms (...)"
Product: Drivers Reporter: kazikcz
Component: Video(DRI - non Intel)Assignee: drivers_video-dri
Status: CLOSED DOCUMENTED    
Severity: normal CC: dwalker, eric, karabaja4, maximlevitsky, rjw, sa
Priority: P1    
Hardware: All   
OS: Linux   
Kernel Version: 2.6.31-rc2 Subsystem:
Regression: Yes Bisected commit-id:
Bug Depends on:    
Bug Blocks: 13615    

Description kazikcz 2009-07-05 10:49:04 UTC
Some low demanding 3D accelerated applications (tested: quake3-demo, glxgears, stepmania 3.9) seem to suffer from a starvation of some sort. They are unusable and temporarily freeze the X itself.

The applications render their frames in bursts - a few frames show up, then it freezes for a split of a second (X itself too). The bursts occur quite regularly, a few times per second. In quake3-demo while cg_drawfps was enabled, the frame rate would burst between numbers like 50 and 300.

Reverting the commit b962442e46a9340bdbc6711982c59ff0cc2b5afb - "drm/i915: Change GEM throttling to be 20ms like the comment says." fixes the issue for me.

Also, limiting maximum frame rate (com_maxfps) in quake3-demo to about ~50 (1000ms / 20 ms = 50 -- I thought) eliminates the bursts. The greater the limit the stronger the bursts are. Enabling vertical sync seems to help too (as I have 60hz).

I tested against mesa/xorg-server/xf86-video-intel from git, and latest official releases. This happens on all three combination: EXA/UXA/UXA+KMS and regardless of Tiling option.

My laptop: Intel GM965/GL960, 1GB of ram, Intel Core 2 Duo T7100 @ 1.80GHz.
Comment 1 Maxim Levitsky 2009-07-05 19:04:16 UTC
I confirm similar findings.

1) many games, especially neverball are very bursty, like you descrive
2) glxgears performance recently dropped from 950 to 750 FPS.

I changed the throttling timeout to 5 ms, and neverball no longer bursty, and glxgears back to 950 FPS, and maybe googleearth works again without crashes (it does now, but I am not sure that this change did it)

On the other hand increasing the timeout to 300 ms, like I initially did only make this problem much worse.
Comment 2 Maxim Levitsky 2009-07-05 19:06:04 UTC
I also want to note that I couldn't revert this commit, it creates conflicts, and even after a manual merge, still kernel didn't compile


I use -git of everything (updated daily)
Comment 3 Daniel Walker 2009-07-15 15:31:03 UTC
Add the commit owner to the CC ..
Comment 4 Rafael J. Wysocki 2009-07-21 21:36:44 UTC
Caused by:

commit b962442e46a9340bdbc6711982c59ff0cc2b5afb
Author: Eric Anholt <eric@anholt.net>
Date:   Wed Jun 3 07:27:35 2009 +0000

    drm/i915: Change GEM throttling to be 20ms like the comment says.

First-Bad-Commit : b962442e46a9340bdbc6711982c59ff0cc2b5afb

Notify-Also : Jesse Barnes <jbarnes@virtuousgeek.org>
Comment 5 Eric Anholt 2009-08-03 20:37:02 UTC
Mesa fix (yes, our mesa driver was really buggy to not have behavior something like this commit):

commit 0828579a658af01a64b5e699175dc9bbbedcd685
Author: Eric Anholt <eric@anholt.net>
Date:   Tue Jul 21 11:23:18 2009 -0700

    intel: Wait on the last swapbuffers to complete before queuing a new one.
    
    This fixes jerkiness in doom3 and other apps since the kernel change to
    throttle less absurdly, which led to a thundering herd of frames.
    
    Because this is a rather minimal fix, there is at least one downside: If
    the whole scene completes in one batchbuffer, we'll end up stalling the GPU.
    
    Thanks to Michel Dänzer for suggesting using glFlush to signal frame end
    instead of going to all the effort of adding a new DRI2 extension.
Comment 6 Dule Dulic 2009-08-09 21:54:16 UTC
Was this bug fixed?

I still have the problem on my i915 card... 3D rendering is very choppy (like described in the first comment).

I'm using: xf86-video-intel 2.8.0, xorg-server 1.6.3, latest mesa etc., latest git kernel (built today).
Comment 7 Dule Dulic 2009-08-14 21:32:55 UTC
nevermind, mesa from git fixed it.
Comment 8 Rafael J. Wysocki 2009-08-14 21:43:22 UTC
Should the bug be closed, then?
Comment 9 Dule Dulic 2009-08-14 22:04:22 UTC
As stated above:

"Because this is a rather minimal fix, there is at least one downside: If the whole scene completes in one batchbuffer, we'll end up stalling the GPU."

So, this is more of a quick workaround then a fix? Maybe it's not such a good idea to rush with closing the bug, unless it's not (or will not be) kernel related?
Comment 10 Rafael J. Wysocki 2009-08-15 14:08:52 UTC
I was just asking and I don't get your point.  Sorry.
Comment 11 Eric Anholt 2009-08-19 00:56:21 UTC
The kernel bugzilla is the wrong place to be tracking a Mesa bug.

(Yes, this should be closed)
Comment 12 Rafael J. Wysocki 2009-08-19 19:14:31 UTC
Thanks Eric, closing.
Comment 13 Michel Dänzer 2009-08-21 16:38:39 UTC
FWIW, I followed up with some other possible solutions. It actually seems best overall to handle this in the X driver rather than in the Mesa driver.