Bug 16881

Summary: [REGRESSION, Radeon-KMS] 2.6.36-rc[1-3] - missing textures in 0 A.D.
Product: Drivers Reporter: trapdoor6
Component: Video(DRI - non Intel)Assignee: drivers_video-dri
Status: CLOSED INVALID    
Severity: normal CC: alexdeucher, florian, maciej.rutecki, rjw, trapdoor6
Priority: P1    
Hardware: x86-64   
OS: Linux   
Kernel Version: 2.6.36-rc2,3 Subsystem:
Regression: Yes Bisected commit-id:
Bug Depends on:    
Bug Blocks: 16444    
Attachments: 0 A.D. on kernel 2.6.35.3
0 A.D. on kernel 2.6.36-rc1-git3
logs generated on 2.6.35.3 when 0 A.D. was running
logs generated on 2.6.36-rc1-git3 when 0 A.D. was running

Description trapdoor6 2010-08-24 12:20:43 UTC
Created attachment 27751 [details]
0 A.D. on kernel 2.6.35.3

Pasting below what I already posted on LKML. These e-mails describe the problem.
Also attaching related screenshots and archived log files.


---------- Forwarded messages ----------
### Link to LKML thread: http://lkml.org/lkml/2010/8/24/19

From: trapDoor <trapdoor6@gmail.com>
Date: Tue, Aug 24, 2010 at 12:27 PM
Subject: Re: [REGRESSION, Radeon-KMS] 2.6.36-rc1 - graphic issues in 0. A.D.
To: Alex Deucher <alexdeucher@gmail.com>


On Tue, Aug 24, 2010 at 6:09 AM, Alex Deucher <alexdeucher@gmail.com> wrote:
> On Mon, Aug 23, 2010 at 12:00 PM, trapDoor <trapdoor6@gmail.com> wrote:
>> On Mon, Aug 23, 2010 at 6:12 AM, Alex Deucher <alexdeucher@gmail.com> wrote:
>>> On Sun, Aug 22, 2010 at 3:41 PM, trapDoor <trapdoor6@gmail.com> wrote:
>>>> Hello,
>>>> Just wanted to let you know about this before bisecting which
>>>> hopefully I will be able to start tomorrow. Please take a look at the
>>>> screenshots in below links to see the difference with rendering
>>>> textures in 0 A.D. alpha1 on kernel 2.6.35.3 and 2.6.36-rc1-git3. [0
>>>> A.D. - strategic game, OS clone of Age of Empires; home page:
>>>> http://wildfiregames.com/0ad/]
>>>>
>>>> 0 A.D. on kernel 2.6.35.3 [good]:
>>>>
>>>> http://picasaweb.google.co.uk/104351852606666221362/0ad?authkey=Gv1sRgCMne0NqXpuzR_gE#5508312214769142226
>>>>
>>>> 0 A.D. on kernel 2.6.36-rc1-git3:
>>>>
>>>> http://picasaweb.google.co.uk/104351852606666221362/0ad?authkey=Gv1sRgCMne0NqXpuzR_gE#5508312218519295058
>>>>
>>>> Please note that in both cases only kernel was different. The other
>>>> components [including hardware] were in the same versions and had the
>>>> same options (like the game itself, xorg-ati drivers, mesa, libdrm
>>>> [S3TC disabled], etc.). So it must be kernel then.
>>>
>>> What card?  Anything in your dmesg?
>>>
>>> Alex
>>>
>>
>> Hi Alex,
>> Sorry for lack of details in my first e-mail. I was hoping to send you
>> the logs together with bisecting results. Unfortunately there are
>> other issues between 2.6.35-git2 - the last 'good' kernel (which
>> doesn't include the first drm pull for 2.6.36 window merge) - and
>> 2.6.36-rc1. Due to those issues I can't bisect.
>>
>> For example:
>> 1) First I tried to narrow the problem down to the closest affected
>> kernel snapshot. So I marked 2.6.35-git2 as good and I was expecting
>> that 2.6.35-git3 will be a bad one (as -git3 is the first snapshot
>> that includes drm patches from the first drm pull). But on -git3 0
>> A.D. even fails to start.
>>
>> 2) Then I was hoping to do bisecting between 2.6.35-git11 and
>> 2.6.35-git12 (-git12 is the first snapshot that includes patches from
>> the second drm pull). But these both snapshots won't even compile for
>> me. It stops suddenly at the second stage of making modules without
>> giving any errors (despite kernel debugging enabled in .config). I
>> remember I had this problem for a while, AFAIR since around
>> 2.6.35-git5 to -git15 - between these I couldn't compile any snapshot
>> I had tried.
>>
>> It's very likely that the issue I wanted to bisect is related to one
>> of or both drm pulls . So doing bisect between e.g. these kernels:
>> 2.6.35-git16 (which includes both drm pulls; assuming it's the first
>> snapshot since -git5 I could compile) and 2.6.36-rc1 would be
>> pointless - they both will be bad.
>>
>>
>> ------------
>> Now, this is my card:
>>
>> Asus ATI Radeon HD3650 Silent 512MB
>> some glxinfo details:
>>        OpenGL vendor string: Advanced Micro Devices, Inc.
>>        OpenGL renderer string: Mesa DRI R600 (RV635 9598) 20090101  TCL DRI2
>>        OpenGL version string: 2.1 Mesa 7.9-devel
>>        OpenGL shading language version string: 1.20
>>
>>
>> ------------
>> About the logs. It happens that 0 A.D. does produce really huge logs.
>> Running the game just for about 10 SECONDS (counting after chosen map
>> has been loaded) relulted in these:
>>
>> On a 'good' kernel 2.6.35.3
>>        60K     ./Xorg.0.log
>>        48K     ./dmesg
>>        27M     ./kern.log
>>        27M     ./syslog
>>        27M     ./messages
>>        81M     [total]
>>
>> on a bad kernel 2.6.36-rc1
>>        60K     ./Xorg.0.log
>>        52K     ./dmesg
>>        21M     ./kern.log
>>        21M     ./syslog
>>        21M     ./messages
>>        63M     [total]
>>
>> There's no mistake above: 3 logs had been blown up to over 20M during
>> only the little time when the game was running.
>> How I did it: deleted all those logs completely, re-booted from the
>> 'good' kernel, run the game and stopped. Then archived the logs,
>> deleted again and repeated the same on the 'bad' kernel.
>>
>> So it looks like 99,999% of all lines in: kern.log, messages in syslog
>> were produced on both kernels when 0 A.D. was running. Before I first
>> time run it, I had never seen kern.log and messages in /var/log at
>> all, and syslog was never bigger than around 1,5M (with logs collected
>> from several boots and during a couple of days). Running 0 A.D. for a
>> couple of minutes results in about 1G for each of those 3 logs.
>>
>> After compressing, the archive files are relatively small:
>>        2.0M    ./2.6.35.3_0ad-logs.tar.bz2
>>        680K    ./2.6.35.3_0ad-logs.tar.lzma
>>
>>        1.6M    ./2.6.36-rc1-00159-g36423a5_0ad-logs.tar.bz2
>>        748K    ./2.6.36-rc1-00159-g36423a5_0ad-logs.tar.lzma [yes, somehow
>>        >> this came up bigger than 2.6.35.3(...).lzma]
>>
>>
>> Do you mind if I send the .lzma's to you (unless you prefer .bz2)? I
>> won't cc LKML or anyone of course, they will be sent only to you.
>> Please let me know.
>
> I don't really need the whole files, likely it's just a few messages
> repeated.  Go ahead and send me the files and I'll take a look.
>
> Alex
>

Files attached.
The problem still occurs on 2.6.36-rc2-00098-gd1b113b (which includes
drm patches from yesterday's pull).

--
Thanks
Tomasz
---------- End of forwarded messages ----------


Tomasz
Comment 1 trapdoor6 2010-08-24 12:26:19 UTC
Created attachment 27761 [details]
0 A.D. on kernel 2.6.36-rc1-git3

Lots of missing textures; they are displayed correctly on kernel 2.6.35.3
Comment 2 trapdoor6 2010-08-24 12:32:21 UTC
Created attachment 27771 [details]
logs generated on 2.6.35.3 when 0 A.D. was running

These logs uncompressed are 81 MB. Archived in tar.lzma format for best compression ratio.
Comment 3 trapdoor6 2010-08-24 12:34:13 UTC
Created attachment 27781 [details]
logs generated on 2.6.36-rc1-git3 when 0 A.D. was running

These logs uncompressed are 62 MB. Archived in tar.lzma format for best compression ratio
Comment 4 Rafael J. Wysocki 2010-08-26 20:00:53 UTC
*** Bug 17141 has been marked as a duplicate of this bug. ***
Comment 5 Florian Mickler 2010-08-30 13:38:27 UTC
On Mon, 30 Aug 2010 10:54:41 +0100
trapDoor <trapdoor6-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> wrote:

> There was at least one drm pull between 2.6.36-rc2 and -rc3. So I
> tested on -rc3 and I confirm that the issue is still present.
> 
> I was going to update the bugzilla entry but it won't let me log in at
> this moment.
>
Comment 6 Alex Deucher 2010-08-30 17:01:45 UTC
Do you know what specific commit caused the breakage?  Both sets of logs have the same vbo message.
Comment 7 trapdoor6 2010-08-30 18:07:05 UTC
Alex,
Unfortunately I don't know. I would if I could do bisecting. But due to the problems I described in the first e-mail I couldn't. So far I've got only what I had initially: the screenshot and the logs. Maybe there are some other logs which could give us something more specific? I have searched in my home folder in the 0 A.D. config and cache files but I can't see anything useful there.

I wonder if anyone else has come across the same issue? Or maybe you would be able to reproduce it? That involves installing O A.D., unfortunately (I don't what other software would be good to use as an equivalent for testing this case).
Comment 8 Alex Deucher 2010-08-30 19:28:00 UTC
I've installed 0ad here, however, I've never used it before.  Can you explain how to trigger the bug?
Comment 9 trapdoor6 2010-08-30 20:53:51 UTC
[originally posted on LKML]

I can't log on to bugzilla again, so replying from here [LKML].

When you start the game please choose single player, default map
(Arcadia) and accept (you can skip the players setup). The bug appears
right away when the map has been loaded and the game begins. On my box
that happens only on kernels 2.6.36-rc[1-3]. On 2.6.35.[0-4] it's OK.
Comment 10 trapdoor6 2010-08-30 20:54:56 UTC
One more thing. I have no support for S3TC compressed textures enabled
in mesa and that turns up a performance warning message. I just accept
it to get to the initial game menu. Lack of S3TC support causes
performance slowdown on both 'good' and 'bad' kernels but this is
rather irrelevant for the case we are testing. What's important is to
keep the same settings within libdrm, mesa etc when running the game
on 2.6.35.x and 2.6.36-rcx
Comment 11 Alex Deucher 2010-08-30 21:33:07 UTC
Weird; works fine here (other than the vbo warnings).  Everything renders correctly.  What version of mesa are you using?
Comment 12 trapdoor6 2010-08-30 22:29:27 UTC
Here you are:
mesa 7.9.0+git20100830.f3eebb84-0ubuntu0sarvatt~lucid [classic drivers]

Installed from the following Ubuntu (PPA) repository:
https://edge.launchpad.net/~xorg-edgers/+archive/ppa/

Published on 2010-08-24
  * Checkout from git 20100823 (master branch) up to commit: 1288d5c39234e7c54ae2fbb81dd788c98c62a7b3 [http://cgit.freedesktop.org/mesa/mesa/commit/?id=1288d5c39234e7c54ae2fbb81dd788c98c62a7b3]
Comment 13 trapdoor6 2010-08-30 22:34:39 UTC
Also here is my driconf config [~/dric], maybe it's something there ?

<driconf>
    <device screen="0" driver="dri2">
        <application name="Default">
            <option name="vblank_mode" value="0" />
        </application>
    </device>
    <device screen="0" driver="r600">
        <application name="all">
            <option name="force_s3tc_enable" value="false" />
            <option name="disable_s3tc" value="true" />
            <option name="fp_optimization" value="0" />
            <option name="fthrottle_mode" value="2" />
            <option name="disable_stencil_two_side" value="false" />
            <option name="tcl_mode" value="3" />
            <option name="texture_depth" value="0" />
            <option name="def_max_anisotropy" value="1.0" />
            <option name="no_rast" value="false" />
            <option name="command_buffer_size" value="8" />
            <option name="round_mode" value="0" />
            <option name="dither_mode" value="0" />
            <option name="texture_coord_units" value="8" />
            <option name="disable_lowimpact_fallback" value="true" />
            <option name="texture_image_units" value="8" />
            <option name="color_reduction" value="1" />
            <option name="vblank_mode" value="0" />
        </application>
    </device>
</driconf>
Comment 14 Alex Deucher 2010-08-31 07:11:58 UTC
I was finally able to reproduce this.  It's a bug in the mesa development snapshot you are using.  Upgrade to a newer snapshot or use a released version.  The drm code is fine.
Comment 15 trapdoor6 2010-08-31 07:35:39 UTC
There's no newer snapshot available for my system (Ubuntu Lucid) in that PPA yet, only for Maverick. The yesterdays snapshot failed to build on Launchpad for Lucid (i386 and amd64). So I'll wait until the next one shows up, which I'd expect very soon (and hopefully it will be build successfully on amd64).
I never compiled mesa myself and don't want to do it now, neither revert to 7.8.

But still: on kernel 2.6.35.4, with the same mesa version the game runs fine, no empty black squares at all.. I understand it's that bug in mesa which affects only 2.6.36-rc. And the bug is triggered by some recent drm-radeon commit in kernel, but there is nothing wrong with that commit itself..

I'll test again when a new mesa snapshot is available and let you know the outcome right away.

Thanks a lot for looking into this.
Comment 16 Alex Deucher 2010-08-31 07:47:15 UTC
It's sort of random luck.  The vbo size was getting garbage values.  You'll also note that the vbo size errors in your kernel log are gone with a newer mesa snapshot.
Comment 17 Rafael J. Wysocki 2010-08-31 20:02:36 UTC
Not a kernel bug, so closing.