Created attachment 90801 [details] return to desktop after bad rendering Hi, I'm experiencing a major rendering corruption with linux 3.8 and my HD 6870. software: latest kernel from linus as of 2013-08-01 latest mesa git as of 2013-08-01 latest llvm from tstellar git as of 2013-08-01 latest DDX from git as of 2013-08-01 libdrm 2.4.40 Symptoms: I triggered this several times running Heroes of Newerth. When a match starts, sometimes The textures are all black, or sometimes my cursor is missing. (It looks like my LLVM-enabled for the glsl compiler builds of mesa trigger the black textures more often) When this happens, quitting the game and returning to my desktop, everything is garbled, things do not refresh correctly. See screenshot. keeping the same userland and just downgrading to linux 3.7 solves everything. Nothing gets added to dmesg... I don't have much time for bisecting this, I'll try asap but it won't be before some days, so if someone has similar hardware, please try to reproduce it. HoN is free to play and natively runs on linux. (http://www.heroesofnewerth.com)
Created attachment 90811 [details] When the rendering is bad inside the game right before I quit the game and all the bad stuff happens.
Still happening in rc3
I bisected it. THough it looks like the behavior changed halfway, I was getting kernel crashes for this commit for example. d2ead3eaf8a4bf92129eda69189ce18a6c1cc8bd is the first bad commit commit d2ead3eaf8a4bf92129eda69189ce18a6c1cc8bd Author: Alex Deucher <alexander.deucher@amd.com> Date: Thu Dec 13 09:55:45 2012 -0500 drm/radeon/kms: add evergreen/cayman CS parser for async DMA (v2) Allows us to use the DMA ring from userspace. DMA doesn't have a good NOP packet in which to embed the reloc idx, so userspace has to add a reloc for each buffer used and order them to match the command stream. v2: fix address bounds checking Signed-off-by: Alex Deucher <alexander.deucher@amd.com> :040000 040000 7183de0d56e5c01b40775244d5dc4b5441406786 f3abce52c375cc4598cd23739df825771e6fb46e M drivers
Probably a bad bisect. That commit just enables userspace accel drivers to utilize the DMA engine, but the userspace drivers do not take advantage of that yet so the code is currently never called.
Hm, I was expecting something like that. I'm experiencing 2 bugs in 1. This one was a plain crash, while the first one I opened the report about was a corruption but no crash. I may re-bisect the kernel and flag "good" when it just crashes but no corruption ?
After further testing, it seems that there is 2 bugs as I said. The first one I'm seeing (and the one the report is about) seems to be caused by commit dd54fee7d440c4a9756cce2c24a50c15e4c17ccb. And now the crash I also got is probably an older commit, but as the crash is pretty random, I got the bisect wrong. Seeing the commit message of dd54fee7d440c4a9756cce2c24a50c15e4c17ccb, it fixes a kernel crash that looks like mine, I'll attach a screenshot of mine (poor quality :/) ==> So maybe dd54fef DID fix the kernel crash but replaced it with the corruption I'm seeing ?
Created attachment 91101 [details] kernel crash I got from older commits while bisecting
(In reply to comment #6) > ==> So maybe dd54fef DID fix the kernel crash but replaced it with the > corruption I'm seeing ? Does the corruption also occur with dd54fee7d440c4a9756cce2c24a50c15e4c17ccb applied manually on top of 0d0b3e7443bed6b49cb90fe7ddc4b5578a83a88d?
Does reverting the following commit fix the issue? commit d025e9e2b890db679f1246037bf65bd4be512627 Author: Jerome Glisse <jglisse@redhat.com> Date: Thu Nov 29 10:35:41 2012 -0500 drm/radeon: do not move bo to different placement at each cs The bo creation placement is where the bo will be. Instead of trying to move bo at each command stream let this work to another worker thread that will use more advance heuristic. agd5f: remove leftover unused variable Signed-off-by: Jerome Glisse <jglisse@redhat.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
(In reply to comment #8) > (In reply to comment #6) > > ==> So maybe dd54fef DID fix the kernel crash but replaced it with the > > corruption I'm seeing ? > > Does the corruption also occur with dd54fee7d440c4a9756cce2c24a50c15e4c17ccb > applied manually on top of 0d0b3e7443bed6b49cb90fe7ddc4b5578a83a88d? g0d0b3e7 with patch dd54fee7d I see no corruption
(In reply to comment #9) > Does reverting the following commit fix the issue? > > commit d025e9e2b890db679f1246037bf65bd4be512627 > Author: Jerome Glisse <jglisse@redhat.com> > Date: Thu Nov 29 10:35:41 2012 -0500 > > drm/radeon: do not move bo to different placement at each cs > > The bo creation placement is where the bo will be. Instead of trying > to move bo at each command stream let this work to another worker > thread that will use more advance heuristic. > > agd5f: remove leftover unused variable > > Signed-off-by: Jerome Glisse <jglisse@redhat.com> > Reviewed-by: Alex Deucher <alexander.deucher@amd.com> It does fix the corruption.
Same issue as: https://bugs.freedesktop.org/show_bug.cgi?id=58659
Created attachment 91421 [details] Exclude system placement Does applying this patch without reverting anything fix the issue ?
Better to try this patch instead first : http://people.freedesktop.org/~glisse/0001-drm-radeon-exclude-system-placement-when-validating-.patch
(In reply to comment #14) > Better to try this patch instead first : > > http://people.freedesktop.org/~glisse/0001-drm-radeon-exclude-system-placement-when-validating-.patch With this patch, my game froze before I could even check the rendering. My cursor still moved, I could switch to tty1. I checked dmesg : nothing added. I went back to tty7 (X) and then it was stuck there.
A patch referencing this bug report has been merged in Linux v3.8-rc5: commit 20707874fd4fd37e09513f508e642fa8bd06365a Author: Alex Deucher <alexander.deucher@amd.com> Date: Thu Jan 17 13:10:50 2013 -0500 Revert "drm/radeon: do not move bo to different placement at each cs"
Indeed, linux 3.8-rc5 with no patch applied is working now, I see no corruption.
Final 3.8 is working