Bug 113341 - GPU Lockup on AMD Kaveri
Summary: GPU Lockup on AMD Kaveri
Status: RESOLVED UNREPRODUCIBLE
Alias: None
Product: Drivers
Classification: Unclassified
Component: Video(DRI - non Intel) (show other bugs)
Hardware: x86-64 Linux
: P1 normal
Assignee: drivers_video-dri
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2016-02-27 20:18 UTC by Bernd Steinhauser
Modified: 2016-06-03 05:22 UTC (History)
0 users

See Also:
Kernel Version: 4.4.2
Subsystem:
Regression: No
Bisected commit-id:


Attachments
journal log from the time of the gpu lockup. (9.11 KB, text/plain)
2016-02-27 20:18 UTC, Bernd Steinhauser
Details
glxinfo (98.17 KB, text/plain)
2016-02-29 05:33 UTC, Bernd Steinhauser
Details
Xorg.0.log (58.98 KB, text/x-log)
2016-03-05 11:48 UTC, Bernd Steinhauser
Details

Description Bernd Steinhauser 2016-02-27 20:18:57 UTC
Created attachment 206321 [details]
journal log from the time of the gpu lockup.

GPU is an AMD Kaveri: 1002:130f
Happened when I started an application (way before the application actually showen) on KDE Plasma 5.5.
The kernel version was 4.4.2 with an additional vblank fix applied (see [1]). 

Since I wasn't prepared for this and ssh wasn't activated and I couldn't save dmesg output. Instead I'll attach what I found in the journal.

[1] https://bugs.freedesktop.org/show_bug.cgi?id=93746
Comment 1 Michel Dänzer 2016-02-29 01:25:44 UTC
Please attach the output of glxinfo and the Xorg log.
Comment 2 Bernd Steinhauser 2016-02-29 05:33:39 UTC
Created attachment 206381 [details]
glxinfo
Comment 3 Bernd Steinhauser 2016-02-29 05:37:40 UTC
I forgot about Xorg log, sorry. Since only one old one is kept, it's gone already.
Maybe logging to syslog/journal is a good idea ...
Comment 4 Bernd Steinhauser 2016-03-05 11:47:44 UTC
Today I ran into this again and got the Xorg log, but there doesn't seem to be anything interesting in it.
Kernel is now 4.4.4.

When the freeze occurs, I can still ssh into the system. Applications continue to run (i.e. I had a music player running), I could even reboot the system, although that seemed to take longer because X didn't like to shutdown (was markt as DSsl+ in ps aux).
Comment 5 Bernd Steinhauser 2016-03-05 11:48:09 UTC
Created attachment 207721 [details]
Xorg.0.log
Comment 6 Michel Dänzer 2016-03-07 09:27:24 UTC
Any chance you could try if this also happens with LLVM 3.8 or even current SVN/Git?

Does it always happen when starting a particular application?
Comment 7 Bernd Steinhauser 2016-03-07 14:11:48 UTC
I tried to build LLVM/clang scm, but it failed (have to check why and if I can get around that).

I've had several freezes over the last few weeks (most of them when I bisected that other bug mentioned above (so could be unrelated), but only a few after staying on 4.4.x), but this one was the only one I could relate to a specific event, starting an application (thunderbird here).
The other ones seemed to happen out of nowhere, so it could be that it was just coincidence that it happened when I started that application.
Comment 8 Michel Dänzer 2016-03-08 01:50:14 UTC
FWIW, right now it's better to try LLVM 3.8 than SVN/Git, because the latter will expose you to https://bugs.freedesktop.org/show_bug.cgi?id=94242 .
Comment 9 Bernd Steinhauser 2016-03-11 18:14:50 UTC
I will try to build llvm, but it could require me a few days since it's not yet provided by my distribution and I have to check the changes in the build system.

BTW, I can usually ssh into the system. Is there any way I could gather more debug info when this happens?
Comment 10 Bernd Steinhauser 2016-03-30 17:46:26 UTC
During the last 2 weeks I switched between amdgpu and radeon a couple of times.
What I noticed is that with radeon I do get lockups here and there, I think almost always (not 100% sure though) when a video is running.
Both with xv and vdpau as video output.
llvm is now 3.8.

On amdgpu I haven't seen a lockup yet, except for a view when bisecting 4.4-rc2 or -rc3, but I guess that was a different problem which got fixed until the release.
Comment 11 Bernd Steinhauser 2016-06-03 05:22:49 UTC
Since I'm now exclusively using amdgpu and since that works very well for me, I didn't do much more tests with radeon and thus cannot tell if this is still present or not.
Therefore closing the bug report.

Note You need to log in before you can comment on or make changes to this bug.