Created attachment 152191 [details] /var/log/messages from radeon 0000:00:01.0: ring 0 stalled to reboot. I was away from the computer when the radeon dri driver crashed; I left a fair number of firefox windows on/tab, some of them may have videos (from BBC news web site) and animated gifs from another web site on; but it crashed about 5-10 minutes after I was away and I was aware of it because the laptop blipped. # lspci | grep VGA 00:01.0 VGA compatible controller: Advanced Micro Devices, Inc. [AMD/ATI] Mullins [Radeon R3 Graphics] some excerpt from the attached logs are: ... [ 8770.250116] radeon 0000:00:01.0: ring 0 stalled for more than 10012msec [ 8770.250128] radeon 0000:00:01.0: GPU lockup (waiting for 0x0000000000056034 last fence id 0x0000000000056031 on ring 0) [ 8770.298635] radeon 0000:00:01.0: Saved 14196 dwords of commands on ring 0. [ 8770.298663] radeon 0000:00:01.0: GPU softreset: 0x0000000C ... [ 8770.313299] radeon 0000:00:01.0: GPU reset succeeded, trying to resume ... [ 8770.339724] [drm] ring test on 0 succeeded in 3 usecs [ 8770.518568] [drm:cik_ring_test] *ERROR* radeon: ring 1 test failed (scratch(0x3010C)=0xCAFEDEAD) [ 8770.752885] [drm:cik_sdma_ring_test] *ERROR* radeon: ring 3 test failed (0xCAFEDEAD) [ 8770.752892] [drm:cik_resume] *ERROR* cik startup failed on resume [ 8780.753181] radeon 0000:00:01.0: ring 0 stalled for more than 10001msec [ 8780.753193] radeon 0000:00:01.0: GPU lockup (waiting for 0x00000000000560f7 last fence id 0x0000000000056031 on ring 0) [ 8780.753199] [drm:cik_ib_test] *ERROR* radeon: fence wait failed (-35). [ 8780.753209] [drm:radeon_ib_ring_tests] *ERROR* radeon: failed testing IB on GFX ring (-35). [ 8780.753215] radeon 0000:00:01.0: ib ring test failed (-35). [ 8780.762131] radeon 0000:00:01.0: GPU softreset: 0x0000000C ... The kernel is a largely fedora 3.16.3-200 one grabbed from the koji srpm but with the additional patch from https://bugzilla.kernel.org/show_bug.cgi?id=71051#c8 drv ati 7.4.0 , mesa 10.2.8, glamor from git 347ef4 .
Created attachment 152841 [details] var log message, another crash, 4 days later. same description as last, from stalled to reboot. This time it happened when I was running mplayer - with -vo xv, I think. (in addition to firefox). From now on I am using some additional patches from look at 3.16.3..3.17-rc7 - basically anything that affects gpu/radeon and which would apply cleanly to 3.16.3, besides the sleep patch. 0001-drm-radeon-disable-gfx-cgcg-on-cik.patch 0001-drm-radeon-cik-Read-back-SDMA-WPTR-register-after-wr.patch 0001-drm-radeon-don-t-reset-dma-on-NI-SI-init.patch 0001-drm-radeon-don-t-reset-dma-on-r6xx-evergreen-init.patch 0001-drm-radeon-don-t-reset-sdma-on-CIK-init.patch 0001-drm-radeon-cik-use-a-separate-counter-for-CP-init-ti.patch The sdma one might be relevant? Also the counter one - the ring test failed on ring 1 and 3?
The patches listed above don't fix the problem. Had another GPU lock-up while just switching window from an upload file dialog in firefox (another bugzilla else) to a terminal to change permission of the file being uploaded...
GPU lockups are usually caused by a problem with the command buffers generated in the usermode acceleration drivers in mesa. I would suggest trying a newer version of mesa.
(In reply to Alex Deucher from comment #3) > GPU lockups are usually caused by a problem with the command buffers > generated in the usermode acceleration drivers in mesa. I would suggest > trying a newer version of mesa. How recent should I try? I am already using mesa 10.2.8 and libdrm 2.4.58 . libdrm seems to be a more recent install on 5th Oct, both mesa 10.2.8 was already on when the lock-up happened (twice). mesa 10.3 was released around the same time as 10.2.8 .
(In reply to Hin-Tak Leung from comment #4) > (In reply to Alex Deucher from comment #3) > > GPU lockups are usually caused by a problem with the command buffers > > generated in the usermode acceleration drivers in mesa. I would suggest > > trying a newer version of mesa. > > How recent should I try? I am already using mesa 10.2.8 and libdrm 2.4.58 . > libdrm seems to be a more recent install on 5th Oct, both mesa 10.2.8 was > already on when the lock-up happened (twice). mesa 10.3 was released around > the same time as 10.2.8 . Just try a newer or older version and see if it helps. If so, try and bisect to narrow down what change on the mesa side cuased the problem.
(In reply to Alex Deucher from comment #5) > Just try a newer or older version and see if it helps. If so, try and > bisect to narrow down what change on the mesa side cuased the problem. Unfortunately it doesn't happen often/"reproducible" enough to do git bisect... This is a new machine/hardware which I just put linux on exactly a month ago, and things stabilising perhaps around when I put mesa 10.2.8 on on 25th sept. It is my "main" machine now, and locked up twice in 18 days, which is frequent enough to be troublesome but not frequent enough to do bisect/go back/forward versions to try... I do think it is a kernel problem though, as it seems to be accompanied by X and gnome-shell segfaulting. I still have the core dump from X if that helps?
Reproducable case seems to be using firefox to review/edit an object you've uploaded to Shapeways. Another one is to load a fairly curved shape into OpenScad and hit F5 to view then rotate it.
(In reply to Alan from comment #7) > Reproducable case seems to be using firefox to review/edit an object you've > uploaded to Shapeways. Another one is to load a fairly curved shape into > OpenScad and hit F5 to view then rotate it. Those seem sufficiently different from the scenario described in this report that they should be tracked separately.
FWIW, I am glad I haven't had a lock up since the last time I wrote (over two weeks ago). FWIW, all the patches I mentioned in comment 1 except two are integrated and therefore dropped with my current kernel 3.16.6- (and I haven't upgraded/downgraded anything else actively); so it looks like improvements are being made. I hope I don't see that error again :-).
Created attachment 157061 [details] whole dmesg from vt with 3.16.6 when it crashed. This time it crashed while I was running just a few terminals and a qemu/kvm window, and I was switching terminals (in gnome-shell) to type something I forgot what it was, maybe just doing ls -l to check on the VM's disk image size growth. I had something running for a few hours inside the VM and it is minimized. If you need to know, just the gcc testsuite from an ssh session in, so the VM isn't using much of its graphic capability. This is the whole dmesg since boot; so should have all hardware info, history, etc if those are important. gnome-shell died but I still seemed to have a VT or two so I just do dmesg, waited a bit to see that the drm was not coming back, and rebooted. Am upgrading to mesa 2.10.9 (from 2.10.8) and also to 3.17.2-200 (and dropped all those patches since they were merged) and hoping not to see this problem again.
This time I don't have firefox running. Just a few terminals and qemu/kvm. The gcc testsuite inside the vm is demanding enough I didn't want to run anything else.
Created attachment 158451 [details] /var/log/message, another GPU crash under mesa 10.3.3 Fedora shipped mesa 10.3.3 http://koji.fedoraproject.org/koji/buildinfo?buildID=593648 and it upgraded my custom-built 10.2.9 . Bad idea! The GPU crashed again the first time resuming from a suspend. I have been suspending/resuming under 10.2.9 happily for two+ weeks and generally happy with it for that period. Though it looks like I upgraded from kernel 3.17.2-200 to 3.17.3-200 yesterday and have not needed to suspend during that time. This time the log is interesting in that an hour into using the newer 10.3.3, I have a pile of: Nov 21 13:47:47 localhost kernel: radeon 0000:00:01.0: GPU fault detected: 146 0x02690004 Nov 21 13:47:47 localhost kernel: radeon 0000:00:01.0: VM_CONTEXT1_PROTECTION_FAULT_ADDR 0x00007D93 Nov 21 13:47:47 localhost kernel: radeon 0000:00:01.0: VM_CONTEXT1_PROTECTION_FAULT_STATUS 0x09000004 Nov 21 13:47:47 localhost kernel: VM fault (0x04, vmid 4) at page 32147, write from 'CB0' (0x43423000) (0) though it looks like I continue to use the machine for another hour, suspend, then GPU crash on resume. Oh, just firefox (plus a few terminals) in gnome-shell class mode in gnome 2.12 copr.
I also experience similar hangs, with a F20 system with mesa 10.3.3. In total, I have F20 installed on four systems, two using the nvidia kernel driver, one with i915 and one with radeon. Only the radeon system experience hangs. I get Nov 28 10:17:15 t kernel: radeon 0000:01:00.0: ring 5 stalled for more than 10000msec Nov 28 10:17:15 t kernel: radeon 0000:01:00.0: GPU lockup (waiting for 0x0000000000000004 last fence id 0x0000000000000002 on ring 5) Nov 28 10:17:15 t kernel: [drm:uvd_v1_0_ib_test] *ERROR* radeon: fence wait failed (-35). Nov 28 10:17:15 t kernel: [drm:radeon_ib_ring_tests] *ERROR* radeon: failed testing IB on ring 5 (-35). Nov 28 10:17:15 t kernel: [drm:si_dpm_set_power_state] *ERROR* si_set_sw_state failed Nov 28 10:17:15 t kernel: radeon 0000:01:00.0: GPU fault detected: 146 0x06c24804 Nov 28 10:17:15 t kernel: radeon 0000:01:00.0: VM_CONTEXT1_PROTECTION_FAULT_ADDR 0x000125B6 Nov 28 10:17:15 t kernel: radeon 0000:01:00.0: VM_CONTEXT1_PROTECTION_FAULT_STATUS 0x02048004 Nov 28 10:17:15 t kernel: VM fault (0x04, vmid 1) at page 75190, read from TC (72) Nov 28 10:17:15 t kernel: radeon 0000:01:00.0: GPU fault detected: 146 0x04a33d04 Nov 28 10:17:15 t kernel: radeon 0000:01:00.0: VM_CONTEXT1_PROTECTION_FAULT_ADDR 0x00012EA5 Nov 28 10:17:15 t kernel: radeon 0000:01:00.0: VM_CONTEXT1_PROTECTION_FAULT_STATUS 0x0303D004 Nov 28 10:17:15 t kernel: VM fault (0x04, vmid 1) at page 77477, write from DMA1 (61) Nov 28 10:17:42 t sh: abrt-watch-log: Warning, '/usr/bin/abrt-dump-xorg' did not process its input ... I can not pinpoint one specific action which triggers the hang. First time it was firefox, next time Konsole and other times it simply hangs a while after locking the screen. I am considering downgrading mesa. Let me know if there are other measures I can take to help resolving this bug.
Please make sure your version of mesa has this patch: http://cgit.freedesktop.org/mesa/mesa/commit/?id=ae4536b4f71cbe76230ea7edc7eb4d6041e651b4
FWIW, so far my best experience seems to be 10.2.9 which I haven't had a lockup yet (had two(?) lockup's with 10.2.8 and once within a few hours after upgrading to 10.3.3.). So far I spent about 6 weeks under 10.2.8, and 3 weeks with 10.2.9.
I finally had a lock-up with mesa 10.2.9. Looking at the logs, I have had a fair number of GPU faults which I did not notice, briefly for a few seconds about 4 hours before an extended period of 1/2 hour of such faults. I believe in the 2nd period I was watching a video with mplayer. (the first period might be a trial run of the same video). I suspended the machine to RAM, then on waking up, the screen flashed a few times between black and the last desktop look, with some corruption in the desktop look; the mouse is still responsive to movement but clicking no longer works, nor keyboard (trying to switch to a vt to shutdown/reboot, did not respond). Anyway, I am onto mesa 10.3.4 now, which includes http://cgit.freedesktop.org/mesa/mesa/commit/?id=ae4536b4f71cbe76230ea7edc7eb4d6041e651b4 . I hope this get fixed properly though, since the change looks like it just band-aided over something.
I rebuilt the Fedora20 RPM's from the F20 v-10.3.3-1 mesa.spec file, after applying http://cgit.freedesktop.org/mesa/mesa/commit/?id=ae4536b4f71cbe76230ea7edc7eb4d6041e651b4, creating new RPM's v.10.3.3-2. Now I am up and running with the patched Mesa 10.3.3-2 and the 3.17.4-200 kernel.
(In reply to Alex Deucher from comment #14) > Please make sure your version of mesa has this patch: > http://cgit.freedesktop.org/mesa/mesa/commit/ > ?id=ae4536b4f71cbe76230ea7edc7eb4d6041e651b4 This seems no good /insufficient. I just had a lock-up with 10.3.5, which includes it. With kernel 3.17.6-200.fc20.x86_64, if that means anything. Also it looks like I upgraded firefox to v34 (from v33) 5 days ago. I was merely opening a few more tabs on firefox when it happened. Though 20 minutes before then my computer came out of a suspend, and before the suspend, I was using kvm and virtualbox a bit. Switching VT was still possible so I was able to reboot cleanly. The failure message seems slightly different, so just in case it means anything, ... [71241.232157] radeon 0000:00:01.0: ring 0 stalled for more than 10002msec [71241.232173] radeon 0000:00:01.0: GPU lockup (waiting for 0x000000000052e910 last fence id 0x00 0000000052e90d on ring 0) [71241.232337] radeon 0000:00:01.0: failed to get a new IB (-35) [71241.232347] [drm:radeon_cs_ib_fill] *ERROR* Failed to get ib ! [71241.279772] radeon 0000:00:01.0: Saved 15657 dwords of commands on ring 0. ... [71252.356774] [drm:cik_ring_test] *ERROR* radeon: ring 1 test failed (scratch(0x3010C)=0xCAFEDEA D) [71252.718837] [drm:cik_ring_test] *ERROR* radeon: ring 2 test failed (scratch(0x3010C)=0xCAFEDEA D) [71252.836977] [drm:cik_sdma_ring_test] *ERROR* radeon: ring 3 test failed (0xCAFEDEAD) [71252.836992] [drm:cik_resume] *ERROR* cik startup failed on resume [71252.837260] [drm] ib test on ring 0 succeeded in 0 usecs [71252.837790] [drm] ib test on ring 6 succeeded [71252.838167] [drm] ib test on ring 7 succeeded [71254.210168] [drm:radeon_dp_link_train_cr] *ERROR* displayport link status failed [71254.210182] [drm:radeon_dp_link_train_cr] *ERROR* clock recovery failed [71257.654395] radeon 0000:00:01.0: still active bo inside vm [71257.765448] radeon 0000:00:01.0: still active bo inside vm [71258.526881] radeon 0000:00:01.0: still active bo inside vm [71265.473102] radeon 0000:00:01.0: couldn't schedule ib ... I cam supply the dmesg if needed. Seeing as the patch does not work/insufficient, and my best experience so far is 10.2.9 (lasted 3 weeks, without the patch), my worst experience is 10.3.3 (less than a day), and 10.3.4/10.3.5 (patch included) lasted a week, I am going back to 10.2.9, and adding the patch to it. If the patch improves 10.2.9 the way it did from 10.3.3 -> 10.3.4/10.3.5, i.e. make 10.2.9 lasts a few months, I'd be happy enough.
(In reply to Hin-Tak Leung from comment #18) > (In reply to Alex Deucher from comment #14) > > Please make sure your version of mesa has this patch: > > http://cgit.freedesktop.org/mesa/mesa/commit/ > > ?id=ae4536b4f71cbe76230ea7edc7eb4d6041e651b4 > > This seems no good /insufficient. I just had a lock-up with 10.3.5, which ... > Seeing as the patch does not work/insufficient, and my best experience so > far is 10.2.9 (lasted 3 weeks, without the patch), my worst experience is > 10.3.3 (less than a day), and 10.3.4/10.3.5 (patch included) lasted a week, > I am going back to 10.2.9, and adding the patch to it. If the patch improves > 10.2.9 the way it did from 10.3.3 -> 10.3.4/10.3.5, i.e. make 10.2.9 lasts a > few months, I'd be happy enough. Hi Hin-Tak, I just would like to add that I have been running for one week without problems after patching Mesa 10.3.3 with the patch in Comment #14. Without the patch, I hung every 10-15 minute. --joern
mesa 10.4.0 also crashed on me (fedora provides it so I thought I'll let my 10.2.9 upgrade to have a go), on the first day. first there are some errors in dmesg (running mplayer , not as much as under 10.3.3); on suspend/resume, the screen lighted up again but not shows anything (e.g. busy on resume, I guess). So my experience so far is that if I ever see any errors under dmesg, I should reboot as soon as is convenient, instead of trying to suspend/resume to continue, as it will not survive a suspend resume. I am going back to 10.2.9 patched, until the next mesa that's not 10.3.5 and 10.4.0...
If there really is a significant difference in stability between Mesa 10.2.y and 10.3.y, it would be interesting if you could isolate which change between them made the difference. However, from your description so far, I'm afraid the difference might just be coincidence, because we don't understand yet what triggers the problem, so it happens 'randomly'.
Am just following the advice Alex gave in comment 2(try different versions of mesa) and reporting on my experience. It would appear that my problem is orthogonal to what the commit "radeonsi: Disable asynchronous DMA except for PIPE_BUFFER" was trying to address. That commit was in 10.3.4/10.3.5 and 10.4.0, and *not* in 10.2.9; but I have had crashes with 10.3.5 after ~5 days of use, 10.4.0 within a day, and 10.2.9 for nearly 4 weeks. 10.2.8: two crashes in 6 weeks. 10.2.9: crash after almost 4 weeks. 10.3.3: crash within first day 10.3.4: insufficient data - used it only for a day or two before 10.3.5 10.3.5: crashed after 5 days 10.4.0: crash within first day My crash-free days with 10.3.x/10.4.0 are measured in days if not hours, but with 10.2.x is in weeks. I'll continue to switch to a newer mesa as it comes out, and if I get burned, go back to the longest crash-free version until the another mesa version comes out. I think there is a bug with xv (so probably either mesa or glamor; does not seem to be sensitive to which version) because some videos plays skewed as in playing a square as: ---------------- / / / / / / / / / / ---------------- It happens only to certain specific videos (vdpau gl and x11 are fine), so I am not sure whether it is a bug in mplayer's use of xv, glamor's implementation of xv, or what. It seems to happens to videos with "Movie-Aspect is 1.xx:1 - prescaling to correct movie aspect." when played, but not all such videos are played badly. I am mentioning this, just in case digging further on that video playing problem might help fix the crash...
(In reply to Hin-Tak Leung from comment #22) > My crash-free days with 10.3.x/10.4.0 are measured in days if not hours, but > with 10.2.x is in weeks. I'll continue to switch to a newer mesa as it comes > out, and if I get burned, go back to the longest crash-free version until > the another mesa version comes out. So apparently there was a change between 10.2 and 10.3 which significantly decreased stability on your system. Without isolating that change, it's unlikely that we can reverse the effect unless we get lucky and an independent change happens to help. > I am mentioning this, just in case digging further on that video playing > problem might help fix the crash... That seems unlikely. Please report the Xv problem at https://bugs.freedesktop.org/enter_bug.cgi?product=xorg , component Driver/glamor.
I had a quick look about 10.2.x vs 10.3.x (specifically, just doing "git log mesa-10.2.8..mesa-10.3.3 | grep '^commit' | wc -l" and vice versa), and it is more like they diverged from their most recent common ancestor by 300 and 3000 commits respectively. Though doing a grep 'chery-pick' , says about 250-300 of those are cheery-picked, so the actual difference might be 50 vs 2700 commits, which is still a big bunch to look at what made 10.3.x unstable. It would be easier if the crashes are more "reproducible". The corrupted video playback issue was filed as: https://bugs.freedesktop.org/show_bug.cgi?id=87455 Just in case it is of interest.
(In reply to Hin-Tak Leung from comment #24) > [...] which is still a big bunch to look at what made 10.3.x > unstable. That's what git bisect is for. It can isolate a change with the minimum number of tests required (approximately log2 of the number of commits between the known good and bad). > It would be easier if the crashes are more "reproducible". Indeed, so if you do try to bisect it, it's important that you test each commit long enough to be sure it's 'more stable' before declaring it as good.
I had 10.2.9 + patch for 24 days before it locked up on the 10th; so 10.2.9 (with or without the extra patch) is still by far the best. I had 10.4.1 for a few days (about 4) before upgrading to 10.4.2; so far I have been on 10.4.2 for 10 days now and it is good enough. The crash with 10.2.9 + patch was with kernel 3.17.7-300.fc21 . I booted to 3.17.8-300.fc21 after that and spent 8 days in it, and another 6 in 3.18.3-200.fc21; so a newer kernel might be contributing too. I'll write again if I can go beyond a month without crash.
Created attachment 166191 [details] screen corruption just before suspend & GPU crash on resume See screenshot. I had been playing a few videos with mplayer -vo xv http://*.mp4 from the BBC web site, on and off; then I did some browsing and noticed a few firefox tabs are corrupted (as shown - only about 3 out of those are). I then checked and see some messages in dmesg: [14452.823499] radeon 0000:00:01.0: GPU fault detected: 146 0x02050004 [14452.823512] radeon 0000:00:01.0: VM_CONTEXT1_PROTECTION_FAULT_ADDR 0x00015A90 [14452.823516] radeon 0000:00:01.0: VM_CONTEXT1_PROTECTION_FAULT_STATUS 0x05000004 [14452.823521] VM fault (0x04, vmid 2) at page 88720, write from 'CB0' (0x43423000) (0) So I suspended anyway; and the GPU crashed on resume. I was able to switch to a vt to reboot safely (not always the case) - so if there is anything say, I can look while the GPU is still stuck, please left me know. Been on 10.4.3 since 29 Jan (just routine fedora upgrade) and never did have problem with 10.4.2 before that, so I guess 10.4.2/10.4.3 is at least as good as 10.2.9 + patch. kernel is 3.18.6-200.fc21.x86_64. oh! I rebooted from from 3.18.5-200.fc21 and libdrm-2.4.58-3.fc21.* -> 2.4.59-4.fc21.* So if the screen corruption or the lock-up a regression from 25 days of goodness under 10.4.2/10.4.3 (and perhaps even a month in you include 10.4.1), than either the kernel or libdrm might be it.
So a quick summary is that later 10.4.x (excluding 10.4.0, which locked up within first day) is about as stable as 10.2.9 + patch. None of 10.3.x tried was any good. I'll just continue with 10.4.x and upgrade as they are available, and hope not to report too often about further lock ups.
5 days, another crash. This time it was a hard lock-up, x server crashed and screen was back-lit but blank; no warnings (dmesg nothing interesting before since boot), and couldn't switch vt either. I was not doing anything interesting - just reading web mail on firefox, and it just suddenly went blank. I haven't any interesting change since last; but since I booted libdrm-2.4.58-3.fc21.* -> 2.4.59-4.fc21.* on 7th, had a crash on 9th 2 days later, another 5 days later, and before that been lock-up-free for almost a whole month, I should have a look at what changed in libdrm-2.4.58-3.fc21.* -> 2.4.59-4.fc21.* . Will write again if libdrm turns out to be interesting.
Have another hard lock up under 10.4.3/10.4.4. Just using mplayer and not running firefox - it was right before a scheduled reboot to upgrade to 10.4.4 so the system have 10.4.4 but 10.4.3 was probably cached - so that's probably not a good idea. So that's 10 days since last crash.
Apparently according to the log, I had a GPU crash on 1st March on suspend, which I did not notice. With kernel 3.18.9-200.fc21.x86_64, besides GPU lock up, the kernel also oops'ed. May it is a good thing? : Mar 17 00:35:43 localhost kernel: [12172.039701] WARNING: CPU: 1 PID: 1938 at drivers/gpu/drm/radeon/radeon_object.c:84 radeon_ttm_bo_destroy+ 0xf1/0x100 [radeon]() ... Mar 17 00:35:43 localhost kernel: [12172.039816] CPU: 1 PID: 1938 Comm: gnome-shell Not tainted 3.18.9-200.fc21.x86_64 #1 Mar 17 00:35:43 localhost kernel: [12172.039820] Hardware name: TOSHIBA SATELLITE C50D-B/ZBWAE, BIOS 1.30 06/06/2014 Mar 17 00:35:43 localhost kernel: [12172.039824] 0000000000000000 00000000a817c8ac ffff880233173b68 ffffffff8175b71c Mar 17 00:35:43 localhost kernel: [12172.039830] 0000000000000000 0000000000000000 ffff880233173ba8 ffffffff81098eb1 Mar 17 00:35:43 localhost kernel: [12172.039835] 0000000000000002 ffff880170180058 ffff880170180000 ffffffffffffffff Mar 17 00:35:43 localhost kernel: [12172.039841] Call Trace: Mar 17 00:35:43 localhost kernel: [12172.039853] [<ffffffff8175b71c>] dump_stack+0x46/0x58 Mar 17 00:35:43 localhost kernel: [12172.039862] [<ffffffff81098eb1>] warn_slowpath_common+0x81/0xa0 Mar 17 00:35:43 localhost kernel: [12172.039868] [<ffffffff81098fca>] warn_slowpath_null+0x1a/0x20 Mar 17 00:35:43 localhost kernel: [12172.039901] [<ffffffffa011c091>] radeon_ttm_bo_destroy+0xf1/0x100 [radeon] Mar 17 00:35:43 localhost kernel: [12172.039919] [<ffffffffa00a41b6>] ttm_bo_release_list+0xa6/0x1a0 [ttm] Mar 17 00:35:43 localhost kernel: [12172.039933] [<ffffffffa00a4575>] ttm_bo_release+0x105/0x250 [ttm] Mar 17 00:35:43 localhost kernel: [12172.039948] [<ffffffffa00a46e9>] ttm_bo_unref+0x29/0x30 [ttm] Mar 17 00:35:43 localhost kernel: [12172.039980] [<ffffffffa011c559>] radeon_bo_unref+0x39/0x70 [radeon] Mar 17 00:35:43 localhost kernel: [12172.040017] [<ffffffffa013182b>] radeon_gem_object_free+0x4b/0x70 [radeon] Mar 17 00:35:43 localhost kernel: [12172.040112] [<ffffffffa00443e7>] drm_gem_object_free+0x27/0x40 [drm] Mar 17 00:35:43 localhost kernel: [12172.040151] [<ffffffffa0044970>] drm_gem_object_handle_unreference_unlocked+0x120/0x130 [drm] Mar 17 00:35:43 localhost kernel: [12172.040182] [<ffffffffa0044a26>] drm_gem_handle_delete+0xa6/0x100 [drm] Mar 17 00:35:43 localhost kernel: [12172.040204] [<ffffffffa00450c5>] drm_gem_close_ioctl+0x25/0x30 [drm] Mar 17 00:35:43 localhost kernel: [12172.040224] [<ffffffffa0045a9f>] drm_ioctl+0x1df/0x680 [drm] Mar 17 00:35:43 localhost kernel: [12172.040236] [<ffffffff811cc452>] ? unmap_region+0xe2/0x130 Mar 17 00:35:43 localhost kernel: [12172.040264] [<ffffffffa00fb04c>] radeon_drm_ioctl+0x4c/0x80 [radeon] Mar 17 00:35:43 localhost kernel: [12172.040283] [<ffffffff812282b0>] do_vfs_ioctl+0x2d0/0x4b0 Mar 17 00:35:43 localhost kernel: [12172.040289] [<ffffffff81228511>] SyS_ioctl+0x81/0xa0 Mar 17 00:35:43 localhost kernel: [12172.040297] [<ffffffff81139b86>] ? __audit_syscall_exit+0x1f6/0x2a0 Mar 17 00:35:43 localhost kernel: [12172.040310] [<ffffffff81762129>] system_call_fastpath+0x12/0x17 Mar 17 00:35:43 localhost kernel: [12172.040314] ---[ end trace 844a94b2a6ea5f19 ]---
Created attachment 170901 [details] the part of /var/log/messages about GPU lock up and oops in 3.18.9-200.fc21.x86_64 gz'ed.
Created attachment 171791 [details] output of: sudo journalctl -b -1 --all --no-pager Hi. I am getting something similar and I think it may be reproducible. Since I upgraded to kernel 4 and also upgraded to: libdrm-git (2.4.60.17.g8dff7a0-1) xf86-video-ati-git (7.5.0.r34.g6291baa-1 mesa-dri-git 10.6.0_devel.68990-1 (i don't know what commit, but it's from 1-2 days ago, currently recompiling with latest commit to retry) mesa-git 10.6.0_devel.68990-1 mesa-libgl-git 10.6.0_devel.68990-1 mesa-vaapi-git 10.6.0_devel.68990-1 mesa-vdpau-git 10.6.0_devel.68990-1 opencl-mesa-git 10.6.0_devel.68990-1 I got one random screen freeze(no blanking though) and system was locked up while viewing a youtube video in chromium and using the volume buttons. (nothing in the logs, probably because the system froze) And this one which is probably reproducible, the next day, I plugged in my webcam, and as soon as vlc was attempting to display it (even though it looked like a black picture) screen froze, then blacked the screen (can't remember if with backlight or not) and then I unplugged the webcam before I shutdown(pressed power button once after Ctrl+Alt+F2 to switch to a non-gfx virtual terminal console) and thus see the log on next boot(journalctl -b -1) Mar 23 15:06:15 manji kernel: radeon 0000:00:01.0: ring 0 stalled for more than 10483msec Mar 23 15:06:15 manji kernel: radeon 0000:00:01.0: GPU lockup (current fence id 0x0000000000012c15 last fence id 0x0000000000012cdc on ring 0) (full log included in attachment) the above messages are the beginning of when it got blank I don't know if this is a new issue because I haven't tried my webcam for at least 1-2 months before this. But I've only recently(1-2days) upgraded to kernel 4, from 3.19. apparently after updating mesa(to try next) the new version is 10.6.0_devel.67962-1 but the old one(with the above error) was 10.6.0_devel.68990-1 using manjaro linux. brb
There were no further lockups for me thus far, although that v4l2_release stack dump I still got when closing vlc(or was it when I unplugged, i forget) but that seems to be a different issue: pasted here https://bugzilla.kernel.org/show_bug.cgi?id=81581#c2
I retried with the sole purpose of reproducing: I plugged in (FSC)webcam while vlc was running, opened Media->Capture Device, selected Video Device name /dev/video0 (i think), Advanced Options, Width: 800, Height: 600, (this was working the last time, but probably as 640x480), hit Ok, then Play, at this time, a 640x480 black vlc window appeared and everything froze, the mouse was still moving(i think) and after a few seconds screen went black(forgot if with backlight or not) then I proceeded to ctrl+alt+del and that's how i saved the log. I didn't unplug the usb webcam this time, so there are no v4l2_release stacktraces anymore. Mar 24 15:15:21 manji kernel: ehci-pci 0000:00:12.2: restoring config space at offset 0x4 (was 0x2b00000, writing 0x2b00012) Mar 24 15:15:21 manji kernel: ehci-pci 0000:00:12.2: PME# disabled Mar 24 15:15:21 manji kernel: ehci-pci 0000:00:12.2: enabling bus mastering Mar 24 15:15:21 manji kernel: device: 'ep_81': device_add Mar 24 15:15:32 manji kernel: radeon 0000:00:01.0: ring 0 stalled for more than 10163msec Mar 24 15:15:32 manji kernel: radeon 0000:00:01.0: GPU lockup (current fence id 0x000000000002ea19 last fence id 0x000000000002ea20 on ring 0) Mar 24 15:15:32 manji kernel: radeon 0000:00:01.0: Saved 226 dwords of commands on ring 0. Mar 24 15:15:32 manji kernel: radeon 0000:00:01.0: GPU softreset: 0x0000000D Mar 24 15:15:32 manji kernel: radeon 0000:00:01.0: GRBM_STATUS = 0xF5702828 Mar 24 15:15:32 manji kernel: radeon 0000:00:01.0: GRBM_STATUS_SE0 = 0xFC000005 Mar 24 15:15:32 manji kernel: radeon 0000:00:01.0: GRBM_STATUS_SE1 = 0x00000007 Mar 24 15:15:32 manji kernel: radeon 0000:00:01.0: SRBM_STATUS = 0x20000840 Mar 24 15:15:32 manji kernel: radeon 0000:00:01.0: SRBM_STATUS2 = 0x00000000 Mar 24 15:15:32 manji kernel: radeon 0000:00:01.0: R_008674_CP_STALLED_STAT1 = 0x00000000 Mar 24 15:15:32 manji kernel: radeon 0000:00:01.0: R_008678_CP_STALLED_STAT2 = 0x400C0000 Mar 24 15:15:32 manji kernel: radeon 0000:00:01.0: R_00867C_CP_BUSY_STAT = 0x00048002 Mar 24 15:15:32 manji kernel: radeon 0000:00:01.0: R_008680_CP_STAT = 0x80268647 Mar 24 15:15:32 manji kernel: radeon 0000:00:01.0: R_00D034_DMA_STATUS_REG = 0x44483106 Mar 24 15:15:32 manji kernel: radeon 0000:00:01.0: GRBM_SOFT_RESET=0x00007F6B Mar 24 15:15:32 manji kernel: radeon 0000:00:01.0: SRBM_SOFT_RESET=0x00100100 Mar 24 15:15:32 manji kernel: radeon 0000:00:01.0: GRBM_STATUS = 0x00003828 Mar 24 15:15:32 manji kernel: radeon 0000:00:01.0: GRBM_STATUS_SE0 = 0x00000007 Mar 24 15:15:32 manji kernel: radeon 0000:00:01.0: GRBM_STATUS_SE1 = 0x00000007 Mar 24 15:15:32 manji kernel: radeon 0000:00:01.0: SRBM_STATUS = 0x20000040 Mar 24 15:15:32 manji kernel: radeon 0000:00:01.0: SRBM_STATUS2 = 0x00000000 Mar 24 15:15:32 manji kernel: radeon 0000:00:01.0: R_008674_CP_STALLED_STAT1 = 0x00000000 Mar 24 15:15:32 manji kernel: radeon 0000:00:01.0: R_008678_CP_STALLED_STAT2 = 0x00000000 Mar 24 15:15:32 manji kernel: radeon 0000:00:01.0: R_00867C_CP_BUSY_STAT = 0x00000000 Mar 24 15:15:32 manji kernel: radeon 0000:00:01.0: R_008680_CP_STAT = 0x00000000 Mar 24 15:15:32 manji kernel: radeon 0000:00:01.0: R_00D034_DMA_STATUS_REG = 0x44C83D57 Mar 24 15:15:32 manji kernel: radeon 0000:00:01.0: GPU reset succeeded, trying to resume Mar 24 15:15:32 manji kernel: [drm] Found smc ucode version: 0x00011100 Mar 24 15:15:32 manji kernel: [drm] PCIE GART of 1024M enabled (table at 0x0000000000274000). Mar 24 15:15:32 manji kernel: radeon 0000:00:01.0: WB enabled Mar 24 15:15:32 manji kernel: radeon 0000:00:01.0: fence driver on ring 0 use gpu addr 0x0000000020000c00 and cpu addr 0xffff8804099a5c00 Mar 24 15:15:32 manji kernel: radeon 0000:00:01.0: fence driver on ring 3 use gpu addr 0x0000000020000c0c and cpu addr 0xffff8804099a5c0c Mar 24 15:15:32 manji kernel: radeon 0000:00:01.0: fence driver on ring 5 use gpu addr 0x0000000000072118 and cpu addr 0xffffc90005d32118 Mar 24 15:15:32 manji kernel: [drm] ring test on 0 succeeded in 1 usecs Mar 24 15:15:32 manji kernel: [drm] ring test on 3 succeeded in 3 usecs Mar 24 15:15:32 manji kernel: [drm] ring test on 5 succeeded in 1 usecs Mar 24 15:15:33 manji kernel: [drm] UVD initialized successfully. Mar 24 15:15:33 manji kernel: [drm:radeon_dp_link_train] *ERROR* displayport link status failed Mar 24 15:15:33 manji kernel: [drm:radeon_dp_link_train] *ERROR* clock recovery failed Mar 24 15:15:33 manji kernel: [drm] ib test on ring 0 succeeded in 0 usecs Mar 24 15:15:33 manji kernel: [drm] ib test on ring 3 succeeded in 0 usecs Mar 24 15:15:33 manji kernel: i2c i2c-8: master_xfer[0] W, addr=0x50, len=1 Mar 24 15:15:33 manji kernel: i2c i2c-8: master_xfer[1] R, addr=0x50, len=8 Mar 24 15:15:33 manji kernel: [drm] ib test on ring 5 succeeded Mar 24 15:15:35 manji kernel: r8169 0000:01:00.0 net0: link down Mar 24 15:15:41 manji kernel: r8169 0000:01:00.0: PME# enabled Mar 24 15:15:44 manji kernel: radeon 0000:00:01.0: ring 0 stalled for more than 10490msec Mar 24 15:15:44 manji kernel: radeon 0000:00:01.0: GPU lockup (current fence id 0x000000000002ea3a last fence id 0x000000000002ea76 on ring 0) Mar 24 15:15:44 manji kernel: radeon 0000:00:01.0: ring 0 stalled for more than 10990msec Mar 24 15:15:44 manji kernel: radeon 0000:00:01.0: GPU lockup (current fence id 0x000000000002ea3a last fence id 0x000000000002ea76 on ring 0) Mar 24 15:15:45 manji kernel: radeon 0000:00:01.0: ring 0 stalled for more than 11490msec Mar 24 15:15:45 manji kernel: radeon 0000:00:01.0: GPU lockup (current fence id 0x000000000002ea3a last fence id 0x000000000002ea76 on ring 0) Mar 24 15:15:45 manji kernel: radeon 0000:00:01.0: ring 0 stalled for more than 11990msec Mar 24 15:15:45 manji kernel: radeon 0000:00:01.0: GPU lockup (current fence id 0x000000000002ea3a last fence id 0x000000000002ea76 on ring 0) Mar 24 15:15:46 manji kernel: radeon 0000:00:01.0: ring 0 stalled for more than 12490msec Mar 24 15:15:46 manji kernel: radeon 0000:00:01.0: GPU lockup (current fence id 0x000000000002ea3a last fence id 0x000000000002ea76 on ring 0) Mar 24 15:15:46 manji kernel: radeon 0000:00:01.0: ring 0 stalled for more than 12990msec Mar 24 15:15:46 manji kernel: radeon 0000:00:01.0: GPU lockup (current fence id 0x000000000002ea3a last fence id 0x000000000002ea76 on ring 0) Mar 24 15:15:47 manji kernel: radeon 0000:00:01.0: ring 0 stalled for more than 13490msec Mar 24 15:15:47 manji kernel: radeon 0000:00:01.0: GPU lockup (current fence id 0x000000000002ea3a last fence id 0x000000000002ea76 on ring 0) Mar 24 15:15:47 manji kernel: radeon 0000:00:01.0: ring 0 stalled for more than 13990msec Mar 24 15:15:47 manji kernel: radeon 0000:00:01.0: GPU lockup (current fence id 0x000000000002ea3a last fence id 0x000000000002ea76 on ring 0) Mar 24 15:15:48 manji kernel: radeon 0000:00:01.0: ring 0 stalled for more than 14490msec Mar 24 15:15:48 manji kernel: radeon 0000:00:01.0: GPU lockup (current fence id 0x000000000002ea3a last fence id 0x000000000002ea76 on ring 0) Mar 24 15:15:48 manji kernel: radeon 0000:00:01.0: ring 0 stalled for more than 14990msec Mar 24 15:15:48 manji kernel: radeon 0000:00:01.0: GPU lockup (current fence id 0x000000000002ea3a last fence id 0x000000000002ea76 on ring 0) Mar 24 15:15:49 manji kernel: radeon 0000:00:01.0: Saved 1906 dwords of commands on ring 0. Mar 24 15:15:49 manji kernel: radeon 0000:00:01.0: GPU softreset: 0x00000009 Mar 24 15:15:49 manji kernel: radeon 0000:00:01.0: GRBM_STATUS = 0xF5702828 Mar 24 15:15:49 manji kernel: radeon 0000:00:01.0: GRBM_STATUS_SE0 = 0xFC000005 Mar 24 15:15:49 manji kernel: radeon 0000:00:01.0: GRBM_STATUS_SE1 = 0x00000007 Mar 24 15:15:49 manji kernel: radeon 0000:00:01.0: SRBM_STATUS = 0x20000840 Mar 24 15:15:49 manji kernel: radeon 0000:00:01.0: SRBM_STATUS2 = 0x00000000 Mar 24 15:15:49 manji kernel: radeon 0000:00:01.0: R_008674_CP_STALLED_STAT1 = 0x00000000 Mar 24 15:15:49 manji kernel: radeon 0000:00:01.0: R_008678_CP_STALLED_STAT2 = 0x400C0000 Mar 24 15:15:49 manji kernel: radeon 0000:00:01.0: R_00867C_CP_BUSY_STAT = 0x00048002 Mar 24 15:15:49 manji kernel: radeon 0000:00:01.0: R_008680_CP_STAT = 0x80268647 Mar 24 15:15:49 manji kernel: radeon 0000:00:01.0: R_00D034_DMA_STATUS_REG = 0x44C83D57 Mar 24 15:15:49 manji kernel: radeon 0000:00:01.0: GRBM_SOFT_RESET=0x00007F6B Mar 24 15:15:49 manji kernel: radeon 0000:00:01.0: SRBM_SOFT_RESET=0x00000100 Mar 24 15:15:49 manji kernel: radeon 0000:00:01.0: GRBM_STATUS = 0x00003828 Mar 24 15:15:49 manji kernel: radeon 0000:00:01.0: GRBM_STATUS_SE0 = 0x00000007 Mar 24 15:15:49 manji kernel: radeon 0000:00:01.0: GRBM_STATUS_SE1 = 0x00000007 Mar 24 15:15:49 manji kernel: radeon 0000:00:01.0: SRBM_STATUS = 0x20000040 Mar 24 15:15:49 manji kernel: radeon 0000:00:01.0: SRBM_STATUS2 = 0x00000000 Mar 24 15:15:49 manji kernel: radeon 0000:00:01.0: R_008674_CP_STALLED_STAT1 = 0x00000000 Mar 24 15:15:49 manji kernel: radeon 0000:00:01.0: R_008678_CP_STALLED_STAT2 = 0x00000000 Mar 24 15:15:49 manji kernel: radeon 0000:00:01.0: R_00867C_CP_BUSY_STAT = 0x00000000 Mar 24 15:15:49 manji kernel: radeon 0000:00:01.0: R_008680_CP_STAT = 0x00000000 Mar 24 15:15:49 manji kernel: radeon 0000:00:01.0: R_00D034_DMA_STATUS_REG = 0x44C83D57 Mar 24 15:15:49 manji kernel: radeon 0000:00:01.0: GPU reset succeeded, trying to resume Mar 24 15:15:49 manji kernel: [drm] Found smc ucode version: 0x00011100 Mar 24 15:15:49 manji kernel: [drm] PCIE GART of 1024M enabled (table at 0x0000000000274000). Mar 24 15:15:49 manji kernel: radeon 0000:00:01.0: WB enabled Mar 24 15:15:49 manji kernel: radeon 0000:00:01.0: fence driver on ring 0 use gpu addr 0x0000000020000c00 and cpu addr 0xffff8804099a5c00 Mar 24 15:15:49 manji kernel: radeon 0000:00:01.0: fence driver on ring 3 use gpu addr 0x0000000020000c0c and cpu addr 0xffff8804099a5c0c Mar 24 15:15:49 manji kernel: radeon 0000:00:01.0: fence driver on ring 5 use gpu addr 0x0000000000072118 and cpu addr 0xffffc90005d32118 Mar 24 15:15:49 manji kernel: [drm] ring test on 0 succeeded in 1 usecs Mar 24 15:15:49 manji kernel: [drm] ring test on 3 succeeded in 3 usecs Mar 24 15:15:49 manji kernel: [drm] ring test on 5 succeeded in 1 usecs Mar 24 15:15:49 manji kernel: [drm] UVD initialized successfully. Mar 24 15:15:49 manji kernel: [drm:radeon_dp_link_train] *ERROR* displayport link status failed Mar 24 15:15:49 manji kernel: [drm:radeon_dp_link_train] *ERROR* clock recovery failed Mar 24 15:15:49 manji kernel: [drm] ib test on ring 0 succeeded in 0 usecs Mar 24 15:15:49 manji kernel: [drm] ib test on ring 3 succeeded in 0 usecs Mar 24 15:15:49 manji kernel: i2c i2c-8: master_xfer[0] W, addr=0x50, len=1 Mar 24 15:15:49 manji kernel: i2c i2c-8: master_xfer[1] R, addr=0x50, len=8 Mar 24 15:15:49 manji kernel: [drm] ib test on ring 5 succeeded Mar 24 15:15:50 manji systemd[1]: Starting Getty on tty2... Mar 24 15:15:50 manji systemd[1]: Started Getty on tty2. Mar 24 15:15:50 manji acpid[2529]: client 3299[1000:100] has disconnected Mar 24 15:15:50 manji systemd[1]: Received SIGINT. ... at this point rebooting was in progress And I also had added radeon.hard_reset=1 at kernel cmdline (since my last dmesg) because I saw in another bug that that helped someone fix it, but apparently not for me. Mar 24 14:40:57 manji kernel: Linux version 4.0.0-rc5-gbc465aa (emacs@manji) (gcc version 4.9.2 20150304 (prerelease) (GCC) ) #56 SMP Mon Mar 23 14:50:12 CET 2015 Mar 24 14:40:57 manji kernel: Command line: BOOT_IMAGE=/vmlinuz-linux-git root=UUID=bfa4ab6e-19a3-4601-ba2b-267c55841c73 rw cryptdevice=/dev/disk/by-uuid/70c08890-417a-497d-b6ab-c0d0357a63e2:cryptManjaro:allow-discards ipv6.disable=1 pnp.debug=1 loglevel=9 log_buf_len=10M printk.always_kmsg_dump=y printk.time=y mminit_loglevel=0 memory_corruption_check=1 fbcon=scrollback:4096k fbcon=font:ProFont6x11 apic=debug earlyprintk=vga dynamic_debug.verbose=1 "dyndbg=file arch/x86/kernel/apic/* +pflmt ; file drivers/video/* +pflmt ; file drivers/acpi/* +pflmt" i8042.debug acpi_backlight=vendor radeon.hard_reset=1 I am willing to test patches or any suggestions... i have time. Cheers
Created attachment 172001 [details] dmesg ok, I can always reproduce this (tried thrice) just by openning webcam in vlc, twice: the second time fails. 1. plug in usb webcam (and never unplug it) 2. reboot (not needed, but hey) 3. modprobe uvcvideo 4. run vlc Media->Open Capture Device, /dev/video0, Play 5. that works ok, now exit vlc 6. do step 4 and 5 again, this time the screen will freeze, mouse will keep moving, and the vlc window has black webcam screen instead of actual webcam screen. 7. after like 10 sec the screen blanks without backlight 8. +- a few seconds later, I can do Alt+Ctrl+F2 to switch to virtual terminal number 2 which isn't graphic to then can do Ctrl+Alt+Del to reboot and thus save the log (included in attachment) Step 1 and 3 do not need to be in order (ignoring step 2 that is)
Created attachment 172011 [details] instant blanking without recovery when radeon.lockup_timeout=20 I accidentally added kernel param: radeon.lockup_timeout=20 which made it blank in firefox without being locked up or frozen first, and remained blank, but i was able to ctrl+alt+del of course. But this tells me that the (soft?)reset gpu thing isn't working. Is there a way to make it work? If that doesn't work normally, that explains why when it really locks up, it can't get back to life again unless a warm boot happens. If this would work, then I assume it would be able to recover from the current issue being described in this thread. the log for this case is attached (search for: 20msec ) the problem i guess are these: Mar 24 17:00:42 manji kernel: [drm:radeon_dp_link_train] *ERROR* displayport link status failed Mar 24 17:00:42 manji kernel: [drm:radeon_dp_link_train] *ERROR* clock recovery failed btw the correct param I wanted was: radeon.lockup_timeout=20000 for 20 sec. and by now I have also added some extra kernel params: nohz=on rcu_nocbs=1-3 pcie_aspm=force radeon.audio=0 radeon.lockup_timeout=20000 radeon.test=0 radeon.agpmode=-1 radeon.benchmark=0 radeon.tv=0 radeon.hard_reset=1 radeon.aspm=1 radeon.msi=1 radeon.pcie_gen2=-1 radeon.no_wb=1 radeon.dynclks=1 radeon.r4xx_atom=0 radeonfb radeon.fastfb=1 radeon.modeset=1 radeon.dpm=1 radeon.runpm=1 which still allow me to reproduce the issue described in the previous comment.
Emanuel, please file your own report for the webcam related hang.
Created attachment 172311 [details] same error as OP (dmesg) Sorry, I thought it was the same thing(bug). I actually did get the same error as OP once (*ERROR* radeon: failed testing IB on GFX ring (-35)) with radeon.lockup_timeout=4 (but was intended to be 4000) My apologies though.
I have been using mesa-10.4.7 since 02 Apr 2015, (and hadn't got a crash since March 17, most of that on mesa 10.4.6, I think). So 10.4.7 itself is certainly as good as the end of 10.2.x series. FYI, the video playback issues I mentioned in comment 22 were filed and fixed: https://bugs.freedesktop.org/show_bug.cgi?id=87455 corrupted xv video playback https://bugzilla.redhat.com/show_bug.cgi?id=1213021 gnome-shell and mutter mis-use PAspect for XSetWMNormalHints() So at the moment, my X is working as good as I hope it would be - and if I don't get a GPU lock up in another month - when I upgrade to fc22, which may break something -, I'd probably say the problem has somehow disappeared. Michel: I see you were the one who fixed the X glamor bug, so thanks! Emanuel: I really don't think that filing bugs when using so many experimental versions (kernel, mesa, etc) is constructive - dev codes are what they are. Can you try at least to see which of the experimental things cause your problem, since you can crash "reliably"?
I discovered on Ubuntu, that from 17.04 (since using glamor rather than EXA), I do get similar bug on resume on a RS780C (Radeon 3100 card [2008]). Problem gone away when using EXA acceleration... on Mint 20.2 at least. Details in: https://bugs.launchpad.net/bugs/1944991