Basically, some time after login I experience GPU stall, after that the X environment becomes unusable due to more errors ("couldn't schedule IB"),so I've to switch to console to reboot the PC. If I'm not mistaken, I hit this as soon as I tried 3.1 (rc2, perhaps even rc1 ?) but at this point I'm not sure about this. Kernel 3.0 (+some stuff from airlied - radeon-testing/fixes) works fine. I've no idea how to reproduce this - usually I'm running only Firefox and/or Thunderbird and some terminal windows (ssh sessions, rtorrent). It can happen a few minutes after login or after a few hours, the one below actually took some time as normally it's within a hour. I'm using... - Radeon 6850 - Gentoo/AMD64 system - Xfce/xfwm v4.8.1 w/ compositor enabled - xorg-server 1.10.4/1.11 - ati-driver (r600g), libdrm and mesa from git I could try to bisect this but due to nature of the bug I don't know how long it would take. Sep 01 15:06:00 [kernel] [15000.167025] radeon 0000:01:00.0: GPU lockup CP stall for more than 10036msec Sep 01 15:06:00 [kernel] [15000.167032] ------------[ cut here ]------------ Sep 01 15:06:00 [kernel] [15000.167046] WARNING: at drivers/gpu/drm/radeon/radeon_fence.c:267 radeon_fence_wait+0x39f/0x3d0() Sep 01 15:06:00 [kernel] [15000.167053] Hardware name: GA-MA78G-DS3H Sep 01 15:06:00 [kernel] [15000.167058] GPU lockup (waiting for 0x000A529F last fence id 0x000A529C) Sep 01 15:06:00 [kernel] [15000.167062] Modules linked in: reiserfs Sep 01 15:06:00 [kernel] [15000.167073] Pid: 3280, comm: X Not tainted 3.1.0-rc4-00131-g9e79e3e #242 Sep 01 15:06:00 [kernel] [15000.167078] Call Trace: Sep 01 15:06:00 [kernel] [15000.167092] [<ffffffff810694ab>] ? warn_slowpath_common+0x7b/0xc0 Sep 01 15:06:00 [kernel] [15000.167101] [<ffffffff810695a5>] ? warn_slowpath_fmt+0x45/0x50 Sep 01 15:06:00 [kernel] [15000.167111] [<ffffffff812ea3cf>] ? radeon_fence_wait+0x39f/0x3d0 Sep 01 15:06:00 [kernel] [15000.167119] [<ffffffff81084be0>] ? wake_up_bit+0x40/0x40 Sep 01 15:06:00 [kernel] [15000.167129] [<ffffffff812b47ef>] ? ttm_bo_wait+0x10f/0x1b0 Sep 01 15:06:00 [kernel] [15000.167139] [<ffffffff8130413f>] ? radeon_gem_wait_idle_ioctl+0x8f/0x110 Sep 01 15:06:00 [kernel] [15000.167147] [<ffffffff8129d4e1>] ? drm_ioctl+0x401/0x4a0 Sep 01 15:06:00 [kernel] [15000.167156] [<ffffffff813040b0>] ? radeon_gem_set_tiling_ioctl+0xb0/0xb0 Sep 01 15:06:00 [kernel] [15000.167164] [<ffffffff810773a8>] ? set_current_blocked+0x38/0x60 Sep 01 15:06:00 [kernel] [15000.167172] [<ffffffff81031d2a>] ? do_signal+0x21a/0x770 Sep 01 15:06:00 [kernel] [15000.167181] [<ffffffff8110da7c>] ? do_vfs_ioctl+0x9c/0x540 Sep 01 15:06:00 [kernel] [15000.167188] [<ffffffff810773a8>] ? set_current_blocked+0x38/0x60 Sep 01 15:06:00 [kernel] [15000.167195] [<ffffffff810324f8>] ? sys_rt_sigreturn+0x1e8/0x200 Sep 01 15:06:00 [kernel] [15000.167203] [<ffffffff8110df69>] ? sys_ioctl+0x49/0x80 Sep 01 15:06:00 [kernel] [15000.167212] [<ffffffff815d6b7b>] ? system_call_fastpath+0x16/0x1b Sep 01 15:06:00 [kernel] [15000.167218] ---[ end trace f6bfd0dc5ce37413 ]--- Sep 01 15:06:00 [kernel] [15000.168398] radeon 0000:01:00.0: GPU softreset Sep 01 15:06:00 [kernel] [15000.168405] radeon 0000:01:00.0: GRBM_STATUS=0xA0003828 Sep 01 15:06:00 [kernel] [15000.168410] radeon 0000:01:00.0: GRBM_STATUS_SE0=0x00000007 Sep 01 15:06:00 [kernel] [15000.168416] radeon 0000:01:00.0: GRBM_STATUS_SE1=0x00000007 Sep 01 15:06:00 [kernel] [15000.168422] radeon 0000:01:00.0: SRBM_STATUS=0x20020EC0 Sep 01 15:06:00 [kernel] [15000.345258] radeon 0000:01:00.0: Wait for MC idle timedout ! Sep 01 15:06:00 [kernel] [15000.345265] radeon 0000:01:00.0: GRBM_SOFT_RESET=0x00007F6B Sep 01 15:06:00 [kernel] [15000.345372] radeon 0000:01:00.0: GRBM_STATUS=0x00003828 Sep 01 15:06:00 [kernel] [15000.345377] radeon 0000:01:00.0: GRBM_STATUS_SE0=0x00000007 Sep 01 15:06:00 [kernel] [15000.345383] radeon 0000:01:00.0: GRBM_STATUS_SE1=0x00000007 Sep 01 15:06:00 [kernel] [15000.345388] radeon 0000:01:00.0: SRBM_STATUS=0x200206C0 Sep 01 15:06:00 [kernel] [15000.346396] radeon 0000:01:00.0: GPU reset succeed Sep 01 15:06:00 [kernel] [15000.556795] radeon 0000:01:00.0: Wait for MC idle timedout ! Sep 01 15:06:00 [kernel] [15000.744306] radeon 0000:01:00.0: Wait for MC idle timedout ! Sep 01 15:06:00 [kernel] [15000.747925] radeon 0000:01:00.0: WB enabled Sep 01 15:06:00 [kernel] [15000.969041] [drm:r600_ring_test] *ERROR* radeon: ring test failed (scratch(0x8504)=0xCAFEDEAD) Sep 01 15:06:00 [kernel] [15000.969049] [drm:evergreen_resume] *ERROR* evergreen startup failed on resume Sep 01 15:06:00 [kernel] [15000.977509] [drm:radeon_ib_schedule] *ERROR* radeon: couldn't schedule IB(15). Sep 01 15:06:00 [kernel] [15000.977518] [drm:radeon_cs_ioctl] *ERROR* Failed to schedule IB ! Sep 01 15:06:00 [kernel] [15000.982304] [drm:radeon_ib_schedule] *ERROR* radeon: couldn't schedule IB(0). Sep 01 15:06:00 [kernel] [15000.982307] [drm:radeon_cs_ioctl] *ERROR* Failed to schedule IB ! Sep 01 15:06:00 [kernel] [15000.984280] [drm:radeon_ib_schedule] *ERROR* radeon: couldn't schedule IB(1). Sep 01 15:06:00 [kernel] [15000.984283] [drm:radeon_cs_ioctl] *ERROR* Failed to schedule IB ! Sep 01 15:06:00 [kernel] [15000.984819] [drm:radeon_ib_schedule] *ERROR* radeon: couldn't schedule IB(2). Sep 01 15:06:00 [kernel] [15000.984821] [drm:radeon_cs_ioctl] *ERROR* Failed to schedule IB !
This might be the same as bug 42162, which I bisected. You could try if commit b03e7495a862b028294f59fc87286d6d78ee7fa1 is the first bad commit...
OK, after some "extensive" testing (as in glxgears x 20, duh) b03e7495a862b028294f59fc87286d6d78ee7fa1 "crashed" near 30 minute mark. Currently I'm running 5f66d2b58ca879e70740c82422354144845d6dd3, lets see what happens. As a side note, I've noticed you are also using Gigabyte mobo... I might be oversensitive here but I encountered some strange bugs before, bugs which can be annoying yet I didn't saw many ppl reporting it.
More than 8 hours have passed without a single glitch (same as before.. glxgears, web browser, wesnoth, etc.) so yeah, it appears that b03e7495a862b028294f59fc87286d6d78ee7fa1 is most likely the cause.
Should mark this as a dupe of bug 42162 then. Does the patch on bug 42162 help?
Yes, it seems to be the same issue. Currently I'm testing the patch, I'll notify you later if it worked for me.
The patch works, thanks. *** This bug has been marked as a duplicate of bug 42162 ***