The problem can be exists with linux 2.6.38-rc* (tested rc5 to rc8), but not with linux 2.6.37 or earlier. Reproduce: Browsing the web with Chromium (tested v9 and v10). After some minutes, the screen gets partly freezed, colors are mixed, font isn't rendered correctly, windows can't be closed, mouse cursor is corrupted etc. It is hardly possible to shut down the X server. If this succeeds, it is not possible to restart it. The whole system has to be rebooted. dmesg extract: [drm:i915_hangcheck_elapsed] *ERROR* Hangcheck timer elapsed... GPU hung [drm:i915_do_wait_request] *ERROR* i915_do_wait_request returns -11 (awaiting 118947 at 118945, next 118948) [drm:i915_reset] *ERROR* Failed to reset chip. Software: Gentoo Linux xorg-server-1.9.4 xf86-video-intel-2.14.0 libdrm-2.4.23 mesa-7.9.1 KDE 4.4.5 Hardware: Thinkpad Z60m lspci 00:00.0 Host bridge: Intel Corporation Mobile 915GM/PM/GMS/910GML Express Processor to DRAM Controller (rev 03) 00:02.0 VGA compatible controller: Intel Corporation Mobile 915GM/GMS/910GML Express Graphics Controller (rev 03) 00:02.1 Display controller: Intel Corporation Mobile 915GM/GMS/910GML Express Graphics Controller (rev 03) 00:1b.0 Audio device: Intel Corporation 82801FB/FBM/FR/FW/FRW (ICH6 Family) High Definition Audio Controller (rev 03) 00:1c.0 PCI bridge: Intel Corporation 82801FB/FBM/FR/FW/FRW (ICH6 Family) PCI Express Port 1 (rev 03) 00:1c.1 PCI bridge: Intel Corporation 82801FB/FBM/FR/FW/FRW (ICH6 Family) PCI Express Port 2 (rev 03) 00:1c.2 PCI bridge: Intel Corporation 82801FB/FBM/FR/FW/FRW (ICH6 Family) PCI Express Port 3 (rev 03) 00:1c.3 PCI bridge: Intel Corporation 82801FB/FBM/FR/FW/FRW (ICH6 Family) PCI Express Port 4 (rev 03) 00:1d.0 USB Controller: Intel Corporation 82801FB/FBM/FR/FW/FRW (ICH6 Family) USB UHCI #1 (rev 03) 00:1d.1 USB Controller: Intel Corporation 82801FB/FBM/FR/FW/FRW (ICH6 Family) USB UHCI #2 (rev 03) 00:1d.2 USB Controller: Intel Corporation 82801FB/FBM/FR/FW/FRW (ICH6 Family) USB UHCI #3 (rev 03) 00:1d.3 USB Controller: Intel Corporation 82801FB/FBM/FR/FW/FRW (ICH6 Family) USB UHCI #4 (rev 03) 00:1d.7 USB Controller: Intel Corporation 82801FB/FBM/FR/FW/FRW (ICH6 Family) USB2 EHCI Controller (rev 03) 00:1e.0 PCI bridge: Intel Corporation 82801 Mobile PCI Bridge (rev d3) 00:1f.0 ISA bridge: Intel Corporation 82801FBM (ICH6M) LPC Interface Bridge (rev 03) 00:1f.2 IDE interface: Intel Corporation 82801FBM (ICH6M) SATA Controller (rev 03) 00:1f.3 SMBus: Intel Corporation 82801FB/FBM/FR/FW/FRW (ICH6 Family) SMBus Controller (rev 03) 02:00.0 Ethernet controller: Broadcom Corporation NetXtreme BCM5751M Gigabit Ethernet PCI Express (rev 11) 14:00.0 CardBus bridge: Ricoh Co Ltd RL5c476 II (rev b3) 14:00.1 FireWire (IEEE 1394): Ricoh Co Ltd R5C552 IEEE 1394 Controller (rev 08) 14:00.2 SD Host controller: Ricoh Co Ltd R5C822 SD/SDIO/MMC/MS/MSPro Host Adapter (rev 17) 14:00.3 System peripheral: Ricoh Co Ltd R5C592 Memory Stick Bus Host Adapter (rev 08) 14:02.0 Network controller: Intel Corporation PRO/Wireless 2915ABG [Calexico2] Network Connection (rev 05)
Created attachment 51252 [details] Fix tiling corruption. This is the v2.6.38 version of the patch to fix tiling corruption of which Chromium suffers.
Thank you, unfortunately the patch does not apply on 2.6.38: $ git am ../0001-drm-i915-Fix-tiling-corruption-from-pipelined-fencin.patch Applying: drm/i915: Fix tiling corruption from pipelined fencing error: patch failed: drivers/gpu/drm/i915/i915_gem.c:2601 error: drivers/gpu/drm/i915/i915_gem.c: patch does not apply Patch failed at 0001 drm/i915: Fix tiling corruption from pipelined fencing A more recent kernel from Linus' tree (5bab188a316718a26346cdb25c4cc6b319f8f907) crashes on my system even before X starts, so I can't test your patches based on this version.
Today I upgraded Chromium from 10.0.648.133 to 10.0.648.204, and now the problem is not reproducible anymore. I tested Kernels 2.6.38-rc8 and 2.6.38 from Linus' tree without any additional patches. I can't decide whether there is still need for a fix in the kernel.
*** Bug 32102 has been marked as a duplicate of this bug. ***
Is this bug the same as bug 30512 ?
(In reply to comment #5) > Is this bug the same as bug 30512 ? No. Different class of hardware unaffected by the pipelined fencing bug. That smells like a userspace driver bug cross-posted to kernel.org.
(In reply to comment #3) > Today I upgraded Chromium from 10.0.648.133 to 10.0.648.204, and now the > problem is not reproducible anymore. I tested Kernels 2.6.38-rc8 and 2.6.38 > from Linus' tree without any additional patches. > > I can't decide whether there is still need for a fix in the kernel. Well your description matched others where the patch was useful, so assuming that it would have helped here: commit 29c5a587284195278e233eec5c2234c24fb2c204 Author: Chris Wilson <chris@chris-wilson.co.uk> Date: Thu Mar 17 15:23:22 2011 +0000 drm/i915: Fix tiling corruption from pipelined fencing ... even though it was disabled. A mistake in the handling of fence reuse caused us to skip the vital delay of waiting for the object to finish rendering before changing the register. This resulted in us changing the fence register whilst the bo was active and so causing the blits to complete using the wrong stride or even the wrong tiling. (Visually the effect is that small blocks of the screen look like they have been interlaced). The fix is to wait for the GPU to finish using the memory region pointed to by the fence before changing it. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=34584 Cc: Andy Whitcroft <apw@canonical.com> Cc: Daniel Vetter <daniel.vetter@ffwll.ch> Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch> [Note for 2.6.38-stable, we need to reintroduce the interruptible passing] Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Tested-by: Dave Airlie <airlied@linux.ie>
(In reply to comment #7) > (In reply to comment #3) > > Today I upgraded Chromium from 10.0.648.133 to 10.0.648.204, and now the > > problem is not reproducible anymore. I tested Kernels 2.6.38-rc8 and 2.6.38 > > from Linus' tree without any additional patches. I was wrong, I can still reproduce this bug with 2.6.38 and 2.6.39-rc6. > > I can't decide whether there is still need for a fix in the kernel. > > Well your description matched others where the patch was useful, so assuming > that it would have helped here: > > commit 29c5a587284195278e233eec5c2234c24fb2c204 > Author: Chris Wilson <chris@chris-wilson.co.uk> > Date: Thu Mar 17 15:23:22 2011 +0000 > > drm/i915: Fix tiling corruption from pipelined fencing > > ... even though it was disabled. A mistake in the handling of fence reuse > caused us to skip the vital delay of waiting for the object to finish > rendering before changing the register. This resulted in us changing the > fence register whilst the bo was active and so causing the blits to > complete using the wrong stride or even the wrong tiling. (Visually the > effect is that small blocks of the screen look like they have been > interlaced). The fix is to wait for the GPU to finish using the memory > region pointed to by the fence before changing it. > > Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=34584 > Cc: Andy Whitcroft <apw@canonical.com> > Cc: Daniel Vetter <daniel.vetter@ffwll.ch> > Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch> > [Note for 2.6.38-stable, we need to reintroduce the interruptible > passing] > Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> > Tested-by: Dave Airlie <airlied@linux.ie> As the problem still exists in 2.6.39-rc6 and I cannot see this patch either in mainline nor in 2.6.38-stable, what will be the next step? Please also notice that I wasn't able to test it since it does not apply against .38.
Fixed by commit 29c5a587284195278e233eec5c2234c24fb2c204 .
(In reply to comment #9) > Fixed by commit 29c5a587284195278e233eec5c2234c24fb2c204 . I'm sorry, but this issue isn't solved completely. I can still reproduce errors as following just by using Chromium on some Javascript and image intensive websites: [drm:i915_hangcheck_elapsed] *ERROR* Hangcheck timer elapsed... GPU hung [drm:i915_wait_request] *ERROR* i915_wait_request returns -11 (awaiting 59733 at 59731, next 59734) [drm:i915_reset] *ERROR* Failed to reset chip. I'm currently using: Linux 2.6.39 xorg-server 1.9.5 libdrm 2.4.25 xf86-video-intel 2.14.0 mesa 7.10.2 chromium 11.0.696.68
Just to complete the picture: Since 2.6.39, the effects of this GPU hangers aren't that drastic anymore; the system remains responsible. But here is what I just experienced: - surfed with Chromium on google.com - suddenly: X-Server completely blocked for about 2 sec - X-Server recovered somehow, but lots of colors are confused, fonts are rendered incorrectly - But windows still could be moved, closed etc. - dmesg says "GPU hung" like in comment #10 - to be evil, now try to run glxgears - X/kdm (?) crashed and restarted immediately, I log back in - colors / fonts look fine now! - try to run glxgears again: works fine now! - the only thing that's still not working is XVideo I will be testing 3.0-rcx sooner or later, hopefully we get this issue fixed for 3.0 ;)
This issue is solved in the meanwhile.
Do you have a pointer to the fix? Else we rather close this as unreproducible.
(In reply to comment #13) > Do you have a pointer to the fix? Not really. v3.0 works, v2.6.39 doesn't, but I don't know which exact commit fixed it. > Else we rather close this as unreproducible. I'm fine with unreproducible, thanks.