Bug 35122 - [drm:i915_hangcheck_elapsed] *ERROR* Hangcheck timer elapsed... GPU hung
Summary: [drm:i915_hangcheck_elapsed] *ERROR* Hangcheck timer elapsed... GPU hung
Status: RESOLVED INVALID
Alias: None
Product: Drivers
Classification: Unclassified
Component: Video(DRI - Intel) (show other bugs)
Hardware: All Linux
: P1 normal
Assignee: drivers_video-dri-intel@kernel-bugs.osdl.org
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2011-05-15 12:30 UTC by Toralf Förster
Modified: 2012-03-25 13:03 UTC (History)
2 users (show)

See Also:
Kernel Version: 3.2.2
Subsystem:
Regression: No
Bisected commit-id:


Attachments
/sys/kernel/debug/dri/*/i915_error_state (7.79 KB, text/plain)
2011-05-15 12:30 UTC, Toralf Förster
Details
/sys/kernel/debug/dri/0/i915_error_state (4.79 KB, text/plain)
2011-08-27 15:28 UTC, Toralf Förster
Details
head -n 200 /sys/kernel/debug/dri/0/i915_error_state > error_state (4.93 KB, application/octet-stream)
2012-01-26 21:12 UTC, Toralf Förster
Details
gdb back trace (13.42 KB, text/plain)
2012-01-26 21:24 UTC, Toralf Förster
Details

Description Toralf Förster 2011-05-15 12:30:43 UTC
Created attachment 57922 [details]
/sys/kernel/debug/dri/*/i915_error_state

When I run the winetest suite for 1.3.20 with enabled gallium driver for "i965" and "sw" under an almost stable Gentoo:

OpenGL vendor string:                   Tungsten Graphics, Inc
OpenGL renderer string:                 Mesa DRI Mobile Intel® GM45 Express Chipset  x86/MMX/SSE2
OpenGL version string:                  1.4 (2.1 Mesa 7.10.2)
Driver:                                 Intel
GPU class:                              i965
OpenGL version:                         1.4
Mesa version:                           7.10.2
X server version:                       1.10.1
Linux kernel version:                   2.6.38
Direct rendering:                       no
Requires strict binding:                yes
GLSL shaders:                           no
Texture NPOT support:                   limited

I got this in /var/log/messages:

2011-05-15T13:51:21.016+02:00 n22 kernel: [drm:i915_hangcheck_elapsed] *ERROR* Hangcheck timer elapsed... GPU hung
2011-05-15T13:51:21.020+02:00 n22 kernel: [drm:i915_do_wait_request] *ERROR* i915_do_wait_request returns -11 (awaiting 165531 at -16777216, next 165532)
2011-05-15T13:51:21.522+02:00 n22 kernel: [drm:i915_reset] *ERROR* Failed to reset chip.

I'll attach the output of command
$>head -n 100 /sys/kernel/debug/dri/*/i915_error_state > error
Comment 1 Paul Bolle 2011-08-09 20:45:34 UTC
See bug #27892. Perhaps this is a duplicate, perhaps this is just similar.
Comment 2 Toralf Förster 2011-08-10 07:52:08 UTC
(In reply to comment #1)
> See bug #27892. Perhaps this is a duplicate, perhaps this is just similar.

I dunno.
But the bug itself is still present here with x11 driver 2.16, mesa 7.10.3, kernel 2.6.39.4 (although /sys/kernel/debug/dri/*/i915_error_state was empty).
Comment 3 Toralf Förster 2011-08-27 15:28:37 UTC
Created attachment 70462 [details]
/sys/kernel/debug/dri/0/i915_error_state

With intel driver 2.16.0 and kernel 2.6.39.4 I run into this for 2 wine test cases "device" for the dlls "d3d8" and "d3d9".
Comment 4 Toralf Förster 2012-01-26 21:12:15 UTC
Created attachment 72208 [details]
 head -n 200 /sys/kernel/debug/dri/0/i915_error_state > error_state

The issue is still present in current kernel 3.2.2, mesa 7.11.2, xorg 1.11.3 :

2012-01-26T22:03:24.771+01:00 n22 kernel: [drm:i915_hangcheck_elapsed] *ERROR* Hangcheck timer elapsed... GPU hung
2012-01-26T22:03:24.771+01:00 n22 kernel: [drm] capturing error event; look for more information in /debug/dri/0/i915_er
ror_state
2012-01-26T22:03:24.774+01:00 n22 kernel: [drm:i915_wait_request] *ERROR* i915_wait_request returns -11 (awaiting 842330
 at -16777216, next 842331)
2012-01-26T22:03:25.276+01:00 n22 kernel: [drm:i915_reset] *ERROR* Failed to reset chip.
Comment 5 Toralf Förster 2012-01-26 21:24:22 UTC
Created attachment 72209 [details]
gdb back trace

Furthermore Xorg gave a core dump  and this after trying to restart it (but this might be a bug in X11 ?) :

Backtrace:
[ 38762.533] 0: /usr/bin/X (xorg_backtrace+0x3c) [0x81be98c]
[ 38762.533] 1: /usr/bin/X (0x8048000+0x17a4b1) [0x81c24b1]
[ 38762.533] 2: (vdso) (__kernel_rt_sigreturn+0x0) [0xb776340c]
[ 38762.533] 3: /usr/lib/libpixman-1.so.0 (0x43f6c000+0x7f91b) [0x43feb91b]
[ 38762.533] 4: /usr/lib/libpixman-1.so.0 (0x43f6c000+0x4d436) [0x43fb9436]
[ 38762.533] 5: /usr/lib/libpixman-1.so.0 (0x43f6c000+0x4d723) [0x43fb9723]
[ 38762.533] 6: /usr/lib/libpixman-1.so.0 (pixman_blt+0x75) [0x43f724c5]
[ 38762.533] 7: /usr/lib/xorg/modules/libfb.so (fbCopyNtoN+0x269) [0xb7373cb9]
[ 38762.533] 8: /usr/bin/X (miCopyRegion+0x163) [0x819bfa3]
[ 38762.533] 9: /usr/bin/X (miDoCopy+0x3a0) [0x819c480]
[ 38762.533] 10: /usr/lib/xorg/modules/libfb.so (fbCopyArea+0x79) [0xb7373f19]
[ 38762.533] 11: /usr/lib/xorg/modules/drivers/intel_drv.so (0xb7381000+0x3ae05) [0xb73bbe05]
[ 38762.533] 12: /usr/lib/xorg/modules/drivers/intel_drv.so (0xb7381000+0x3130a) [0xb73b230a]
[ 38762.533] 13: /usr/bin/X (0x8048000+0x101bcf) [0x8149bcf]
[ 38762.533] 14: /usr/bin/X (0x8048000+0xbff89) [0x8107f89]
[ 38762.533] 15: /usr/bin/X (0x8048000+0xc10aa) [0x81090aa]
[ 38762.533] 16: /usr/bin/X (0x8048000+0xbf2a0) [0x81072a0]
[ 38762.533] 17: /usr/bin/X (0x8048000+0xc03f8) [0x81083f8]
[ 38762.533] 18: /usr/bin/X (0x8048000+0xbba27) [0x8103a27]
[ 38762.533] 19: /usr/bin/X (0x8048000+0xbad54) [0x8102d54]
[ 38762.533] 20: /usr/bin/X (0x8048000+0x2f767) [0x8077767]
[ 38762.533] 21: /usr/bin/X (0x8048000+0x1d86a) [0x806586a]
[ 38762.533] 22: /lib/libc.so.6 (__libc_start_main+0xe7) [0xb74da2a7]
[ 38762.533] 23: /usr/bin/X (0x8048000+0x1d421) [0x8065421]
[ 38762.533] Segmentation fault at address 0x4
[ 38762.534] 
Fatal server error:
[ 38762.534] Caught signal 11 (Segmentation fault). Server aborting
[ 38762.534] 
[ 38762.534] 
Please consult the The X.Org Foundation support
Comment 6 Daniel Vetter 2012-03-25 13:03:45 UTC
The i965 gallium driver is completely broken and has therefore been removed.

Note You need to log in before you can comment on or make changes to this bug.