Bug 30512

Summary: [drm:i915_hangcheck_elapsed] *ERROR* Hangcheck timer elapsed... GPU hung
Product: Drivers Reporter: Cristian Aravena Romero (caravena)
Component: Video(DRI - Intel)Assignee: drivers_video-dri-intel (drivers_video-dri-intel)
Status: RESOLVED CODE_FIX    
Severity: normal CC: caravena, eddy.petrisor+linbug, yermandu.dev
Priority: P1    
Hardware: All   
OS: Linux   
URL: https://bugs.freedesktop.org/show_bug.cgi?id=35048
Kernel Version: 2.6.38-rc7 Subsystem:
Regression: No Bisected commit-id:
Attachments: kern.log catching the hang

Description Cristian Aravena Romero 2011-03-06 01:10:28 UTC
Open bug in freedesktop.org:
https://bugs.freedesktop.org/show_bug.cgi?id=35048

InstallationMedia: Ubuntu 10.10 "Maverick Meerkat" - Release amd64 (20101007)
PackageArchitecture: amd64
SourcePackage: xserver-xorg-video-intel
UnreportableReason: The running kernel is not an Ubuntu kernel
system:
 distro:             Ubuntu
 codename:           maverick
 architecture:       x86_64
 kernel:             2.6.38-020638rc7-generic


[161891.789663] [drm:i915_gem_mmap_gtt_ioctl] *ERROR* Attempting to mmap a
purgeable buffer
 [162806.360098] [drm:i915_hangcheck_elapsed] *ERROR* Hangcheck timer
elapsed... GPU hung
 [162806.361285] [drm:i915_do_wait_request] *ERROR* i915_do_wait_request
returns -11 (awaiting 1680485 at 1680481, next 1680487)
 [162806.880076] [drm:i915_reset] *ERROR* Failed to reset chip.
 [162807.477686] intel_gpu_dump: page allocation failure. order:8, mode:0x40d0
 [162807.477692] Pid: 30379, comm: intel_gpu_dump Not tainted
2.6.38-020638rc7-generic #201103020909
 [162807.477695] Call Trace:
 [162807.477706]  [<ffffffff81110453>] ? __alloc_pages_slowpath+0x553/0x720
 [162807.477710]  [<ffffffff811107b3>] ? __alloc_pages_nodemask+0x193/0x1d0
 [162807.477715]  [<ffffffff815b3b5e>] ? _raw_spin_lock+0xe/0x20
 [162807.477718]  [<ffffffff81111a81>] ? free_one_page+0x291/0x360
 [162807.477724]  [<ffffffff81145c63>] ? alloc_pages_current+0xa3/0x110
 [162807.477728]  [<ffffffff8110e03e>] ? __get_free_pages+0xe/0x50
 [162807.477733]  [<ffffffff8114f478>] ? kmalloc_order_trace+0x38/0xb0
 [162807.477737]  [<ffffffff81151135>] ? __kmalloc+0x125/0x190
 [162807.477740]  [<ffffffff81114a6c>] ? put_page+0x2c/0x40
 [162807.477749]  [<ffffffff8118019b>] ? seq_read+0x14b/0x3e0
 [162807.477753]  [<ffffffff8115fda9>] ? vfs_read+0xc9/0x180
 [162807.477756]  [<ffffffff81160315>] ? sys_read+0x55/0x90
 [162807.477761]  [<ffffffff8100c002>] ? system_call_fastpath+0x16/0x1b
 [162807.477763] Mem-Info:
 [162807.477765] Node 0 DMA per-cpu:
 [162807.477769] CPU    0: hi:    0, btch:   1 usd:   0
 [162807.477771] CPU    1: hi:    0, btch:   1 usd:   0
 [162807.477772] Node 0 DMA32 per-cpu:
 [162807.477775] CPU    0: hi:  186, btch:  31 usd:   0
 [162807.477777] CPU    1: hi:  186, btch:  31 usd:  25
 [162807.477782] active_anon:312352 inactive_anon:87698 isolated_anon:30
 [162807.477784]  active_file:137893 inactive_file:137256 isolated_file:0
 [162807.477784]  unevictable:38 dirty:516 writeback:0 unstable:0
 [162807.477785]  free:5861 slab_reclaimable:13833 slab_unreclaimable:8735
 [162807.477786]  mapped:29278 shmem:30937 pagetables:10615 bounce:0
 [162807.477789] Node 0 DMA free:11576kB min:36kB low:44kB high:52kB
active_anon:120kB inactive_anon:828kB active_file:2896kB inactive_file:284kB
unevictable:0kB isolated(anon):0kB isolated(file):0kB present:15668kB
mlocked:0kB dirty:0kB writeback:0kB mapped:0kB shmem:0kB slab_reclaimable:104kB
slab_unreclaimable:76kB kernel_stack:8kB pagetables:0kB unstable:0kB bounce:0kB
writeback_tmp:0kB pages_scanned:0 all_unreclaimable? no
 [162807.477800] lowmem_reserve[]: 0 2883 2883 2883
 [162807.477804] Node 0 DMA32 free:11868kB min:6848kB low:8560kB high:10272kB
active_anon:1249288kB inactive_anon:349964kB active_file:548676kB
inactive_file:548740kB unevictable:152kB isolated(anon):120kB
isolated(file):0kB present:2952660kB mlocked:152kB dirty:2064kB writeback:0kB
mapped:117112kB shmem:123748kB slab_reclaimable:55228kB
slab_unreclaimable:34864kB kernel_stack:3072kB pagetables:42460kB unstable:0kB
bounce:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? no
 [162807.477815] lowmem_reserve[]: 0 0 0 0
 [162807.477818] Node 0 DMA: 10*4kB 4*8kB 5*16kB 5*32kB 6*64kB 3*128kB 3*256kB
3*512kB 2*1024kB 3*2048kB 0*4096kB = 11576kB
 [162807.477829] Node 0 DMA32: 1421*4kB 135*8kB 107*16kB 8*32kB 5*64kB 5*128kB
2*256kB 3*512kB 0*1024kB 0*2048kB 0*4096kB = 11740kB
 [162807.478444] 325119 total pagecache pages
 [162807.478446] 19011 pages in swap cache
 [162807.478448] Swap cache stats: add 135815, delete 116804, find
128125/132695
 [162807.478450] Free swap  = 3219068kB
 [162807.478451] Total swap = 3574424kB
 [162807.500260] 752624 pages RAM
 [162807.500263] 13843 pages reserved
 [162807.500265] 406144 pages shared
 [162807.500266] 482944 pages non-shared
Date: Sat Mar  5 21:47:28 2011
DistroRelease: Ubuntu 10.10
DumpSignature: 67f80f19
ExecutablePath: /usr/share/apport/apport-gpu-error-intel.py
IntelGpuDump:
 ACTHD: 0xffffffff
 EIR: 0x00000000
 EMR: 0xffffff05
 ESR: 0x00000000
 PGTBL_ER: 0x00000000
 IPEHR: 0x00000000
 IPEIR: 0x00000000
 INSTDONE: 0xfffffffe
 INSTDONE1: 0xffffffff
 Ringbuffer: Reminder: head pointer is GPU read, tail pointer is CPU write
InterpreterPath: /usr/bin/python2.6
Comment 1 Eddy Petrișor 2011-03-27 00:25:24 UTC
I got a similar hang on a self compiled 2.6.38-rc8-g35d34df7 (that is the git vanilla 35d34df7)

Unfortunately I don't have a complete dmesg since the screen goes blank when this happens, just some info rom syslog.



Mar 27 01:46:56 heidi kernel: [40722.780040] [drm:i915_hangcheck_elapsed] *ERROR* Hangcheck timer elapsed... GPU hung
Mar 27 01:46:56 heidi kernel: [40722.781357] [drm:i915_do_wait_request] *ERROR* i915_do_wait_request returns -11 (awaiting 1689189 at 1689185, next 1689190)
Mar 27 01:46:56 heidi kernel: [40722.781742] [drm:init_ring_common] *ERROR* render ring initialization failed ctl 00000000 head 00000000 tail 00000000 start 00000000
Mar 27 01:47:03 heidi kernel: [40729.744035] [drm:i915_hangcheck_elapsed] *ERROR* Hangcheck timer elapsed... GPU hung
Mar 27 01:47:03 heidi kernel: [40729.744680] [drm:init_ring_common] *ERROR* render ring initialization failed ctl 00000000 head 00000000 tail 00000000 start 00000000
Mar 27 01:47:05 heidi kernel: [40732.232051] [drm:i915_hangcheck_elapsed] *ERROR* Hangcheck timer elapsed... GPU hung
Mar 27 01:47:05 heidi kernel: [40732.232943] [drm:i915_reset] *ERROR* GPU hanging too fast, declaring wedged!
Mar 27 01:47:05 heidi kernel: [40732.232950] [drm:i915_reset] *ERROR* Failed to reset chip.



I got this two times in a row when trying to work on some clip in kino.
Comment 2 Eddy Petrișor 2011-04-02 13:40:24 UTC
Created attachment 53242 [details]
kern.log catching the hang

This is an extract (by date) of the kern.log that catches the hang.
Comment 3 Cristian Aravena Romero 2011-06-11 15:10:58 UTC
In https://bugs.freedesktop.org/show_bug.cgi?id=35048

Chris Wilson 2011-03-13 12:06:58 PDT: CODE_FIX