Bug 12771 (i915_gem_flush)

Summary: Oops in i915_gem_flush
Product: Drivers Reporter: Kalev Lember (kalev)
Component: Video(DRI - non Intel)Assignee: drivers_video-dri
Status: CLOSED UNREPRODUCIBLE    
Severity: normal CC: rjw
Priority: P1    
Hardware: All   
OS: Linux   
Kernel Version: 2.6.29-rc6-git1 Subsystem:
Regression: Yes Bisected commit-id:
Bug Depends on:    
Bug Blocks: 12398    

Description Kalev Lember 2009-02-24 08:35:07 UTC
Distribution: Gentoo
Hardware Environment: IBM R51 laptop with Intel 855GME
Software Environment: xorg 1.5.3, xf86-video-intel 2.6.1, libdrm-2.4.4; using kernel modesetting
Problem Description:
Stopping and then restarting X server reliably triggers the following oops:

[drm:i915_mem_init_heap] *ERROR* heap already initialized?<1>BUG: unable to handle kernel paging request at 000041a8
IP: [<c0287db8>] i915_gem_flush+0xd8/0x130
*pde = 00000000 
Oops: 0002 [#1] PREEMPT 
last sysfs file: /sys/devices/pci0000:00/0000:00:1e.0/0000:02:08.0/resource
Modules linked in: ipv6 michael_mic arc4 ecb lib80211_crypt_tkip snd_seq_dummy snd_seq_oss snd_seq_midi_event snd_seq snd_seq_device snd_pcm_oss snd_mixer_oss snd_intel8x0 snd_ac97_codec irtty_sir thinkpad_acpi ac97_bus sir_dev snd_pcm irda snd_timer rfkill snd ipw2200 soundcore led_class sr_mod e100 libipw cdrom yenta_socket snd_page_alloc crc_ccitt rsrc_nonstatic mii pcmcia_core lib80211 8250_pci sg ehci_hcd uhci_hcd video 8250_pnp output 8250 serial_core

Pid: 9948, comm: X Not tainted (2.6.29-rc6-git1 #1) 2887AVG
EIP: 0060:[<c0287db8>] EFLAGS: 00213212 CPU: 0
EIP is at i915_gem_flush+0xd8/0x130
EAX: 000041a8 EBX: 0001ffff ECX: 000041ac EDX: 000041b0
ESI: 02000004 EDI: f703b000 EBP: ce497e0c ESP: ce497df8
DS: 007b ES: 007b FS: 0000 GS: 0033 SS: 0068
Process X (pid: 9948, ti=ce496000 task=ce62a380 task.ti=ce496000)
Stack:
00000008 f7137400 00000009 ce513940 00000001 ce497ec8 c028ae62 b7b52000
c12659a0 00000000 c014de20 00000000 ce497e6c c01625c6 ce6b5b78 0000005b
00000000 ce589d40 ce63e640 f7137400 ce586f40 f7137410 ce670f00 ce670ee0
Call Trace:
[<c028ae62>] ? i915_gem_execbuffer+0xca2/0xd80
[<c014de20>] ? filemap_fault+0x0/0x480
[<c01625c6>] ? handle_mm_fault+0x126/0x600
[<c01134f6>] ? do_page_fault+0x2d6/0x750
[<c0273100>] ? drm_ioctl+0xe0/0x2f0
[<c028a1c0>] ? i915_gem_execbuffer+0x0/0xd80
[<c0181571>] ? vfs_ioctl+0x81/0x90
[<c0181702>] ? do_vfs_ioctl+0x72/0x590
[<c0175060>] ? vfs_write+0x100/0x140
[<c01745a0>] ? do_sync_write+0x0/0x110
[<c0181c59>] ? sys_ioctl+0x39/0x70
[<c0103371>] ? sysenter_do_call+0x12/0x25
[<c0370000>] ? intelfb_pci_register+0x310/0x1000
Code: f0 66 90 89 f0 83 c8 02 f6 45 ec 10 0f 45 f0 83 7f 20 07 7e 4e 8b 57 1c 8b 4f 14 8b 5f 0c 8d 04 11 83 c2 04 21 da 01 d1 83 c2 04 <89> 30 21 da c7 01 00 00 00 00 8b 47 08 89 57 1c 83 6f 20 08 05 
EIP: [<c0287db8>] i915_gem_flush+0xd8/0x130 SS:ESP 0068:ce497df8
---[ end trace 6084dc6175dae6ae ]---
Comment 1 Kalev Lember 2009-03-04 12:00:22 UTC
The oops still occurs with 2.6.29-rc7. An additional thing to note is that it only happens with EXA turned on, but not with UXA.


I compiled my kernel with CONFIG_DEBUG_INFO and examined vmlinux with gdb:

(gdb) list *i915_gem_flush+0x131
0xc02be421 is in i915_gem_flush (drivers/gpu/drm/i915/i915_gem.c:1203).
1198
1199    #if WATCH_EXEC
1200                    DRM_INFO("%s: queue flush %08x to ring\n", __func__,
cmd);
1201    #endif
1202                    BEGIN_LP_RING(2);
1203                    OUT_RING(cmd);
1204                    OUT_RING(0); /* noop */
1205                    ADVANCE_LP_RING();
1206            }
1207    }


And the oops itself with WATCH_EXEC defined in i915_drv.h:

[drm:i915_mem_init_heap] *ERROR* heap already initialized?<6>[drm] buffers_ptr 136692264 buffer_count 1 len 000001d8
[drm] i915_gem_execbuffer: invalidate_domains 00000008 flush_domains 00000001
[drm] i915_gem_flush: invalidate 00000008 flush 00000001
[drm] i915_gem_flush: queue flush 02000004 to ring
BUG: unable to handle kernel paging request at 000034c8
IP: [<c02be421>] i915_gem_flush+0x131/0x170
*pde = 00000000
Oops: 0002 [#1] PREEMPT
last sysfs file: /sys/devices/pci0000:00/0000:00:1e.0/0000:02:08.0/resource
Modules linked in: ipv6 snd_seq_dummy snd_seq_oss snd_seq_midi_event snd_seq snd_seq_device snd_pcm_oss snd_mixer_oss snd_intel8x0 irtty_sir snd_ac97_codec ac97_bus sir_dev snd_pcm irda ipw2200 snd_timer snd libipw thinkpad_acpi soundcore yenta_socket e100 rfkill sr_mod rsrc_nonstatic snd_page_alloc ehci_hcd 8250_pnp 8250_pci lib80211 mii video crc_ccitt led_class uhci_hcd pcmcia_core sg cdrom 8250 serial_core output

Pid: 2733, comm: X Not tainted (2.6.29-rc7 #2) 2887AVG
EIP: 0060:[<c02be421>] EFLAGS: 00213212 CPU: 0
EIP is at i915_gem_flush+0x131/0x170
EAX: 000034c8 EBX: 0001ffff ECX: 000034cc EDX: 000034d0
ESI: 02000004 EDI: f7134400 EBP: f5e07dfc ESP: f5e07dd8
DS: 007b ES: 007b FS: 0000 GS: 0033 SS: 0068
Process X (pid: 2733, ti=f5e06000 task=f5efb840 task.ti=f5e06000)
Stack:
c043f548 c03d113d 02000004 00000001 f703b000 02000004 00000009 00000008
00000001 f5e07eb8 c02c15b2 c043ffd0 c03d1237 00000008 00000001 b7999000
c178c600 00000000 c0152640 00000000 f5e07e6c f5d10ec0 f68aedc0 f7134400
Call Trace:
[<c02c15b2>] ? i915_gem_execbuffer+0xda2/0xdd0
[<c0152640>] ? filemap_fault+0x0/0x4c0
[<c0113f26>] ? do_page_fault+0x2d6/0x750
[<c02aabc2>] ? drm_gem_object_lookup+0x32/0x70
[<c02a9290>] ? drm_ioctl+0xe0/0x2f0
[<c017a121>] ? do_sync_write+0xd1/0x110
[<c02c0810>] ? i915_gem_execbuffer+0x0/0xdd0
[<c01874a1>] ? vfs_ioctl+0x81/0x90
[<c0187632>] ? do_vfs_ioctl+0x72/0x5c0
[<c0125d97>] ? _local_bh_enable+0x27/0xa0
[<c017ab10>] ? vfs_write+0x100/0x140
[<c017a050>] ? do_sync_write+0x0/0x110
[<c0187bb9>] ? sys_ioctl+0x39/0x70
[<c0103431>] ? sysenter_do_call+0x12/0x25
[<c03a0000>] ? unix_stream_connect+0x3a0/0x490
Code: 04 e8 39 db 0e 00 8b 75 ec 83 7e 20 07 7e 42 8b 45 ec 8b 75 f0 8b 50 1c 8b 48 14 8b 58 0c 8d 04 11 83 c2 04 21 da 01 d1 83 c2 04 <89>30 21 da c7 01 00 00 00 00 8b 45 ec 83 68 20 08 89 50 1c 8b
EIP: [<c02be421>] i915_gem_flush+0x131/0x170 SS:ESP 0068:f5e07dd8
---[ end trace 832e8b45814ff199 ]---
Comment 2 Kalev Lember 2009-04-22 19:37:35 UTC
Closing the bug because I can no longer reproduce it with 2.6.30-rc3 and latest Xorg stuff.

The reason why I no longer get the oops is probably because it only triggered with EXA, but new xf86-video-intel is forcing accelmethod to UXA when KMS is used.