Laptop: HP Compaq nx9020 Display controller: Intel Corporation 82852/855GM Integrated Graphics Device (rev 02) OS: Arch Linux i686 kernel-2.6.32.8 with patch from http://bugzilla.kernel.org/show_bug.cgi?id=14957 xorg-server-1.7.4.901 xf86-video-intel-2.10.0 mesa-7.7 By "freeze" I mean that it's not possible to move the mouse pointer or use the keyboard. Kernel log message: Feb 16 20:38:01 takron kernel: ------------[ cut here ]------------ Feb 16 20:38:01 takron kernel: kernel BUG at drivers/gpu/drm/i915/i915_gem.c:2108! Feb 16 20:38:01 takron kernel: invalid opcode: 0000 [#1] PREEMPT SMP Feb 16 20:38:01 takron kernel: last sysfs file: /sys/devices/virtual/hwmon/hwmon0/temp1_input Feb 16 20:38:01 takron kernel: Modules linked in: nfs lockd fscache nfs_acl auth_rpcgss sunrpc michael_mic arc4 ecb lib80211_crypt_tkip ipv6 ext2 pcmcia snd_seq_dummy snd_seq_oss snd_seq_midi_event snd_seq snd_seq_device yenta_socket rsrc_nonstatic pcmcia_core snd_pcm_oss snd_mixer_oss 8139too mii ipw2200 libipw lib80211 snd_intel8x0m joydev snd_intel8x0 snd_ac97_codec ac97_bus snd_pcm snd_timer iTCO_wdt iTCO_vendor_support psmouse snd fuse uhci_hcd shpchp container soundcore snd_page_alloc wmi ac battery ehci_hcd i2c_i801 sg processor pci_hotplug thermal usbcore evdev serio_raw vboxdrv rtc_cmos rtc_core rtc_lib ext4 mbcache jbd2 crc16 sr_mod sd_mod cdrom ata_piix ata_generic pata_acpi libata scsi_mod i915 drm_kms_helper drm i2c_algo_bit button i2c_core video output intel_agp agpgart Feb 16 20:38:01 takron kernel: Feb 16 20:38:01 takron kernel: Pid: 4819, comm: X Not tainted (2.6.32-ARCH #1) compaq nx9020 (PG711ES#ABB) Feb 16 20:38:01 takron kernel: EIP: 0060:[<ee8af465>] EFLAGS: 00213246 CPU: 0 Feb 16 20:38:01 takron kernel: EIP is at i915_gem_evict_everything+0xe5/0x120 [i915] Feb 16 20:38:01 takron kernel: EAX: ed2f2000 EBX: ed315000 ECX: 00000000 EDX: 0000fdfd Feb 16 20:38:01 takron kernel: ESI: ed399400 EDI: ed315e0c EBP: ed315e20 ESP: ed2f3da4 Feb 16 20:38:01 takron kernel: DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068 Feb 16 20:38:01 takron kernel: Process X (pid: 4819, ti=ed2f2000 task=c21b0c90 task.ti=ed2f2000) Feb 16 20:38:01 takron kernel: Stack: Feb 16 20:38:01 takron kernel: 00000000 0000000a c02e4480 0000000a c02e49c0 ee8b0a80 00203292 ed278184 Feb 16 20:38:01 takron kernel: <0> 00203246 ed278000 00203286 00000001 c129a088 d6c53300 ed315000 c12b8a9c Feb 16 20:38:01 takron kernel: <0> de3b9940 c2381800 c22268a0 d12bf380 ed315000 c02e4480 de3b9800 c2381800 Feb 16 20:38:01 takron kernel: Call Trace: Feb 16 20:38:01 takron kernel: [<ee8b0a80>] ? i915_gem_execbuffer+0x830/0x1310 [i915] Feb 16 20:38:01 takron kernel: [<c129a088>] ? unix_stream_recvmsg+0x218/0x540 Feb 16 20:38:01 takron kernel: [<c12b8a9c>] ? __mutex_lock_slowpath+0x1ec/0x2c0 Feb 16 20:38:01 takron kernel: [<ee75b298>] ? drm_ioctl+0x158/0x320 [drm] Feb 16 20:38:01 takron kernel: [<ee8b0250>] ? i915_gem_execbuffer+0x0/0x1310 [i915] Feb 16 20:38:01 takron kernel: [<c10e3c95>] ? do_sync_read+0xd5/0x120 Feb 16 20:38:01 takron kernel: [<c10f1d29>] ? vfs_ioctl+0x89/0xa0 Feb 16 20:38:01 takron kernel: [<c10f1ea9>] ? do_vfs_ioctl+0x79/0x5c0 Feb 16 20:38:01 takron kernel: [<ee8b18b0>] ? i915_gem_fault+0x0/0x150 [i915] Feb 16 20:38:01 takron kernel: [<c10e3d46>] ? rw_verify_area+0x66/0xe0 Feb 16 20:38:01 takron kernel: [<c1064a30>] ? ktime_get_ts+0xd0/0x100 Feb 16 20:38:01 takron kernel: [<c10f2466>] ? sys_ioctl+0x76/0x90 Feb 16 20:38:01 takron kernel: [<c10039f3>] ? sysenter_do_call+0x12/0x28 Feb 16 20:38:01 takron kernel: Code: c0 89 c1 75 a6 89 f0 e8 ca fa ff ff 85 c0 89 c1 75 99 89 f8 89 0c 24 e8 6a ab a0 d2 3b ab 20 0e 00 00 74 0b 89 f8 e8 3b ae a0 d2 <0f> 0b eb fe 8d 83 18 0e 00 00 39 83 18 0e 00 00 75 e7 8d 83 10 Feb 16 20:38:01 takron kernel: EIP: [<ee8af465>] i915_gem_evict_everything+0xe5/0x120 [i915] SS:ESP 0068:ed2f3da4 Feb 16 20:38:01 takron kernel: ---[ end trace d4b77122adeb6f75 ]---
Can you test the drm-intel-next branch from git://git.kernel.org/pub/scm/linux/kernel/git/anholt/drm-intel.git?
Ok, I'll try it
BTW, what was the last working kernel?
2.6.32.7
Well, it should be pretty straightforward to carry out bisection of commits between 2.6.32.7 and 2.6.32.8. Can you please try that?
Interaction with userspace would muddle the bisection. This bug should be fixed by: commit 99fcb766a3a50466fe31d743260a3400c1aee855 Author: Daniel Vetter <daniel.vetter@ffwll.ch> Date: Sun Feb 7 16:20:18 2010 +0100 drm/i915: Update write_domains on active list after flush. Before changing the status of a buffer with a pending write we will await upon a new flush for that buffer. So we can take advantage of any flushes posted whilst the buffer is active and pending processing by the GPU, by clearing its write_domain and updating its last_rendering_seqno -- thus saving a potential flush in deep queues and improves flushing behaviour upon eviction for both GTT space and fences. In order to reduce the time spent searching the active list for matching write_domains, we move those to a separate list whose elements are the buffers belong to the active/flushing list with pending writes. Orignal patch by Chris Wilson <chris@chris-wilson.co.uk>, forward-ported by me. In addition to better performance, this also fixes a real bug. Before this changes, i915_gem_evict_everything didn't work as advertised. When the gpu was actually busy and processing request, the flush and subsequent wait would not move active and dirty buffers to the inactive list, but just to the flushing list. Which triggered the BUG_ON at the end of this function. With the more tight dirty buffer tracking, all currently busy and dirty buffers get moved to the inactive list by one i915_gem_flush operation. I've left the BUG_ON I've used to prove this in there. References: Bug 25911 - 2.10.0 causes kernel oops and system hangs http://bugs.freedesktop.org/show_bug.cgi?id=25911 Bug 26101 - [i915] xf86-video-intel 2.10.0 (and git) triggers kernel oops within seconds after login http://bugs.freedesktop.org/show_bug.cgi?id=26101 Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Tested-by: Adam Lantos <hege@playma.org> Cc: stable@kernel.org Signed-off-by: Eric Anholt <eric@anholt.net>
Ok closing per Chris's add.