Subject : 2.6.30-rc2: WARNING at i915_gem.c for i915_gem_idle
Submitter : Niel Lambrechts <firstname.lastname@example.org>
Date : 2009-04-21 21:35
References : http://marc.info/?l=linux-kernel&m=124034980819102&w=4
This entry is being used for tracking a regression from 2.6.29. Please don't
close it until the problem is fixed in the mainline.
References : http://lkml.org/lkml/2009/4/27/290
On Sunday 07 June 2009, Niel Lambrechts wrote:
> Rafael J. Wysocki wrote:
> I have not seen the same problem in later levels up to rc8, and there
> has been quite a few more drm patches. I think this one can be closed
> and I will log a new entry if it happens again.
Reopening at the reporter's request.
The problem may be caused by the PAT issues that have patches available (e.g. bug #13884).
Created attachment 22570 [details]
i915_gem_idle warning (126.96.36.199 stable)
Created attachment 22571 [details]
i915_gem_idle warning (2.6.31-rc2)
Created attachment 22572 [details]
i915_gem_idle warning (2.6.31-rc4)
I originally noticed warnings as attached - after having an issue where Xorg became 100% busy. I use hibernate almost daily, so it could be that these problems originate from hibernate or suspend-to-ram. The logged warnings however do not occur directly after resuming.
There have also been a few occurrences of soft-locked blank Xorg screens on my laptop- although they were not always accompanied by these warnings (note, I have not seen them on 188.8.131.52 yet, which I upgraded to a day or so ago).
I will also try 2.6.31-rc5 based on Raphael's comment around PAT.
Sorry, I should have mentioned that I upgraded to 184.108.40.206 shortly after 220.127.116.11 - I will be testing this as well.
I got a similar WARNING after telinit 3 (I gave the command via the network, the display was sleeping in the kdm login screen). x86_64 kernel 18.104.22.168, intel driver 2.7.1, UXA, no KMS. CONFIG_SUSPEND is not set, CONFIG_HIBERNATION is not set.
Created attachment 22921 [details]
another i915_gem_idle warning
Not quite enough info here to distinguish whether this warning is caused by a hung GPU or unflushed buffers. For the latter case, that should be fixed with:
Author: Daniel Vetter <email@example.com>
Date: Sun Feb 7 16:20:18 2010 +0100
drm/i915: Update write_domains on active list after flush.
Before changing the status of a buffer with a pending write we will await
upon a new flush for that buffer. So we can take advantage of any flushes
posted whilst the buffer is active and pending processing by the GPU, by
clearing its write_domain and updating its last_rendering_seqno -- thus
saving a potential flush in deep queues and improves flushing behaviour
upon eviction for both GTT space and fences.
In order to reduce the time spent searching the active list for matching
write_domains, we move those to a separate list whose elements are
the buffers belong to the active/flushing list with pending writes.
Orignal patch by Chris Wilson <firstname.lastname@example.org>, forward-ported
In addition to better performance, this also fixes a real bug. Before
this changes, i915_gem_evict_everything didn't work as advertised. When
the gpu was actually busy and processing request, the flush and subsequent
wait would not move active and dirty buffers to the inactive list, but
just to the flushing list. Which triggered the BUG_ON at the end of this
function. With the more tight dirty buffer tracking, all currently busy and
dirty buffers get moved to the inactive list by one i915_gem_flush operation.
I've left the BUG_ON I've used to prove this in there.
Bug 25911 - 2.10.0 causes kernel oops and system hangs
Bug 26101 - [i915] xf86-video-intel 2.10.0 (and git) triggers kernel oops
within seconds after login