Bug 14662 - Dell E5500 kernel panic with KMS
Summary: Dell E5500 kernel panic with KMS
Status: RESOLVED CODE_FIX
Alias: None
Product: Drivers
Classification: Unclassified
Component: Video(DRI - Intel) (show other bugs)
Hardware: All Linux
: P1 high
Assignee: Chris Wilson
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2009-11-22 07:27 UTC by Mateusz Kaduk
Modified: 2010-01-09 11:55 UTC (History)
2 users (show)

See Also:
Kernel Version: 2.6.32-rc8
Subsystem:
Regression: No
Bisected commit-id:


Attachments
Pictures with kernel panic (858.87 KB, application/octet-stream)
2009-11-22 07:29 UTC, Mateusz Kaduk
Details

Description Mateusz Kaduk 2009-11-22 07:27:44 UTC
Hi,

I compiled 2.6.32-rc8 and 2.6.31.6, but I get kernel panic, complaining about some intel GEM related parts.

My distribution kernel which works besides OpenGL is 2.6.31, it boots fine, but looks like it works in UMS mode or sets KMS later. Boot option i915.modeset=1 is not recognized there. Also OpenGL graphic shows black-screen while desktop compositing is damn slow, to move window it takes half a minute or more.

The difference between compiled kernels and one in distribution is that, those compiled have i915 kms mode set by default.

I made pictures of calltrace and registers with camera and boot_delay for printk, those are in attachment.

My graphic is
00:02.0 VGA compatible controller: Intel Corporation Mobile 4 Series Chipset Integrated Graphics Controller (rev 07)

OpenGL renderer string: Mesa DRI Mobile Intel® GM45 Express Chipset GEM 20090712 2009Q2 RC3 
OpenGL version string: 2.1 Mesa 7.6
Comment 1 Mateusz Kaduk 2009-11-22 07:29:10 UTC
Created attachment 23867 [details]
Pictures with kernel panic
Comment 2 Hugh Dickins 2009-12-01 19:26:42 UTC
Hi Chris,

I'm slightly worried by this patch which went into Linus's git yesterday.

Thank you for looking into the issue, and coming up with what Mateusz
has found to be a good workaround.

But is it the right fix?

drm/i915 ought to be working without CONFIG_SHMEM: it then falls back
to using ramfs instead of shmem.  So why isn't that working here?

I'd take the NULL function pointer in read_cache_page_async() to be a
NULL filler function; but that should be pointing to simple_readpage(),
as specified in ramfs_aops.

You don't go into more detail in the comments: do you have any more
info on how this came about?  I don't get it.

I'd prefer not to "select SHMEM" there, because it shouldn't be
necessary; and will put CONFIG_SHMEM=y into the .configs of users
who chose not to have it before, and who will forget to reverse it
if we come up with a better fix letter.

If there is a better fix: I just don't understand what happened here.

Thanks,
Hugh

From: Chris Wilson <chris@chris-wilson.co.uk>
Date: Sun, 22 Nov 2009 15:40:31 +0000 (+0000)
Subject: drm/i915: Select CONFIG_SHMEM
X-Git-Url: http://git.kernel.org/gitweb.cgi?p=linux%2Fkernel%2Fgit%2Ftorvalds%2Flinux-2.6.git;a=commitdiff_plain;h=ca9ab10033d190c1ede85fdf456307bdfdabf079

drm/i915: Select CONFIG_SHMEM

The driver requires shmfs as the backing filesystem to handle the buffer
objects, so ensure it is selected if the user chooses to build our
driver.

Fixes: Bug 14662 - Dell E5500 kernel panic with KMS
http://bugzilla.kernel.org/show_bug.cgi?id=14662

The revealing nature of the panic is the NULL function pointer
dereference in read_cache_page_async().

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reported-and-tested-by: Mateusz Kaduk <mateusz.kaduk@gmail.com>
Signed-off-by: Eric Anholt <eric@anholt.net>
Cc: stable@kernel.org
---

diff --git a/drivers/gpu/drm/Kconfig b/drivers/gpu/drm/Kconfig
index f831ea1..96eddd1 100644
--- a/drivers/gpu/drm/Kconfig
+++ b/drivers/gpu/drm/Kconfig
@@ -92,6 +92,7 @@ config DRM_I830
 config DRM_I915
 	tristate "i915 driver"
 	depends on AGP_INTEL
+	select SHMEM
 	select DRM_KMS_HELPER
 	select FB_CFB_FILLRECT
 	select FB_CFB_COPYAREA
Comment 3 Matt Mackall 2009-12-01 20:31:51 UTC
On Tue, 2009-12-01 at 19:26 +0000, Hugh Dickins wrote:
> Hi Chris,
> 
> I'm slightly worried by this patch which went into Linus's git yesterday.
> 
> Thank you for looking into the issue, and coming up with what Mateusz
> has found to be a good workaround.
> 
> But is it the right fix?

This really does look like a band-aid. There's either a real problem in
tiny-shmem or alternately something scribbling on its aops? In which
case turning shmem on is just causing something different to get
scribbled on.
Comment 4 Chris Wilson 2009-12-01 21:22:55 UTC
Now that you've pointed out the fallback suppport for !SHMEM, yes it does look like a bandaid. As you can probably guess, diagnosis was performed as a simple scan over config in complete ignorance of the ramfs fallback.

If something really was scribbling over the aops, then surely it would also be a problem with SHMEM as well? I'll see if I can reproduce the failure on a test machine and debug the issue truly.
Comment 5 Chris Wilson 2009-12-01 21:42:50 UTC
So starting to digest mm/shmem.c and it appears that shmem_aops.readpage is only defined if TMPFS -- with no protection if read_mapping_page() is called without the filler defined.

From that reading it looks like i915 actually requires TMPFS, which depends on SHMEM, and not SHMEM itself.
Comment 6 Jesse Barnes 2009-12-02 19:38:39 UTC
Tag you're it
Comment 7 Mateusz Kaduk 2010-01-09 11:55:26 UTC
I dont experience it anymore. Closing.

Note You need to log in before you can comment on or make changes to this bug.