Kernel Bug Tracker – Bug 16891
Kernel panic while loading intel module during boot
Last modified: 2011-01-23 16:12:31 UTC
Created attachment 27791 [details]
Dmesg output of the crash
The kernel will panic during boot while loading intel module (in initramfs)
This is an Atom-based machine: MSI AE1920 with video card 8086:a001.
It was working with kernel 18.104.22.168; I might bisect when I have some time.
Dmesg output is attached. I'm getting it with netconsole.
Some suggested on #intel-gfx that the problem might be that agp was modular. After setting CONFIG_AGP=y and CONFIG_AGP_INTEL=y, the problem was still happening, except earlier in boot (obviously).
I bisected the problem to this commit:
Author: Tim Gardner <email@example.com>
Date: Fri Jul 9 14:48:50 2010 -0600
agp/intel: Use the correct mask to detect i830 aperture size.
commit f1befe71fa7a79ab733011b045639d8d809924ad introduced a
regression when detecting aperture size of some i915 adapters, e.g.,
those on the Intel Q35 chipset.
The original report: https://bugzilla.kernel.org/show_bug.cgi?id=15733
The regression report: https://bugzilla.kernel.org/show_bug.cgi?id=16294
According to the specification found at
http://intellinuxgraphics.org/VOL_1_graphics_core.pdf, the PCI config
space register I830_GMCH_CTRL is a mirror of GMCH Graphics
Control. The correct macro for isolating the aperture size bits is
therefore I830_GMCH_GMS_MASK along with the attendant changes to the
Signed-off-by: Tim Gardner <firstname.lastname@example.org>
Tested-by: Kees Cook <email@example.com>
Cc: Chris Wilson <firstname.lastname@example.org>
Cc: Eric Anholt <email@example.com>
Cc: Jesse Barnes <firstname.lastname@example.org>
Signed-off-by: Eric Anholt <email@example.com>
Created attachment 28041 [details]
lspci -vvvnn of the machine from a working 22.214.171.124 kernel
The bug is still reproducible with 126.96.36.199.
Reverting the offending commit allows the boot to reach completion (no hang), but then I have other problems with input devices in Xorg and Ethernet networking. It's the same if I revert f1befe71fa7a79ab733011b045639d8d809924ad, which introduced the previous regression. I don't know if the problems are related.
Can you try my intel-gtt rework branch?
It contains a patch that should fix problems due to e7b96f28c58ca09f15f6c.
I can, and will, but it might take a few days.
(In reply to comment #3)
> but then I have other problems with input devices in Xorg and Ethernet
These problems where in userland on my side.
Your branch fixes this specific bug, and boot is able to reach completion.
Tested-by: Anisse Astier <firstname.lastname@example.org>
As it stands the code is in drm-intel-next, but I think Daniel is planning to submit a stable patch as well.
(In reply to comment #7)
> As it stands the code is in drm-intel-next, but I think Daniel is planning to
> submit a stable patch as well.
(In reply to comment #8)
> (In reply to comment #7)
> > As it stands the code is in drm-intel-next, but I think Daniel is planning to
> > submit a stable patch as well.
> Did he?
AFAIK, not yet.
> --- Comment #9 from Anisse Astier <email@example.com> 2010-09-29 09:07:23 ---
> (In reply to comment #8)
> > (In reply to comment #7)
> > > As it stands the code is in drm-intel-next, but I think Daniel is planning to
> > > submit a stable patch as well.
> > Did he?
> AFAIK, not yet.
[Sorry for the late reply, looks like bz.k.org fail has eaten my response.]
Nope, not yet. I'd like to give this some vetting time in -linus. It's
marked cc: stable, so Greg KH will remind me in time to backport. I'd
simply like to avoid yet another regression in -stable over the same
problem - if I'm counting correctly, my patches a trial number tree :(
Current solutions seems to be the right one, but that's also what the
changelog of the previous two patches claimed.
On Monday, October 04, 2010, Anisse Astier wrote:
> On Sun, 3 Oct 2010 23:53:02 +0200 (CEST), "Rafael J. Wysocki" <firstname.lastname@example.org> wrote :
> > This message has been generated automatically as a part of a report
> > of regressions introduced between 2.6.34 and 2.6.35.
> > The following bug entry is on the current list of known regressions
> > introduced between 2.6.34 and 2.6.35. Please verify if it still should
> > be listed and let the tracking team know (either way).
> > Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=16891
> > Subject : Kernel panic while loading intel module during boot
> > Submitter : Anisse Astier <email@example.com>
> > Date : 2010-08-24 13:19 (41 days old)
> This bug is still valid, and should be listed as a regression.
> I tried to upload on bugzilla a patch authored by Daniel Vetter that fixes
> the problem, but then bugzilla went into blackhole mode.
Created attachment 32602 [details]
Fix for intel-gtt 2.6.35 regression
Patch fixing the problem.
It still needs a meaningful description and Daniel's Signed-off-by.
The original respondent (Kees Cook) has built a kernel with this patch. He reports no regressions.
Patch : https://bugzilla.kernel.org/attachment.cgi?id=32602
Handled-By : Anisse Astier <firstname.lastname@example.org>
fixed in .37-rc1 by
Author: Daniel Vetter <email@example.com>
Date: Sat Aug 28 11:04:32 2010 +0200
intel-gtt: fix gtt_total_entries detection