Created attachment 27791 [details] Dmesg output of the crash The kernel will panic during boot while loading intel module (in initramfs) This is an Atom-based machine: MSI AE1920 with video card 8086:a001. It was working with kernel 2.6.34.1; I might bisect when I have some time. Dmesg output is attached. I'm getting it with netconsole.
Some suggested on #intel-gfx that the problem might be that agp was modular. After setting CONFIG_AGP=y and CONFIG_AGP_INTEL=y, the problem was still happening, except earlier in boot (obviously). I bisected the problem to this commit: commit e7b96f28c58ca09f15f6c2e8ccbb889a30fab4f7 Author: Tim Gardner <tim.gardner@canonical.com> Date: Fri Jul 9 14:48:50 2010 -0600 agp/intel: Use the correct mask to detect i830 aperture size. BugLink: https://bugs.launchpad.net/bugs/597075 commit f1befe71fa7a79ab733011b045639d8d809924ad introduced a regression when detecting aperture size of some i915 adapters, e.g., those on the Intel Q35 chipset. The original report: https://bugzilla.kernel.org/show_bug.cgi?id=15733 The regression report: https://bugzilla.kernel.org/show_bug.cgi?id=16294 According to the specification found at http://intellinuxgraphics.org/VOL_1_graphics_core.pdf, the PCI config space register I830_GMCH_CTRL is a mirror of GMCH Graphics Control. The correct macro for isolating the aperture size bits is therefore I830_GMCH_GMS_MASK along with the attendant changes to the case statement. Signed-off-by: Tim Gardner <tim.gardner@canonical.com> Tested-by: Kees Cook <kees.cook@canonical.com> Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: Eric Anholt <eric@anholt.net> Cc: Jesse Barnes <jbarnes@virtuousgeek.org> Signed-off-by: Eric Anholt <eric@anholt.net>
Created attachment 28041 [details] lspci -vvvnn of the machine from a working 2.6.34.1 kernel
The bug is still reproducible with 2.6.35.4. Reverting the offending commit allows the boot to reach completion (no hang), but then I have other problems with input devices in Xorg and Ethernet networking. It's the same if I revert f1befe71fa7a79ab733011b045639d8d809924ad, which introduced the previous regression. I don't know if the problems are related.
Can you try my intel-gtt rework branch? http://cgit.freedesktop.org/~danvet/drm/log/?h=intel_gtt_rework It contains a patch that should fix problems due to e7b96f28c58ca09f15f6c.
I can, and will, but it might take a few days. Thanks
(In reply to comment #3) > but then I have other problems with input devices in Xorg and Ethernet > networking. These problems where in userland on my side. Your branch fixes this specific bug, and boot is able to reach completion. Tested-by: Anisse Astier <anisse@astier.eu>
As it stands the code is in drm-intel-next, but I think Daniel is planning to submit a stable patch as well.
(In reply to comment #7) > As it stands the code is in drm-intel-next, but I think Daniel is planning to > submit a stable patch as well. Did he?
(In reply to comment #8) > (In reply to comment #7) > > As it stands the code is in drm-intel-next, but I think Daniel is planning > to > > submit a stable patch as well. > > Did he? AFAIK, not yet.
> --- Comment #9 from Anisse Astier <anisse@astier.eu> 2010-09-29 09:07:23 --- > (In reply to comment #8) > > (In reply to comment #7) > > > As it stands the code is in drm-intel-next, but I think Daniel is > planning to > > > submit a stable patch as well. > > > > Did he? > > AFAIK, not yet. [Sorry for the late reply, looks like bz.k.org fail has eaten my response.] Nope, not yet. I'd like to give this some vetting time in -linus. It's marked cc: stable, so Greg KH will remind me in time to backport. I'd simply like to avoid yet another regression in -stable over the same problem - if I'm counting correctly, my patches a trial number tree :( Current solutions seems to be the right one, but that's also what the changelog of the previous two patches claimed. -Daniel
On Monday, October 04, 2010, Anisse Astier wrote: > On Sun, 3 Oct 2010 23:53:02 +0200 (CEST), "Rafael J. Wysocki" <rjw@sisk.pl> > wrote : > > > This message has been generated automatically as a part of a report > > of regressions introduced between 2.6.34 and 2.6.35. > > > > The following bug entry is on the current list of known regressions > > introduced between 2.6.34 and 2.6.35. Please verify if it still should > > be listed and let the tracking team know (either way). > > > > > > Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=16891 > > Subject : Kernel panic while loading intel module during boot > > Submitter : Anisse Astier <anisse@astier.eu> > > Date : 2010-08-24 13:19 (41 days old) > > > > > > This bug is still valid, and should be listed as a regression. > I tried to upload on bugzilla a patch authored by Daniel Vetter that fixes > the problem, but then bugzilla went into blackhole mode.
Created attachment 32602 [details] Fix for intel-gtt 2.6.35 regression Patch fixing the problem. It still needs a meaningful description and Daniel's Signed-off-by.
The original respondent (Kees Cook) has built a kernel with this patch. He reports no regressions.
Patch : https://bugzilla.kernel.org/attachment.cgi?id=32602 Handled-By : Anisse Astier <anisse@astier.eu>
fixed in .37-rc1 by commit e5e408fc94595aab897f613b6f4e2f5b36870a6f Author: Daniel Vetter <daniel.vetter@ffwll.ch> Date: Sat Aug 28 11:04:32 2010 +0200 intel-gtt: fix gtt_total_entries detection