Bug 16891
Summary: | Kernel panic while loading intel module during boot | ||
---|---|---|---|
Product: | Drivers | Reporter: | Anisse Astier (anisse) |
Component: | Video(DRI - Intel) | Assignee: | Daniel Vetter (daniel) |
Status: | CLOSED CODE_FIX | ||
Severity: | high | CC: | akpm, anisse, chris, daniel, florian, rjw, timg |
Priority: | P1 | ||
Hardware: | All | ||
OS: | Linux | ||
Kernel Version: | 2.6.35.4 | Subsystem: | |
Regression: | Yes | Bisected commit-id: | |
Bug Depends on: | |||
Bug Blocks: | 16055 | ||
Attachments: |
Dmesg output of the crash
lspci -vvvnn of the machine from a working 2.6.34.1 kernel Fix for intel-gtt 2.6.35 regression |
Some suggested on #intel-gfx that the problem might be that agp was modular. After setting CONFIG_AGP=y and CONFIG_AGP_INTEL=y, the problem was still happening, except earlier in boot (obviously). I bisected the problem to this commit: commit e7b96f28c58ca09f15f6c2e8ccbb889a30fab4f7 Author: Tim Gardner <tim.gardner@canonical.com> Date: Fri Jul 9 14:48:50 2010 -0600 agp/intel: Use the correct mask to detect i830 aperture size. BugLink: https://bugs.launchpad.net/bugs/597075 commit f1befe71fa7a79ab733011b045639d8d809924ad introduced a regression when detecting aperture size of some i915 adapters, e.g., those on the Intel Q35 chipset. The original report: https://bugzilla.kernel.org/show_bug.cgi?id=15733 The regression report: https://bugzilla.kernel.org/show_bug.cgi?id=16294 According to the specification found at http://intellinuxgraphics.org/VOL_1_graphics_core.pdf, the PCI config space register I830_GMCH_CTRL is a mirror of GMCH Graphics Control. The correct macro for isolating the aperture size bits is therefore I830_GMCH_GMS_MASK along with the attendant changes to the case statement. Signed-off-by: Tim Gardner <tim.gardner@canonical.com> Tested-by: Kees Cook <kees.cook@canonical.com> Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: Eric Anholt <eric@anholt.net> Cc: Jesse Barnes <jbarnes@virtuousgeek.org> Signed-off-by: Eric Anholt <eric@anholt.net> Created attachment 28041 [details]
lspci -vvvnn of the machine from a working 2.6.34.1 kernel
The bug is still reproducible with 2.6.35.4. Reverting the offending commit allows the boot to reach completion (no hang), but then I have other problems with input devices in Xorg and Ethernet networking. It's the same if I revert f1befe71fa7a79ab733011b045639d8d809924ad, which introduced the previous regression. I don't know if the problems are related. Can you try my intel-gtt rework branch? http://cgit.freedesktop.org/~danvet/drm/log/?h=intel_gtt_rework It contains a patch that should fix problems due to e7b96f28c58ca09f15f6c. I can, and will, but it might take a few days. Thanks (In reply to comment #3) > but then I have other problems with input devices in Xorg and Ethernet > networking. These problems where in userland on my side. Your branch fixes this specific bug, and boot is able to reach completion. Tested-by: Anisse Astier <anisse@astier.eu> As it stands the code is in drm-intel-next, but I think Daniel is planning to submit a stable patch as well. (In reply to comment #7) > As it stands the code is in drm-intel-next, but I think Daniel is planning to > submit a stable patch as well. Did he? (In reply to comment #8) > (In reply to comment #7) > > As it stands the code is in drm-intel-next, but I think Daniel is planning > to > > submit a stable patch as well. > > Did he? AFAIK, not yet. > --- Comment #9 from Anisse Astier <anisse@astier.eu> 2010-09-29 09:07:23 ---
> (In reply to comment #8)
> > (In reply to comment #7)
> > > As it stands the code is in drm-intel-next, but I think Daniel is
> planning to
> > > submit a stable patch as well.
> >
> > Did he?
>
> AFAIK, not yet.
[Sorry for the late reply, looks like bz.k.org fail has eaten my response.]
Nope, not yet. I'd like to give this some vetting time in -linus. It's
marked cc: stable, so Greg KH will remind me in time to backport. I'd
simply like to avoid yet another regression in -stable over the same
problem - if I'm counting correctly, my patches a trial number tree :(
Current solutions seems to be the right one, but that's also what the
changelog of the previous two patches claimed.
-Daniel
On Monday, October 04, 2010, Anisse Astier wrote:
> On Sun, 3 Oct 2010 23:53:02 +0200 (CEST), "Rafael J. Wysocki" <rjw@sisk.pl>
> wrote :
>
> > This message has been generated automatically as a part of a report
> > of regressions introduced between 2.6.34 and 2.6.35.
> >
> > The following bug entry is on the current list of known regressions
> > introduced between 2.6.34 and 2.6.35. Please verify if it still should
> > be listed and let the tracking team know (either way).
> >
> >
> > Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=16891
> > Subject : Kernel panic while loading intel module during boot
> > Submitter : Anisse Astier <anisse@astier.eu>
> > Date : 2010-08-24 13:19 (41 days old)
> >
> >
>
> This bug is still valid, and should be listed as a regression.
> I tried to upload on bugzilla a patch authored by Daniel Vetter that fixes
> the problem, but then bugzilla went into blackhole mode.
Created attachment 32602 [details]
Fix for intel-gtt 2.6.35 regression
Patch fixing the problem.
It still needs a meaningful description and Daniel's Signed-off-by.
The original respondent (Kees Cook) has built a kernel with this patch. He reports no regressions. Patch : https://bugzilla.kernel.org/attachment.cgi?id=32602 Handled-By : Anisse Astier <anisse@astier.eu> fixed in .37-rc1 by commit e5e408fc94595aab897f613b6f4e2f5b36870a6f Author: Daniel Vetter <daniel.vetter@ffwll.ch> Date: Sat Aug 28 11:04:32 2010 +0200 intel-gtt: fix gtt_total_entries detection |
Created attachment 27791 [details] Dmesg output of the crash The kernel will panic during boot while loading intel module (in initramfs) This is an Atom-based machine: MSI AE1920 with video card 8086:a001. It was working with kernel 2.6.34.1; I might bisect when I have some time. Dmesg output is attached. I'm getting it with netconsole.