Bug 8136 - 2.6.21-rc2-mm2 won't boot
Summary: 2.6.21-rc2-mm2 won't boot
Status: CLOSED CODE_FIX
Alias: None
Product: Alternate Trees
Classification: Unclassified
Component: mm (show other bugs)
Hardware: i386 Linux
: P2 normal
Assignee: Andrew Morton
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2007-03-06 12:07 UTC by Nicolas Mailhot
Modified: 2007-03-08 10:51 UTC (History)
0 users

See Also:
Kernel Version: 2.6.21-rc2-mm2
Subsystem:
Regression: ---
Bisected commit-id:


Attachments
config for 2.6.21-rc2.mm2 (43.08 KB, text/plain)
2007-03-06 12:07 UTC, Nicolas Mailhot
Details
diff with working 2.6.21-rc2.mm1 config (3.66 KB, text/plain)
2007-03-06 12:08 UTC, Nicolas Mailhot
Details
lspci (117.65 KB, text/plain)
2007-03-06 12:09 UTC, Nicolas Mailhot
Details
2.6.21-rc2.mm1 dmesg (26.65 KB, text/plain)
2007-03-06 12:43 UTC, Nicolas Mailhot
Details
Printscreen with vesfg disabled and earlyprintk=vga (420.54 KB, image/jpeg)
2007-03-06 15:36 UTC, Nicolas Mailhot
Details

Description Nicolas Mailhot 2007-03-06 12:07:06 UTC
Most recent kernel where this bug did *NOT* occur: 2.6.21-rc2-mm1
Distribution: Fedora Devel
Hardware Environment: Giga-byte Technology GA-K8N Ultra-9 Mainboard (AMD 64 X2 +
Nvidia CK804)
Software Environment: N/A
Problem Description: kernel won't boot, blank screen & no activity after leaving
the bootloader (no messages, no penguins)

Steps to reproduce: try to boot
Comment 1 Nicolas Mailhot 2007-03-06 12:07:55 UTC
Created attachment 10624 [details]
config for 2.6.21-rc2.mm2
Comment 2 Nicolas Mailhot 2007-03-06 12:08:47 UTC
Created attachment 10625 [details]
diff with working 2.6.21-rc2.mm1 config
Comment 3 Nicolas Mailhot 2007-03-06 12:09:15 UTC
Created attachment 10626 [details]
lspci
Comment 4 Nicolas Mailhot 2007-03-06 12:43:31 UTC
Created attachment 10627 [details]
2.6.21-rc2.mm1 dmesg
Comment 5 Anonymous Emailer 2007-03-06 14:42:28 UTC
Reply-To: akpm@linux-foundation.org


Can you please add

	earlyprintk=vga

to the kernel boot parameters, see if we get any useful
information?  If so, a digital photograph of the screen
might be useful.

Thanks.

Comment 6 Nicolas Mailhot 2007-03-06 15:36:29 UTC
Created attachment 10631 [details]
Printscreen with vesfg disabled and earlyprintk=vga
Comment 7 Nicolas Mailhot 2007-03-06 15:41:30 UTC
That's funny, I was just re-reading http://lkml.org/lkml/2006/10/25/26 and
wondering if I had HPET working on my CK804 system or not

My dmesg says
ACPI: INT_SRC_OVR (bus 0 bus_irq 0 global_irq 2 dfl dfl)
then
..MP-BIOS bug: 8254 timer not connected to IO-APIC

(and on a working kernel
..MP-BIOS bug: 8254 timer not connected to IO-APIC
Using local APIC timer interrupts.
result 12558072
Detected 12.558 MHz APIC timer.)

it looks like the problem described in the lkml thread

Comment 8 Anonymous Emailer 2007-03-06 16:16:03 UTC
Reply-To: akpm@linux-foundation.org

On Tue, 6 Mar 2007 15:36:29 -0800
bugme-daemon@bugzilla.kernel.org wrote:

> http://bugzilla.kernel.org/show_bug.cgi?id=8136
> 

Let's take this to email.

> 
> 
> 
> ------- Additional Comments From Nicolas.Mailhot@LaPoste.net  2007-03-06 15:36 -------
> Created an attachment (id=10631)
>  --> (http://bugzilla.kernel.org/attachment.cgi?id=10631&action=view)
> Printscreen with vesfg disabled and earlyprintk=vga
> 

So rc2-mm2 panics due to "MP-BIOS bug: 8254 timer not connected to IO-APIC" and
rc2-mm1 does not.

Could be ACPI, could be x86_64 timer changes, could be something else.

Would you have time to bisect it? 
http://www.zip.com.au/~akpm/linux/patches/stuff/bisecting-mm-trees.txt
explains how.

If so, I'd suggest you drill in on the patches between
x86_64-mm-defconfig-update.patch and
optimize-and-simplify-get_cycles_sync.patch: the x86 changes.

Comment 9 Anonymous Emailer 2007-03-06 16:29:37 UTC
Reply-To: rjw@sisk.pl

Hi,

On Wednesday, 7 March 2007 01:15, Andrew Morton wrote:
> On Tue, 6 Mar 2007 15:36:29 -0800
> bugme-daemon@bugzilla.kernel.org wrote:
> 
> > http://bugzilla.kernel.org/show_bug.cgi?id=8136
> > 
> 
> Let's take this to email.
> 
> > 
> > 
> > 
> > ------- Additional Comments From Nicolas.Mailhot@LaPoste.net  2007-03-06 15:36 -------
> > Created an attachment (id=10631)
> >  --> (http://bugzilla.kernel.org/attachment.cgi?id=10631&action=view)
> > Printscreen with vesfg disabled and earlyprintk=vga
> > 
> 
> So rc2-mm2 panics due to "MP-BIOS bug: 8254 timer not connected to IO-APIC" and
> rc2-mm1 does not.

I'm observing a similar thing on my dual-core AMD64 testbed desktop.  Still,
another dual-core AMD64 machine I have runs -rc2-mm2 just fine.

One of the differences between them is that the failing one uses gcc 4.1.0 (sigh).

> Could be ACPI, could be x86_64 timer changes, could be something else.
> 
> Would you have time to bisect it? 
> http://www.zip.com.au/~akpm/linux/patches/stuff/bisecting-mm-trees.txt
> explains how.
> 
> If so, I'd suggest you drill in on the patches between
> x86_64-mm-defconfig-update.patch and
> optimize-and-simplify-get_cycles_sync.patch: the x86 changes.

I'll try to debug it tomorrow.

Greetings,
Rafael

Comment 10 Nicolas Mailhot 2007-03-06 22:56:57 UTC
$ rpm -q gcc
gcc-4.1.2-3.x86_64
Comment 11 Nicolas Mailhot 2007-03-06 23:36:35 UTC
Le mardi 06 mars 2007 à 16:15 -0800, Andrew Morton a écrit :

> So rc2-mm2 panics due to "MP-BIOS bug: 8254 timer not connected to IO-APIC" and
> rc2-mm1 does not.
> 
> Could be ACPI, could be x86_64 timer changes, could be something else.
> 
> Would you have time to bisect it? 
> http://www.zip.com.au/~akpm/linux/patches/stuff/bisecting-mm-trees.txt
> explains how.
> 
> If so, I'd suggest you drill in on the patches between
> x86_64-mm-defconfig-update.patch and
> optimize-and-simplify-get_cycles_sync.patch: the x86 changes.

I may have some more debug time this evening (CET), probably not enough
for a full bisection. I'd really love to have timer/clock problems
nailed once and for all on this box (MP BIOS, RTC, HPET, whatever)

Comment 12 Nicolas Mailhot 2007-03-07 14:07:02 UTC
Le mardi 06 mars 2007 à 16:15 -0800, Andrew Morton a écrit :
> On Tue, 6 Mar 2007 15:36:29 -0800
> bugme-daemon@bugzilla.kernel.org wrote:
> 
> > http://bugzilla.kernel.org/show_bug.cgi?id=8136

> So rc2-mm2 panics due to "MP-BIOS bug: 8254 timer not connected to IO-APIC" and
> rc2-mm1 does not.
> 
> Could be ACPI, could be x86_64 timer changes, could be something else.
> 
> Would you have time to bisect it? 
> 
> I'd suggest you drill in on the patches between
> x86_64-mm-defconfig-update.patch and
> optimize-and-simplify-get_cycles_sync.patch: the x86 changes.

Removing the x86 patchset (342-430) and utrace (647-663) makes the
system boot. (no surprise, but good to confirm). I'll try a few more
tests tomorrow, need to sleep now.

Comment 13 Nicolas Mailhot 2007-03-08 10:51:30 UTC
2.6.21-rc3.mm2 works again so I'll close the bug
Comment 14 Anonymous Emailer 2007-03-08 13:30:55 UTC
Reply-To: rjw@sisk.pl

On Wednesday, 7 March 2007 01:32, Rafael J. Wysocki wrote:
> Hi,
> 
> On Wednesday, 7 March 2007 01:15, Andrew Morton wrote:
> > On Tue, 6 Mar 2007 15:36:29 -0800
> > bugme-daemon@bugzilla.kernel.org wrote:
> > 
> > > http://bugzilla.kernel.org/show_bug.cgi?id=8136
> > > 
> > 
> > Let's take this to email.
> > 
> > > 
> > > 
> > > 
> > > ------- Additional Comments From Nicolas.Mailhot@LaPoste.net  2007-03-06 15:36 -------
> > > Created an attachment (id=10631)
> > >  --> (http://bugzilla.kernel.org/attachment.cgi?id=10631&action=view)
> > > Printscreen with vesfg disabled and earlyprintk=vga
> > > 
> > 
> > So rc2-mm2 panics due to "MP-BIOS bug: 8254 timer not connected to IO-APIC" and
> > rc2-mm1 does not.
> 
> I'm observing a similar thing on my dual-core AMD64 testbed desktop.  Still,
> another dual-core AMD64 machine I have runs -rc2-mm2 just fine.
> 
> One of the differences between them is that the failing one uses gcc 4.1.0 (sigh).
> 
> > Could be ACPI, could be x86_64 timer changes, could be something else.
> > 
> > Would you have time to bisect it? 
> > http://www.zip.com.au/~akpm/linux/patches/stuff/bisecting-mm-trees.txt
> > explains how.
> > 
> > If so, I'd suggest you drill in on the patches between
> > x86_64-mm-defconfig-update.patch and
> > optimize-and-simplify-get_cycles_sync.patch: the x86 changes.
> 
> I'll try to debug it tomorrow.

OK, this seems to be fixed in 2.6.21-rc3-mm2.


Note You need to log in before you can comment on or make changes to this bug.