Bug 6563 - 2.6.16.16 boot hang in ACPI mode, 2.6.12 boots 1 of 2 processors - Pentium D
Summary: 2.6.16.16 boot hang in ACPI mode, 2.6.12 boots 1 of 2 processors - Pentium D
Status: CLOSED PATCH_ALREADY_AVAILABLE
Alias: None
Product: ACPI
Classification: Unclassified
Component: Config-Other (show other bugs)
Hardware: i386 Linux
: P2 blocking
Assignee: Len Brown
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2006-05-16 08:53 UTC by David Ronis
Modified: 2006-11-15 02:30 UTC (History)
2 users (show)

See Also:
Kernel Version: 2.6.16.16
Subsystem:
Regression: ---
Bisected commit-id:


Attachments
dmsg file after boot with acpi=off (9.69 KB, text/plain)
2006-05-16 08:58 UTC, David Ronis
Details
.config file used to build the kernel (39.41 KB, text/plain)
2006-05-16 08:59 UTC, David Ronis
Details
/proc/interrupts after booting with acpi=off (528 bytes, text/plain)
2006-05-16 09:01 UTC, David Ronis
Details
dmsg file from knoppix boot (13.04 KB, text/plain)
2006-05-17 08:22 UTC, David Ronis
Details
/proc/interrupts from knoppix boot (520 bytes, text/plain)
2006-05-17 08:23 UTC, David Ronis
Details
/proc/cpuinfo from Knoppix boot (532 bytes, text/plain)
2006-05-17 08:23 UTC, David Ronis
Details
diffs of dmesg between Knoppix & 2.6.16.16 (14.67 KB, text/plain)
2006-05-18 15:40 UTC, David Ronis
Details
Full hardware listing (14.04 KB, text/plain)
2006-05-22 07:57 UTC, David Ronis
Details

Description David Ronis 2006-05-16 08:53:54 UTC
Most recent kernel where this bug did not occur:  Unclear (seems to work under
knoppix)
Distribution:  Slackware
Hardware Environment:  Pendium D dual core, nvidia graphics, SATA drive
Software Environment:  gcc-4.1.0

I posted this to the linux-acpi list and was advised to file a bug report. 
Here's where I am with the problem up until now.

Thanks for the reply, I've answered in context below.

On Mon, 2006-05-15 at 18:43 -0400, Brown, Len wrote: 
> >I'm trying to get a new dual-core Intel Pentium D box to work, but I'm
> >having trouble getting SMP working.
> >
> >Here's where I am:
> >
> >1.  With the 2.6.12 kernel from the knoppix disk, things work.  Two
> >processors are recognized at boot time.
> >
> >2.  With 2.6.16.16 things hang at boot time when ACPI is enabled (even
> >when passing acip=ht at boot).  I see the following:
> >
> >Enabling IO-APIC IRQ's
> >...TIMER:  vector=0x31 apic1=0 pin1=2 apic2=-1 pin2=-1
> 
> Is this the last thing printed before the hang?
> While this line is printed by horrible cruft that should
> have been deleted from the kernel years ago, it is
> actually nominal output.

Yes, this is the last thing I see (unless I fool around with the ACPI
boot parameters, where I don't even see this line).

> Any chance you can capture the console output with "debug"
> from the failure case?

You mean pass debug as a boot flag?  Sure I can try that. 
> >On another pentium 4 hyperthreaded box I see the same followed by:
> >.
> >.MP-BIOS bug: 8254 timer not connected to IO-APIC
> >...trying to set up timer (IRQ0) through the 8259A ...  failed.
> >...trying to set up timer as Virtual Wire IRQ... works.
> >checking TSC synchronization across 2 CPUs: passed.
> >Brought up 2 CPUs
> >and the boot succeeds.
> 
> This isn't a good model of "working" to follow.
> It is in MP mode, MP mode is broken, and a workaround is being invoked.
> 
> >3.  Disabling ACPI (yes, ACPI, not APIC) either by passing acpi=off or
> >in building the kernel, works; I boot, but only one processor is
> >recognized.
> 
> if you boot with "apic=off" (yes APIC, not ACPI:-), does the system
> come up in ACPI+PIC mode?

I tried acpi=off and noapic and that worked.  Either separately (or
apic=off) hangs.

> 
> how about if you boot with "maxcpus=1", does that come up in ACPI
> mode normally?
> 

Fails.

> Please file a bugzilla here where you can attach the dmesg
> and /proc/interrupts from these cases
> 
> http://bugzilla.kernel.org/enter_bug.cgi?product=ACPI
> 

Will do this later today.  Thanks.  

David

> thanks,
> -Len
> 
>
Comment 1 David Ronis 2006-05-16 08:58:43 UTC
Created attachment 8118 [details]
dmsg file after boot with acpi=off
Comment 2 David Ronis 2006-05-16 08:59:49 UTC
Created attachment 8119 [details]
.config file used to build the kernel
Comment 3 David Ronis 2006-05-16 09:01:24 UTC
Created attachment 8120 [details]
/proc/interrupts after booting with acpi=off
Comment 4 Shaohua 2006-05-16 20:49:13 UTC
can you post the dmesg with 2.6.12 (the working case)?
> >.MP-BIOS bug: 8254 timer not connected to IO-APIC
> >...trying to set up timer (IRQ0) through the 8259A ...  failed.
> >...trying to set up timer as Virtual Wire IRQ... works.
> >checking TSC synchronization across 2 CPUs: passed.
is this the same motherboard used by the pentium D CPU?
Comment 5 David Ronis 2006-05-17 08:21:13 UTC
I've attached the information requested.  I was in error though,  the knoppix
version actually comes up with only 1 cpu.  I saw the 2 penguins on the boot
screen and assumed that both would be working.

On other thing.  I compiled the 2.6.16.16 kernel with hyperthreading turned on 
(cpuinfo  
says it's there).

Comment 6 David Ronis 2006-05-17 08:22:28 UTC
Created attachment 8129 [details]
dmsg file from knoppix boot
Comment 7 David Ronis 2006-05-17 08:23:17 UTC
Created attachment 8130 [details]
/proc/interrupts from knoppix boot
Comment 8 David Ronis 2006-05-17 08:23:52 UTC
Created attachment 8131 [details]
/proc/cpuinfo from Knoppix boot
Comment 9 Len Brown 2006-05-18 00:39:45 UTC
There are two problems here:

1. 2.6.12/Knoppix boots in ACPI mode, but enables only 1 processor?

The BIOS is not enabling the 2nd processor:

Linux version 2.6.12 (root@Knoppix) (gcc-Version 3.3.6 (Debian 1:3.3.6-7)) #2 
SMP Tue Aug 9 23:20:52 CEST 2005
...
ACPI: LAPIC (acpi_id[0x01] lapic_id[0x00] enabled)
Processor #0 15:6 APIC version 20
ACPI: LAPIC (acpi_id[0x02] lapic_id[0x01] disabled)
ACPI: LAPIC (acpi_id[0x03] lapic_id[0x82] disabled)
ACPI: LAPIC (acpi_id[0x04] lapic_id[0x83] disabled)

Update your board with the latest BIOS from the vendor,
then go into SETUP and make sure there is no option
that is turning off the 2nd processor.

2. 2.6.16.16 ACPI-mode boot hang:

Please always add "debug" to the cmdline.
the "acpi=off" dmesg is missing some lines.

Is it possible to capture the serial console log of
the failure with just "debug" as a parameter?

Also, try booting with "noapic".
If there is something funky about the timer IRQ on this board
that 2.6.16 is tripping over, this should get around it and
run in legacy PIC mode just like the "acpi=off" case does.
Comment 10 David Ronis 2006-05-18 15:39:24 UTC
OK I upgraded the bios and made sure that everthing that looked like two
processors should be enabled was.

Booting 2.6.12 from knoppix (with no extra flags other than debug) works; i.e.,
I see both 
processors.

Booting 2.6.16.16 hangs unless I put acpi=off, where I don't see the second
processor.  I tried simply booting with noapic and debug, but that hangs too. 
The last line on the screen is:

Total of 2 processors activated (13606.11 BogoMIPS).

I diff'd the dmsgs from the before and after the bios upgrade (with acpi=off and
debug)

diff dmesg /var/log/dmesg
6,7c6,7
<  BIOS-e820: 0000000000100000 - 000000007fe5f000 (usable)
<  BIOS-e820: 000000007fe5f000 - 000000007fee9000 (ACPI NVS)
---
>  BIOS-e820: 0000000000100000 - 000000007fe5c000 (usable)
>  BIOS-e820: 000000007fe5c000 - 000000007fee9000 (ACPI NVS)
35c35
< Detected 3401.795 MHz processor.
---
> Detected 3400.704 MHz processor.
40c40
< Memory: 2074024k/2096128k available (1961k kernel code, 20452k reserved, 655k 
data, 200k init, 1178000k highmem)
---
> Memory: 2073900k/2096128k available (1961k kernel code, 20452k reserved, 655k 
data, 200k init, 1177988k highmem)
42c42
< Calibrating delay using timer specific routine.. 6807.51 BogoMIPS (lpj=3403756
)
---
> Calibrating delay using timer specific routine.. 6807.50 BogoMIPS (lpj=3403754
)
59c59
< Total of 1 processors activated (6807.51 BogoMIPS).
---
> Total of 1 processors activated (6807.50 BogoMIPS).
97c97
<   MEM window: 95300000-953fffff
---
>   MEM window: disabled.
101c101
<   MEM window: 95400000-954fffff
---
>   MEM window: disabled.

> is after the upgrade.

I also diffed the 2.6.16.16 (acpi=off) with knoppix and I've attached that.

One question:  I built the kernel with hyperthreading enabled.  /proc/cpuinfo
says it's there, but this is really a dual core chip.  Is this a problem?
Comment 11 David Ronis 2006-05-18 15:40:44 UTC
Created attachment 8142 [details]
diffs of dmesg between Knoppix & 2.6.16.16
Comment 12 David Ronis 2006-05-22 07:57:36 UTC
Created attachment 8164 [details]
Full hardware listing

I just noticed your return e-mail address.  This box has an intel motherboard
and seems to be largely intel-based.   Perhaps you have a similar one at work
that has a working SMP kernel?	If so, could you send me a copy of the .config
file used to build the kernel?
Comment 13 David Ronis 2006-05-26 06:19:38 UTC
I upgraded the kernel to 2.6.16.18 today (I'd already tried 2.6.16.17, with no
improvement.   The problem is solved (although it beats me as to why given the
changes listed in the Changelog).   One thing slightly different in the
configuration was that I turned on IRQBALANCE (it had been on when I  initially
was trying to get things to work).

Note You need to log in before you can comment on or make changes to this bug.