Bug 8834 (RK886EX) - ACPI boot failure w/o CONFIG_SMP - Rocky III+ RK886EX
Summary: ACPI boot failure w/o CONFIG_SMP - Rocky III+ RK886EX
Status: REJECTED INSUFFICIENT_DATA
Alias: RK886EX
Product: ACPI
Classification: Unclassified
Component: Config-Processors (show other bugs)
Hardware: All Linux
: P1 normal
Assignee: ykzhao
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2007-08-01 03:40 UTC by Ph. Marek
Modified: 2008-01-23 21:16 UTC (History)
1 user (show)

See Also:
Kernel Version: 2.6.20.1, 2.6.22.1
Subsystem:
Regression: ---
Bisected commit-id:


Attachments
acpidump (170.21 KB, text/plain)
2007-08-01 03:41 UTC, Ph. Marek
Details
lspci with acpi=off (17.83 KB, text/plain)
2007-08-01 03:41 UTC, Ph. Marek
Details
dmesg with acpi=off (6.96 KB, text/plain)
2007-08-01 03:44 UTC, Ph. Marek
Details
dmesg with SMP (25.13 KB, text/plain)
2007-08-17 05:05 UTC, Ph. Marek
Details
dump tool to get bios PRT/MPS table (6.33 KB, application/x-gzip)
2007-12-20 23:51 UTC, ykzhao
Details

Description Ph. Marek 2007-08-01 03:40:32 UTC
Most recent kernel where this bug did not occur: ?
Distribution: Debian
Hardware Environment:
Software Environment: PXE-booted kernel, with some custom initrd

--- Problem Description:

Kernel boots, but hangs with a blank screen shortly after "checking if image is initramfs" (some messages scroll by), unless "acpi=off" or "acpi=ht" is given.
"acpi=force", "acpi=strict", "pci=conf1", "acpi=noirq", "pci=noacpi", 
"noapic hda=noprobe hdc=noprobe" don't help.

The initrd doesn't clear the screen; maybe it's some kind of powersaving that kicks in and kills the machine.
No NumLock, no SysReq.

--- Steps to reproduce: Booting the kernel.
Comment 1 Ph. Marek 2007-08-01 03:41:24 UTC
Created attachment 12221 [details]
acpidump
Comment 2 Ph. Marek 2007-08-01 03:41:57 UTC
Created attachment 12222 [details]
lspci with acpi=off
Comment 3 Ph. Marek 2007-08-01 03:44:17 UTC
Created attachment 12223 [details]
dmesg with acpi=off
Comment 4 Len Brown 2007-08-03 21:18:00 UTC
This one also has an MCFG, please try "pci=nommconf"
Comment 5 Ph. Marek 2007-08-05 22:24:15 UTC
(In reply to comment #4)
> This one also has an MCFG, please try "pci=nommconf"
Same problem.

I think it's something with power saving ... why should it shut off the screen?
Comment 6 Len Brown 2007-08-14 13:53:05 UTC
are there any patches in the kernel, such as the one that
checks the initrd for a DSDT override?
Comment 7 Ph. Marek 2007-08-15 00:51:11 UTC
Not that I know of - I took a clean kernel from kernel.org, extracted it, compiled and tried ... no patches.
The only thing I tried (but failed?) was to enter this device into the ACPI blacklist - so I took it out again.
Comment 8 Ph. Marek 2007-08-17 05:05:22 UTC
Inspired by the fact that a knoppix 5.1.1 boots I tried to get my config as similar as possible ... I turned SMP on, and the machine works.
Will attach the new dmesg, in case it's interesting.
Comment 9 Ph. Marek 2007-08-17 05:05:54 UTC
Created attachment 12422 [details]
dmesg with SMP
Comment 10 Len Brown 2007-08-17 08:38:48 UTC
CONFIG_SMP adds multi-processor support and force enables IO-APIC support.
It looks like your UP kernel already included IO-APIC support,
so that is probably not it.  (ie. your UP kernel will probably also
boot with "acpi=off noapic".  You might try the ACPI enabled UP kernel
with "noapic" just for grins)

The fact that SMP support was necessary probably means the issue
is with APIC-id's used to enumerate the processors and the IO-APIC.
Comment 11 Ph. Marek 2007-08-20 04:44:48 UTC
I  now tested a bit more using our production kernel (2.6.20.1, SMP compiled in) and "nosmp". Please note that the transcript might be not exact, as it's typed from a scratchpad.

Parameter "nosmp":
  apm: BIOS not found
  assign_int_mode: Found MSI cap
  assign_int_mode: Found MSI cap
  assign_int_mode: Found MSI cap
  ACPI: Ac Adapter [ADP1] online
30 sec pause
  ACPI: ... 
unreadable, scrolls by too fast
  ide: Assigned 33MHz system bus for PIO modes; ...
  ICH7: IDE controller at PCI 0000:00:1f.1
  ACPI: PCI Int. 0000:00:1f.1[B] ->  GSI 19 (level, low) -> IRQ 18
  ICH7: chipset rev. 2
  ICH7: not 100% compatible, ...
    ide0: BM-DMA at 0x1810-0x1817; Bios set hda:DMA, hdb:PIO
  hda: Optiarc DVD RW AD-7530, ATAPI CD/DVD-ROM
  ide0 at 0x1f0-0x1f7,0x3f6 on irq 19
5 sec pause
  ide-cd: cmd 0x5a timed out
  hda: lost interrupt
1 min pause
  ide-cd: cmd 0x5a timed out
  hda: lost interrupt
1 min pause
  hda: ATAPI 24x DVD ROM ... UDMA(33)
  Uniform CD-ROM driver rev. 3.20
  hda: lost interrupt
  ide-cd: cmd 0x3 timed out

and I killed the machine after some more timeouts.


With "nosmp hda=noprobe" I got:
  ata1: SATA max UDMA 133
  ata2: SATA max UDMA 133
  scsi0: ata_piix
  ata 1.00: ATA-7, max UDMA/100
  ata 1.00: ata1-dev 0 multcount 16
  ata 1.00: qc timeout (... 0xef)
  ata 1.00: failed to set xfermode (err_mask=0x4)
followed by some timeout (15 sec.?), then more messages and the famous black screen.

Anything else?
Comment 12 Len Brown 2007-08-20 09:28:04 UTC
Using "nosmp" was an unfortunate choice, as it is broken on ACPI systems
until the fix in bug # 1641  Note that "maxcpus=0" is the same as "nosmp".

However, you should be able to use "maxcpus=1" on your SMP kernel,
and it should boot uni-processor.  If it fails, that would be a bug.
You may also be able to re-create the failure using the SMP kernel
but by reducing CONFIG_NR_CPUS.

Also, from above, please confirm that your UP kernel
that fails by default boots with "acpi=off noapic",
but fails with just "noapic"

please include the dmesg from the successful boots.
Comment 13 Len Brown 2007-08-20 11:57:24 UTC
ACPI: APIC 7F682E62, 0068 (r1 INTEL  CALISTGA  6040000 LOHR       5A)

ACPI: APIC 7F682F70, 0068 (r1 PTLTD  	 APIC    6040000  LTP        0)

ACPI: BIOS bug: multiple APIC/MADT found, using 0
ACPI: If "acpi_apic_instance=2" works better, notify linux-acpi@vger.kernel.org

Please boot with "acpi_apic_instance=2" and report if that
has any effect on the system.  a diff of the dmesg before/after
would be helpful too.

It appears the only difference is the id of the IOAPIC,
since the default for the timer on IRQ0 is edge/high anyway.

< ACPI: APIC (v001 INTEL  CALISTGA 0x06040000 LOHR 0x0000005a) @ 0x(nil)
---
> ACPI: APIC (v001 PTLTD         APIC   0x06040000  LTP 0x00000000) @ 0x(nil)
4,6c4
< ACPI: IOAPIC (id[0x01] address[0xfec00000] global_irq_base[0x0])
< ACPI: INT_SRC_OVR (bus 0 bus_irq 0 global_irq 2 dfl dfl)
< ACPI: INT_SRC_OVR (bus 0 bus_irq 9 global_irq 9 high level)
---
> ACPI: IOAPIC (id[0x02] address[0xfec00000] global_irq_base[0x0])
8a7,8
> ACPI: INT_SRC_OVR (bus 0 bus_irq 0 global_irq 2 high edge)
> ACPI: INT_SRC_OVR (bus 0 bus_irq 9 global_irq 9 high level)
Comment 14 Ph. Marek 2007-08-20 23:44:53 UTC
"acpi=off noapic" boots, as expected.
"noapic hda=noprobe hdb=noprobe" (as above) doesn't work.
"acpi_apic_instance=2" doesn't work - blanks screen.
Comment 15 Fu Michael 2007-10-10 01:36:26 UTC
could you please try Len's suggestion in comment# 12, i.e. to use maxcpus=1? 

please refer to http://bugzilla.kernel.org/show_bug.cgi?id=1641..
Comment 16 Fu Michael 2007-11-12 17:30:39 UTC
reject the bug due to no response from bug reporter...
Comment 17 Ph. Marek 2007-11-13 01:06:29 UTC
Sorry .. you could have simply reminded me :-)

2.6.23, maxcpus=1, hangs with blank screen as before.
Comment 18 ykzhao 2007-12-20 23:50:04 UTC
From the log in comment #9 it seems that the system is booted successfully after adding the SMP support.

Will you please test the SMP kernel with boot option of "maxcpus=1"? 
If it can be booted , please attach the output of dmesg.
If it can't be booted, please capture the picture when the system hangs.
Thanks.
Comment 19 ykzhao 2007-12-20 23:51:43 UTC
Created attachment 14139 [details]
dump tool to get bios PRT/MPS table

Will you please use the attached tool to dump bios PRT/MPS table?
Comment 20 ykzhao 2007-12-20 23:58:12 UTC
Will you please test the following boot option on SMP kernel ?
a. pci=noacpi
b. acpi=noirq
c. noapic
If it can be booted , please attach the output of dmesg.
If it can't be booted, please capture the picture when the system hangs.
Thanks.
Comment 21 ykzhao 2008-01-23 21:16:48 UTC
Since there is no response from the bug reporter, the bug will be rejected. 
If the problem still exists, please reopen the bug.

Note You need to log in before you can comment on or make changes to this bug.