Bug 8834 (RK886EX)

Summary: ACPI boot failure w/o CONFIG_SMP - Rocky III+ RK886EX
Product: ACPI Reporter: Ph. Marek (philipp+kernel-bugs)
Component: Config-ProcessorsAssignee: ykzhao (yakui.zhao)
Status: REJECTED INSUFFICIENT_DATA    
Severity: normal CC: acpi-bugzilla
Priority: P1    
Hardware: All   
OS: Linux   
Kernel Version: 2.6.20.1, 2.6.22.1 Subsystem:
Regression: --- Bisected commit-id:
Attachments: acpidump
lspci with acpi=off
dmesg with acpi=off
dmesg with SMP
dump tool to get bios PRT/MPS table

Description Ph. Marek 2007-08-01 03:40:32 UTC
Most recent kernel where this bug did not occur: ?
Distribution: Debian
Hardware Environment:
Software Environment: PXE-booted kernel, with some custom initrd

--- Problem Description:

Kernel boots, but hangs with a blank screen shortly after "checking if image is initramfs" (some messages scroll by), unless "acpi=off" or "acpi=ht" is given.
"acpi=force", "acpi=strict", "pci=conf1", "acpi=noirq", "pci=noacpi", 
"noapic hda=noprobe hdc=noprobe" don't help.

The initrd doesn't clear the screen; maybe it's some kind of powersaving that kicks in and kills the machine.
No NumLock, no SysReq.

--- Steps to reproduce: Booting the kernel.
Comment 1 Ph. Marek 2007-08-01 03:41:24 UTC
Created attachment 12221 [details]
acpidump
Comment 2 Ph. Marek 2007-08-01 03:41:57 UTC
Created attachment 12222 [details]
lspci with acpi=off
Comment 3 Ph. Marek 2007-08-01 03:44:17 UTC
Created attachment 12223 [details]
dmesg with acpi=off
Comment 4 Len Brown 2007-08-03 21:18:00 UTC
This one also has an MCFG, please try "pci=nommconf"
Comment 5 Ph. Marek 2007-08-05 22:24:15 UTC
(In reply to comment #4)
> This one also has an MCFG, please try "pci=nommconf"
Same problem.

I think it's something with power saving ... why should it shut off the screen?
Comment 6 Len Brown 2007-08-14 13:53:05 UTC
are there any patches in the kernel, such as the one that
checks the initrd for a DSDT override?
Comment 7 Ph. Marek 2007-08-15 00:51:11 UTC
Not that I know of - I took a clean kernel from kernel.org, extracted it, compiled and tried ... no patches.
The only thing I tried (but failed?) was to enter this device into the ACPI blacklist - so I took it out again.
Comment 8 Ph. Marek 2007-08-17 05:05:22 UTC
Inspired by the fact that a knoppix 5.1.1 boots I tried to get my config as similar as possible ... I turned SMP on, and the machine works.
Will attach the new dmesg, in case it's interesting.
Comment 9 Ph. Marek 2007-08-17 05:05:54 UTC
Created attachment 12422 [details]
dmesg with SMP
Comment 10 Len Brown 2007-08-17 08:38:48 UTC
CONFIG_SMP adds multi-processor support and force enables IO-APIC support.
It looks like your UP kernel already included IO-APIC support,
so that is probably not it.  (ie. your UP kernel will probably also
boot with "acpi=off noapic".  You might try the ACPI enabled UP kernel
with "noapic" just for grins)

The fact that SMP support was necessary probably means the issue
is with APIC-id's used to enumerate the processors and the IO-APIC.
Comment 11 Ph. Marek 2007-08-20 04:44:48 UTC
I  now tested a bit more using our production kernel (2.6.20.1, SMP compiled in) and "nosmp". Please note that the transcript might be not exact, as it's typed from a scratchpad.

Parameter "nosmp":
  apm: BIOS not found
  assign_int_mode: Found MSI cap
  assign_int_mode: Found MSI cap
  assign_int_mode: Found MSI cap
  ACPI: Ac Adapter [ADP1] online
30 sec pause
  ACPI: ... 
unreadable, scrolls by too fast
  ide: Assigned 33MHz system bus for PIO modes; ...
  ICH7: IDE controller at PCI 0000:00:1f.1
  ACPI: PCI Int. 0000:00:1f.1[B] ->  GSI 19 (level, low) -> IRQ 18
  ICH7: chipset rev. 2
  ICH7: not 100% compatible, ...
    ide0: BM-DMA at 0x1810-0x1817; Bios set hda:DMA, hdb:PIO
  hda: Optiarc DVD RW AD-7530, ATAPI CD/DVD-ROM
  ide0 at 0x1f0-0x1f7,0x3f6 on irq 19
5 sec pause
  ide-cd: cmd 0x5a timed out
  hda: lost interrupt
1 min pause
  ide-cd: cmd 0x5a timed out
  hda: lost interrupt
1 min pause
  hda: ATAPI 24x DVD ROM ... UDMA(33)
  Uniform CD-ROM driver rev. 3.20
  hda: lost interrupt
  ide-cd: cmd 0x3 timed out

and I killed the machine after some more timeouts.


With "nosmp hda=noprobe" I got:
  ata1: SATA max UDMA 133
  ata2: SATA max UDMA 133
  scsi0: ata_piix
  ata 1.00: ATA-7, max UDMA/100
  ata 1.00: ata1-dev 0 multcount 16
  ata 1.00: qc timeout (... 0xef)
  ata 1.00: failed to set xfermode (err_mask=0x4)
followed by some timeout (15 sec.?), then more messages and the famous black screen.

Anything else?
Comment 12 Len Brown 2007-08-20 09:28:04 UTC
Using "nosmp" was an unfortunate choice, as it is broken on ACPI systems
until the fix in bug # 1641  Note that "maxcpus=0" is the same as "nosmp".

However, you should be able to use "maxcpus=1" on your SMP kernel,
and it should boot uni-processor.  If it fails, that would be a bug.
You may also be able to re-create the failure using the SMP kernel
but by reducing CONFIG_NR_CPUS.

Also, from above, please confirm that your UP kernel
that fails by default boots with "acpi=off noapic",
but fails with just "noapic"

please include the dmesg from the successful boots.
Comment 13 Len Brown 2007-08-20 11:57:24 UTC
ACPI: APIC 7F682E62, 0068 (r1 INTEL  CALISTGA  6040000 LOHR       5A)

ACPI: APIC 7F682F70, 0068 (r1 PTLTD  	 APIC    6040000  LTP        0)

ACPI: BIOS bug: multiple APIC/MADT found, using 0
ACPI: If "acpi_apic_instance=2" works better, notify linux-acpi@vger.kernel.org

Please boot with "acpi_apic_instance=2" and report if that
has any effect on the system.  a diff of the dmesg before/after
would be helpful too.

It appears the only difference is the id of the IOAPIC,
since the default for the timer on IRQ0 is edge/high anyway.

< ACPI: APIC (v001 INTEL  CALISTGA 0x06040000 LOHR 0x0000005a) @ 0x(nil)
---
> ACPI: APIC (v001 PTLTD         APIC   0x06040000  LTP 0x00000000) @ 0x(nil)
4,6c4
< ACPI: IOAPIC (id[0x01] address[0xfec00000] global_irq_base[0x0])
< ACPI: INT_SRC_OVR (bus 0 bus_irq 0 global_irq 2 dfl dfl)
< ACPI: INT_SRC_OVR (bus 0 bus_irq 9 global_irq 9 high level)
---
> ACPI: IOAPIC (id[0x02] address[0xfec00000] global_irq_base[0x0])
8a7,8
> ACPI: INT_SRC_OVR (bus 0 bus_irq 0 global_irq 2 high edge)
> ACPI: INT_SRC_OVR (bus 0 bus_irq 9 global_irq 9 high level)
Comment 14 Ph. Marek 2007-08-20 23:44:53 UTC
"acpi=off noapic" boots, as expected.
"noapic hda=noprobe hdb=noprobe" (as above) doesn't work.
"acpi_apic_instance=2" doesn't work - blanks screen.
Comment 15 Fu Michael 2007-10-10 01:36:26 UTC
could you please try Len's suggestion in comment# 12, i.e. to use maxcpus=1? 

please refer to http://bugzilla.kernel.org/show_bug.cgi?id=1641..
Comment 16 Fu Michael 2007-11-12 17:30:39 UTC
reject the bug due to no response from bug reporter...
Comment 17 Ph. Marek 2007-11-13 01:06:29 UTC
Sorry .. you could have simply reminded me :-)

2.6.23, maxcpus=1, hangs with blank screen as before.
Comment 18 ykzhao 2007-12-20 23:50:04 UTC
From the log in comment #9 it seems that the system is booted successfully after adding the SMP support.

Will you please test the SMP kernel with boot option of "maxcpus=1"? 
If it can be booted , please attach the output of dmesg.
If it can't be booted, please capture the picture when the system hangs.
Thanks.
Comment 19 ykzhao 2007-12-20 23:51:43 UTC
Created attachment 14139 [details]
dump tool to get bios PRT/MPS table

Will you please use the attached tool to dump bios PRT/MPS table?
Comment 20 ykzhao 2007-12-20 23:58:12 UTC
Will you please test the following boot option on SMP kernel ?
a. pci=noacpi
b. acpi=noirq
c. noapic
If it can be booted , please attach the output of dmesg.
If it can't be booted, please capture the picture when the system hangs.
Thanks.
Comment 21 ykzhao 2008-01-23 21:16:48 UTC
Since there is no response from the bug reporter, the bug will be rejected. 
If the problem still exists, please reopen the bug.