Distribution: SuSE Linux Enterprise Server 8 Hardware Environment: IBM Blade Server. Software Environment: Problem Description: Newer blades (or newer BIOS revisions) require acpi=off to boot. That happens with the SLES8 kernel, and also with plain 2.4.23pre, and 2.6.0-testN Older blades (or older BIOS revisions) do not require any acpi boot options. Maybe these older systems are blacklisted somewhere. I will attach various outputs from a new blade server, this one does not boot without acpi=off. IDE doesnt work, acpi=on leads to dma timeouts. I was able to use nfsroot and gather the data. 2.4.23pre8 does not use the second cpu, this worked ok with earlier 23pre kernels.
Created attachment 1215 [details] BladeCenter_HS20_8832-31Z.tar.gz new blade server data note the *acpi_off* and *acpi_on* files.
There are 2 issues here. #1 why doesn't ACPI mode work on this system #2 should this system be blacklisted and does the list need updating. #1: ACPI mode It appears that in ACPI/APIC mode, the IDE interrupt is set to IO-APIC-level, when in non ACPI mode it is set to IO-APIC-edge. Please boot the system with ACPI enabled and the non-optimal "noapic" on the cmdline as an experiment to see if ACPI actually works and it is the setting of the interrupt to IDE which is causing the problems. (please attach dmesg & /proc/interrupts) the dmesg for the failure shows that we set IRQ14 to level/low (mode:1/active:1) like we're hard-coded to do for all PCI link device interrupts: ACPI: PCI Interrupt Link [LPID] (IRQs *14) ACPI: PCI Interrupt Link [LPID] enabled at IRQ 14 IOAPIC[0]: Set PCI routing entry (14-14 -> 0x99 -> IRQ 14 Mode:1 Active:1) 00:00:0f[B] -> 14-14 -> IRQ 14 Hmmm, I've seen PCI Interrupt Link devices used for PIRQ routers in PIC mode, but I've never seen them used in APIC mode. Surely we have this case hard-coded to to level/low... The AML for this machine is quite curious. _PRT in PIC mode returns {} -- nothing. _PRT in APIC mode returns entries with PCI link devices: The code for LPID is quite unusual. Method (_PRS, 0, NotSerialized) { Store (0x81, IOPT) Return (_CRS ()) } Curious to have a programmable device who's only possible setting is the current one... #2: blacklist please attach the dmidecode and the dmesg from the (old) working system/BIOS. Please see if the (new) failing system/bios boots with "acpi=ht" Yes, there is a dmi_scan.c blacklist entry that probably matches the old system: { force_acpi_ht, "IBM Bladecenter", { MATCH(DMI_BOARD_VENDOR, "IBM"), MATCH(DMI_BOARD_NAME, "IBM eServer BladeCenter HS20"), NO_MATCH, NO_MATCH }}, If it matches, you'll see in the dmesg: IBM Bladecenter detected: force use of acpi=ht The dmi info you attached suggests that IBM may have re-named some fields: BIOS Information Vendor: IBM Version: -[BSE105DUS-1.01]- Release Date: 07/30/2003 System Information Manufacturer: IBM Product Name: IBM eServer BladeCenter HS20 -[883231Z]- Base Board Information Manufacturer: IBM Product Name: Server Blade #3: other comments;-) lots of usb messages in the dmesg -- i guess you've got some usb debugging enabled? is there an actual problem with usb, or just IDE?
Looking closer at the AML for link device LPID... _PRS returns _CRS -- current IRQ is the only one possible. _SRS is a no-op, trying to set this IRQ to anything does nothing. _CRS does actually ask and IO port (PIDP) what the IRQ# is: Method (_CRS, 0, NotSerialized) { Store (0x83, IOPT) /* just a debug line: writes 0x83 to port 80 */ Name (RRET, ResourceTemplate () { IRQ (Level, ActiveLow, Shared) {0} }) CreateWordField (RRET, 0x01, RINT) Store (PIDP, Local0) If (LEqual (Local0, 0x00)) { Store (0x00, RINT) } Else { ShiftLeft (One, Local0, RINT) } Return (RRET) } Bug _CRS is hard-coded to return (Level, ActiveLow, Shared) -- which is exactly how Linux set up the interrupt. ie. The AML is telling us to program the IOAPIC for IDE as level triggered, not edge triggered. Indeed, I don't know why this link device exists, except for the purpose of specifying level/low. --- Re: the 2nd cpu. This is a dual-Xeon HT-capable system, yes? if booting with acpi=ht does not bring up all 4 logical processors, then try increasing CONFIG_NR_CPUS=8 from 4.
Created attachment 1335 [details] debug patch to hard-code PCI link on IRQ14 to edge/high This debug patch should ignore the level/low specification for a link device on IRQ14 and hard-code it to edge/high. Please try it out on the bladesever in IOAPIC mode. Please attach resulting dmesg and /proc/interrupts. If this patch works as intended and the system functions properly, then this confirms that the issue is IRQ14 setting resulting from this BIOS link device.
Leah tested the patch and confirmed that if Linux ignores the BIOS then IDE works. http://bugzilla.suse.de/show_bug.cgi?id=32567 Closing as will-not-fix -- since Linux is correctly doing exactly what the BIOS asks -- setting IRQ14 to level/low. The distros can add this system to their version of the "acpi=ht" blacklist until the BIOS is fixed.