Bug 5518 - Reboot-through-BIOS does not mask APIC LVT interrupts - BIOS hang
Summary: Reboot-through-BIOS does not mask APIC LVT interrupts - BIOS hang
Status: CLOSED CODE_FIX
Alias: None
Product: Platform Specific/Hardware
Classification: Unclassified
Component: i386 (show other bugs)
Hardware: i386 Linux
: P2 normal
Assignee: Zwane Mwaikambo
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2005-10-28 11:59 UTC by Boris Weissman
Modified: 2005-11-13 22:43 UTC (History)
1 user (show)

See Also:
Kernel Version: 2.6.12 (earlier?)
Tree: Mainline
Regression: ---


Attachments
Clear LVTs on reboot through BIOS (1.00 KB, patch)
2005-10-31 16:56 UTC, Zwane Mwaikambo
Details | Diff
patch-reboot_through_BIOS_does_not_mask_APIC_LVT_interrupts (1.01 KB, patch)
2005-10-31 23:53 UTC, Zwane Mwaikambo
Details | Diff

Description Boris Weissman 2005-10-28 11:59:12 UTC
Most recent kernel where this bug did not occur: 2.6.14
Distribution: Ubuntu 5.10
Hardware Environment:P4, Opteron
Software Environment:
Problem Description: 

i386 version of Reboot-through-BIOS is unsafe: it forgets to mask
APIC LVT interrupts before jumping to a BIOS entry point. As a result,
BIOS ends up bombarded with interrupts early on boot. The BIOS does
not expect it since following a "normal" hardware cpu reset, all APIC
LVT registers have the Mask bit (16) set and can't generate interrupts.

For example, the version of Phoenix BIOS used by VMware enables interrupts
for the first time before masking/clearing APIC LVT. The APIC Timer LVT
register is still set up for a timer interrupt delivery with a high 
vector from the previous Linux incarnation (0xef in our case). The BIOS
has not fully initialized its IDT at this point and the real mode gate for
0xef remains all zeros. Vector 0xef dispatches BIOS to address 0:0, BIOS
takes a #GP and eventually hangs.

machine_shutdown() does attempt to shut down APIC before jumping to
BIOS, but it is ineffective:

/*
 * If Linux enabled the LAPIC against the BIOS default
 * disable it down before re-entering the BIOS on shutdown.
 * Otherwise the BIOS may get confused and not power-off.
 */
void lapic_shutdown(void)
{
	if (!cpu_has_apic || !enabled_via_apicbase)
		return;

	local_irq_disable();
	disable_local_APIC();
	local_irq_enable();
}

In the common case when LAPIC was correctly enabled by the BIOS initially,
enabled_via_apicbase=0 and lapic_shutdown() does nothing; LVT remains
active. Ubuntu 5.1 uses reboot through BIOS by default.

One way to fix this bug is to move a call to clear_local_APIC() from 
disable_local_APIC() up to lapic_shutdown() as in (against 2.6.14):


*** arch/i386/kernel/apic.c     2005-10-27 17:02:08.000000000 -0700
--- arch/i386/kernel/apic-patched.c     2005-10-28 11:54:38.856286802 -0700
***************
*** 267,274 ****
  {
        unsigned long value;

-       clear_local_APIC();
-
        /*
         * Disable APIC (implies clearing of registers
         * for 82489DX!).
--- 267,272 ----
***************
*** 562,567 ****
--- 560,572 ----
   */
  void lapic_shutdown(void)
  {
+        /*
+         * Always clear/mask APIC LVT even when APIC remains enabled
+         * to match the hardware state following reset and avoid
+         * interrupting the BIOS when it does not expect it.
+         */
+       clear_local_APIC();
+
        if (!cpu_has_apic || !enabled_via_apicbase)
                return;
Comment 1 Andrew Morton 2005-10-28 13:00:43 UTC
Could you please test 2.6.14?
Comment 2 Zwane Mwaikambo 2005-10-30 23:45:53 UTC
If the APIC was setup by Linux cpu_has_apic should also be set so you should
fall through to disable_local_APIC, what am i missing here?
Comment 3 Boris Weissman 2005-10-31 13:46:59 UTC
In response to Comment 1 by Andrew Morton:

I have tried the fix proposed in the original filing on 2.6.14 with 
Phoenix BIOS 4.0 Release 6.0. The fix worked fine. I used these config
options to be similar to the Ubuntu 5.1 kernel where I first observed
the hang (the bug affects only APIC-enabled UP systems):

CONFIG_X86_REBOOTFIXUPS=y
CONFIG_X86_BIOS_REBOOT=y
CONFIG_X86_LOCAL_APIC=y
CONFIG_X86_IO_APIC=y

I also used a boot option "reboot=b" to force reboot through BIOS.
Just like Ubuntu 5.1, the unmodified kernel hanged at:
   md: md0 switched to read-only mode.
   Restarting system
   .

The patched kernel worked fine. 
Comment 4 Boris Weissman 2005-10-31 13:51:32 UTC
In response to Comment 2 by Zwane Mwaikambo:

cpu_has_apic=1 is insufficient to proceed to disable_local_APIC().
The problem is that if BIOS enabled APIC (as it normally would),
enabled_via_apicbase would remain 0 since it means "enabled-by-Linux".
Since enabled_via_apicbase=0, lapic_shutdown() returns immediately
and never executes disable_local_APIC():

/*
 * If Linux enabled the LAPIC against the BIOS default
 * disable it down before re-entering the BIOS on shutdown.
 * Otherwise the BIOS may get confused and not power-off.
 */
void lapic_shutdown(void)
{
	if (!cpu_has_apic || !enabled_via_apicbase)
		return;

	local_irq_disable();
	disable_local_APIC();
	local_irq_enable();
}
Comment 5 Zwane Mwaikambo 2005-10-31 16:56:43 UTC
Created attachment 6430 [details]
Clear LVTs on reboot through BIOS
Comment 6 Zwane Mwaikambo 2005-10-31 16:57:49 UTC
Sorry, i forgot the explanation. Essentially change the logic and do the
clear_local_APIC in this case we do it twice, but that's ok. But do not remove
it from disable_local_APIC as that is called from other places.
Comment 7 Boris Weissman 2005-10-31 23:10:57 UTC
Zwane,

Your patch looks good. Calling disable_local_APIC() twice is alright and
better than not calling it at all. If you don't mind, I'd restructure it a
little to get rid of goto and avoid local_irq_disable/enable when there
is nothing to do:

natasha% diff -u -p -B apic.c apic-patched.c
--- apic.c      2005-10-27 17:02:08.000000000 -0700
+++ apic-patched.c      2005-10-31 23:06:42.733651641 -0800
@@ -562,12 +562,15 @@ void __devinit setup_local_APIC(void)
  */
 void lapic_shutdown(void)
 {
-       if (!cpu_has_apic || !enabled_via_apicbase)
+       if (!cpu_has_apic)
                return;

-       local_irq_disable();
-       disable_local_APIC();
-       local_irq_enable();
+        clear_local_APIC();
+        if (enabled_via_apicbase) {
+                local_irq_disable();
+                disable_local_APIC();
+                local_irq_enable();
+        }
 }

 #ifdef CONFIG_PM

Feel free to ignore it though.

Boris
Comment 8 Zwane Mwaikambo 2005-10-31 23:53:40 UTC
Created attachment 6437 [details]
patch-reboot_through_BIOS_does_not_mask_APIC_LVT_interrupts

I don't feel comfortable calling clear_local_APIC with interrupts enabled, but
i removed the goto at least ;}
Comment 9 Boris Weissman 2005-11-02 11:00:55 UTC
Zwane, your last patch looks great. Many thanks!

Boris
Comment 10 Zwane Mwaikambo 2005-11-02 11:33:03 UTC
Thank you very much for the well presented bug report + patch, i'll make sure
this gets sent upstream.

Note You need to log in before you can comment on or make changes to this bug.