Bug 12122 - sync_core crashes on 486 class CPUs
Summary: sync_core crashes on 486 class CPUs
Status: CLOSED CODE_FIX
Alias: None
Product: Platform Specific/Hardware
Classification: Unclassified
Component: i386 (show other bugs)
Hardware: All Linux
: P1 normal
Assignee: platform_i386
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2008-11-29 08:29 UTC by Michael Karcher
Modified: 2012-07-01 09:47 UTC (History)
3 users (show)

See Also:
Kernel Version: 2.6.26
Subsystem:
Regression: No
Bisected commit-id:


Attachments
Proposed patch (3.71 KB, patch)
2008-12-01 09:32 UTC, H. Peter Anvin
Details | Diff

Description Michael Karcher 2008-11-29 08:29:37 UTC
Distribution: Debian etchnhalf
Hardware Environment: Cyrix 6x86 processor
Software Environment: Linux 2.6.24 as provided by debian
Problem Description:
The debian provided 2.6.24 kernel crashes on 6x86 during boot with an invalid opcode exception. The cause is that CONFIG_PARAVIRT is enabled, which causes the kernel to patch itself as soon as it detects it runs on bare hardware. The kernel patch function text_poke runs sync_core() after patching, which contains a CPUID instruction, as that is serializing. The 6x86 processor does not support this instruction unless explicitly enabled, so an invalid opcode exception is reported.

Steps to reproduce:
Compile a kernel with paravirt support and run it on a system with a non-CPUID-capable processor.

Suggested fix:
As sync_core is only used after microcode update or code patching, performance of sync_core doesn't seem critical, so add a test whether CPUID is supported to sync_core (or alternative.c instead). CPUs that might need this strong kind of serialization should all support CPUID.
Comment 1 Alan 2008-12-01 08:41:27 UTC
Patching it with a lock addl $0, (%esp) ought to do the trick for the 6x86 and other older processors.
Comment 2 H. Peter Anvin 2008-12-01 09:24:41 UTC
Actually, the IA32 manual, volume 3A, section 7.1.3 implies that replacing it with a forward JMP (EB 00) might be better; that is also consistent with what the early boot code does.

In fact, I don't see why we can't use the forward jump unconditionally -- it is known to work on all processors including the new ones.  Neither the JMP nor CPUID will synchronize other cores, so it should be equivalent in that way.

The problem with the current code is that sync_core() does two things which have separate rules in x86: text synchronization, and pipeline serialization (the latter appears to only be due to an erratum in the microcode patching code.)

I'm thinking of splitting this into two functions, sync_text() and serialize().
Comment 3 H. Peter Anvin 2008-12-01 09:32:11 UTC
Created attachment 19092 [details]
Proposed patch

Could you please try this patch?
Comment 4 Michael Karcher 2009-05-02 23:19:31 UTC
I backported the fix to 2.6.24 (for debian etchnhalf), and it works fine.
I also backported it to 2.6.26 (debian lenny) and started to upgrade the machine, but the hard drive died, so no results yet, but should work there too.
Comment 5 Michael Karcher 2009-05-08 21:20:33 UTC
A new HDD has been installed into that 6x86 system.

The backport of your patch does not make 2.6.26 (with debian default config) boot, because there is a new unprotected cpuid in kvm_para_available in include/asm-x86/kvm_para.h which is called by the kvm clock source driver during early setup.
Comment 6 Alan 2012-05-22 15:33:26 UTC
Kvm finally fixed in 3.4
Comment 7 Florian Mickler 2012-07-01 09:47:50 UTC
A patch referencing this bug report has been merged in Linux v3.4:

commit c3709e6734daa4d9b37fe31592ebb0eb57bae1bb
Author: Alan Cox <alan@linux.intel.com>
Date:   Mon May 14 18:41:09 2012 +0100

    x86, kvm: KVM paravirt kernels don't check for CPUID being unavailable

Note You need to log in before you can comment on or make changes to this bug.