Bug 8647

Summary: PANIC: CPU too old for this kernel. with Crusoe CPU
Product: Platform Specific/Hardware Reporter: CHIKAMA Masaki (masaki.chikama)
Component: i386Assignee: platform_i386
Status: CLOSED CODE_FIX    
Severity: normal CC: andi-bz, haveaniceday
Priority: P1    
Hardware: All   
OS: Linux   
Kernel Version: 2.6.22-rcX Subsystem:
Regression: --- Bisected commit-id:
Attachments: Only require CMPXCHG64 with PAE

Description CHIKAMA Masaki 2007-06-17 22:57:32 UTC
Most recent kernel where this bug did not occur: 2.6.21.5
Distribution: Fedora 7

Hardware Environment: cat /proc/cpuinfo
processor       : 0
vendor_id       : GenuineTMx86
cpu family      : 6
model           : 4
model name      : Transmeta(tm) Crusoe(tm) Processor TM5600
stepping        : 3
cpu MHz         : 500.000
cache size      : 512 KB
fdiv_bug        : no
hlt_bug         : no
f00f_bug        : no
coma_bug        : no
fpu             : yes
fpu_exception   : yes
cpuid level     : 1
wp              : yes
flags           : fpu vme de pse tsc msr cx8 cmov mmx longrun constant_tsc up
bogomips        : 1196.55
clflush size    : 32

Software Environment:
Fedora 7 + rawhide kernel , Fedora 7 + vanilla 2.6.22-rcX kernel.

Problem Description:
My Laptop with Transmeta Crusoe CPU fail to boot with
this message. "PANIC: CPU too old for this kernel." 
Rebuilding with CONFIG_M486 works fine.
I only checked rc3 and rc5.

Steps to reproduce:
Compiling with CONFIG_M686 then boot with the kernel.
Comment 1 Michal Piotrowski 2007-06-19 15:59:48 UTC
Please revert both patches
http://www.stardust.webpages.pl/files/patches/setup_revert/

(and mark this bug as regression)
Comment 2 CHIKAMA Masaki 2007-06-19 19:09:30 UTC
(In reply to comment #1)
Thanks for paying your attension.
But the patch is against for ... what?

[chikama@nabal linux-2.6.22-rc5]$ patch -R -p1 < ~/boot1.patch
patching file arch/i386/boot/setup.S
[chikama@nabal linux-2.6.22-rc5]$ patch -R -p1 < ~/boot2.patch
patching file arch/i386/Kconfig.cpu
patching file arch/i386/boot/setup.S
patching file arch/i386/kernel/verify_cpu.S
Unreversed patch detected!  Ignore -R? [n]
Apply anyway? [n]
Skipping patch.
1 out of 1 hunk ignored -- saving rejects to file arch/i386/kernel/verify_cpu.S.rej
patching file include/asm-i386/cpufeature.h
patching file include/asm-i386/required-features.h
[chikama@nabal linux-2.6.22-rc5]$

May I go ahead ?
Comment 3 Michal Piotrowski 2007-06-20 00:44:42 UTC
Sorry, please apply this
http://www.stardust.webpages.pl/files/patches/setup_revert/boot-fix.patch

(revert of:
commit 4c1f59d8be7e5da75d9380da23671005b363c45c
Author: Christian Volkmann <haveaniceday@cv-sv.de>
Date:   Mon May 21 14:31:48 2007 +0200

commit 39427d6e595ebee38fdd77bcf55d6b13d7a4324a
Author: Andi Kleen <ak@suse.de>
Date:   Mon May 21 14:31:50 2007 +0200

commit fd0581bbb40d8f4b0e4b3a4de2258a50df37bb57
Author: Andi Kleen <ak@suse.de>
Date:   Fri May 11 11:23:18 2007 +0200

commit c7f81c9453375d6416658995eafd3397cb9bba1d
Author: Andi Kleen <ak@suse.de>
Date:   Wed May 2 19:27:20 2007 +0200
)
Comment 4 CHIKAMA Masaki 2007-06-20 01:47:48 UTC
(In reply to comment #3)
> Sorry, please apply this
Thanks, but it seems to need more revert.
(patched against for rc5)

  CC      arch/i386/kernel/process.o
arch/i386/kernel/process.c: In function `select_idle_routine':
arch/i386/kernel/process.c:268: error: `REQUIRED_MASK1' undeclared (first use in this function)
arch/i386/kernel/process.c:268: error: (Each undeclared identifier is reported only once
arch/i386/kernel/process.c:268: error: for each function it appears in.)
make[1]: *** [arch/i386/kernel/process.o] Error 1
make: *** [arch/i386/kernel] Error 2
Comment 5 Michal Piotrowski 2007-06-20 02:56:20 UTC
Ok, it will take more time - use git-bisect, it is the easiest way
(chapter 4 http://www.stardust.webpages.pl/files/handbook/handbook-en-0.3-rc1.pdf)
Comment 6 CHIKAMA Masaki 2007-06-21 03:47:50 UTC
(In reply to comment #5)
This is the another revert required.

commit 3671df8572a299acff9c9cac2bf7279ee614d154
Author: Andi Kleen <ak@suse.de>
Date:   Wed May 2 19:27:20 2007 +0200
 
    [PATCH] i386: Evaluate constant cpu features at runtime

Now I revert 5 patches gainst rc5 and the machine boots.
Thanks.
Comment 7 Christian Volkmann 2007-06-21 13:15:55 UTC
I personally would set this bug as blocker until a "all i386 CPU" regression is
done!

I expect arch/i386/kernel/verify_cpu.S needs a detection/action for the crusoe
processor. This new introduced routine verify_cpu.S checks if CMPXCHG64 exists
for CPU >= 586.

Andi Kleen patched verify_cpu.S to enable CMPXCHG64 via MSR for VIA. VIA has
this "feature" cause of an old Win-NT "feature".

This thread contains the discussion about this issue:
 http://www.uwsg.indiana.edu/hypermail/linux/kernel/0705.2/0499.html

Hmm, I have been right in expecting this:
> - other X86-CPU than AMD (?) or Intel
>  => correct verify_cpu.S or set bits if required.


Somebody might decide to disable this feature before rc status is left ?
May be Andi Kleen ? Does somebody have a CPU test farm for a proper regression
on all i386 CPUs ? Otherwise some people might be very unhappy with 2.6.22


Dirty patch to disable the functions of verify_cpu.S (not tested by me):
--- ./arch/i386/kernel/verify_cpu.S.orig       2007-06-21 21:48:25.887380643 +0200
+++ ./arch/i386/kernel/verify_cpu.S    2007-06-21 21:49:31.531117779 +0200
@@ -90,5 +90,5 @@

 bad:
        popfl
-       movl    $1,%eax
+       xor     %eax,%eax
        ret
Comment 8 Christian Volkmann 2007-06-21 13:33:56 UTC
I expect that the PAE bit is not set for the Crusoe 

http://www.labri.fr/perso/fleury/hacks/bug_cms/Crusoe_Exposed/download/TM5800_BIOSGuide_6-14-02.pdf

./include/asm-i386/required-features.h
=> REQUIRED_MASK1 consists of PAE ( bit 6),  CMOV (bit 15 ) and CHG64 ( CX8 => bit 8 )

According the pdf link Chapter 1.1.3 page 12 bit 6 (PAE) is "reserved".
May be this hint helps Andi.
Comment 9 Christian Volkmann 2007-06-21 14:25:56 UTC
I found another potential problem:
>
>        movl    $0x0,%eax               # See if cpuid 1 is implemented
>        cpuid
>        cmpl    $0x1,%eax
>        jb      bad                     # no cpuid 1

According to TM5800_BIOSGuide_6-14-02.pdf, page 11:  %eax can not only be 1.
It's 3 if the processor serial number is enabled (for TM5800).

My guess is: This detection doe not work correct if the processor serial number
is enabled. May be also for other processors ?

I do not know i386 well enough to say that.
Comment 10 Andi Kleen 2007-06-21 15:43:20 UTC
Re #9: Hint: it's jb, not je as you seem to believe.
Re #8: The PAE bit is only in there if the kernel has been compiled with PAE.
I doubt that's the case here.

I suspect the problem is CX8 again. It probably also needs special case
code to enable it. Yes, it looks like it:

   /* Unhide possibly hidden capability flags */
        rdmsr(0x80860004, cap_mask, uk);
        wrmsr(0x80860004, ~0, uk);
        c->x86_capability[0] = cpuid_edx(0x00000001);
        wrmsr(0x80860004, cap_mask, uk);

Actually i rechecked the code now and it looks like CX8 is only 
strictly needed together with PAE. So we should make it depend
on that.


 
Comment 11 Andi Kleen 2007-06-21 15:44:06 UTC
Created attachment 11846 [details]
Only require CMPXCHG64 with PAE

Can you test with this patch please?
Comment 12 CHIKAMA Masaki 2007-06-21 18:40:04 UTC
(In reply to comment #11)
> Can you test with this patch please?
Patching against rc5 and fail to do menuconfig.

arch/i386/Kconfig:543:error: found recursive dependency: HIGHMEM64G -> X86_PAE -> X86_CMPXCHG64 -> HIGHMEM64G
 -> I2O_EXT_ADAPTEC_DMA64make[1]: *** [menuconfig] Error 1
make: *** [menuconfig] Error 2
Comment 13 CHIKAMA Masaki 2007-06-24 18:33:19 UTC
Fedora rawhide kernel which include latest git works now.
Please close this bug.
Thanks all.