Bug 14211

Summary: Kernel hangs on Compaq Evo N800c with ACPI enabled
Product: ACPI Reporter: Ondrej Zary (linux)
Component: Power-ProcessorAssignee: Venkatesh Pallipadi (venki)
Status: CLOSED CODE_FIX    
Severity: high CC: akpm, chepioq, lenb, rjw, rui.zhang, yakui.zhao
Priority: P1    
Hardware: All   
OS: Linux   
Kernel Version: 2.6.31 Subsystem:
Regression: Yes Bisected commit-id:
Bug Depends on:    
Bug Blocks: 13615    
Attachments: Don't disable ARB_DISABLE when the family ID is 0x0F.

Description Ondrej Zary 2009-09-22 21:13:25 UTC
Kernel 2.6.31 hangs during boot on Compaq Evo N800c laptop when ACPI is enabled (works with acpi=off). 2.6.30 works fine.
See https://bugzilla.redhat.com/show_bug.cgi?id=522057

Bisection resulted in commit ee1ca48fae7e575d5e399d4fdcfe0afc1212a64c:

$ git bisect bad
ee1ca48fae7e575d5e399d4fdcfe0afc1212a64c is first bad commit
commit ee1ca48fae7e575d5e399d4fdcfe0afc1212a64c
Author: Pallipadi, Venkatesh <venkatesh.pallipadi@intel.com>
Date:   Thu May 21 17:09:10 2009 -0700

    ACPI: Disable ARB_DISABLE on platforms where it is not needed

    ARB_DISABLE is a NOP on all of the recent Intel platforms.

    For such platforms, reduce contention on c3_lock
    by skipping the fake ARB_DISABLE.

    Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com>
    Signed-off-by: Len Brown <len.brown@intel.com>

:040000 040000 f05492fbba2c068f22d7aefca36896f07ff8611b 47e784478de3c386591a92181e7784f5d224a989 M      arch
:040000 040000 a379b031d617def0094cca3f56c2eda510a0604b cc00f83544238d1757fdf2a5e62206d1a49de2de M      drivers

Reverting this commit fixes the regression.

CPU is Mobile Pentium 4-M:

processor       : 0
vendor_id       : GenuineIntel
cpu family      : 15
model           : 2
model name      : Mobile Intel(R) Pentium(R) 4 - M CPU 2.20GHz
stepping        : 7
cpu MHz         : 2200.000
cache size      : 512 KB
fdiv_bug        : no
hlt_bug         : no
f00f_bug        : no
coma_bug        : no
fpu             : yes
fpu_exception   : yes
cpuid level     : 2
wp              : yes
flags           : fpu vme de pse tsc msr pae mce cx8 sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe up pebs bts cid
bogomips        : 4395.79
clflush size    : 64
power management:
Comment 1 Rafael J. Wysocki 2009-09-22 22:14:40 UTC
First-Bad-Commit : ee1ca48fae7e575d5e399d4fdcfe0afc1212a64c
Comment 2 ykzhao 2009-09-24 07:03:08 UTC
Created attachment 23165 [details]
Don't disable ARB_DISABLE when the family ID is 0x0F.

Will you please try the debug patch and see whether the issue still exists?

thanks.
Comment 3 Ondrej Zary 2009-09-24 08:14:29 UTC
It works with the patch.

But I wonder if it's correct - can the family ever be bigger than 15?

Also it seems to work with many P4 CPUs - maybe it hangs only on Mobile P4-M? Or maybe only on some specific family/model/stepping.
Comment 4 chepioq 2009-09-24 17:31:20 UTC
Hi...
I have same problem with my laptop Toshiba X200, my cpu is an IntelM Core2Duo, T7500, with platform centrino.
I apply the patch on a 2.6.31 kernel, but that don't work for me, this kernel don't boot without "acpi=off" option.
2.6.30 work fine with my laptop...
Comment 5 chepioq 2009-09-24 17:38:50 UTC
processor       : 0                                                                   
vendor_id       : GenuineIntel                                                        
cpu family      : 6                                                                   
model           : 15                                                                  
model name      : Intel(R) Core(TM)2 Duo CPU     T7500  @ 2.20GHz                     
stepping        : 11                                                                  
cpu MHz         : 2200.000                                                            
cache size      : 4096 KB                                                             
physical id     : 0                                                                   
siblings        : 2                                                                   
core id         : 0                                                                   
cpu cores       : 2                                                                   
apicid          : 0                                                                   
initial apicid  : 0                                                                   
fpu             : yes                                                                 
fpu_exception   : yes                                                                 
cpuid level     : 10                                                                  
wp              : yes                                                                 
flags           : fpu vme de pse tsc msr pae mce cx8 apic mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx lm constant_tsc arch_perfmon pebs bts rep_good pni dtes64 monitor ds_cpl vmx est tm2 ssse3 cx16 xtpr pdcm lahf_lm ida tpr_shadow vnmi flexpriority
bogomips        : 4388.96
clflush size    : 64
cache_alignment : 64
address sizes   : 36 bits physical, 48 bits virtual
power management:
Comment 6 Ondrej Zary 2009-09-24 20:37:13 UTC
You have different CPU, so this patch does not affect. It's possible that you have also a different problem.

Try reverting the commit I posted. Get it from gitweb http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commitdiff_plain;h=ee1ca48fae7e575d5e399d4fdcfe0afc1212a64c and apply with "patch -R".
Comment 7 chepioq 2009-09-25 20:04:13 UTC
I apply the patch -R, but for same result: kernel 2.6.31 don't boot without "acpi=off" option on boot...
Comment 8 ykzhao 2009-09-27 05:11:39 UTC
hi, Chepioq
    Maybe the issue on your box is different with this bug.
    Will you please open a new bug and see whether the box can be booted with the following boot option?
    a. idle=poll
    b. processor.max_cstate=2
    c. nolapic_timer

Thanks.
Comment 9 Len Brown 2009-09-27 07:28:03 UTC
patch in comment #2 applied to acpi tree
Comment 11 Andrew Morton 2009-09-30 19:20:17 UTC
(In reply to comment #9)
> patch in comment #2 applied to acpi tree

Did it also get queued for 2.6.31.x?
Comment 12 Len Brown 2009-10-02 14:45:18 UTC
commit 3e2ada5867b7e9fa0b296d30fa8f3726ebd0a8b7
ACPI: fix Compaq Evo N800c (Pentium 4m) boot hang regression

shipped in linux-2.6.32-rc1 and was submitted to 2.6.31.stable
after 2.6.31.1

closed