Bug 14700 - Broken system because of a bad ACPI commit
Summary: Broken system because of a bad ACPI commit
Status: CLOSED CODE_FIX
Alias: None
Product: ACPI
Classification: Unclassified
Component: Power-Processor (show other bugs)
Hardware: All Linux
: P1 normal
Assignee: acpi_power-processor
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2009-11-27 12:40 UTC by Petri Lehtinen
Modified: 2009-12-17 03:41 UTC (History)
3 users (show)

See Also:
Kernel Version: 2.6.31
Subsystem:
Regression: Yes
Bisected commit-id:


Attachments
Don't disable ARB_DISABLE when the mode id is less than 0x0f (775 bytes, patch)
2009-11-30 03:25 UTC, ykzhao
Details | Diff

Description Petri Lehtinen 2009-11-27 12:40:55 UTC
After an upgrade to 2.6.31 kernel, the system is very unstable.
Sometimes it doesn't boot. If it does, the laptop keyboard doesn't
work and after a few minutes it hangs with ATA errors.

I bisected this problem down to the following commit:

commit ee1ca48fae7e575d5e399d4fdcfe0afc1212a64c
Author: Pallipadi, Venkatesh <venkatesh.pallipadi@intel.com>
Date:   Thu May 21 17:09:10 2009 -0700

    ACPI: Disable ARB_DISABLE on platforms where it is not needed

    ARB_DISABLE is a NOP on all of the recent Intel platforms.

    For such platforms, reduce contention on c3_lock
    by skipping the fake ARB_DISABLE.

    Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com>
    Signed-off-by: Len Brown <len.brown@intel.com>

With this commit reverted, the system works fine with 2.6.31.

$ cat /proc/cpuinfo
processor       : 0
vendor_id       : GenuineIntel
cpu family      : 6
model           : 14
model name      : Intel(R) Celeron(R) M CPU        410  @ 1.46GHz
stepping        : 8
cpu MHz         : 1463.194
cache size      : 1024 KB
fdiv_bug        : no
hlt_bug         : no
f00f_bug        : no
coma_bug        : no
fpu             : yes
fpu_exception   : yes
cpuid level     : 10
wp              : yes
flags           : fpu vme de pse tsc msr pae mce cx8 apic mtrr pge mca cmov clflush dts acpi mmx fxsr sse sse2 ss tm pbe nx constant_tsc up arch_perfmon bts pni monitor tm2 xtpr pdcm
bogomips        : 2926.38
clflush size    : 64
power management:

If I read the code correctly, in the commit message, "all of the
recent Intel platforms" seem to mean those with family == 6 and model
>= 14. My processor's model is 14 so could we have an
off-by-one error here?
Comment 1 ykzhao 2009-11-30 00:56:00 UTC
Hi, 
   Will you please try the following boot option and see whether the box can be booted correctly?
   a. processor.max_cstate=1
   b. idle=poll

Thanks.
Comment 2 ykzhao 2009-11-30 03:25:47 UTC
Created attachment 23973 [details]
Don't disable ARB_DISABLE when the mode id is less than 0x0f

Will you please try the latest kernel(2.6.32-rc7/rc8) and see whether the box can be booted correctly?

If it still can't be booted, please try the attached debug patch and see whether it can be booted correctly?

Thanks.
Comment 3 Petri Lehtinen 2009-11-30 19:54:42 UTC
With both 2.6.31 and 2.6.32-rc8 I had the following results:

 no extra kernel params  --> doesn't boot
 processor.max_cstate=1  --> works OK
 idle=poll               --> works OK

I'm unable to test with the patch right now, I'll do it later.
Comment 4 Petri Lehtinen 2009-12-01 20:36:01 UTC
2.6.32-rc8 plus the patch works fine.

BTW, the machine in question is Acer Travelmate 2440, just in case you need it for the commit message or something.
Comment 5 Zhang Rui 2009-12-04 02:27:37 UTC
Yakui,
is the patch in comment #2 an acceptable solution for upstream kernel?
if yes, please resend it to linux devel.
Comment 6 Len Brown 2009-12-16 05:42:42 UTC
commit 03a05ed1152944000151d57b71000de287a1eb02
Author: Zhao Yakui <yakui.zhao@intel.com>
Date:   Fri Dec 11 15:17:20 2009 +0800

    ACPI: Use the ARB_DISABLE for the CPU which model id is less than 0x0f.


is queued in the acpi tree for linux-2.6.33
Comment 7 Len Brown 2009-12-17 03:41:09 UTC
shipped in linux-2.6.33 before -rc1

closed.

Note You need to log in before you can comment on or make changes to this bug.