Bug 42700 - Kernel won't boot on D102GGC2 board unless ACPI=off or idle=halt
Kernel won't boot on D102GGC2 board unless ACPI=off or idle=halt
Status: CLOSED CODE_FIX
Product: ACPI
Classification: Unclassified
Component: ACPICA-Core
i386 Linux
: P1 normal
Assigned To: Len Brown
:
Depends on:
Blocks:
  Show dependency treegraph
 
Reported: 2012-01-31 05:55 UTC by Waleed Hamra
Modified: 2012-06-18 19:28 UTC (History)
5 users (show)

See Also:
Kernel Version: 3.*
Tree: Mainline
Regression: Yes


Attachments
dmesg output on 2.6 (50.08 KB, text/plain)
2012-01-31 05:58 UTC, Waleed Hamra
Details
lspci -vvnn (8.86 KB, text/plain)
2012-01-31 05:59 UTC, Waleed Hamra
Details
dmidecode (8.00 KB, text/plain)
2012-01-31 06:01 UTC, Waleed Hamra
Details
sudo acpidump (76.64 KB, text/plain)
2012-02-14 14:28 UTC, Waleed Hamra
Details
001-ACPICA-Fix-regression-in-FADT-revision-checks.patch (2.20 KB, patch)
2012-02-27 19:49 UTC, Len Brown
Details | Diff

Description Waleed Hamra 2012-01-31 05:55:57 UTC
this is a kubuntu 32-bit system. 
in kubuntu 11.04, they were using the 2.6 kernel still, and the system was booting fine, up to the last kernel tried on that release:

willy@Hamra:~$ uname -r
2.6.38-12-generic

after upgrade to 11.10, ubuntu started using the 3.0 kernel, which won't boot on this board, unless ACPI=off was specified. but this, of course, removes all ACPI functionality, and disables hyper threading, effectively halving the processing power.
kernels tried as shipped by ubuntu:

willy@Hamra:~$ ls -1 /boot/vmlinuz-3.0*
/boot/vmlinuz-3.0.0-13-generic
/boot/vmlinuz-3.0.0-14-generic
/boot/vmlinuz-3.0.0-15-generic
/boot/vmlinuz-3.0.0-16-generic

so i tried compiling a linux kernel from last month's 3.1 tree, to check if this is an ubuntu issue or a kernel issue. i used very close values to what i would normally use on my LinuxFromScratch and ended with same bug. then i brought the config of the 2.6.38-12-generic kernel provided by ubuntu, and used it as starting point for configuring the 3.1 kernel. i just ran the configuration program so it can remove obsolete options, and add necessary new ones, didnt modify anything myself, and compiled, again, same result, system won't boot.

the stage of booting at which this issue happens is after these lines in the boot process:

[    0.285128] pciehp: PCI Express Hot Plug Controller Driver version: 0.4
[    0.285432] input: Power Button as /devices/LNXSYSTM:00/device:00/PNP0C0C:00/input/input0
[    0.285443] ACPI: Power Button [PWRB]
[    0.285571] input: Power Button as /devices/LNXSYSTM:00/LNXPWRBN:00/input/input1
[    0.285579] ACPI: Power Button [PWRF]

after the last PWRF line the system just hangs. no panic, no flashing keyboard LEDs, and no response at all. judging by the speed of fans, it doesnt seem the processor is stuck in an intensive infinite loop either.

normally, on a 2.6 kernel, the lines following these are:

[    0.285849] ACPI: acpi_idle registered with cpuidle
[    0.287994] ERST: Table is not found!
[    0.288199] Serial: 8250/16550 driver, 32 ports, IRQ sharing enabled
[    0.292073] isapnp: Scanning for PnP cards...


hope this is enough info to be able to debug this issue. a launchpad bug report has been filed as well: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/883441

thanks in advance
Comment 1 Waleed Hamra 2012-01-31 05:58:07 UTC
Created attachment 72239 [details]
dmesg output on 2.6
Comment 2 Waleed Hamra 2012-01-31 05:59:31 UTC
Created attachment 72240 [details]
lspci -vvnn
Comment 3 Waleed Hamra 2012-01-31 06:01:00 UTC
Created attachment 72241 [details]
dmidecode
Comment 4 Zhang Rui 2012-02-02 06:30:02 UTC
what if you boot with idle=poll or idle=halt?
Comment 5 Waleed Hamra 2012-02-03 03:44:25 UTC
that was easy... 
either option works fine, and the system boots. thanks a lot for your help.
any suggestions on which is preferable?
Comment 6 Len Brown 2012-02-07 03:35:56 UTC
does booting with "intel_idle.max_cstate=0" have any effect?

(I can't imagine it could, b/c acpi=off works, but lets confirm...)

how about booting with "processor.max_cstate=1"
"processor.max_cstate=2"
"processor.max_cstate=3"
etc?

For the last one that works, please show the output from
grep . /sys/devices/system/cpu/cpu0/cpuidle/*/*
Comment 7 Waleed Hamra 2012-02-10 10:05:49 UTC
just to be clear, you want me to try these without the "idle=" parameter?
if that's the case, none worked, they all resulted in the same hang, on the same place during the boot process.
the only difference is an additional line stating the CPU c-state got limited to X, etc..
Comment 8 Zhang Rui 2012-02-13 04:55:56 UTC
hmm, I guess "idle=nomwait" also works for you, right?
please attach the output of acpidump for you machine.
Comment 9 Waleed Hamra 2012-02-14 14:24:45 UTC
actually no, "idle=nomwait" also results in a hang, so far, only "idle=halt" and "idle=poll" are booting the system.
i just booted with idle=halt, and ran acpidump, the output is in the next attachment.
Comment 10 Waleed Hamra 2012-02-14 14:28:58 UTC
Created attachment 72373 [details]
sudo acpidump
Comment 11 Len Brown 2012-02-27 19:49:58 UTC
Created attachment 72490 [details]
001-ACPICA-Fix-regression-in-FADT-revision-checks.patch

please test the attached patch

which fixes
https://bugzilla.redhat.com/show_bug.cgi?id=727865
Comment 12 Waleed Hamra 2012-03-25 11:00:59 UTC
sorry for this very late reply, it does actually solve the bug.
i tried with a trunk kernel pulled from git, 3 weeks ago. compiled it without patch, to make sure bug existed in that kernel, then compiled it again with patch, and lo and behold, bug gone :)
Comment 13 Zhang Rui 2012-03-27 01:27:57 UTC
good. Bug closed
Comment 14 Florian Mickler 2012-04-04 14:56:20 UTC
A patch referencing this bug report has been merged in Linux v3.4-rc1:

commit 3e80acd1af40fcd91a200b0416a7616b20c5d647
Author: Julian Anastasov <ja@ssi.bg>
Date:   Thu Feb 23 22:40:43 2012 +0200

    ACPICA: Fix regression in FADT revision checks
Comment 15 ichise82 2012-04-27 08:29:31 UTC
Hello,
this is my first post in this tracking system.

Sadly I have to say that the patch above does NOT fix the problem for me. I tried several kernels, but no one of the 3.x branch works, patched or unpatched, unless acpi=off .

Here's the kernels I tried:
- kernel 2.6.32 --> OK
- kernel 3.0.0 patched/unpatched --> ERROR
- kernel vanilla 3.2 patched/unpatched --> ERROR

There is no way to boot with a kernel 3.x unless acpi=off.
Maybe there is something missing in my kernel configuration?

Let me know if you need more infos or some command output.

Thanks in advice
Comment 16 Len Brown 2012-06-05 04:32:19 UTC
ichise82,
Per comment #14, the patch is included in Linux-3.4 --
does linux 3.4 work?

Which motherboard do you have, same as Waleed, or other?
Does 2.6.38 work?
Comment 17 Len Brown 2012-06-18 19:28:06 UTC
The original problem posted by the original submitter is fixed
and shipped per comment #14
So this bug is closed.

ichise82, if you have an additional problem, please
open a new bug report.

Note You need to log in before you can comment on or make changes to this bug.