Kernel Bug Tracker – Bug 31872
boot panic unless acpi=off, Thread overran stack, or stack corrupted - Toshiba Satellite/mobile P4
Last modified: 2011-04-30 19:53:03 UTC
Created attachment 51972 [details]
My laptop cannot boot anymore with latest 2.6.38 kernel (also confirmed
with 2.6.38 rc6, rc7 and rc8) while ACPI support is enabled (booting with ACPI=off allows ending the boot sequence but many functionalities are lost). A kernel panic occurs early during the boot.
Older kernels until 188.8.131.52 did not trigger this bug on that laptop.
It's the official Debian kernel 2.6.38-1-686.
Created attachment 51982 [details]
Created attachment 51992 [details]
picture of boot message
Created attachment 52002 [details]
picture of boot message
Created attachment 52012 [details]
full boot sequence
I could capture boot messages until the crash with a camera using the boot_delay option. Relevant messages could be those on the two pictures in attachment (I am not sure). I also linked to a tarball with pictures of the whole boot sequence (just wget http://dl.free.fr/nbej8o6bE should do it). I stopped capture boot messages until they seem to repeat endelessly, but if needed I can provide more.
Please ask me if you need more information.
Please confirm that this fails with unmodified kernel.org 2.6.38,
and that it does not fail with the kernel.org 2.6.37.stable (now 184.108.40.206)
Can you bisect which change between 2.6.37 and 2.6.38-rc6 causes
the failure, or at least try the rc's, such as -rc1?
please attach the .config for the failing kernel,
in the hopes that it can be reproduced on an additional machine.
Created attachment 52512 [details]
config of failing kernel
I will do both (confirm which unmodified kernel.org version fails, and bisect) and report back when done.
In the meantime, please find in attachment the config of the failing kernel (sorry to forget about this one).
FWIW, I seriously doubt this is an ACPICA problem. It rather looks like this
is related to interrupts (I/O ACPI or LAPIC issue perhaps).
Created attachment 53002 [details]
2.6.38 dmesg with bf325f9538d8c89312be305b9779edbcb436af00 reverted
I tested kernels from kernel.org:
v220.127.116.11 -> OK
v18.104.22.168 -> failed
v2.6.38-rc1 -> failed
With git bisect I could isolate the commit that results into the crash:
Author: Rafael J. Wysocki <firstname.lastname@example.org>
Date: Thu Nov 25 00:10:44 2010 +0100
ACPI / PM: Register power resource devices as soon as they are needed
Depending on the organization of the ACPI namespace, power resource
device objects may generally be scanned after the "regular" device
objects that they are referred from through _PRn. This, in turn, may
cause acpi_bus_get_power_flags() to attempt to access them through
acpi_bus_init_power() before they are registered (and initialized by
acpi_power_driver). [This is not a theoretical issue, it actually
happens for one PnP device on my testbed HP nx6325.]
To fix this problem, make acpi_bus_get_power_flags() attempt to
register power resource devices as soon as they have been found in
the _PRn output for any other devices.
Signed-off-by: Rafael J. Wysocki <email@example.com>
Signed-off-by: Len Brown <firstname.lastname@example.org>
I could build and run a 2.6.38 kernel with the above commit reverted. It boots with no problem (with acpi support)
The corresponding dmesg is in attachement
Please attach the output of acpidump from your machine.
Created attachment 54612 [details]
ouput of acpidump
Find in attachement the output of the acpidump command.
Created attachment 55262 [details]
Debug early registration of power resources
Please apply think patch and see if the crash happens. If it doesn't, please
attach the dmesg output.
Sorry, the "think" above should be "this".
Created attachment 55272 [details]
ACPI: Avoid infinite recurrence in registering power resources
Well, I think I know what the problem is.
In your DSDT the _PR0 object of power resource PUT2 points back to this
power resource. In consequence, while registering PUT2
acpi_bus_get_power_flags() sees that it depends on PUT2 and tries to
register it again, which leads to an infinitely deep recurrence.
The attached patch should work around this issue. If it does, please
disregard the two previous comments.
Problem fixed when patch applied. Thanks a lot.
Should I understand that the DSDT table is too much buggy here
(I really have no understanding of the ACPI spec.) ?
In such case, should I remove PUT2 inside the
Name (_PR0, Package (0x01)
Created attachment 55342 [details]
ACPI: Avoid infinite recurrence in registering power resources (v2)
Well, it shouldn't be there, but I bet your BIOS is not the only one with
a problem of this kind, so we should add a safeguard against that.
Please check if the attached patch helps too.
The acpi-power-resources-fix.patch v2 also helps. Thanks.
Note that I tested v2 alone (not v1+v2).
(In reply to comment #18)
> The acpi-power-resources-fix.patch v2 also helps. Thanks.
> Note that I tested v2 alone (not v1+v2).
That was as intended. :-)
Thanks for testing, I'll submit the patch for merging shortly.
Fixed by http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=7bed50c5edf5cba8dd515a31191cbfb6065ddc85 .