Bug 31872
Summary: | boot panic unless acpi=off, Thread overran stack, or stack corrupted - Toshiba Satellite/mobile P4 | ||
---|---|---|---|
Product: | ACPI | Reporter: | Pascal Dormeau (pdormeau) |
Component: | ACPICA-Core | Assignee: | Rafael J. Wysocki (rjw) |
Status: | CLOSED CODE_FIX | ||
Severity: | normal | CC: | florian, lenb, maciej.rutecki, rjw |
Priority: | P1 | ||
Hardware: | i386 | ||
OS: | Linux | ||
Kernel Version: | 2.6.38 | Subsystem: | |
Regression: | Yes | Bisected commit-id: | |
Bug Depends on: | |||
Bug Blocks: | 27352 | ||
Attachments: |
lspci -vvv
/proc/cpuinfo picture of boot message picture of boot message full boot sequence config of failing kernel 2.6.38 dmesg with bf325f9538d8c89312be305b9779edbcb436af00 reverted ouput of acpidump Debug early registration of power resources ACPI: Avoid infinite recurrence in registering power resources ACPI: Avoid infinite recurrence in registering power resources (v2) |
Created attachment 51982 [details]
/proc/cpuinfo
Created attachment 51992 [details]
picture of boot message
Created attachment 52002 [details]
picture of boot message
Created attachment 52012 [details]
full boot sequence
I could capture boot messages until the crash with a camera using the boot_delay option. Relevant messages could be those on the two pictures in attachment (I am not sure). I also linked to a tarball with pictures of the whole boot sequence (just wget http://dl.free.fr/nbej8o6bE should do it). I stopped capture boot messages until they seem to repeat endelessly, but if needed I can provide more. Please ask me if you need more information. Regards Pascal Dormeau Please confirm that this fails with unmodified kernel.org 2.6.38, and that it does not fail with the kernel.org 2.6.37.stable (now 2.6.37.6) Can you bisect which change between 2.6.37 and 2.6.38-rc6 causes the failure, or at least try the rc's, such as -rc1? please attach the .config for the failing kernel, in the hopes that it can be reproduced on an additional machine. Created attachment 52512 [details]
config of failing kernel
Thanks, I will do both (confirm which unmodified kernel.org version fails, and bisect) and report back when done. In the meantime, please find in attachment the config of the failing kernel (sorry to forget about this one). Regards FWIW, I seriously doubt this is an ACPICA problem. It rather looks like this is related to interrupts (I/O ACPI or LAPIC issue perhaps). Created attachment 53002 [details] 2.6.38 dmesg with bf325f9538d8c89312be305b9779edbcb436af00 reverted Hello, I tested kernels from kernel.org: v2.6.37.6 -> OK v2.6.38.1 -> failed v2.6.38-rc1 -> failed With git bisect I could isolate the commit that results into the crash: commit bf325f9538d8c89312be305b9779edbcb436af00 Author: Rafael J. Wysocki <rjw@sisk.pl> Date: Thu Nov 25 00:10:44 2010 +0100 ACPI / PM: Register power resource devices as soon as they are needed Depending on the organization of the ACPI namespace, power resource device objects may generally be scanned after the "regular" device objects that they are referred from through _PRn. This, in turn, may cause acpi_bus_get_power_flags() to attempt to access them through acpi_bus_init_power() before they are registered (and initialized by acpi_power_driver). [This is not a theoretical issue, it actually happens for one PnP device on my testbed HP nx6325.] To fix this problem, make acpi_bus_get_power_flags() attempt to register power resource devices as soon as they have been found in the _PRn output for any other devices. Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl> Signed-off-by: Len Brown <len.brown@intel.com> I could build and run a 2.6.38 kernel with the above commit reverted. It boots with no problem (with acpi support) The corresponding dmesg is in attachement Best regards Pascal Dormeau Please attach the output of acpidump from your machine. Created attachment 54612 [details]
ouput of acpidump
Please,
Find in attachement the output of the acpidump command.
Best regards,
Pascal Dormeau
Created attachment 55262 [details]
Debug early registration of power resources
Please apply think patch and see if the crash happens. If it doesn't, please
attach the dmesg output.
Sorry, the "think" above should be "this". Created attachment 55272 [details]
ACPI: Avoid infinite recurrence in registering power resources
Well, I think I know what the problem is.
In your DSDT the _PR0 object of power resource PUT2 points back to this
power resource. In consequence, while registering PUT2
acpi_bus_get_power_flags() sees that it depends on PUT2 and tries to
register it again, which leads to an infinitely deep recurrence.
The attached patch should work around this issue. If it does, please
disregard the two previous comments.
Hello, Problem fixed when patch applied. Thanks a lot. Should I understand that the DSDT table is too much buggy here (I really have no understanding of the ACPI spec.) ? In such case, should I remove PUT2 inside the Name (_PR0, Package (0x01) { PUT2 } stanza ? Best regards Pascal Dormeau Created attachment 55342 [details]
ACPI: Avoid infinite recurrence in registering power resources (v2)
Well, it shouldn't be there, but I bet your BIOS is not the only one with
a problem of this kind, so we should add a safeguard against that.
Please check if the attached patch helps too.
Hello, The acpi-power-resources-fix.patch v2 also helps. Thanks. Note that I tested v2 alone (not v1+v2). Best regards, Pascal Dormeau (In reply to comment #18) > Hello, > > The acpi-power-resources-fix.patch v2 also helps. Thanks. > > Note that I tested v2 alone (not v1+v2). That was as intended. :-) Thanks for testing, I'll submit the patch for merging shortly. |
Created attachment 51972 [details] lspci -vvv My laptop cannot boot anymore with latest 2.6.38 kernel (also confirmed with 2.6.38 rc6, rc7 and rc8) while ACPI support is enabled (booting with ACPI=off allows ending the boot sequence but many functionalities are lost). A kernel panic occurs early during the boot. Older kernels until 2.6.37.2 did not trigger this bug on that laptop. It's the official Debian kernel 2.6.38-1-686.