Bug 8573
Summary: | ACPI Exception (battery-0216): AE_SUPPORT, Extracting _BST [20070126] - Acer Aspire 1640z | ||
---|---|---|---|
Product: | ACPI | Reporter: | jonne (bugzilla_kernel_org) |
Component: | Power-Battery | Assignee: | Alexey Starikovskiy (astarikovskiy) |
Status: | CLOSED CODE_FIX | ||
Severity: | high | CC: | acpi-bugzilla, bunk, casteyde.christian, funman, itrs.lin, Robert.Moore |
Priority: | P2 | ||
Hardware: | i386 | ||
OS: | Linux | ||
Kernel Version: | 2.6.21-1.3200.fc8 | Subsystem: | |
Regression: | --- | Bisected commit-id: | |
Attachments: |
acpidump
dmesg.2.6.24-0.167-rc8.git4.fc9 acpidump.2.6.24-0.167-rc8.git4.fc9 patch to emit error messages for AE_SUPPORT dmesg output of 2.6.24-custom kernel with utcopy.c patch acpi output of 2.6.24-custom kernel with utcopy.c patch patch vs 2.26.25-rc6 from bug 10202 |
Description
jonne
2007-06-03 11:18:22 UTC
Please attach the acpidump output. This exception occurs when evaluating _BST method. Please recompile the kernel without the ACPI battery driver and see if there is any differnece. Created attachment 11722 [details]
acpidump
Thanks very much for your help, I really appreciate. I installed acpidump and now upload the attachment. Further, I'm going to try to compile the kernel without ACPI battery driver like you said and will report any new findings then. Also, I compiled kernel 2.6.21.4 without ACPI battery driver and, well, what can I say. There was no Battery exception. Most things seemed to work just fine (wireless network did not work, but I'm not an expert, there can be many reasons). It seems the exception does not *always* occur. Yesterday I was able to boot Fedora's newest development kernel and the exception did not occur. Everything worked, except that my battery level seemed to be stuck at 2%. After a reboot several hours later I got the battery exception again. This weekend I'm going to try to replace the battery by somebody who owns the exact same notebook. Thanks for helping me! Replacing the battery solved the problem. I must conclude it is hardware failure. Sorry for the noise, the battery exception is quite a clear indication to faulty hardware I guess. Anyways, I guess I should search my warranty and hope for the best. Thanks for helping. I reject the bug now as INVALID, because I think the battery exception message should have been good enough. (the windows xp machine with the faulty battery runned *extremely* slow and acer's epower management tool could not be started). Linux should not be getting a red-zone error.
This looks like a failure in our error path.
> Replacing the battery solved the problem.
Keep your old battery, we may need it for testing:-)
Just back from holidays. I did not throw the old battery away, so you can send me instructions on what to do with it if needed at some time. Thanks again, Jonne. reassign to Len and see what he want to play with this malfunctioning battery... thanks for the patience, Jonne. jonne, can you reproduce this failure with 2.6.24-rc? (there are been a lot of updated to battery.c in 2.6.24) Hi, I just did another test. However, while writing this I see that you even mention a newer kernel version than I used to test. I tried Fedora's newest stable kernel kernel-devel-2.6.23.14-107.fc8 kernel-2.6.23.14-107.fc8 kernel-headers-2.6.23.14-107.fc8 I booted with the faulty battery, and received a lot of error messages. I'll try to see whether I can find these in files somewhere, but for now, I could only write some by hand, which was a bit difficult, because the screen was scrolling down all the time. Here's what I could decipher: .... BUG: Unable to handle kernel NULL pointer dereference at virtual address 000000001 .... Oops 0000 [#8] SMP .... Call trace: copy_process do_filp_open do_fork copy_to_user sys_clone syscall_call xfrm_alloc_spi I further noticed that there were several other call traces, which were very different from the one above. Anyways, I see that you wanted me to test with a newer version, so I'll try to install that version now, and send a message here later. Finally, if somebody wants to have my battery, it's fine with me, I'll send it over via mail, no problem. Regards, Jonne. Created attachment 14568 [details]
dmesg.2.6.24-0.167-rc8.git4.fc9
Created attachment 14569 [details]
acpidump.2.6.24-0.167-rc8.git4.fc9
You guys are really funny :) I'm running 2.6.24-0.167-rc8.git4.fc9 kernel now, from Fedora's development repository. I see it's a rc, maybe not the very latest, but you didn't specify which one either, so hope it's okay. Anyways, ..., I could boot with the faulty battery !!! I didn't see any error message !!! Wireless is working ... everything just seems ok. However, ... ;) The dmesg output does show some errors, so I'm attaching dmesg and acpidump. Have a nice day, Jonne. ps > battery monitor in the tray also works ;) It says I'm now 50% and charging :P I'm going to pull the plug out now. By the way, I remember in the past the battery did work sometimes, with a lot of luck, I'll try to reboot another time and see what happens, maybe this is not what you expected to happen ... I pulled the plug and now battery monitor says 100% charged, with 0:00 hours remaining ... dmesg shows _same_ errors, but now they don't break battery driver. slab shows problem as acpi wrote past the end of allocated buffer of 64 bytes. size of buffer is expected to be 4*sizeof(ACPI_OBJECT) + 1 * sizeof(ACPI_OBJECT), as it is supposed to hold PBST package of 4 integers, thus 80 bytes, not 64. There is a problem with this DSDT, which might be connected: PBST references absent Z00A object. Acpiexec from 20080123 gives buffer length as 96 bytes. jonne, Do you have ability to patch the kernel? I've tried to reproduce this here, no luck so far. There exists the possibility that the caller to acpi_evaluate_object is passing in an acpi_buffer object with an incorrect length. In other words, the battery driver may be reporting a pre-allocated buffer of sufficient length, but the actual buffer is smaller than reported. This could result in the buffer overflow, as the acpica code will trust the reported length of the buffer. Alexey, please take a look at what parameters are being passed to evaluate_object. Thanks. It is called with NULL/ACPI_ALLOCATE buffer, so it is acpi_evaluate_object, who creates wrong size. Also, did you notice, that ut_initialize_buffer may shrink preallocated buffer? Yes, the Buffer->Length is set to the amount of buffer actually used. It doesn't really "shrink" the buffer. I think we will need an ACPI trace of the _BST execution to determine the reason behind both the AE_SUPPORT and buffer problems. Created attachment 14667 [details]
patch to emit error messages for AE_SUPPORT
This small patch to utcopy.c will emit an informative message when the AE_SUPPORT exception occurs. Please try it, it may give us some more information.
I've been working on this today. I'm following this guide: http://www.howtoforge.com/kernel_compilation_fedora Downloaded this kernel source package: kernel-2.6.24-2.fc9.src.rpm I could not apply the patch directly, because the patch was created with another utcopy.c version I guess. However, the comments in the file were unique in my version of the utcopy.c, so I could easily add the two extra statements there. I'm now compiling the result - fingers crossed. Created attachment 14766 [details]
dmesg output of 2.6.24-custom kernel with utcopy.c patch
Created attachment 14767 [details]
acpi output of 2.6.24-custom kernel with utcopy.c patch
I cannot find any "Unsupported object type" messages anywhere (I expected those due to the patch I applied). Don't know what to do next, so I'll just replace my battery again before it explodes and wait for any further suggestions ;) Please check if this patch helps: From: Lin Ming <ming.m.lin@intel.com> Fix a memory overflow bug when copying NULL internal package element object to external. Signed-off-by: Lin Ming <ming.m.lin@intel.com> Signed-off-by: Zhang Rui <rui.zhang@intel.com> --- drivers/acpi/utilities/utobject.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) Index: linux-2.6/drivers/acpi/utilities/utobject.c =================================================================== --- linux-2.6.orig/drivers/acpi/utilities/utobject.c +++ linux-2.6/drivers/acpi/utilities/utobject.c @@ -432,7 +432,7 @@ acpi_ut_get_simple_object_size(union acp * element -- which is legal) */ if (!internal_object) { - *obj_length = 0; + *obj_length = sizeof(union acpi_object); return_ACPI_STATUS(AE_OK); } *** Bug 10202 has been marked as a duplicate of this bug. *** Created attachment 15317 [details] patch vs 2.26.25-rc6 from bug 10202 per bug #10132 Lin-Ming's patch in comment #26 is now upstream. Here is Alexey's patch from bug 10202, which has been applied to the ACPI tree to address this issue. now in Linus' tree as commit b8a1bdb14940946fcf0438a6337b2a6c54294fb8 *** Bug 10339 has been marked as a duplicate of this bug. *** |