Bug 10339 - Kernel oops (2.6.24) while booting with battery inserted
Summary: Kernel oops (2.6.24) while booting with battery inserted
Status: CLOSED DUPLICATE of bug 8573
Alias: None
Product: ACPI
Classification: Unclassified
Component: Power-Battery (show other bugs)
Hardware: All Linux
: P1 normal
Assignee: Alexey Starikovskiy
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2008-03-27 08:54 UTC by Lin Yu-Cheng
Modified: 2008-06-13 22:19 UTC (History)
4 users (show)

See Also:
Kernel Version: 2.6.24
Subsystem:
Regression: ---
Bisected commit-id:


Attachments
oops dmesg (10.08 KB, application/x-gzip)
2008-03-27 08:56 UTC, Lin Yu-Cheng
Details
BIOS DSDT (11.58 KB, application/x-gzip)
2008-03-27 08:57 UTC, Lin Yu-Cheng
Details
kernel config (77.88 KB, application/octet-stream)
2008-03-27 09:37 UTC, Lin Yu-Cheng
Details
Do not use netlink in battery, but print out values that would be used (890 bytes, patch)
2008-03-27 10:16 UTC, Thomas Renninger
Details | Diff
oops dmesg (30.81 KB, application/octet-stream)
2008-03-27 18:50 UTC, Lin Yu-Cheng
Details
BIOS DSDT (139.67 KB, text/x-dsl)
2008-03-27 18:51 UTC, Lin Yu-Cheng
Details
oops dmesg with initcall_debug parameter set (48.89 KB, application/octet-stream)
2008-03-27 18:52 UTC, Lin Yu-Cheng
Details
DSDT fix (2.97 KB, text/x-patch)
2008-03-28 02:30 UTC, Lin Yu-Cheng
Details
git commit c8d16e27a3601d1cbcdfe657eb4ff5e762019e8d (945 bytes, patch)
2008-03-28 03:33 UTC, Thomas Renninger
Details | Diff
acpidump result (104.62 KB, text/plain)
2008-03-28 03:37 UTC, Lin Yu-Cheng
Details
dmesg with Thomas' patch (28.31 KB, application/octet-stream)
2008-04-02 08:51 UTC, Lin Yu-Cheng
Details
b8a1bdb14940946fcf0438a6337b2a6c54294fb8 patch (1.56 KB, patch)
2008-04-02 09:05 UTC, Alexey Starikovskiy
Details | Diff

Description Lin Yu-Cheng 2008-03-27 08:54:57 UTC
Latest working kernel version:
Earliest failing kernel version:
Distribution:Ubuntu hardy
Hardware Environment:
Software Environment:
Problem Description:

kernel oops (Ubuntu 2.6.24-12) while booting with battery
inserted.
Some workaround are acpi=off, avoiding loading battery.ko or putting
battery away.

dmesg and dsdt as attachments.

Thanks,
Sam
Comment 1 Lin Yu-Cheng 2008-03-27 08:56:41 UTC
Created attachment 15459 [details]
oops dmesg
Comment 2 Lin Yu-Cheng 2008-03-27 08:57:10 UTC
Created attachment 15460 [details]
BIOS DSDT
Comment 3 Lin Yu-Cheng 2008-03-27 09:37:19 UTC
Created attachment 15461 [details]
kernel config
Comment 4 Robert Moore 2008-03-27 09:45:20 UTC
Please don't post these things compressed, I just want to click on the link and look at them.
Comment 5 Thomas Renninger 2008-03-27 10:16:14 UTC
Created attachment 15462 [details]
Do not use netlink in battery, but print out values that would be used

Does this work? Can you show the values printed now when the battery driver is loaded.

For now I do not think the bug lies in ACPI.
Does AC status (status changing on (un-)plugging) work if you do not loaded the battery driver?

The oops happens in net/core/skbuff.c (while sending a netlink message).
This file states:
The functions in this file will not compile correctly with gcc 2.4.x
and I have no idea why...

If we get stuck here (and this could happen rather soon, I do not know much about netlink), we should ask Alan or whoever else is deeper involved in netlink for help.
Comment 6 Lin Yu-Cheng 2008-03-27 18:50:28 UTC
Created attachment 15469 [details]
oops dmesg
Comment 7 Lin Yu-Cheng 2008-03-27 18:51:25 UTC
Created attachment 15470 [details]
BIOS DSDT
Comment 8 Lin Yu-Cheng 2008-03-27 18:52:19 UTC
Created attachment 15471 [details]
oops dmesg with initcall_debug parameter set
Comment 9 Lin Yu-Cheng 2008-03-28 02:30:34 UTC
Created attachment 15474 [details]
DSDT fix

This seems to be a BIOS bug (several hole in external reference).
After applying this patch and use it to override machine DSDT table the
issue is gone.
Comment 10 Thomas Renninger 2008-03-28 03:11:45 UTC
Great job!
This is a nice example why we need DSDT overriding, can't resist to add Len to CC...

I am going to stop here, the rest should be reproducable in userspace.

You've done three changes here (probably you fixed up the compile warnings?):
  1) Get rid of undefined Z00X variable used in battery
  2) Rename _T_0 function
  3) Return a value for _WAK func

I expect the first is breaking things.
The Z00X functions are defined in an external SSDT. Can you attach whole acpidump output pls (as text/plain mime type). acpidump includes also external SSDTs. I wonder whether these Z00x functions are really declared somewhere or whether they are used as some kind of pre-defined global variables, e.g. like localX locally.
Comment 11 Thomas Renninger 2008-03-28 03:23:31 UTC
One sec..., I think this is fixed in 2.6.25-rc6.
I dig out the patches for testing, this should go to 2.6.24.X also after some testing time?
Comment 12 Thomas Renninger 2008-03-28 03:33:28 UTC
Created attachment 15476 [details]
git commit c8d16e27a3601d1cbcdfe657eb4ff5e762019e8d

This should fix the mem corruption.
You might still get a NULL pointer exception (not sure), if evaluate_object does not like NULL handle package elements.
Please give it a try (with the original DSDT).
Comment 13 Lin Yu-Cheng 2008-03-28 03:37:58 UTC
Created attachment 15477 [details]
acpidump result

seems that Z003,Z004,Z005 only referenced in _BIF Method
Comment 14 ykzhao 2008-03-30 19:13:12 UTC
Hi, Lin
    Thanks for the work. As you point in comment #9, the bug is caused by broken BIOS. 
    Will you please try the patch in comment #12 and see whether the oops message disappears? (Please don't use the custom DSDT).
    
Comment 15 Thomas Renninger 2008-03-31 05:09:49 UTC
In fact, this is not a BIOS, but only a kernel bug.
With recent acpica compiler there also is no warning for the Z00x usage (did you see an error/warning here?):

iasl -sa dsdt.dsl 

Intel ACPI Component Architecture
ASL Optimizing Compiler version 20071019 [Nov  6 2007]
Copyright (C) 2000 - 2007 Intel Corporation
Supports ACPI Specification Revision 3.0a

dsdt.dsl   254:     Method (\_WAK, 1, NotSerialized)
Warning  1079 -                 ^ Reserved method must return a value (_WAK)

dsdt.dsl  1923:                     Method (_RMV, 0, NotSerialized)
Warning  1079 -                                ^ Reserved method must return a value (_RMV)

dsdt.dsl  4220:                                     Name (_T_0, 0x00)
Error    4081 -                         Use of reserved word ^  (_T_0)


After fixing the _T_0 error and executing the new binary DSDT in userspace I get:
acpiexec dsdt.aml
...
- execute \_SB_.BAT0._BIF
Executing \_SB_.BAT0._BIF
ACPI Error (dsobject-0208): [Z003] Namespace lookup failure, AE_NOT_FOUND
ACPI Error (dsobject-0208): [Z004] Namespace lookup failure, AE_NOT_FOUND
ACPI Error (dsobject-0208): [Z005] Namespace lookup failure, AE_NOT_FOUND
**** AcpiExec: Exception AE_NOT_FOUND during execution of method [_BIF] Opcode [Package] @22

**** Exception AE_NOT_FOUND during execution of method [\_SB_.BAT0._BIF] (Node 0x404d9e8)
...
ACPI Error (psparse-0626): Method parse/execution failed [\_SB_.BAT0._BIF] (Node 0x404d9e8), AE_NOT_FOUND
Execution of \_SB_.BAT0._BIF failed with status AE_NOT_FOUND

I could imagine the NOT_FOUND case is handled more gracefully in the kernel, but I could imagine reading battery still fails, because the package includes a NULL handle even with the fixed kernel.

Yakui, on the other hand site, this should be reproducable by modifying any existing laptop DSDT and add undefined Z00x objects in the _BIF there...
Comment 16 Robert Moore 2008-03-31 10:01:56 UTC
The reason there is no compilation error for the Z00x objects is that the disassembler is automatically generating external statements for the missing objects:

DefinitionBlock ("DSDT.aml", "DSDT", 1, "VIA  ", "PTL_ACPI", 0x06040000)
{
    External (Z005)
    External (Z004)
    External (Z003)
Comment 17 Robert Moore 2008-04-01 14:48:01 UTC
Does Windows return the correct battery information for this machine? This information would be very useful, as we are still trying to understand the Windows behavior in these cases.

FYI, the fields in the BIF package that correspond to the unresolved externals are as follows:

- execute \_SB_.BAT0._BIF
Executing \_SB_.BAT0._BIF
ACPI Error (dsobject-0208): [Z003] Namespace lookup failure, AE_NOT_FOUND
ACPI Error (dsobject-0208): [Z004] Namespace lookup failure, AE_NOT_FOUND
ACPI Error (dsobject-0208): [Z005] Namespace lookup failure, AE_NOT_FOUND
[AcpiExec] Exception AE_NOT_FOUND during execution of method [_BIF] Opcode [Package] @22

**** Exception AE_NOT_FOUND during execution of method [\_SB_.BAT0._BIF] (Node 004547B8)

Method Execution Stack:
    Method [_BIF] executing: [_BIF] @00005 #0012:  Package (0x0D)
{
    One,
    Ones,
    Ones,
    One,
    Ones,
    0x012C,
    0x96,
    One,
    One,
    Z003,
    Z004,
    "LIon",
    Z005
}

Z003 -- Model Number
Z004 -- Serial Number
Z005 -- OEM INformation
Comment 18 Lin Yu-Cheng 2008-04-02 08:50:02 UTC
(In reply to comment #12)
> Created an attachment (id=15476) [details]
> git commit c8d16e27a3601d1cbcdfe657eb4ff5e762019e8d
> 
> This should fix the mem corruption.
> You might still get a NULL pointer exception (not sure), if evaluate_object
> does not like NULL handle package elements.
> Please give it a try (with the original DSDT).
> 

Yes the fix works to pass booting.
Of course batt info doesn't work.

 # cat /proc/acpi/battery/BAT0/*

cat: /proc/acpi/battery/BAT0/alarm: Bad address
cat: /proc/acpi/battery/BAT0/info: Bad address
cat: /proc/acpi/battery/BAT0/state: Bad address
Comment 19 Lin Yu-Cheng 2008-04-02 08:51:37 UTC
Created attachment 15570 [details]
dmesg with  Thomas' patch
Comment 20 Alexey Starikovskiy 2008-04-02 09:05:16 UTC
Created attachment 15571 [details]
b8a1bdb14940946fcf0438a6337b2a6c54294fb8 patch

Please check that this patch is enough
Comment 21 Lin Yu-Cheng 2008-04-02 22:50:53 UTC
(In reply to comment #20)
> Created an attachment (id=15571) [details]
> b8a1bdb14940946fcf0438a6337b2a6c54294fb8 patch
> 
> Please check that this patch is enough
> 

Yes this patch works nicely.

demo@demo2-desktop:~$ cat /proc/acpi/battery/BAT0/*
alarm:                   unsupported
present:                 yes
design capacity:         2200 mAh
last full capacity:      2051 mAh
battery technology:      rechargeable
design voltage:          14800 mV
design capacity warning: 205 mAh
design capacity low:     102 mAh
capacity granularity 1:  1 mAh
capacity granularity 2:  1 mAh
model number:            
serial number:           
battery type:            LIon
OEM info:                
present:                 yes
capacity state:          ok
charging state:          charged
present rate:            0 mA
remaining capacity:      2043 mAh
present voltage:         16367 mV
Comment 22 Alexey Starikovskiy 2008-04-03 09:35:43 UTC
Ok, so it is another duplicate of 8573...

*** This bug has been marked as a duplicate of bug 8573 ***

Note You need to log in before you can comment on or make changes to this bug.