Kernel Bug Tracker – Bug 6980
Reading battery state causes oops, or AE_NOT_FOUND or AE_AML_OPERAND_TYPE
Last modified: 2007-03-08 00:52:50 UTC
Most recent kernel where this bug did not occur: unknown (also occurs with
Hardware Environment: 686
Software Environment: kde
Problem Description: frequent (e.g. daily) kernel oopses:
[17186277.420000] BUG: unable to handle kernel NULL pointer dereference at
virtual address 00000002
[17186277.420000] printing eip:
[17186277.420000] *pde = 00000000
[17186277.420000] Oops: 0000 [#1]
[17186277.420000] Modules linked in: binfmt_misc rfcomm l2cap bluetooth ppdev
speedstep_ich speedstep_lib cpufreq_userspace cpufreq_stats freq_table
cpufreq_powersave cpufreq_ondemand cpufreq_conservative video button battery
container ac dm_mod ipv6 af_packet md_mod ide_cs lp pcmcia 3c59x snd_intel8x0
snd_ac97_codec snd_ac97_bus yenta_socket rsrc_nonstatic mii pcmcia_core joydev
snd_pcm_oss snd_mixer_oss hw_random snd_pcm snd_timer snd parport_pc parport
psmouse soundcore snd_page_alloc floppy intel_agp agpgart shpchp pci_hotplug
serio_raw tsdev evdev usbhid ext3 jbd mbcache ide_generic uhci_hcd usbcore
ide_cd cdrom ide_disk piix generic thermal processor fan vga16fb cfbcopyarea
vgastate cfbimgblt cfbfillrect
[17186277.420000] CPU: 0
[17186277.420000] EIP: 0060:[acpi_ex_store+273/519] Not tainted VLI
[17186277.420000] EFLAGS: 00210282 (22.214.171.124 #1)
[17186277.420000] EIP is at acpi_ex_store+0x111/0x207
[17186277.420000] eax: 00000000 ebx: dfe7b8bc ecx: 00000000 edx:
[17186277.420000] esi: 00000000 edi: dfe7b754 ebp: 00000014 esp:
[17186277.420000] ds: 007b es: 007b ss: 0068
[17186277.420000] Process hald (pid: 3808, threadinfo=d9ee2000 task=defd1030)
[17186277.420000] Stack: dfe7b8bc d87ed5bc 00000000 d87ed400 c01f0eea
d87ed400 d87ed5bc 00000003
[17186277.420000] 00000000 dfe7b8bc d87ed400 00000000 c75a93a0
00000007 c01e84a8 d9ee3e60
[17186277.420000] d87ed400 00000000 d87ed5ec c01f61b3 e080bfac
00000000 00200286 c75a9288
[17186277.420000] Call Trace:
[17186277.420000] <c01f0eea> acpi_ex_opcode_2A_1T_1R+0x33b/0x368 <c01e84a8>
[17186277.420000] <c01f61b3> acpi_ps_parse_loop+0x55f/0x848 <c01f5acb>
[17186277.420000] <c01f68c8> acpi_ps_execute_pass+0x6b/0x7b <c01f698c>
[17186277.420000] <c01f3fcd> acpi_ns_evaluate_by_handle+0xd1/0x12a
[17186277.420000] <c01f38cd> acpi_evaluate_object+0x109/0x1b4 <c0170003>
[17186277.420000] <e0a5a66b> acpi_battery_read_state+0x94/0x242 [battery]
[17186277.420000] <c0180190> seq_read+0x67/0x2ba <c0180129>
[17186277.420000] <c01623fd> vfs_read+0xa2/0x158 <c0162ea0>
[17186277.420000] <c0102ceb> sysenter_past_esp+0x54/0x75
[17186277.420000] Code: 8b 43 18 8b 38 89 e2 89 f0 e8 1d 96 00 00 89 c2 85 c0
0f 85 01 01 00 00 31 f6 85 ff 75 0a eb 13 89 f8 e8 b2 97 00 00 46 8b 43 10
<0f> b7 40 02 39 c6 72 ed 8b 53 18 8b 04 24 89 02 be 01 00 00 00
[17186277.420000] EIP: [acpi_ex_store+273/519] acpi_ex_store+0x111/0x207
dmesg output of similar oops'es for other kernel versions can be found at
Looks like something is reading /proc/acpi/battery/*/state
and that is crashing the AML interpreter.
If this is specific to the battery, then it will go away
if you "rmmod battery". To make it happen sooner, you could
probably "cat /proc/acpi/battery/*/state" in a loop.
Can you reproduce this failure with 2.6.18-rc4 or later?
It would be interesting to know if this has _always_ happened --
ie. if you boot a kernel even older than 2.6.15 do you see this?
Please attach the output from acpidump, available in pmtools here:
Created attachment 8745 [details]
oops report with 2.6.18-rc4
It also happens with 2.6.18-rc4.
With 2.6.18-rc4, I get when booting:
[ 84.272000] Lukewarm IQ detected in hotplug locking
[ 84.272000] BUG: warning at kernel/cpu.c:38/lock_cpu_hotplug()
[ 84.272000] [<c0138fc5>] lock_cpu_hotplug+0x74/0x7d
[ 84.272000] [<c01306f4>] __create_workqueue+0x39/0x11d
[ 84.272000] [<e0a6b441>] cpufreq_governor_dbs+0x29d/0x2e7
[ 84.272000] [<c024540f>] __cpufreq_governor+0x3c/0xc2
[ 84.272000] [<c024565e>] __cpufreq_set_policy+0xc5/0x10c
[ 84.272000] [<c0245857>] store_scaling_governor+0xa2/0x1ba
[ 84.272000] [<c0246162>] handle_update+0x0/0x5
[ 84.272000] [<c01d3900>] kobject_set_name+0x80/0x98
[ 84.272000] [<c02457b5>] store_scaling_governor+0x0/0x1ba
[ 84.272000] [<c02452b3>] store+0x2e/0x3e
[ 84.272000] [<c01a38a3>] sysfs_write_file+0x84/0xc9
[ 84.272000] [<c01a381f>] sysfs_write_file+0x0/0xc9
[ 84.272000] [<c0167c2b>] vfs_write+0xa4/0x162
[ 84.272000] [<c0168598>] sys_write+0x41/0x6a
[ 84.272000] [<c0102dd5>] sysenter_past_esp+0x56/0x79
Not sure if this is related.
When issuing "cat /proc/acpi/battery/BAT1/state" in a loop, every once in a
while i get an error, with each time in dmesg:
ACPI Error (psargs-0355): [STAT] Namespace lookup failure, AE_NOT_FOUND
ACPI Error (psparse-0537): Method parse/execution failed
[\_SB_.PCI0.LPCB.BAT1._BST] (Node df8c4784), AE_NOT_FOUND
ACPI Exception (acpi_battery-0206): AE_NOT_FOUND, Evaluating _BST 
and sometimes also:
ACPI Error (exstore-0296): Target is not a Reference or Constant object -
Integer [dfe7ee34] 
ACPI Error (psparse-0537): Method parse/execution failed
[\_SB_.PCI0.LPCB.BAT1._BST] (Node df8c4784), AE_AML_OPERAND_TYPE
ACPI Exception (acpi_battery-0206): AE_AML_OPERAND_TYPE, Evaluating _BST
I didn't succeed in reproducing the oops, but when running the loop long
enough I get hundreds of thousands of lines in the kernel log saying
ACPI Warning (utdelete-0397): Large Reference Count (801) in object dfe7eaec
with the count running up. This more or less completely uses up my processing
The exact command I issued was:
"for i in `seq 1 10000`; do cat /proc/acpi/battery/BAT1/state; dmesg |
tail -10; sleep 0.3; done"
Maybe 0.3 seconds sleep is too fast?
As far as I know, it didn't always happen, or at least not as frequently. It
must have started sometime with an ubuntu update installing a newer kernel.
I'll try with older versions and report.
Created attachment 8746 [details]
It would be interesting to know how often "once in a while" is. So far, I have
not been able to reproduce the problem. Of course, it will be interesting to
know if earlier versions did not show the problem.
> earlier versions ...
From above we know that this happens with
ACPICA 20050902 (Linux 2.6.15.stable)
ACPICA 20060127 (Linux 2.6.16 and Linux 2.6.17)
ACPICA 20060707 (Linux 2.6.18)
So it looks like we've been fooled by this machine for some time
and this isn't the result of a recent regression.
> [ 84.272000] Lukewarm IQ detected in hotplug locking
You can ignore this.
It isn't related and should be gone at 2.6.18.final.
> to know how often "once in a while" is. So far, I have
Sometimes 3 minutes, sometimes 15... It usually doesn't take long.
Note that until now I didn't succeed in reproducing the actual oops by doing
the 'cat /proc/acpi/.../state'. However, after a few minutes, the kernel
consistently gets into a loop flooding the logs with "(utdelete-0397): Large
Reference Count" messages. Sometimes this is preceded by some AE_NOT_FOUND and
AE_AML_OPERAND_TYPE errors, sometimes not.
> not been able to reproduce the problem.
Just spoke to a colleague of mine who has the same portable as I (Dell
Inspiron 2650): he gets the same oops'es if he enables the Gnome battery
> Of course, it will be interesting to know if earlier versions did not show
I'll report on this asap, I'm a bit short on time and I'm having troubles
compiling an older kernel which actually boots, using the .config's I have.
When booting with an Ubuntu Breezy live CD, with kernel version 2.6.12, I don't
seem to be getting any errors.
With 2.6.12, I get many of these:
[4302572.547000] read EC, IB not empty
[4302572.597000] read EC, OB not full
[4302572.597000] ACPI-0423: *** Error: Handler for [EmbeddedControl]
[4302572.597000] ACPI-0508: *** Error: Method execution failed
[\_SB_.PCI0.LPCB.BAT1._BST] (Node dfec5f20), AE_TIME
but AFAIK this was another issue which is resolved in more recent kernel versions.
I can reproduce one of the problems when I hit the method with 2 threads:
Creating 2 threads to execute 186A0 times each
ACPI Error (psargs-0459): [STAT] Namespace lookup failure, AE_NOT_FOUND
**** AcpiExec: Exception AE_NOT_FOUND during execution of method [_BST] Opcode
**** Exception AE_NOT_FOUND during execution of method
[\_SB_.PCI0.LPCB.BAT1._BST] (Node 00461298)
You should try setting the AcpiGbl_AllMethodsSerialized flag to TRUE in order
to force method serialization.
> You should try setting the AcpiGbl_AllMethodsSerialized flag to TRUE
> in order to force method serialization.
And how do I do that?
This looks like a bug in the DSDT. The _BST method is
declared "NotSerialized", yet it creates a namespace object (STAT) AND it
performs blocking I/O operations on the Embedded Controller that will
relinquish the interpreter. A second thread that attempts to reenter the
method at this time (during EC I/O) will fail, in unpredictable (timing-
Method (_BST, 0, NotSerialized)
Name (STAT, Package (0x04)
... /* EC operations to get battery status */
Changing the declaration above to:
Method (_BST, 0, Serialized)
Fixes the problem here.
You can either fix the DSDT and override the BIOS version, or set the
serialized flag in the Linux configuration, I think it's like this:
acpi_serialize = TRUE
Boot with the Linux kernel cmdline option "acpi_serialize".
Internally that sets acpi_gbl_all_methods_serialized = TRUE.
The "acpi_serialize" indeed seems to do the trick.
i didn't get a chance to test altering the DSDT yet.
The auto-serialize mechanism has been fixed for ACPICA version 20060912:
Fixed a regression where an error was no longer emitted if a control method
attempts to create 2 objects of the same name. This once again returns
AE_ALREADY_EXISTS. When this exception occurs, it invokes the mechanism that
will dynamically serialize the control method to possibly prevent future
*** Bug 7386 has been marked as a duplicate of this bug. ***
ACPICA 20060912 shipped in 2.6.21-rc1