Bug 4665

Summary: ACPI battery monitor does not work
Product: ACPI Reporter: Rafael J. Wysocki (rjwysocki)
Component: Power-BatteryAssignee: Luming Yu (luming.yu)
Status: CLOSED CODE_FIX    
Severity: blocking CC: acpi-bugzilla, bug-track, d.bellegueulle, daniel, oryn, osdl
Priority: P2    
Hardware: i386   
OS: Linux   
Kernel Version: 2.6.12-rc1-mm1 - -rc5-mm1, 2.6.13-rc2-mm1 and above Subsystem:
Regression: --- Bisected commit-id:
Attachments: Patch to revert the changes that introduced the problem
dmesg output from 2.6.13-rc1-mm1 w/ ACPI debug
dmesg output from 2.6.13-rc1-mm1 w/ debug patch w/ ACPI debug
acpidmp output from 2.6.13-rc1-mm1 w/ debug patch
dmesg output from 2.6.13-rc1-mm1 w/ timing patch
dmesg output from 2.6.13-rc1-mm1 w/ timing patch w/ debug
dmesg output from 2.6.13-rc1-mm1 w/ timing patch w/ debug
dmesg output from 2.6.13-rc1-mm1 w/ timing patch + debug patch
dmesg output from 2.6.13-rc1-mm1 w/ timing patch + debug patch w/o preemption
patch against the disassembly of DSDT
override DSDT
dmesg output from 2.6.13-rc1-mm1 w/ override DSDT + timing patch
override DSDT -take 2
dmesg output from 2.6.13-rc1-mm1 w/ override DSDT -take 2 + timing patch
patch for falling back to polling mode with kernel option : ec_polling
ec_polling patch against vanilla 2.6.13-rc3
patch vs 2.6.13-rc5

Description Rafael J. Wysocki 2005-05-26 08:52:29 UTC
Distribution: SUSE 9.[23] AMD64 
 
Hardware Environment: Asus L5D 
 
Software Environment: SUSE 9.[23] AMD64 
 
Problem Description: ACPI battery monitor is very slow to respond and it does 
not return any numbers, eg 
 
rafael@albercik:/proc/acpi/battery/BAT0> time cat state 
present: 
Comment 1 Rafael J. Wysocki 2005-05-26 08:57:37 UTC
Created attachment 5092 [details]
Patch to revert the changes that introduced the problem

This patch reverts the patch that introduced the problem.  It is against
2.6.12-rc3-mm2 but applies up to and including 2.6.12-rc5-mm1.
Comment 2 Rafael J. Wysocki 2005-06-02 02:52:38 UTC
The problem is not present in 2.6.12-rc5-mm2 
 
Comment 3 Luming Yu 2005-06-29 23:55:19 UTC
Hi,
Do you use sbs-cm package ?

Thanks,
Luming

Comment 4 Rafael J. Wysocki 2005-06-30 05:25:55 UTC
No, I don't.  Why? 
 
Rafael 
 
Comment 5 Rafael J. Wysocki 2005-07-02 04:29:26 UTC
The problem is back in 2.6.13-rc1-mm1. 
 
Comment 6 Luming Yu 2005-07-04 23:19:51 UTC
Hi,
We need to figure out why burst-mode EC patch doesn't work for you.
Please build kernel with ACPI debug option on, and 
echo -n 0x00100000 > /proc/acpi/debug_layer
echo -n 0xffffffff > /proc/acpi/debug_level

Then test and send me dmesg.

Thanks,
Luming
Comment 7 Rafael J. Wysocki 2005-07-05 03:45:10 UTC
Created attachment 5267 [details]
dmesg output from 2.6.13-rc1-mm1 w/ ACPI debug

Done.  I tested it by executing 'cat /proc/acpi/battery/BAT0/state'.

I'm attaching the full output of dmesg (just in case).

Greets,
Rafael
Comment 8 Luming Yu 2005-07-05 06:50:33 UTC
Thanks for this info.  Please run :

#cat /proc/acpi/embedded_controller/EC/info

if use global lock  yes then

please try this debug patch.

--- ec.0        2005-07-05 21:43:52.000000000 +0800
+++ ec.c        2005-07-05 21:48:06.000000000 +0800
@@ -802,7 +802,8 @@
        acpi_driver_data(device) = ec;

        /* Use the global lock for all EC transactions? */
-       acpi_evaluate_integer(ec->handle, "_GLK", NULL, &ec->global_lock);
+//     acpi_evaluate_integer(ec->handle, "_GLK", NULL, &ec->global_lock);
+       ec->global_lock = 0;

        /* If our UID matches the UID for the ECDT-enumerated EC,
           we now have the *real* EC info, so kill the makeshift one.*/
@@ -1005,7 +1006,7 @@
        status = acpi_evaluate_integer(handle, "_GPE", NULL, &ec_ecdt-
>gpe_bit);
        if (ACPI_FAILURE(status))
                return status;
-       ec_ecdt->global_lock = TRUE;
+       ec_ecdt->global_lock = 0;
        ec_ecdt->handle = handle;

        printk(KERN_INFO PREFIX  "GPE=0x%02x, ports=0x%2x, 0x%2x\n",
@@ -1084,7 +1085,7 @@
        ec_ecdt->data_addr = ecdt_ptr->ec_data;
        ec_ecdt->gpe_bit = ecdt_ptr->gpe_bit;
        /* use the GL just to be safe */
-       ec_ecdt->global_lock = TRUE;
+       ec_ecdt->global_lock = 0;
        ec_ecdt->uid = ecdt_ptr->uid;

        status = acpi_get_handle(NULL, ecdt_ptr->ec_id, &ec_ecdt->handle);


Comment 9 Rafael J. Wysocki 2005-07-05 13:49:54 UTC
rafael@albercik:~> cat /proc/acpi/embedded_controller/EC0/info 
gpe bit:                 0x25 
ports:                   0x66, 0x62 
use global lock:         no 
 
Comment 10 Luming Yu 2005-07-05 23:44:45 UTC
Then, try this debug patch. 

--- ec.c        2005-07-06 14:35:55.000000000 +0800
+++ ec.c.1      2005-07-06 14:42:32.000000000 +0800
@@ -280,14 +280,6 @@
        if (ec->global_lock)
                acpi_release_global_lock(glk);

-       if(atomic_read(&ec->leaving_burst) == 2){
-               ACPI_DEBUG_PRINT((ACPI_DB_INFO,"aborted, retry ...\n"));
-               while(atomic_read(&ec->pending_gpe)){
-                       msleep(1);
-               }
-               acpi_enable_gpe(NULL, ec->gpe_bit, ACPI_NOT_ISR);
-               goto retry;
-       }

        return_VALUE(status);
 }
@@ -363,14 +355,6 @@
        if (ec->global_lock)
                acpi_release_global_lock(glk);

-       if(atomic_read(&ec->leaving_burst) == 2){
-               ACPI_DEBUG_PRINT((ACPI_DB_INFO,"aborted, retry ...\n"));
-               while(atomic_read(&ec->pending_gpe)){
-                       msleep(1);
-               }
-               acpi_enable_gpe(NULL, ec->gpe_bit, ACPI_NOT_ISR);
-               goto retry;
-       }

        return_VALUE(status);
 }
@@ -466,11 +450,6 @@
        if (ec->global_lock)
                acpi_release_global_lock(glk);

-       if(atomic_read(&ec->leaving_burst) == 2){
-               ACPI_DEBUG_PRINT((ACPI_DB_INFO,"aborted, retry ...\n"));
-               acpi_enable_gpe(NULL, ec->gpe_bit, ACPI_NOT_ISR);
-               status = -ENODATA;
-       }
        return_VALUE(status);
 }

Comment 11 Rafael J. Wysocki 2005-07-06 02:20:06 UTC
Created attachment 5283 [details]
dmesg output from 2.6.13-rc1-mm1 w/ debug patch w/ ACPI debug

Done.  The behavior has not changed but the messages in dmesg are different
(attached).
Comment 12 Luming Yu 2005-07-06 02:29:41 UTC
Please attach the output of acpidmp.
Comment 13 Rafael J. Wysocki 2005-07-06 02:46:01 UTC
Created attachment 5284 [details]
acpidmp output from 2.6.13-rc1-mm1 w/ debug patch
Comment 14 Luming Yu 2005-07-06 07:21:49 UTC
To track down the problem, I want to see the results with the following 
chanages that remove ec query.

--- ec.c        2005-07-06 14:35:55.000000000 +0800
+++ ec.c.2      2005-07-06 22:18:51.000000000 +0800
@@ -497,18 +497,6 @@

        ACPI_FUNCTION_TRACE("acpi_ec_gpe_query");

-       if (acpi_ec_read_status(ec) & ACPI_EC_FLAG_SCI)
-               result = acpi_ec_query(ec, &value);
-
-       if (result)
-               goto end;
-
-       object_name[2] = hex[((value >> 4) & 0x0F)];
-       object_name[3] = hex[(value & 0x0F)];
-
-       ACPI_DEBUG_PRINT((ACPI_DB_INFO, "Evaluating %s\n", object_name));
-
-       acpi_evaluate_object(ec->handle, object_name, NULL, NULL);
 end:
        atomic_dec(&ec->pending_gpe);
        return;
~
Comment 15 Rafael J. Wysocki 2005-07-06 08:07:07 UTC
Done.  The behavior has not changed, the output of dmesg has not changed (ie. 
see Comment #11). 
 
 
Comment 16 Luming Yu 2005-07-06 08:36:54 UTC
Hmm, strange, Let's measure acpi_ec_read and acpi_ec_write :

--- ec.c        2005-07-06 23:32:10.000000000 +0800
+++ ec.c.3      2005-07-06 23:33:26.000000000 +0800
@@ -65,6 +65,7 @@
 #define ACPI_EC_BURST_DISABLE  0x83
 #define ACPI_EC_COMMAND_QUERY  0x84

+extern u64 jiffies;
 static int acpi_ec_add (struct acpi_device *device);
 static int acpi_ec_remove (struct acpi_device *device, int type);
 static int acpi_ec_start (struct acpi_device *device);
@@ -234,6 +235,7 @@
        u32                     glk;

        ACPI_FUNCTION_TRACE("acpi_ec_read");
+       printk("...jiffies = %lu \n", jiffies);

        if (!ec || !data)
                return_VALUE(-EINVAL);
@@ -289,6 +291,7 @@
                goto retry;
        }

+       printk("...jiffies = %lu \n", jiffies);
        return_VALUE(status);
 }

@@ -305,6 +308,7 @@

        ACPI_FUNCTION_TRACE("acpi_ec_write");

+       printk("...jiffies = %lu \n", jiffies);
        if (!ec)
                return_VALUE(-EINVAL);
 retry:
@@ -372,6 +376,7 @@
                goto retry;
        }

+       printk("...jiffies = %lu \n", jiffies);
        return_VALUE(status);
 }


Comment 17 Rafael J. Wysocki 2005-07-06 13:42:57 UTC
Created attachment 5291 [details]
dmesg output from 2.6.13-rc1-mm1 w/ timing patch

I have removed the declaration of jiffies, because the kernel did not compile
with it.  Additionally, I've figured out that I should revert the first debug
patch.

The resulting dmesg output is attached.
Comment 18 Luming Yu 2005-07-06 18:00:11 UTC
Thanks, to make the problem more clearly, could you 
build kernel with ACPI debug option on, and 
echo -n 0x00100000 > /proc/acpi/debug_layer
echo -n 0xffffffff > /proc/acpi/debug_level

Then, I could spot bottleneck.
Comment 19 Rafael J. Wysocki 2005-07-07 03:21:31 UTC
Created attachment 5293 [details]
dmesg output from 2.6.13-rc1-mm1 w/ timing patch w/ debug

It is complied with the ACPI debug, but I forgot the "echo" things last time. 
The (full) dmesg output is attached.
Comment 20 Luming Yu 2005-07-07 18:31:21 UTC
> From log, I found there is a long delay between EC write and read
> operations. It is about 0.6s.
> So, it is possible to take 15s to finish all EC operations for a battery
> status query.
> Here are some snippets :
> 
> ...jiffies = 4295177380
>  acpi_ec-0379 [111E] [24] acpi_ec_write         : ----Exit-
> 0000000000000000
>  acpi_ec-0664 [111E] [23] acpi_ec_space_handler : ----Exit-
> 0000000000000000
>  acpi_ec-0608 [111E] [22] acpi_ec_space_handler : ----Entry
>  acpi_ec-0236 [111E] [23] acpi_ec_read          : ----Entry
> ...jiffies = 4295178019
> 
> .......................
> ......................
> .................
> 
> ...jiffies = 4295178105
>  acpi_ec-0379 [111E] [24] acpi_ec_write         : ----Exit-
> 0000000000000000
>  acpi_ec-0664 [111E] [23] acpi_ec_space_handler : ----Exit-
> 0000000000000000
>  acpi_ec-0608 [111E] [22] acpi_ec_space_handler : ----Entry
>  acpi_ec-0236 [111E] [23] acpi_ec_read          : ----Entry
> ...jiffies = 4295178727
Comment 21 Luming Yu 2005-07-07 18:36:19 UTC
>   To debug further, Could you try this debug patch to skip all
> Acquire/Release statement in AML interpreter. And retest.
> I expect to see less delay.  

I want to see these debug info:

1.#time cat state 
2.# dmesg like comment#19


--- osl.c       2005-05-07 13:20:31.000000000 +0800
+++ osl.c.1     2005-07-08 09:26:09.000000000 +0800
@@ -792,11 +792,6 @@
        ACPI_DEBUG_PRINT ((ACPI_DB_MUTEX, "Acquiring spinlock[%p] from %s 
level\n", handle,
                ((flags & ACPI_NOT_ISR) ? "non-interrupt" : "interrupt")));

-       if (flags & ACPI_NOT_ISR)
-               ACPI_DISABLE_IRQS();
-
-       spin_lock((spinlock_t *)handle);
-
        return_VOID;
 }

@@ -814,11 +809,6 @@
        ACPI_DEBUG_PRINT ((ACPI_DB_MUTEX, "Releasing spinlock[%p] from %s 
level\n", handle,
                ((flags & ACPI_NOT_ISR) ? "non-interrupt" : "interrupt")));

-       spin_unlock((spinlock_t *)handle);
-
-       if (flags & ACPI_NOT_ISR)
-               ACPI_ENABLE_IRQS();
-
        return_VOID;
 }




Comment 22 Rafael J. Wysocki 2005-07-08 12:42:13 UTC
Created attachment 5302 [details]
dmesg output from 2.6.13-rc1-mm1 w/ timing patch w/ debug

rafael@albercik:~> cd /proc/acpi/battery/BAT0/
rafael@albercik:/proc/acpi/battery/BAT0> time cat state
present:		 yes
capacity state: 	 ok
charging state: 	 charged
present rate:		 unknown
remaining capacity:	 unknown
present voltage:	 unknown

real	0m27.630s
user	0m0.004s
sys	0m4.212s
Comment 23 Luming Yu 2005-07-10 23:01:53 UTC
I'm sorry, I made a mistake with my last debug patch. Could you test this one:
--- osl.c       2005-07-06 14:35:55.000000000 +0800
+++ osl.c.1     2005-07-11 13:36:16.000000000 +0800
@@ -899,6 +899,7 @@
        int                     ret = 0;

        ACPI_FUNCTION_TRACE ("os_wait_semaphore");
+#if 0

        if (!sem || (units < 1))
                return_ACPI_STATUS (AE_BAD_PARAMETER);
@@ -963,6 +964,7 @@
        else {
                ACPI_DEBUG_PRINT ((ACPI_DB_MUTEX, "Acquired semaphore[%p|%d|%d]
\n", handle, units, timeout));
        }
+#endif

        return_ACPI_STATUS (status);
 }
@@ -981,6 +983,7 @@

        ACPI_FUNCTION_TRACE ("os_signal_semaphore");

+#if 0
        if (!sem || (units < 1))
                return_ACPI_STATUS (AE_BAD_PARAMETER);

@@ -990,6 +993,7 @@
        ACPI_DEBUG_PRINT ((ACPI_DB_MUTEX, "Signaling semaphore[%p|%d]\n", 
handle, units));

        up(sem);
+#endif

        return_ACPI_STATUS (AE_OK);
 }
~
~
Comment 24 Rafael J. Wysocki 2005-07-11 14:30:18 UTC
Created attachment 5309 [details]
dmesg output from 2.6.13-rc1-mm1 w/ timing patch + debug patch

Done.  Now I've got some nice Oopses and a segfault during reboot:

rafael@albercik:~> cd /proc/acpi/battery/BAT0/
rafael@albercik:/proc/acpi/battery/BAT0> time cat state
Killed

real	0m0.472s
user	0m0.000s
sys	0m0.012s
rafael@albercik:/proc/acpi/battery/BAT0>
Message from syslogd@albercik at Mon Jul 11 23:20:39 2005 ...
albercik kernel: Oops: 0000 [3] PREEMPT

Message from syslogd@albercik at Mon Jul 11 23:20:40 2005 ...
albercik kernel: CR2: 0000000000001008

The output of dmesg is attached.
Comment 25 Luming Yu 2005-07-11 17:54:39 UTC
Did you config PREEMPT ? If yes, Please try disable it with a clean kernel and 
retest.
Comment 26 Rafael J. Wysocki 2005-07-12 03:33:17 UTC
Created attachment 5313 [details]
dmesg output from 2.6.13-rc1-mm1 w/ timing patch + debug patch w/o preemption

With CONFIG_PREEMPT_NONE=y

rafael@albercik:~> cd /proc/acpi/battery/BAT0/
rafael@albercik:/proc/acpi/battery/BAT0> time cat state
present:		 yes
capacity state: 	 ok
charging state: 	 charged
present rate:		 unknown
remaining capacity:	 unknown
present voltage:	 unknown

real	0m18.840s
user	0m0.000s
sys	0m6.340s
Comment 27 Rafael J. Wysocki 2005-07-13 03:41:20 UTC
The problem has made it to mainline, as it is presend in 2.6.13-rc3. 
Comment 28 Luming Yu 2005-07-13 23:54:04 UTC
Created attachment 5327 [details]
patch against the disassembly of DSDT

Please try apply this patch to DSDT, and override DSDT, and test kernel w/
timing patch.
Comment 29 Luming Yu 2005-07-14 00:11:42 UTC
Created attachment 5328 [details]
override DSDT

Or, you can just include this file to override the original DSDT, and test with
timing patch.
Comment 30 Rafael J. Wysocki 2005-07-14 06:16:00 UTC
Created attachment 5331 [details]
dmesg output from 2.6.13-rc1-mm1 w/ override DSDT + timing patch

I used the DSDT from Comment #29.

This time the timing is much better but the rate, capacity and voltage of the
battery are still not recognized:

rafael@albercik:~> cd /proc/acpi/battery/BAT0/
rafael@albercik:/proc/acpi/battery/BAT0> time cat state
present:		 yes
capacity state: 	 ok
charging state: 	 charged
present rate:		 unknown
remaining capacity:	 unknown
present voltage:	 unknown

real	0m0.515s
user	0m0.004s
sys	0m0.500s
Comment 31 Luming Yu 2005-07-14 18:46:06 UTC
Created attachment 5333 [details]
override DSDT -take 2

Please try this override DSDT -- take 2.
Comment 32 Rafael J. Wysocki 2005-07-15 04:01:03 UTC
Created attachment 5334 [details]
dmesg output from 2.6.13-rc1-mm1 w/ override DSDT -take 2 + timing patch

It seems to be better, but still not right.  For the system plugged-in:

rafael@albercik:~> cd /proc/acpi/battery/BAT0/
rafael@albercik:/proc/acpi/battery/BAT0> time cat state
present:		 yes
capacity state: 	 critical
charging state: 	 charged
present rate:		 0 mA
remaining capacity:	 -14 mAh
present voltage:	 unknown

real	0m0.600s
user	0m0.000s
sys	0m0.580s

Do you need the numbers I get without the EC burst mode patch?
Comment 33 Luming Yu 2005-07-23 04:08:43 UTC
Created attachment 5366 [details]
patch for falling back to polling mode with kernel option : ec_polling

The patch is against linux-2.6.13-rc3.

With this patch, if you use kernel boot option ec_polling, the ec driver will 
fall back to polling mode, which suppose to work on L5D, otherwise, ec driver
will use burst mode. Hopefully it will workaround your problem. The real fix is
under-going. 
Ps. I have tested it on my T42. Both mode seems to work. Please tested it.
Thanks
Luming
Comment 34 Rafael J. Wysocki 2005-07-23 09:52:09 UTC
Created attachment 5368 [details]
ec_polling patch against vanilla 2.6.13-rc3

The patch works for me. :-)  Thanks a lot for it!

Initially it didn't apply to vanilla 2.6.13-rc3, so I've remade it a little
(the resulting patch is attached).

Greets,
Rafael
Comment 35 Len Brown 2005-07-29 15:09:18 UTC
applied to-linus
Comment 36 Len Brown 2005-08-03 19:33:25 UTC
Created attachment 5497 [details]
patch vs 2.6.13-rc5

This patch makes polling mode the default again.
use "ec_burst=1" if you'd like to try burst mode,
but it has enough problems that it will not be
the default till they're addressed.

This patch went into linus's tree
after 2.6.13-rc5 -- closing.
Comment 37 Luming Yu 2005-08-04 00:29:03 UTC
As a follow up about bust-mode, could you try debug patch filed at bug 3851
Comment 38 Rafael J. Wysocki 2005-08-09 14:54:41 UTC
Referring to Comment #37: 
 
Sorry for the delay. 
 
I have tested the patch at: 
  
http://bugzilla.kernel.org/attachment.cgi?id=5506&action=view 
 
I applied it to 2.6.13-rc5, booted the kernel without the ec_polling parameter, 
and it worked. 
 
Comment 39 Luming Yu 2005-08-10 01:45:05 UTC
Hi, 
I just updated the patch at bug 3851, please help testing. 
 
Thanks, 
Luming 
Comment 40 Rafael J. Wysocki 2005-08-11 02:04:50 UTC
I'm currently testing the patch at: 
 
http://bugzilla.kernel.org/attachment.cgi?id=5574&action=view 
 
I have booted with ec_burst=1.  It generally works, but I get "present rate" 
different to zero and the system is on AC power for a couple of hours. 
 
Comment 41 Rafael J. Wysocki 2005-08-11 02:06:22 UTC
Referring to Comment #40: 
 
Now "present rate" is zero.  It must have been a fluctuation then. 
 
Comment 42 Luming Yu 2005-08-21 00:15:51 UTC
*** Bug 1093 has been marked as a duplicate of this bug. ***
Comment 43 Luming Yu 2005-08-21 01:18:32 UTC
*** Bug 4793 has been marked as a duplicate of this bug. ***
Comment 44 Luming Yu 2005-08-21 01:24:21 UTC
*** Bug 4911 has been marked as a duplicate of this bug. ***
Comment 45 Luming Yu 2005-08-21 20:24:58 UTC
*** Bug 4846 has been marked as a duplicate of this bug. ***
Comment 46 Luming Yu 2005-08-31 06:16:01 UTC
*** Bug 5100 has been marked as a duplicate of this bug. ***