Bug 4665
Description
Rafael J. Wysocki
2005-05-26 08:52:29 UTC
Created attachment 5092 [details]
Patch to revert the changes that introduced the problem
This patch reverts the patch that introduced the problem. It is against
2.6.12-rc3-mm2 but applies up to and including 2.6.12-rc5-mm1.
The problem is not present in 2.6.12-rc5-mm2 Hi, Do you use sbs-cm package ? Thanks, Luming No, I don't. Why? Rafael The problem is back in 2.6.13-rc1-mm1. Hi, We need to figure out why burst-mode EC patch doesn't work for you. Please build kernel with ACPI debug option on, and echo -n 0x00100000 > /proc/acpi/debug_layer echo -n 0xffffffff > /proc/acpi/debug_level Then test and send me dmesg. Thanks, Luming Created attachment 5267 [details]
dmesg output from 2.6.13-rc1-mm1 w/ ACPI debug
Done. I tested it by executing 'cat /proc/acpi/battery/BAT0/state'.
I'm attaching the full output of dmesg (just in case).
Greets,
Rafael
Thanks for this info. Please run :
#cat /proc/acpi/embedded_controller/EC/info
if use global lock yes then
please try this debug patch.
--- ec.0 2005-07-05 21:43:52.000000000 +0800
+++ ec.c 2005-07-05 21:48:06.000000000 +0800
@@ -802,7 +802,8 @@
acpi_driver_data(device) = ec;
/* Use the global lock for all EC transactions? */
- acpi_evaluate_integer(ec->handle, "_GLK", NULL, &ec->global_lock);
+// acpi_evaluate_integer(ec->handle, "_GLK", NULL, &ec->global_lock);
+ ec->global_lock = 0;
/* If our UID matches the UID for the ECDT-enumerated EC,
we now have the *real* EC info, so kill the makeshift one.*/
@@ -1005,7 +1006,7 @@
status = acpi_evaluate_integer(handle, "_GPE", NULL, &ec_ecdt-
>gpe_bit);
if (ACPI_FAILURE(status))
return status;
- ec_ecdt->global_lock = TRUE;
+ ec_ecdt->global_lock = 0;
ec_ecdt->handle = handle;
printk(KERN_INFO PREFIX "GPE=0x%02x, ports=0x%2x, 0x%2x\n",
@@ -1084,7 +1085,7 @@
ec_ecdt->data_addr = ecdt_ptr->ec_data;
ec_ecdt->gpe_bit = ecdt_ptr->gpe_bit;
/* use the GL just to be safe */
- ec_ecdt->global_lock = TRUE;
+ ec_ecdt->global_lock = 0;
ec_ecdt->uid = ecdt_ptr->uid;
status = acpi_get_handle(NULL, ecdt_ptr->ec_id, &ec_ecdt->handle);
rafael@albercik:~> cat /proc/acpi/embedded_controller/EC0/info gpe bit: 0x25 ports: 0x66, 0x62 use global lock: no Then, try this debug patch. --- ec.c 2005-07-06 14:35:55.000000000 +0800 +++ ec.c.1 2005-07-06 14:42:32.000000000 +0800 @@ -280,14 +280,6 @@ if (ec->global_lock) acpi_release_global_lock(glk); - if(atomic_read(&ec->leaving_burst) == 2){ - ACPI_DEBUG_PRINT((ACPI_DB_INFO,"aborted, retry ...\n")); - while(atomic_read(&ec->pending_gpe)){ - msleep(1); - } - acpi_enable_gpe(NULL, ec->gpe_bit, ACPI_NOT_ISR); - goto retry; - } return_VALUE(status); } @@ -363,14 +355,6 @@ if (ec->global_lock) acpi_release_global_lock(glk); - if(atomic_read(&ec->leaving_burst) == 2){ - ACPI_DEBUG_PRINT((ACPI_DB_INFO,"aborted, retry ...\n")); - while(atomic_read(&ec->pending_gpe)){ - msleep(1); - } - acpi_enable_gpe(NULL, ec->gpe_bit, ACPI_NOT_ISR); - goto retry; - } return_VALUE(status); } @@ -466,11 +450,6 @@ if (ec->global_lock) acpi_release_global_lock(glk); - if(atomic_read(&ec->leaving_burst) == 2){ - ACPI_DEBUG_PRINT((ACPI_DB_INFO,"aborted, retry ...\n")); - acpi_enable_gpe(NULL, ec->gpe_bit, ACPI_NOT_ISR); - status = -ENODATA; - } return_VALUE(status); } Created attachment 5283 [details]
dmesg output from 2.6.13-rc1-mm1 w/ debug patch w/ ACPI debug
Done. The behavior has not changed but the messages in dmesg are different
(attached).
Please attach the output of acpidmp. Created attachment 5284 [details]
acpidmp output from 2.6.13-rc1-mm1 w/ debug patch
To track down the problem, I want to see the results with the following chanages that remove ec query. --- ec.c 2005-07-06 14:35:55.000000000 +0800 +++ ec.c.2 2005-07-06 22:18:51.000000000 +0800 @@ -497,18 +497,6 @@ ACPI_FUNCTION_TRACE("acpi_ec_gpe_query"); - if (acpi_ec_read_status(ec) & ACPI_EC_FLAG_SCI) - result = acpi_ec_query(ec, &value); - - if (result) - goto end; - - object_name[2] = hex[((value >> 4) & 0x0F)]; - object_name[3] = hex[(value & 0x0F)]; - - ACPI_DEBUG_PRINT((ACPI_DB_INFO, "Evaluating %s\n", object_name)); - - acpi_evaluate_object(ec->handle, object_name, NULL, NULL); end: atomic_dec(&ec->pending_gpe); return; ~ Done. The behavior has not changed, the output of dmesg has not changed (ie. see Comment #11). Hmm, strange, Let's measure acpi_ec_read and acpi_ec_write : --- ec.c 2005-07-06 23:32:10.000000000 +0800 +++ ec.c.3 2005-07-06 23:33:26.000000000 +0800 @@ -65,6 +65,7 @@ #define ACPI_EC_BURST_DISABLE 0x83 #define ACPI_EC_COMMAND_QUERY 0x84 +extern u64 jiffies; static int acpi_ec_add (struct acpi_device *device); static int acpi_ec_remove (struct acpi_device *device, int type); static int acpi_ec_start (struct acpi_device *device); @@ -234,6 +235,7 @@ u32 glk; ACPI_FUNCTION_TRACE("acpi_ec_read"); + printk("...jiffies = %lu \n", jiffies); if (!ec || !data) return_VALUE(-EINVAL); @@ -289,6 +291,7 @@ goto retry; } + printk("...jiffies = %lu \n", jiffies); return_VALUE(status); } @@ -305,6 +308,7 @@ ACPI_FUNCTION_TRACE("acpi_ec_write"); + printk("...jiffies = %lu \n", jiffies); if (!ec) return_VALUE(-EINVAL); retry: @@ -372,6 +376,7 @@ goto retry; } + printk("...jiffies = %lu \n", jiffies); return_VALUE(status); } Created attachment 5291 [details]
dmesg output from 2.6.13-rc1-mm1 w/ timing patch
I have removed the declaration of jiffies, because the kernel did not compile
with it. Additionally, I've figured out that I should revert the first debug
patch.
The resulting dmesg output is attached.
Thanks, to make the problem more clearly, could you build kernel with ACPI debug option on, and echo -n 0x00100000 > /proc/acpi/debug_layer echo -n 0xffffffff > /proc/acpi/debug_level Then, I could spot bottleneck. Created attachment 5293 [details]
dmesg output from 2.6.13-rc1-mm1 w/ timing patch w/ debug
It is complied with the ACPI debug, but I forgot the "echo" things last time.
The (full) dmesg output is attached.
> From log, I found there is a long delay between EC write and read
> operations. It is about 0.6s.
> So, it is possible to take 15s to finish all EC operations for a battery
> status query.
> Here are some snippets :
>
> ...jiffies = 4295177380
> acpi_ec-0379 [111E] [24] acpi_ec_write : ----Exit-
> 0000000000000000
> acpi_ec-0664 [111E] [23] acpi_ec_space_handler : ----Exit-
> 0000000000000000
> acpi_ec-0608 [111E] [22] acpi_ec_space_handler : ----Entry
> acpi_ec-0236 [111E] [23] acpi_ec_read : ----Entry
> ...jiffies = 4295178019
>
> .......................
> ......................
> .................
>
> ...jiffies = 4295178105
> acpi_ec-0379 [111E] [24] acpi_ec_write : ----Exit-
> 0000000000000000
> acpi_ec-0664 [111E] [23] acpi_ec_space_handler : ----Exit-
> 0000000000000000
> acpi_ec-0608 [111E] [22] acpi_ec_space_handler : ----Entry
> acpi_ec-0236 [111E] [23] acpi_ec_read : ----Entry
> ...jiffies = 4295178727
> To debug further, Could you try this debug patch to skip all > Acquire/Release statement in AML interpreter. And retest. > I expect to see less delay. I want to see these debug info: 1.#time cat state 2.# dmesg like comment#19 --- osl.c 2005-05-07 13:20:31.000000000 +0800 +++ osl.c.1 2005-07-08 09:26:09.000000000 +0800 @@ -792,11 +792,6 @@ ACPI_DEBUG_PRINT ((ACPI_DB_MUTEX, "Acquiring spinlock[%p] from %s level\n", handle, ((flags & ACPI_NOT_ISR) ? "non-interrupt" : "interrupt"))); - if (flags & ACPI_NOT_ISR) - ACPI_DISABLE_IRQS(); - - spin_lock((spinlock_t *)handle); - return_VOID; } @@ -814,11 +809,6 @@ ACPI_DEBUG_PRINT ((ACPI_DB_MUTEX, "Releasing spinlock[%p] from %s level\n", handle, ((flags & ACPI_NOT_ISR) ? "non-interrupt" : "interrupt"))); - spin_unlock((spinlock_t *)handle); - - if (flags & ACPI_NOT_ISR) - ACPI_ENABLE_IRQS(); - return_VOID; } Created attachment 5302 [details]
dmesg output from 2.6.13-rc1-mm1 w/ timing patch w/ debug
rafael@albercik:~> cd /proc/acpi/battery/BAT0/
rafael@albercik:/proc/acpi/battery/BAT0> time cat state
present: yes
capacity state: ok
charging state: charged
present rate: unknown
remaining capacity: unknown
present voltage: unknown
real 0m27.630s
user 0m0.004s
sys 0m4.212s
I'm sorry, I made a mistake with my last debug patch. Could you test this one: --- osl.c 2005-07-06 14:35:55.000000000 +0800 +++ osl.c.1 2005-07-11 13:36:16.000000000 +0800 @@ -899,6 +899,7 @@ int ret = 0; ACPI_FUNCTION_TRACE ("os_wait_semaphore"); +#if 0 if (!sem || (units < 1)) return_ACPI_STATUS (AE_BAD_PARAMETER); @@ -963,6 +964,7 @@ else { ACPI_DEBUG_PRINT ((ACPI_DB_MUTEX, "Acquired semaphore[%p|%d|%d] \n", handle, units, timeout)); } +#endif return_ACPI_STATUS (status); } @@ -981,6 +983,7 @@ ACPI_FUNCTION_TRACE ("os_signal_semaphore"); +#if 0 if (!sem || (units < 1)) return_ACPI_STATUS (AE_BAD_PARAMETER); @@ -990,6 +993,7 @@ ACPI_DEBUG_PRINT ((ACPI_DB_MUTEX, "Signaling semaphore[%p|%d]\n", handle, units)); up(sem); +#endif return_ACPI_STATUS (AE_OK); } ~ ~ Created attachment 5309 [details]
dmesg output from 2.6.13-rc1-mm1 w/ timing patch + debug patch
Done. Now I've got some nice Oopses and a segfault during reboot:
rafael@albercik:~> cd /proc/acpi/battery/BAT0/
rafael@albercik:/proc/acpi/battery/BAT0> time cat state
Killed
real 0m0.472s
user 0m0.000s
sys 0m0.012s
rafael@albercik:/proc/acpi/battery/BAT0>
Message from syslogd@albercik at Mon Jul 11 23:20:39 2005 ...
albercik kernel: Oops: 0000 [3] PREEMPT
Message from syslogd@albercik at Mon Jul 11 23:20:40 2005 ...
albercik kernel: CR2: 0000000000001008
The output of dmesg is attached.
Did you config PREEMPT ? If yes, Please try disable it with a clean kernel and retest. Created attachment 5313 [details]
dmesg output from 2.6.13-rc1-mm1 w/ timing patch + debug patch w/o preemption
With CONFIG_PREEMPT_NONE=y
rafael@albercik:~> cd /proc/acpi/battery/BAT0/
rafael@albercik:/proc/acpi/battery/BAT0> time cat state
present: yes
capacity state: ok
charging state: charged
present rate: unknown
remaining capacity: unknown
present voltage: unknown
real 0m18.840s
user 0m0.000s
sys 0m6.340s
The problem has made it to mainline, as it is presend in 2.6.13-rc3. Created attachment 5327 [details]
patch against the disassembly of DSDT
Please try apply this patch to DSDT, and override DSDT, and test kernel w/
timing patch.
Created attachment 5328 [details]
override DSDT
Or, you can just include this file to override the original DSDT, and test with
timing patch.
Created attachment 5331 [details] dmesg output from 2.6.13-rc1-mm1 w/ override DSDT + timing patch I used the DSDT from Comment #29. This time the timing is much better but the rate, capacity and voltage of the battery are still not recognized: rafael@albercik:~> cd /proc/acpi/battery/BAT0/ rafael@albercik:/proc/acpi/battery/BAT0> time cat state present: yes capacity state: ok charging state: charged present rate: unknown remaining capacity: unknown present voltage: unknown real 0m0.515s user 0m0.004s sys 0m0.500s Created attachment 5333 [details]
override DSDT -take 2
Please try this override DSDT -- take 2.
Created attachment 5334 [details]
dmesg output from 2.6.13-rc1-mm1 w/ override DSDT -take 2 + timing patch
It seems to be better, but still not right. For the system plugged-in:
rafael@albercik:~> cd /proc/acpi/battery/BAT0/
rafael@albercik:/proc/acpi/battery/BAT0> time cat state
present: yes
capacity state: critical
charging state: charged
present rate: 0 mA
remaining capacity: -14 mAh
present voltage: unknown
real 0m0.600s
user 0m0.000s
sys 0m0.580s
Do you need the numbers I get without the EC burst mode patch?
Created attachment 5366 [details]
patch for falling back to polling mode with kernel option : ec_polling
The patch is against linux-2.6.13-rc3.
With this patch, if you use kernel boot option ec_polling, the ec driver will
fall back to polling mode, which suppose to work on L5D, otherwise, ec driver
will use burst mode. Hopefully it will workaround your problem. The real fix is
under-going.
Ps. I have tested it on my T42. Both mode seems to work. Please tested it.
Thanks
Luming
Created attachment 5368 [details]
ec_polling patch against vanilla 2.6.13-rc3
The patch works for me. :-) Thanks a lot for it!
Initially it didn't apply to vanilla 2.6.13-rc3, so I've remade it a little
(the resulting patch is attached).
Greets,
Rafael
applied to-linus Created attachment 5497 [details]
patch vs 2.6.13-rc5
This patch makes polling mode the default again.
use "ec_burst=1" if you'd like to try burst mode,
but it has enough problems that it will not be
the default till they're addressed.
This patch went into linus's tree
after 2.6.13-rc5 -- closing.
As a follow up about bust-mode, could you try debug patch filed at bug 3851 Referring to Comment #37: Sorry for the delay. I have tested the patch at: http://bugzilla.kernel.org/attachment.cgi?id=5506&action=view I applied it to 2.6.13-rc5, booted the kernel without the ec_polling parameter, and it worked. Hi, I just updated the patch at bug 3851, please help testing. Thanks, Luming I'm currently testing the patch at: http://bugzilla.kernel.org/attachment.cgi?id=5574&action=view I have booted with ec_burst=1. It generally works, but I get "present rate" different to zero and the system is on AC power for a couple of hours. Referring to Comment #40: Now "present rate" is zero. It must have been a fluctuation then. *** Bug 1093 has been marked as a duplicate of this bug. *** *** Bug 4793 has been marked as a duplicate of this bug. *** *** Bug 4911 has been marked as a duplicate of this bug. *** *** Bug 4846 has been marked as a duplicate of this bug. *** *** Bug 5100 has been marked as a duplicate of this bug. *** |