Bug 3851

Summary: burst-mode EC
Product: ACPI Reporter: Oleg I. Vdovikin (oleg)
Component: Power-BatteryAssignee: Luming Yu (luming.yu)
Status: CLOSED CODE_FIX    
Severity: normal CC: acpi-bugzilla, brian, dgege1, keresztg, OoberMick, romano.giannetti, syrjala, sziwan, trenn
Priority: P2    
Hardware: i386   
OS: Linux   
Kernel Version: 2.6.9 Subsystem:
Regression: --- Bisected commit-id:
Attachments: dmesg -s40000 output
acpidmp output
dmidecode output
/proc/interrupts
lspci -vv
acpi=strict dmesg
dmesg from patched kernel
EC full SCI events queue processing
interrupt_based_ec-1 (This patch is written by Dmitry Torokhov. (against 2.6.10-rc2))
interrupt_based_ec-2 (This patch is written by Luming Yu )
FC3 2.6.9 kernel config
patch from comments #26 + #27 applied to linux-2.6.10
patch from comments #26 + #27 applied to linux-2.6.11-rc2
testing patch for solving gpe_disabled issue with burst-mode ec
Full syslog capture from boot to failing suspend/resume. Ec patch applied, gpe- patch NOT| applied
bust mode ec debug-patch against (2.6.13-rc5)
update one against 2.6.13-rc5
update burs-ec patch against 2.6.13-rc6
patch against linux-2.6.13-rc6 ( against Lindent-ed ec.c )
patch against 2.6.13.2

Description Oleg I. Vdovikin 2004-12-02 12:10:14 UTC
Distribution: Fedora Core 3
Hardware Environment: HP Omnibook 6100
Software Environment: 2.6.9, acpi 20041105
Problem Description: Battery/charge status no longer updated. It stays at the
same level all the time. It's still updated then Fan is turned on/off due to
thermal zone changes. 2.6.8 (acpi 20040326) works fine, 2.6.9 with acpi 20040816
and 20041105 does not work.

Steps to reproduce:
Boot and watch cat /proc/acpi/battery/BAT1/state
Comment 1 Oleg I. Vdovikin 2004-12-02 12:11:39 UTC
Created attachment 4196 [details]
dmesg -s40000 output
Comment 2 Oleg I. Vdovikin 2004-12-02 12:12:31 UTC
Created attachment 4197 [details]
acpidmp output
Comment 3 Oleg I. Vdovikin 2004-12-02 12:13:07 UTC
Created attachment 4198 [details]
dmidecode output
Comment 4 Oleg I. Vdovikin 2004-12-02 12:13:51 UTC
Created attachment 4199 [details]
/proc/interrupts
Comment 5 Oleg I. Vdovikin 2004-12-02 12:33:29 UTC
Created attachment 4200 [details]
lspci -vv
Comment 6 Luming Yu 2004-12-02 17:26:08 UTC
Please try boot option: 
 acpi=strict

Comment 7 Oleg I. Vdovikin 2004-12-02 23:05:49 UTC
Created attachment 4203 [details]
acpi=strict dmesg

Looks like QUERY_20 does not get executed anymore.
Still does not work.
Comment 8 Luming Yu 2004-12-03 02:48:08 UTC
Please try this patch
--- linux-2.6.10-rc2/drivers/acpi/battery.c     2004-11-15 09:28:17.000000000 +0800
+++ linux-2.6.10-rc2/drivers/acpi/battery.c.e   2004-12-03 18:41:17.690742800 +0800
@@ -469,6 +469,7 @@
                goto end;
        }

+       acpi_os_wait_events_complete (NULL);
        /* Battery Units */

        units = battery->flags.power_unit ? ACPI_BATTERY_UNITS_AMPS :
ACPI_BATTERY_UNITS_WATTS;
Comment 9 Luming Yu 2004-12-03 02:52:20 UTC
_BST just issue a poll request.
_Q9 will update PBST.

So, We need to make sure any previous changes has been updated to PBST.
Comment 10 Oleg I. Vdovikin 2004-12-03 03:38:36 UTC
Created attachment 4204 [details]
dmesg from patched kernel

I've rmmoded battery and then insmoded recompiled battery.ko
Still does not work, _Q09 does not get called. 
Also, this dmesg contains thermal changes, which triggered _Q09 and battery
state update in the middle.
Comment 11 Luming Yu 2004-12-03 04:19:00 UTC
Hmm, _BST call the following method to request poll,
But, PSTA will never been restored to zero.
So, Only the first call to CPOL have effect.

I'm doubt this DSDT is NOT original one.
Anyway, would you please try to remove
" IF (LEqual(PSTA,0x00))"


Name (PSTA, 0x00)
Method (CPOL, 0, NotSerialized)
{
  If (LEqual (PSTA, 0x00))
  {
    If (ECOK ())
    {
      BPOL ()
      Store (0x01, PSTA)
     }
   }
}
                  
Comment 12 Luming Yu 2004-12-03 04:29:41 UTC
I want to know if battery can change status with the patched kernel?
Comment 13 Oleg I. Vdovikin 2004-12-03 04:33:53 UTC
DSDT is original - FC3 does not allow alter DSDT easily. You mean I've replaced 
it somehow?

In fact, it's too hard for me to recompile entire kernel right now, I will try 
this later. Do you've other ideas? 

As seems LEqual(PSTA,0x00) just guards against the re-entering same code again. 
Needless to say, Windows is working fine.

Also why 2.6.8 is working fine with it?

Yes, everything is with patched kernel.

I've a program which can directly access EC and run SMBus transactions to 
battery. The funny thing is that when this program is running _Q09 is fired up 
and battery status is updated. Also, ones program is finished it gotes yet 
another _Q09 (as expected) several minutes later. Looks like EC is leaved by in 
incorrect state by acpi.
Comment 14 Oleg I. Vdovikin 2004-12-03 04:38:10 UTC
Just googled for PSTA - there are number of similar things, especially with 
HP/Compaq:

Take a look at this:
http://acpi.sourceforge.net/dsdt/tables/Compaq/Presario_2100/Compaq-
Presario_2100-KAM_1.42-custom.asl.gz
Comment 15 Oleg I. Vdovikin 2004-12-05 09:43:27 UTC
Well, spend a day playing with DSDT. So, finally, DSDT is fine.
Yes, PSTA are never set to zero, but this does not needed at all - CPOL (which
is CHECK POLL) starts polling (calls BPOL only once). After that _Q09 (SMSL)
will call BPOL itself to start polling again.
Played with EC from user mode and found, that it's state shows that it has
pending SCI events, which are NOT handled. The problem is that _Q09 calls
several methods, which are causing _Q20 (actually 4 events), but
driver/acpi/ec.c only serves ONE event at a time (per interrupt). So, EC starts
queing events, and some of them are never get handled. So, finally I've changed
event handling code, which is now executes all SCI queries queued by EC.
I have checked this with both 20041105 and 20040816. It finally start working.
Seems was due to a GPE code changes, performed several months ago.
Comment 16 Oleg I. Vdovikin 2004-12-05 09:44:49 UTC
Created attachment 4233 [details]
EC full SCI events queue processing

Please review and reply.
Comment 17 Oleg I. Vdovikin 2004-12-05 09:48:48 UTC
Also, just to clarify: _BST does NOT perform any queries, it just returns cached
battery state, so patch below is USELESS, do not apply it.

--- linux-2.6.10-rc2/drivers/acpi/battery.c     2004-11-15 09:28:17.000000000 +0800
+++ linux-2.6.10-rc2/drivers/acpi/battery.c.e   2004-12-03 18:41:17.690742800 +0800
@@ -469,6 +469,7 @@
                goto end;
        }

+       acpi_os_wait_events_complete (NULL);
        /* Battery Units */

        units = battery->flags.power_unit ? ACPI_BATTERY_UNITS_AMPS :
ACPI_BATTERY_UNITS_WATTS;
Comment 18 Luming Yu 2004-12-06 01:26:06 UTC
To comment #17,
  This patch is NOT trying to get the latest update of PBST,  just because
we flush workqueue before evaluating _BST, in which there could have some AML
methods could be queued onto workqueue of kacpid, and they will change PBST.
   So, I think that patch just introduce some delay to show up the actual
battery status. Otherwise, it is useful.
  
Comment 19 Luming Yu 2004-12-06 01:40:54 UTC
To comment #15,
I believe if there is a pending EC event, they will have a SCI interrupt. (1 to
1 ) So, I don't think any pending EC event will be lost. I believe they will be
queued onto workqueue.

Comment 20 Luming Yu 2004-12-06 01:47:40 UTC
To comment #16,
We are implementing a pure interrupt-based ec address patch.  
The biggest problem is that your patch is based on polling, which will
be obsoleted.
Comment 21 Oleg I. Vdovikin 2004-12-06 02:03:07 UTC
Well, according to things I've found, old code leaves unhandled events forever. 
I've directly accessed EC status and it shows that, more precise I've found 
several events in the EC queue.
And id does not generate interrupts, just because of fact, that GPEs is diabled 
during processing of queued GPE.
Also, once I've patched code, several _Q20 (actually SMbus events) requests are 
appeared right after the _Q09 processing, so _Q09 seems to cause that. Old code 
logs, showed same _Q20 with a large delays - so the real events was just 
pushing them from queue, giving a delay in processing. And some events was 
completely lost due to limited EC queue size. One of the _Q09 was not handled, 
causing battery status no longer updated.
Comment 22 Oleg I. Vdovikin 2004-12-06 02:08:13 UTC
To 15: so you think I'm kidding? ;-) Why it does not work then? And why it's 
working with 200403 version?
To 16: ok, fix this in the right way, probably interrupts are get lost 
somethere...
Comment 23 Oleg I. Vdovikin 2004-12-07 04:49:38 UTC
Checked GPE code and compared to old one. 
Looks like there is a thing, which could lead to loosing interrupts.
drivers/acpi/events/evgpe.c now contains some new code, which is executed then 
GPE is enabled with acpi_ev_enable_gpe:
----------
        case ACPI_GPE_TYPE_RUNTIME:

                ACPI_SET_BIT (gpe_event_info->flags, ACPI_GPE_RUN_ENABLED);

                if (write_to_hardware) {
                        /* Clear the GPE (of stale events), then enable it */

                        status = acpi_hw_clear_gpe (gpe_event_info);
                        if (ACPI_FAILURE (status)) {
                                return_ACPI_STATUS (status);
                        }

                        /* Enable the requested runtime GPE */

                        status = acpi_hw_write_gpe_enable_reg (gpe_event_info);
                }
                break;
--------------
The addition is acpi_hw_clear_gpe. EC uses edge triggered GPE, which is already 
cleared at the begining by acpi_ev_gpe_dispatch (well, level triggered GPE is 
also cleared at the end of acpi_ev_gpe_dispatch). Why this code calls again 
acpi_hw_clear_gpe at the end of the processing? Alternatively, should EC call 
disable/enable GPE at all?

Looks like acpi_hw_clear_gpe could clear pending GPE, so no SCI will be 
generated on enablement of the GPE. And cause problems which I've now.

Any ideas?
Comment 24 Oleg I. Vdovikin 2004-12-07 12:44:37 UTC
Well, played with kernel again. So, per my previous message, the problem is the
change of the behaviour of the acpi_ev_enable_gpe, so acpi_enable_gpe is also
affected. Old EC code do not expect that GPE will be cleared with enable_gpe
call (this seems to be invalid for the edge-triggered events anyway). 
I've tried two patches: the first one is just removes acpi_hw_clear_gpe from the
acpi_ev_enable_gpe, restoring old behaviour: 

--- linux-2.6.9/drivers/acpi/events/evgpe.c~    2004-12-07 21:14:18.070948976 +0300
+++ linux-2.6.9/drivers/acpi/events/evgpe.c     2004-12-07 21:15:03.404057296 +0300
@@ -216,10 +216,12 @@
                if (write_to_hardware) {
                        /* Clear the GPE (of stale events), then enable it */

+#if 0
                        status = acpi_hw_clear_gpe (gpe_event_info);
                        if (ACPI_FAILURE (status)) {
                                return_ACPI_STATUS (status);
                        }
+#endif

                        /* Enable the requested runtime GPE */


And it's started working without patching drivers/acpi/ec.c. Actually
acpi_hw_clear_gpe is eating events, leading to no SCI. Well, probably this could
break other things, which relies on new behaviour.

So I've also tried to remove enable/disable gpe from EC code (this looks
reasonable for me, at least I could not imagine why this code still needed),
leaving acpi_ev_enable_gpe as is:

--- linux-2.6.9/drivers/acpi/ec.c~      2004-12-07 22:03:05.939845248 +0300
+++ linux-2.6.9/drivers/acpi/ec.c       2004-12-07 22:09:48.164697776 +0300
@@ -378,7 +378,11 @@
        acpi_evaluate_object(ec->handle, object_name, NULL, NULL);

 end:
+#if 0
        acpi_enable_gpe(NULL, ec->gpe_bit, ACPI_NOT_ISR);
+#else
+       return;
+#endif
 }

 static u32
@@ -391,7 +393,9 @@
        if (!ec)
                return ACPI_INTERRUPT_NOT_HANDLED;

+#if 0
        acpi_disable_gpe(NULL, ec->gpe_bit, ACPI_ISR);
+#endif

        status = acpi_os_queue_for_execution(OSD_PRIORITY_GPE,
                acpi_ec_gpe_query, ec);

And yes, this patch also solves missing SCI problem. It started working for me.

Please review this...
Comment 25 Luming Yu 2004-12-08 02:07:18 UTC
Thanks for this info. It make the problem clear.
Comment 26 Luming Yu 2004-12-08 02:16:13 UTC
Created attachment 4246 [details]
interrupt_based_ec-1
(This patch is written by Dmitry Torokhov. (against 2.6.10-rc2))

This patch is written by Dmitry Torokhov.
Comment 27 Luming Yu 2004-12-08 02:22:56 UTC
Created attachment 4247 [details]
interrupt_based_ec-2
(This patch is written by Luming Yu )

Please aplly this patch to linux-2.6.10-rc2 (patched with 4246: This patch is
written by Dmitry Torokhov. (against 2.6.10-rc2))
Comment 28 Luming Yu 2004-12-08 02:27:22 UTC
I just attached two patch for pure interrupt based EC access. 
Would you have it a try? I'm not sure it can solve the problem. But this patch
use EC burst mode. I'm sure it will change something. :-)

Notes:
Firstly, you need apply patch at comment #26, then apply patch at comment #27
Comment 29 Oleg I. Vdovikin 2004-12-08 02:35:13 UTC
I'm a bit confused with these patches, looks like they're just adding other 
things, but GPE is still disabled/enabled causing missing SCIs... Anyway, I 
will give them a try, if they will apply to 2.6.9 tree.
Comment 30 Oleg I. Vdovikin 2004-12-08 09:47:32 UTC
Build fails with

drivers/acpi/ec.c: In function `acpi_ec_gpe_query':
drivers/acpi/ec.c:515: error: label at end of compound statement
make[2]: *** [drivers/acpi/ec.o] Error 1
make[1]: *** [drivers/acpi] Error 2
make: *** [drivers] Error 2
make: *** Waiting for unfinished jobs....

you need to put return...
Comment 31 Oleg I. Vdovikin 2004-12-08 10:37:42 UTC
Well, changed patch to fix build error and looks like it fixed my problems.
Battery status is now updated. You may probably need to review bug 3853 as well.
Comment 32 Oleg I. Vdovikin 2004-12-08 23:48:40 UTC
Another problem: with these two patches my Omnibook does not poweroff. The last 
messages are:

Shutdown: hda
Power down.
acpi_power_off called

and it stays turned on.
Comment 33 Luming Yu 2004-12-09 18:38:56 UTC
I just download 2.6.9 base kernel, and apply the patch set cleanly.
I also build the patched kernel successfully. I didn't see error you have.

And, poweroff works fine on my T42, and ASUS 2400NE laptop.
Comment 34 Oleg I. Vdovikin 2004-12-09 23:35:43 UTC
Well, which compiler are you using? I've gcc 3.4.2 from Fedora Core 3.

As for power_off - it happens stable for me, looks like you need HP/Compaq 
laptop for testing, it's quite different from IBM laptops.
Comment 35 Oleg I. Vdovikin 2004-12-09 23:46:24 UTC
Regarding compiling error, just goggled - take a look a this: 
http://gcc.gnu.org/ml/gcc-patches/2004-01/msg01212.html

Just a thought regarding power_off - you may probably need to use the same 
kernel config as I have.
Comment 36 Oleg I. Vdovikin 2004-12-09 23:48:54 UTC
Created attachment 4251 [details]
FC3 2.6.9 kernel config

Kernel config from FC3
Comment 37 Oleg I. Vdovikin 2004-12-10 12:46:43 UTC
Played with patches... Leaved first patch only, power_off does not work and
things are become terrible: battery no longer detected, power events are not
handled,  probably thermal zone also does not work. So, looks like Dmitry should
review his changes...
Comment 38 Luming Yu 2004-12-13 02:53:49 UTC
Could you narrow the problem down (like bug 3842) ?
I need to know where the hang is exactly at.
Comment 39 Oleg I. Vdovikin 2004-12-13 10:50:21 UTC
Well, 3842 is compilation bug, not power off bug. Which one you mean?
Comment 40 Luming Yu 2004-12-13 17:34:57 UTC
Oops, it should be bug 3642 
Comment 41 Len Brown 2004-12-16 19:36:29 UTC
marking as RESOLVED b/c proposed patches in
comment #26 and comment #27 need testing
Comment 42 Oleg I. Vdovikin 2004-12-17 01:57:37 UTC
I'm wonder, why this bug is marked as resolved. One of the patches has 
compiling errors, another one introduces another bug with power off...
I've not time yet to debug this code...
Comment 43 Luming Yu 2004-12-17 02:30:13 UTC
  Just b/c the patches need testing widely. And Len only review patches marked 
as resolved.   As for compile issue, I will update it , or you can do it.
  As for power-off issue,  We need to root cause the problem , that is where 
is hang, before draw any conclusions.Maybe you can open a new tracker for this.
  Anyway, thanks for your testing and debugging.  
Comment 44 Oleg I. Vdovikin 2004-12-19 06:50:17 UTC
Well, it hangs trying to

	status = acpi_evaluate_object (NULL, METHOD_NAME__PTS, &arg_list, NULL);

in the hwsleep.c

    Method (_PTS, 1, NotSerialized)
    {
        Or (\_SB.PCI0.ISA0.MFLG, 0x10, \_SB.PCI0.ISA0.MFLG)
        If (LNot (LLess (Arg0, 0x04)))
        {
            If (ECOK ())
            {
                Store (0x00, \_SB.PCI0.ISA0.EC0.EQBF)
            }
        }

        If (LEqual (Arg0, 0x03))
        {
            If (LEqual (\_SB.PCI0.ISA0.DKTP, 0x03))
            {
                Or (Not (\_SB.PCI0.HUB.FDS.CEVN), 0x40, \_SB.PCI0.HUB.FDS.CEVN)
                If (ECOK ())
                {
                    If (\_SB.PCI0.ISA0.EC0.EQBF)
                    {
                        Store (0x01, \_SB.QBST)
                        Store (0x00, \_SB.PCI0.ISA0.EC0.EQBF)
                        Sleep (0x32)
                    }
                }
            }
        }

        If (LEqual (Arg0, 0x04))
        {
            If (LEqual (\_SB.PCI0.ISA0.SPR0.EJX, 0x04))
            {
                Store (0x00, \_SB.PCI0.ISA0.SPR0.EJX)
            }
            Else
            {
                \_SB.PCI0.ISA0.HPSS (0x0C, 0x00)
            }

            Store (\_SB.PCI0.ISA0.DKTP, \_SB.DCTM)
        }

        If (ECOK ())
        {
            Store (Arg0, \_SB.PCI0.ISA0.EC0.PTSV)
        }

        If (LEqual (Arg0, 0x05))
        {
            Store (0x01, \_SB.PCI0.ISA0.Z000)
        }
    }
Comment 45 Oleg I. Vdovikin 2004-12-19 08:30:33 UTC
Both suggested patches from bug 3642 does not help me. Still hangs at _PTS...
Comment 46 Oleg I. Vdovikin 2004-12-19 13:44:20 UTC
Luming Yu, also noticed that with your patch battery status is updated for sime
time, but stops as before, and EC has events queued for processing, so looks
like your patch just trying to workaround a bug (and takes more time to break
the things), not to fix it.
Do I need to reopen bug?

Below is the snip of my test program accessing EC directly, looks like EC has
queue of 8 queries, and it's filled with _Q20.
---
[root@omnibook ~]# ./ectest
status=28, query=20
[root@omnibook ~]# ./ectest
status=28, query=20
[root@omnibook ~]# ./ectest
status=28, query=20
[root@omnibook ~]# ./ectest
status=28, query=20
[root@omnibook ~]# ./ectest
status=28, query=20
[root@omnibook ~]# ./ectest
status=28, query=20
[root@omnibook ~]# ./ectest
status=28, query=20
[root@omnibook ~]# ./ectest
status=08, query=20
[root@omnibook ~]# ./ectest
status=08, query=00
[root@omnibook ~]#
---
Comment 47 Luming Yu 2004-12-20 18:03:24 UTC
To comment 46, 
  The execution of _Qxx is scheduled in a manner of defered workqueue. So, you 
cannot expect _Qxx method will be executed immediately after EC query request 
triggered.
  To judge if this is a real bug, I need to know how long you cannot see 
battery status with and without my patch set? (During testing, please don't 
use program that directly access EC hardware.)

  
Comment 48 Oleg I. Vdovikin 2004-12-21 00:51:15 UTC
Can't understand, that do you want exactly. Would you like to now how long it 
takes for battery status to stop updating? If so, then it's like the following: 
2.6.8 works always fine, 2.6.9 never updates battery status, unless some other 
events occures (such as turning on/off fan, but it's also stoped after a 
while), with your patches it could work some time, but then stop working. 
Yesterday, I've charged battery fully and the status updated from 10% to 100% 
with no problem. I will try to discharge it today. And let you know if it works.

Yes, I'm never using programs directly acessing EC. I've write ectest specially 
to check the things and to try fiagnose the problem, why the battery status is 
not updated within 2 hours (with your patches), showing the same charge level 
all the time.
Comment 49 Stephen Mollett 2004-12-28 14:59:06 UTC
As an additional bit of information, the two interrupt_based_ec patches fix the  
same problem (battery status "stuck") on an IBM ThinkPad 240 running 2.6.10. 
(One minor tweak was necessary to handle an extra #include line but that was 
all.) 
 
Whether the root cause is the same, I don't know; if you want me to post kernel  
messages, etc., let me know. (Follows on from attempts to solve problems  
described in bug 2443.)  
Comment 50 Ville Syrjala 2005-01-26 10:30:11 UTC
I have an OmniBook 6000 which suffers from this as well. It affects ac, battery,
display switching and display brightness. The last kernel that worked correctly
was 2.6.7-mm4.

I can see the effect perfectly with the display brightness. If I first increase
the brightness a bit and then try to decrease it it still increases a bit more
before actually decreasing. It looks like 8 decrease button presses get stuck
somewhere. I can also usually make the ac, battery and display switch events
happen immediately by playing with the display brightness. But even that stops
working sometimes.

I just tried the interrupt_based_ec patches on 2.6.10-mm2 and while they seem to
fix the ac and battery status they cause other problems.
- Display brightness reacts without the 8 event delay but now there is a big
time delay between events. It takes ~3 seconds to go from dimmest to brightest
while without this patch it took < 1 second. It is uncomfortably slow now.
- Display switch button works exactly two times. First press turns off the LCD,
second brings it back. But after the second press everything stops. No more ac
or battery status updates, display switching or brightness control. From
/proc/interrupts I can see that no more acpi interrupts are generated.
- System no longer powers off.
Comment 51 Len Brown 2005-02-04 06:21:35 UTC
Created attachment 4514 [details]
patch from comments #26 + #27 applied to linux-2.6.10
Comment 52 Luming Yu 2005-02-04 22:03:05 UTC
Created attachment 4516 [details]
patch from comments #26 + #27 applied to linux-2.6.11-rc2
Comment 53 Ville Syrjala 2005-03-06 06:36:42 UTC
I tested 2.6.11-rc5-mm1 with the latest patch on my OB6000 and it seems to work
very well. AC and battery status work and display switching no longer causes
problems. Great job everyone. The only remaining problem is the slowness of the
brightness controls. I can live with it but it would be nice to get it back to
2.6.7-mm4 / Windows speed ie. < 1 second vs. the current > 3 seconds to go from
one end to the other.
Comment 54 Thomas Renninger 2005-03-11 03:27:03 UTC
Ville: Have you compile ACPI_DEBUG?, try without should be about ten times faster...
Comment 55 Ville Syrjala 2005-03-11 06:32:17 UTC
No, I don't use ACPI_DEBUG.
Comment 56 Luming Yu 2005-03-14 18:44:06 UTC
*** Bug 4124 has been marked as a duplicate of this bug. ***
Comment 57 Luming Yu 2005-03-14 19:01:04 UTC
*** Bug 4015 has been marked as a duplicate of this bug. ***
Comment 58 Luming Yu 2005-03-14 19:03:00 UTC
*** Bug 3853 has been marked as a duplicate of this bug. ***
Comment 59 Luming Yu 2005-03-14 19:13:09 UTC
*** Bug 2845 has been marked as a duplicate of this bug. ***
Comment 60 Romano Giannetti 2005-03-18 03:18:48 UTC
I can confirm that this fixes my bug #4124, so please push it to Linus. 
I have applied it to 2.6.11 vanilla, and I have to change line 519 of 
ev.c from end: to end:; (note the semicolon) otherwise gcc 3.4.1 will barf on it. 
Thank you very much.
Comment 61 Len Brown 2005-03-18 22:10:57 UTC
applied to acpi-test tree 
Comment 62 Romano Giannetti 2005-03-20 07:01:10 UTC
Sorry to answer to myself in this thread, but I noticed a quite strange
effect of the above patch. It's better than before, but something fishy is
still going on.

If I query acpi state:

[root@rukbat tmp]# acpi -V
     Battery 1: charging, 96%, 04:38:55 until charged
     Thermal 1: ok, 51.0 degrees C
  AC Adapter 1: on-line

all is ok, although quite slower than before: now the command use 3 seconds
elapsed time, while before it was around 1.5 seconds.

[root@rukbat tmp]# time acpi -V
     Battery 1: charging, 96%, 04:32:25 until charged
     Thermal 1: ok, 52.0 degrees C
  AC Adapter 1: on-line
0.00user 0.51system 0:02.66elapsed 19%CPU (0avgtext+0avgdata 0maxresident)k
0inputs+0outputs (0major+163minor)pagefaults 0swaps

That's not a problem. The real problem is, if I do the same thing on another
window, while acpi -V is waiting, I have:

(0)rukbat:~% acpi -V
     Battery 1: charged, 0%
     Thermal 1: ok, 52.0 degrees C
  AC Adapter 1: on-line

with completely bogus values. I noticed it because sometime, doing acpi -V,
the gkrellm trigger a low battery alarm.

I have the sensation that something is strange. Please, someone with moe
knowledge than me chek the "8 event queue" commentary for the bug
http://bugme.osdl.org/show_bug.cgi?id=3851#c46 maybe there something here.

Thanks,
                Romano
Comment 63 Romano Giannetti 2005-03-29 00:06:07 UTC
Hi!

This is to report an issue with 2.6.11 and ACPI battery/ac. The resume is:
acpi battery with preemptive kernel do not work, while the same kernel with
no preempt works ok. 

The details: 

The working kernel is 2.6.11 with the patch from the acpi-devel list to fix
acpi keys (not working otherwise). See for a description 
http://bugme.osdl.org/show_bug.cgi?id=4124

The data on the non-preemptive kernel (boot messages, config, etc) is here:
http://www.dea.icai.upco.es/romano/linux/br/config-nop/laptop-config.html
This works, apart for the little glitch that doing two "acpi -V" at the same
time, one of them give "0" for battery charge, while the other works ok. 
Suspend/resume works perfectly with this script:
http://www.dea.icai.upco.es/romano/linux/br/config-nop/rg_suspend_script_nowait

The complete configuration for the preemptive kernel is here: 
http://www.dea.icai.upco.es/romano/linux/br/config-p2/laptop-config.html

The problem with the preemptive kernel is that after a while the ac/battery
ACPI reading stop working. The errors do occur at seemely random times.
First time it happened at boot, but I do not have a log of it. Second time
it happened at resume time, and I have captured "oops" (really, scheduling
while atomic) from this (in the following lines). After that, battery/ac
status does not work anymore. If you look at the following dmesg, battery
module will load but it will not detect any battery in the slots.

The same information as above, after resume, is here: 
http://www.dea.icai.upco.es/romano/linux/br/after-res2/laptop-config.html

After that, I activated full ACPI debug (echo 0xFFFFFFFF >
/proc/acpi/debug_level) and tried to load again the battery module.  This
time the loading suceeded. The full syslog output (it's enormous) is 
here:
http://www.dea.icai.upco.es/romano/linux/br/after-res2/syslog.txt.gz


A very similar thing happens with vanilla 2.6.11 with preemptive kernel
(without the patch above). Notice, though, that the acpi key delay bug is
resolved by simpl activate preempt, like Mr. Shaohua said in 
http://bugme.osdl.org/show_bug.cgi?id=4124#c8

Here is the "oops" log at resume (log starts at suspend time)

usbcore: deregistering driver usbhid
uhci_hcd 0000:00:07.2: remove, state 1
usb usb1: USB disconnect, address 1
usb 1-2: USB disconnect, address 2
uhci_hcd 0000:00:07.2: USB bus 1 deregistered
uhci_hcd 0000:00:07.3: remove, state 1
usb usb2: USB disconnect, address 1
uhci_hcd 0000:00:07.3: USB bus 2 deregistered
usbcore: deregistering driver usbmouse
Stopping tasks: ================================= [...]
Freeing memory...  -\|/-\|/-\ [...]
PM: Attempting to suspend to disk.
PM: snapshotting memory.
swsusp: critical section: 
[nosave pfn 0x484]<7>[nosave pfn 0x485]swsusp: Need to copy 8934 pages
suspend: (pages needed: 8934 + 512 free: 56600)
[nosave pfn 0x484]<7>[nosave pfn 0x485]<7>PM: Image restored successfully.
scheduling while atomic: really_suspend/0x00000001/4624
 [<c0396007>] schedule+0x467/0x520
 [<c0121035>] __mod_timer+0x1c5/0x1f0
 [<c0396acd>] schedule_timeout+0x5d/0xb0
 [<c0121ac0>] process_timeout+0x0/0x10
 [<c0121eaf>] msleep+0x2f/0x40
 [<c024b080>] pci_set_power_state+0x190/0x1d0
 [<c024b1c8>] pci_enable_device_bars+0x18/0x40
 [<c024b20f>] pci_enable_device+0x1f/0x40
 [<d0ccf64c>] snd_via82xx_resume+0x1c/0x170 [snd_via82xx]
 [<c024b199>] pci_restore_state+0x39/0x50
 [<d0cacc79>] snd_card_pci_resume+0x49/0x76 [snd]
 [<c024d36c>] pci_device_resume+0x2c/0x40
 [<c02c79a8>] dpm_resume+0xa8/0xb0
 [<c02c79c1>] device_resume+0x11/0x20
 [<c0135268>] finish+0x8/0x40
 [<c01353c5>] pm_suspend_disk+0x75/0xc0
 [<c0133786>] enter_state+0x86/0x90
 [<c013379f>] software_suspend+0xf/0x20
 [<c0289d9a>] acpi_system_write_sleep+0x6a/0x84
 [<c015835c>] vfs_write+0x14c/0x160
 [<c0158441>] sys_write+0x51/0x80
 [<c01032b9>] sysenter_past_esp+0x52/0x75
ACPI: PCI interrupt 0000:00:07.5[C] -> GSI 5 (level, low) -> IRQ 5
scheduling while atomic: really_suspend/0x00000001/4624
 [<c0396007>] schedule+0x467/0x520
 [<c0120fb2>] __mod_timer+0x142/0x1f0
 [<c0396acd>] schedule_timeout+0x5d/0xb0
 [<c0121ac0>] process_timeout+0x0/0x10
 [<c0121eaf>] msleep+0x2f/0x40
 [<d0c338b6>] socket_shutdown+0x26/0x40 [pcmcia_core]
 [<d0c33dbf>] socket_resume+0xbf/0x130 [pcmcia_core]
 [<d0c6ec0e>] yenta_dev_resume+0xae/0xb0 [yenta_socket]
 [<d0c3310c>] pcmcia_socket_dev_resume+0x7c/0x90 [pcmcia_core]
 [<c024d36c>] pci_device_resume+0x2c/0x40
 [<c02c79a8>] dpm_resume+0xa8/0xb0
 [<c02c79c1>] device_resume+0x11/0x20
 [<c0135268>] finish+0x8/0x40
 [<c01353c5>] pm_suspend_disk+0x75/0xc0
 [<c0133786>] enter_state+0x86/0x90
 [<c013379f>] software_suspend+0xf/0x20
 [<c0289d9a>] acpi_system_write_sleep+0x6a/0x84
 [<c015835c>] vfs_write+0x14c/0x160
 [<c0158441>] sys_write+0x51/0x80
 [<c01032b9>] sysenter_past_esp+0x52/0x75
scheduling while atomic: really_suspend/0x00000001/4624
 [<c0396007>] schedule+0x467/0x520
 [<c0120fb2>] __mod_timer+0x142/0x1f0
 [<c0396acd>] schedule_timeout+0x5d/0xb0
 [<c0249088>] pci_bus_write_config_word+0x68/0xa0
 [<c0121ac0>] process_timeout+0x0/0x10
 [<c0121eaf>] msleep+0x2f/0x40
 [<d0c339ed>] socket_setup+0x3d/0x160 [pcmcia_core]
 [<d0c33d5f>] socket_resume+0x5f/0x130 [pcmcia_core]
 [<d0c6ec0e>] yenta_dev_resume+0xae/0xb0 [yenta_socket]
 [<d0c3310c>] pcmcia_socket_dev_resume+0x7c/0x90 [pcmcia_core]
 [<c024d36c>] pci_device_resume+0x2c/0x40
 [<c02c79a8>] dpm_resume+0xa8/0xb0
 [<c02c79c1>] device_resume+0x11/0x20
 [<c0135268>] finish+0x8/0x40
 [<c01353c5>] pm_suspend_disk+0x75/0xc0
 [<c0133786>] enter_state+0x86/0x90
 [<c013379f>] software_suspend+0xf/0x20
 [<c0289d9a>] acpi_system_write_sleep+0x6a/0x84
 [<c015835c>] vfs_write+0x14c/0x160
 [<c0158441>] sys_write+0x51/0x80
 [<c01032b9>] sysenter_past_esp+0x52/0x75
scheduling while atomic: really_suspend/0x00000001/4624
 [<c0396007>] schedule+0x467/0x520
 [<c0323b2e>] pci_write+0x3e/0x50
 [<c0120fb2>] __mod_timer+0x142/0x1f0
 [<c0396acd>] schedule_timeout+0x5d/0xb0
 [<c0121ac0>] process_timeout+0x0/0x10
 [<c0121eaf>] msleep+0x2f/0x40
 [<d0c33a83>] socket_setup+0xd3/0x160 [pcmcia_core]
 [<d0c33d5f>] socket_resume+0x5f/0x130 [pcmcia_core]
 [<d0c6ec0e>] yenta_dev_resume+0xae/0xb0 [yenta_socket]
 [<d0c3310c>] pcmcia_socket_dev_resume+0x7c/0x90 [pcmcia_core]
 [<c024d36c>] pci_device_resume+0x2c/0x40
 [<c02c79a8>] dpm_resume+0xa8/0xb0
 [<c02c79c1>] device_resume+0x11/0x20
 [<c0135268>] finish+0x8/0x40
 [<c01353c5>] pm_suspend_disk+0x75/0xc0
 [<c0133786>] enter_state+0x86/0x90
 [<c013379f>] software_suspend+0xf/0x20
 [<c0289d9a>] acpi_system_write_sleep+0x6a/0x84
 [<c015835c>] vfs_write+0x14c/0x160
 [<c0158441>] sys_write+0x51/0x80
 [<c01032b9>] sysenter_past_esp+0x52/0x75
scheduling while atomic: really_suspend/0x00000001/4624
 [<c0396007>] schedule+0x467/0x520
 [<c0323b2e>] pci_write+0x3e/0x50
 [<c0120fb2>] __mod_timer+0x142/0x1f0
 [<c0396acd>] schedule_timeout+0x5d/0xb0
 [<c0121ac0>] process_timeout+0x0/0x10
 [<c0121eaf>] msleep+0x2f/0x40
 [<d0c33929>] socket_reset+0x59/0xe0 [pcmcia_core]
 [<d0c33aa2>] socket_setup+0xf2/0x160 [pcmcia_core]
 [<d0c33d5f>] socket_resume+0x5f/0x130 [pcmcia_core]
 [<d0c6ec0e>] yenta_dev_resume+0xae/0xb0 [yenta_socket]
 [<d0c3310c>] pcmcia_socket_dev_resume+0x7c/0x90 [pcmcia_core]
 [<c024d36c>] pci_device_resume+0x2c/0x40
 [<c02c79a8>] dpm_resume+0xa8/0xb0
 [<c02c79c1>] device_resume+0x11/0x20
 [<c0135268>] finish+0x8/0x40
 [<c01353c5>] pm_suspend_disk+0x75/0xc0
 [<c0133786>] enter_state+0x86/0x90
 [<c013379f>] software_suspend+0xf/0x20
 [<c0289d9a>] acpi_system_write_sleep+0x6a/0x84
 [<c015835c>] vfs_write+0x14c/0x160
 [<c0158441>] sys_write+0x51/0x80
 [<c01032b9>] sysenter_past_esp+0x52/0x75
ACPI: PCI interrupt 0000:00:0e.0[A] -> GSI 9 (level, low) -> IRQ 9
scheduling while atomic: really_suspend/0x00000001/4624
 [<c0396007>] schedule+0x467/0x520
 [<c02f86a4>] do_rw_taskfile+0x254/0x2b0
 [<c02f8890>] task_no_data_intr+0x0/0xb0
 [<c0396206>] wait_for_completion+0x86/0xf0
 [<c0114460>] default_wake_function+0x0/0x20
 [<c0114460>] default_wake_function+0x0/0x20
 [<c02c8339>] __elv_add_request+0x99/0xe0
 [<c02f38cf>] ide_do_drive_cmd+0x11f/0x170
 [<c02f0623>] generic_ide_resume+0x93/0xc0
 [<c02f8890>] task_no_data_intr+0x0/0xb0
 [<c023cbc7>] kobject_get+0x17/0x20
 [<c02c79a8>] dpm_resume+0xa8/0xb0
 [<c02c79c1>] device_resume+0x11/0x20
 [<c0135268>] finish+0x8/0x40
 [<c01353c5>] pm_suspend_disk+0x75/0xc0
 [<c0133786>] enter_state+0x86/0x90
 [<c013379f>] software_suspend+0xf/0x20
 [<c0289d9a>] acpi_system_write_sleep+0x6a/0x84
 [<c015835c>] vfs_write+0x14c/0x160
 [<c0158441>] sys_write+0x51/0x80
 [<c01032b9>] sysenter_past_esp+0x52/0x75
Restarting tasks...<3>scheduling while atomic: really_suspend/0x00000001/4624
 [<c0396007>] schedule+0x467/0x520
 [<c011399e>] wake_up_process+0x1e/0x20
 [<c0133c78>] thaw_processes+0xe8/0x100
 [<c0135276>] finish+0x16/0x40
 [<c01353c5>] pm_suspend_disk+0x75/0xc0
 [<c0133786>] enter_state+0x86/0x90
 [<c013379f>] software_suspend+0xf/0x20
 [<c0289d9a>] acpi_system_write_sleep+0x6a/0x84
 [<c015835c>] vfs_write+0x14c/0x160
 [<c0158441>] sys_write+0x51/0x80
 [<c01032b9>] sysenter_past_esp+0x52/0x75
 done
scheduling while atomic: really_suspend/0x00000001/4624
 [<c0396007>] schedule+0x467/0x520
 [<c0158441>] sys_write+0x51/0x80
 [<c0103336>] work_resched+0x5/0x16
scheduling while atomic: really_suspend/0x00000001/4624
 [<c0396007>] schedule+0x467/0x520
 [<c011506a>] sys_sched_yield+0x5a/0x80
 [<c0164f02>] coredump_wait+0x32/0xa0
 [<c016505c>] do_coredump+0xec/0x22a
 [<c0113854>] activate_task+0x64/0x80
 [<c0113524>] task_rq_lock+0x14/0x20
 [<c0121fac>] free_uid+0x1c/0x80
 [<c01228d5>] __dequeue_signal+0x105/0x180
 [<c0122985>] dequeue_signal+0x35/0xd0
 [<c012491f>] get_signal_to_deliver+0x2df/0x350
 [<c01030dd>] do_signal+0x9d/0x170
 [<c012a4e8>] __kernel_text_address+0x28/0x40
 [<c0103755>] show_trace+0x35/0x90
 [<c0103755>] show_trace+0x35/0x90
 [<c0103336>] work_resched+0x5/0x16
 [<c0113c90>] finish_task_switch+0x30/0x90
 [<c0395e64>] schedule+0x2c4/0x520
 [<c01123f0>] do_page_fault+0x0/0x5de
 [<c01031e7>] do_notify_resume+0x37/0x3c
 [<c010335a>] work_notifysig+0x13/0x15
note: really_suspend[4624] exited with preempt_count 1
ALPS Touchpad (Glidepoint) detected
  Disabling hardware tapping
input: AlpsPS/2 ALPS TouchPad on isa0060/serio1
ACPI: Battery Slot [BAT1] (battery absent)
ACPI: Battery Slot [BAT2] (battery absent)
ACPI: AC Adapter [ACAD] (off-line)
USB Universal Host Controller Interface driver v2.2
ACPI: PCI interrupt 0000:00:07.2[D] -> GSI 9 (level, low) -> IRQ 9
uhci_hcd 0000:00:07.2: VIA Technologies, Inc. VT82xxxxx UHCI USB 1.1 Controller
uhci_hcd 0000:00:07.2: irq 9, io base 0x1c00
uhci_hcd 0000:00:07.2: new USB bus registered, assigned bus number 1
hub 1-0:1.0: USB hub found
hub 1-0:1.0: 2 ports detected
ACPI: PCI interrupt 0000:00:07.3[D] -> GSI 9 (level, low) -> IRQ 9
uhci_hcd 0000:00:07.3: VIA Technologies, Inc. VT82xxxxx UHCI USB 1.1 Controller (#2)
uhci_hcd 0000:00:07.3: irq 9, io base 0x1c20
uhci_hcd 0000:00:07.3: new USB bus registered, assigned bus number 2
hub 2-0:1.0: USB hub found
hub 2-0:1.0: 2 ports detected
usb 1-2: new low speed USB device using uhci_hcd and address 2
input: USB HID v1.10 Mouse [USB Mouse] on usb-0000:00:07.2-2
usbcore: registered new driver usbhid
drivers/usb/input/hid-core.c: v2.0:USB HID core driver
usbcore: registered new driver usbmouse
drivers/usb/input/usbmouse.c: v1.6:USB HID Boot Protocol mouse driver
Comment 64 Luming Yu 2005-04-04 02:30:28 UTC
Created attachment 4846 [details]
testing patch for solving gpe_disabled issue with burst-mode ec

testing results on toshiba satellite M20:
/proc/acpi/battery/BAT0#time cat state
present:		 yes
capacity state: 	 ok
charging state: 	 charging
present rate:		 1500 mA
remaining capacity:	 4064 mAh
present voltage:	 15000 mV

real	0m0.023s
user	0m0.000s
sys	0m0.020s
Comment 65 Romano Giannetti 2005-04-04 04:01:21 UTC
I applied the patch that for me applies with offset and a trivial reject; I
patched 2.6.12-rc1 vanilla, preempt enabled.
This patch has the following effect on my PC: 

- it revert the usefulness of "ec burst" patch (I mean, suspend key is delayed
again)
- it does not cure the error: run acpi -V & acpi -V & : the second one return
immediately with 0%, the first one returns after couple of seconds with correct
value; 
- it has the same "scheduling while atomic" on resuming. 

I have to ask a question: it all used to work in 2.6.9. Between it and
2.6.11-rc1 something nasty happened. 

Comment 66 Luming Yu 2005-04-04 19:33:57 UTC
To comment#65, 
The testing patch is to solve a gpe-disabled-issue with burst-mode EC patch.
I'm very interested in how this small patch can delay suspend key on your box.

Could you revert this testing patch hunk by hunk to see which part could be 
the root cause? Please note, hunk at line @431 and @469 should be treated as 
one hunk.

Thanks,
Luming
Comment 67 Romano Giannetti 2005-04-05 07:51:06 UTC
Created attachment 4855 [details]
Full syslog capture from boot to failing suspend/resume. Ec patch applied, gpe- patch NOT| applied
Comment 68 Romano Giannetti 2005-04-05 07:54:44 UTC
I have to apologise. I reverted the patch and discovered that this was not the
problem, the last patch is ininfluent in my setup. I captured a full log of
suspend/resume (suspend start around line 430 of the log) in case it could be
useful. After resume a lot of things are malfunctioning; pcmcia modem says it's
busy, battery and ac ACPI doesn't work (but they resume working if
rmmoded-modprobed a couple of times...). 
In 2.6.11 with ec burst patch all is ok. Same config.
Comment 69 Len Brown 2005-04-22 20:09:35 UTC
applied luming's incremental patch from comment #64 to acpi-test tree 
Comment 70 Len Brown 2005-07-27 19:07:15 UTC
shipped in 26.13-rc3 -- closing
see subsequent patch in bug # 4665
Comment 71 Luming Yu 2005-08-04 00:25:06 UTC
Created attachment 5503 [details]
bust mode ec debug-patch against (2.6.13-rc5)

bust-mode EC, Please tested this patch against 2.6.13-rc5
Comment 72 Luming Yu 2005-08-04 11:12:59 UTC
Created attachment 5506 [details]
update one against 2.6.13-rc5
Comment 73 Luming Yu 2005-08-10 01:40:49 UTC
Created attachment 5574 [details]
update burs-ec patch against 2.6.13-rc6

apply it to linux-2.6.13-rc6, boot kernel with ec_burst=1.

Known issue: I tested it on my T42, and several others. Everything works.
But, On T42, if I change some kernel config options, Fn+F7 never turn on the
light of LCD again after turn off the light.(I'm investigating it.)  Except
this, everything seems to work.
Comment 74 Len Brown 2005-08-10 22:26:07 UTC
The patch in comment #73 seems to address the previous
serious performance degradationn when using ec_burst=1.
I have a script cat /proc/acpi/battery/BAT0/state
100 times as a performanc test.

The T20 now gets 67ms/call both with ec_burst=0
or with ec_burst=1.

The T30 now gets 57ms/call with ec_burst=0
and 58ms/call with ec_burst=1.

As this is clearly a step forward, I'll
apply it to-akpm to enable broader testing.
Comment 75 Luming Yu 2005-08-10 23:57:59 UTC
Created attachment 5599 [details]
patch against linux-2.6.13-rc6  ( against Lindent-ed ec.c )
Comment 76 Len Brown 2005-08-11 15:19:52 UTC
included above to 2.6.12 ACPI patch:
http://ftp.kernel.org/pub/linux/kernel/people/lenb/acpi/patches/release/2.6.12/acpi-20050729-2.6.12.patch.gz

as well as the to-akpm tree for inclusion in the -mm patch.
Comment 77 Luming Yu 2005-09-14 02:04:21 UTC
*** Bug 4070 has been marked as a duplicate of this bug. ***
Comment 78 Gustavo Noronha Silva 2005-09-26 10:50:48 UTC
Created attachment 6162 [details]
patch against 2.6.13.2

notice I'm no kernel hacker, so I cannot check that the patch is actually
correct; I manually applied the changes from the .rej generated while applying
the latest patch against 2.6.13.2; it builds, at least

(I have not booted yet to confirm that the problem goes away for me)
Comment 79 Gustavo Noronha Silva 2005-09-26 11:06:03 UTC
I tried it with 2.6.13.2, battery was as dump as since 2.6.9; I then tried with
ec_burst=0 and ec_burst=1, same problem with 0, no battery detected with 1; I
then applied the patch manually (resulting in the patch I submitted) and tried
again with ec_burst=0 and 1; with 1 acpi said the battery was absent, just like
with the non-patched kernel, with 0 the usual problem persisted;

any hints on where to go from here? my problem seems to be different from the
ones which were reported/fixed
Comment 80 Luming Yu 2005-09-27 06:57:57 UTC
To comment #79, Please try linux-2.6.14-rc1 with ec_burst=0 and ec_burst=1.
Also try patch at bug 4588. If you still have problem, please open a new bug.
Thanks.
Comment 81 Len Brown 2006-05-18 21:58:57 UTC
this shipped back in 2.6.14
closed.