Bug 199981

Summary: BISECTED 5a8361f7ecce: question mark on battery icon - Asus Zenbook UX303UB
Product: ACPI Reporter: step-ali (sunmooon15)
Component: Power-BatteryAssignee: Zhang Rui (rui.zhang)
Status: CLOSED CODE_FIX    
Severity: blocking CC: archlinux, charles.stanhope, dcpurton, irineosv, jwrdegoede, lenb, paulo.ulusu, rui.zhang, yu.c.chen
Priority: P1    
Hardware: Intel   
OS: Linux   
Kernel Version: 4.17 Subsystem:
Regression: Yes Bisected commit-id:
Attachments: lsmod 4.16
lsmod 4.17
dmesg
acpidump
grep
run EC _REG explicitly

Description step-ali 2018-06-08 07:22:46 UTC
Created attachment 276383 [details]
lsmod 4.16

4.16 everything was working great , in 4.17 i get no keyboard or touchpad or battery status .

my hardware is Asus Zenbook ux303ub and disrto is archlinux with core repo kernel and linux-git .

i get this error in dmesg :

[ 52.633484] pcieport 0000:00:1c.5: PCIe Bus Error: severity=Corrected, type=Physical Layer, id=00e5(Receiver ID)
[ 52.633488] pcieport 0000:00:1c.5: device [8086:9d15] error status/mask=00000001/00002000
[ 52.633492] pcieport 0000:00:1c.5: [ 0] Receiver Error (First)

ls mod for 4.16 and 4.17 is attached .

help please .
Comment 1 step-ali 2018-06-08 07:23:11 UTC
Created attachment 276385 [details]
lsmod 4.17
Comment 2 step-ali 2018-06-09 16:49:47 UTC
any help ??
Comment 3 Chen Yu 2018-06-11 02:10:23 UTC
A straightforward way is to do a git bisect to address the bad commit thus we can help figuring out what the problem is.
https://git-scm.com/docs/git-bisect
Comment 4 step-ali 2018-06-11 17:40:01 UTC
when i do :

git bisect good 4.16.14 , i get :

fatal: Needed a single revision
Bad rev input: 4.16.14

what to do ?
Comment 5 Zhang Rui 2018-06-12 01:19:37 UTC
First, I think you need to bisect upstream kernel.
something like
git bisect start
git bisect good v4.16
git bisect bad v4.17
Comment 6 step-ali 2018-06-12 01:30:20 UTC
yes i got it working ,

but i found more than one bad commit , is that okay ??
Comment 7 Chen Yu 2018-06-12 01:33:52 UTC
(In reply to step-ali from comment #6)
> yes i got it working ,
> 
> but i found more than one bad commit , is that okay ??

There should only be one 'first bad commit' AFAIK, please confirm  each bad bisect results are consistent, say, the symptom are the same -  no keyboard , touchpad nor battery.
Comment 8 step-ali 2018-06-12 10:51:24 UTC
first two commits were bad ,

third one is good ,

any ideas ??
Comment 9 step-ali 2018-06-12 10:58:14 UTC
also i had to build using gcc7 ,

gcc8 were throwing errors .
Comment 10 step-ali 2018-06-12 19:01:46 UTC
5a8361f7ecceaed64b4064000d16cb703462be49 is the first bad commit
commit 5a8361f7ecceaed64b4064000d16cb703462be49
Author: Schmauss, Erik <erik.schmauss@intel.com>
Date:   Thu Feb 15 13:09:30 2018 -0800

    ACPICA: Integrate package handling with module-level code
    
    ACPICA commit 8faf6fca445eb7219963d80543fb802302a7a8c7
    
    This change completes the integration of the recent changes to
    package object handling with the module-level code support.
    
    For acpi_exec, the -ep flag is removed.
    
    This change allows table load to behave as if it were a method
    invocation. Before this, the definition block definition below would
    have loaded all named objects at the root scope. After loading, it
    would execute the if statements at the root scope.
    
    DefinitionBlock (...)
    {
      Name(OBJ1, 0)
    
      if (1)
      {
        Device (DEV1)
        {
          Name (_HID,0x0)
        }
      }
      Scope (DEV1)
      {
        Name (OBJ2)
      }
    }
    
    The above code would load OBJ1 to the namespace, defer the execution
    of the if statement and attempt to add OBJ2 within the scope of DEV1.
    Since DEV1 is not in scope, this would incur an AE_NOT_FOUND error.
    After this error is emitted, the if block is invoked and DEV1 and its
    _HID is added to the namespace.
    
    This commit changes the behavior to execute the if block in place
    rather than deferring it until all tables are loaded. The new
    behavior is as follows: insert OBJ1 in the namespace, invoke the if
    statement and add DEV1 and its _HID to the namespace, add OBJ2 to the
    scope of DEV1.
    
    Bug report links:
    Link: https://bugs.acpica.org/show_bug.cgi?id=963
    Link: https://bugzilla.kernel.org/show_bug.cgi?id=153541
    Link: https://bugzilla.kernel.org/show_bug.cgi?id=196165
    Link: https://bugzilla.kernel.org/show_bug.cgi?id=192621
    Link: https://bugzilla.kernel.org/show_bug.cgi?id=197207
    Link: https://bugzilla.kernel.org/show_bug.cgi?id=198051
    Link: https://bugzilla.kernel.org/show_bug.cgi?id=198515
    
    ACPICA repo:
    Link: https://github.com/acpica/acpica/commit/8faf6fca
    
    Tested-by: Kai-Heng Feng <kai.heng.feng@canonical.com>
    Signed-off-by: Bob Moore <robert.moore@intel.com>
    Signed-off-by: Erik Schmauss <erik.schmauss@intel.com>
    Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>

:040000 040000 f219e9da26cbd8b2ef039583aad89d6e19a49d12 d1c678b85e4617b4cb2f10dbe27b73754e33831c M	drivers
:040000 040000 e19959d679e7bef5a4d6990ab3311735e0a2e6c1 660ed431fc1e61b9b764e676582e47a089429806 M	include
Comment 11 Chen Yu 2018-06-13 03:13:31 UTC
Thanks for the test, may I know if it is possible for you to confirm this is the bad commit by:
1. checking the commit preceding 5a8361f7ecceaed64b4064000d16cb703462be49 by 1,
    AKA, 7decc66df940fc0b128a642df9ac3d917f1b0c1f
2. checking the commit at 5a8361f7ecceaed64b4064000d16cb703462be49
Comment 12 step-ali 2018-06-13 16:20:14 UTC
yes it is
Comment 13 Erik Kaneda 2018-06-13 20:55:48 UTC
Could you get the full dmesg and acpidump of this machine?
Comment 14 step-ali 2018-06-13 21:28:51 UTC
Created attachment 276533 [details]
dmesg
Comment 15 step-ali 2018-06-13 21:29:22 UTC
Created attachment 276535 [details]
acpidump
Comment 16 step-ali 2018-06-13 23:43:23 UTC
when can i see the fix please ??
Comment 17 step-ali 2018-06-14 09:42:48 UTC
ok , git kernel fixed , keyboard and touchpad are working , but battery icon have a quistion mark on it ??
Comment 18 Erik Kaneda 2018-06-14 20:17:02 UTC
(In reply to step-ali from comment #17)
> ok , git kernel fixed , keyboard and touchpad are working , but battery icon
> have a quistion mark on it ??


Which kernel version are you talking about?
Comment 19 step-ali 2018-06-14 21:30:00 UTC
(In reply to Erik Schmauss from comment #18)
> (In reply to step-ali from comment #17)
> > ok , git kernel fixed , keyboard and touchpad are working , but battery
> icon
> > have a quistion mark on it ??
> 
> 
> Which kernel version are you talking about?

Linux aj 4.17.0-ga27fc14219f2
Comment 20 Erik Kaneda 2018-06-14 22:57:42 UTC
(In reply to step-ali from comment #19)
> (In reply to Erik Schmauss from comment #18)
> > (In reply to step-ali from comment #17)
> > > ok , git kernel fixed , keyboard and touchpad are working , but battery
> > icon
> > > have a quistion mark on it ??

I don't know what the question mark indicates. Can you provide more information about what this means?
Comment 21 step-ali 2018-06-15 00:32:46 UTC
In gnome DE while charging the battery icon used to have lightening bolt on it , but now it has a question mark .
Comment 22 Erik Kaneda 2018-06-15 18:11:20 UTC
Please try to get more information about the question mark and what it means. Maybe you could try asking gnome developers.
Comment 23 Zhang Rui 2018-06-28 02:46:52 UTC
please attach the output of "grep . /sys/class/power_supply/*/*"
Comment 24 step-ali 2018-06-28 06:17:08 UTC
Created attachment 276957 [details]
grep
Comment 25 Zhang Rui 2018-06-28 07:28:46 UTC
(In reply to step-ali from comment #24)
> Created attachment 276957 [details]
> grep

I assume this is got in 4.18-rc kernel, right?

what is the model name of your laptop?

can the questions mark problem be reproduced in 4.15 kernel?

I have seen quite a lot of battery driver changes in 4.17-rc1 and 4.18-rc1. To make clear which piece of changes makes the difference, I'd to be clear about the behavior in 4.15, 4.16, 4.17 final, and latest 4.18-rc
Comment 26 step-ali 2018-06-28 10:27:54 UTC
(In reply to Zhang Rui from comment #25)
> (In reply to step-ali from comment #24)
> > Created attachment 276957 [details]
> > grep
> 
> I assume this is got in 4.18-rc kernel, right?
> 
> what is the model name of your laptop?
> 
> can the questions mark problem be reproduced in 4.15 kernel?
> 
> I have seen quite a lot of battery driver changes in 4.17-rc1 and 4.18-rc1.
> To make clear which piece of changes makes the difference, I'd to be clear
> about the behavior in 4.15, 4.16, 4.17 final, and latest 4.18-rc

yes this is with 4.18 , 4.17 doesn't even boot on the machine ,

my laptop is an Asus Zenbook UX303UB ,

4.14 ts work properly on the machine but nvidia driver is broken on that kernel ,

4.16 used to work properly .
Comment 27 Erik Kaneda 2018-10-12 23:10:07 UTC
Hi, Can you try the latest 4.19 rc?
Comment 28 Zhang Rui 2018-11-20 07:08:34 UTC
@step-ali, can you please check the latest upstream kernel, say 4.20-rc?
Comment 29 Zhang Rui 2018-11-27 04:44:15 UTC
So, first of all, please make sure your kernel is built with CONFIG_ACPI_DEBUG=Y, and then boot with kernel parameter "acpi.debug_level=0x420 acpi.debug_layer=0x05", and attach the full dmesg output after boot, for both good and bad kernels.
Comment 30 Iri 2018-11-27 20:51:38 UTC
hi, I have the same issue (question mark appears in the battery icon when laptop is fully charged) and if I click on the battery it is stuck at "estimating". This only happens when laptop is fully charged and plugged in. When charging all indications are correct. When unplugged all indications are correct. It seems like a miscommunication issue.. my kernel is 4.18.0-11-generic
Comment 31 Iri 2018-11-27 20:52:32 UTC
this only happens on my asus n550-jk laptop. I have a dell laptop that this does not happen to.
Comment 32 Zhang Rui 2018-12-03 06:07:49 UTC
please confirm if the problem is gone with the patch from https://bugzilla.kernel.org/show_bug.cgi?id=200011#c71
Comment 33 Iri 2018-12-03 09:18:23 UTC
Thank you Zhang, can you please provide me with instructions on how to implement this patch as I am new to linux.. it seems like I need to update some lines in some file but I am not sure.. thanks (then I can report back on if it worked)
Comment 34 Zhang Rui 2018-12-19 14:52:48 UTC
Created attachment 280095 [details]
run EC _REG explicitly

please apply this patch on top of upstream kernel, say 4.20-rc
and see if the problem still exists.
Comment 35 Zhang Rui 2018-12-27 15:26:48 UTC
*** Bug 201351 has been marked as a duplicate of this bug. ***
Comment 36 Zhang Rui 2018-12-27 15:27:02 UTC
*** Bug 200011 has been marked as a duplicate of this bug. ***
Comment 37 Zhang Rui 2018-12-27 15:54:43 UTC
*** Bug 201679 has been marked as a duplicate of this bug. ***
Comment 38 Zhang Rui 2019-01-10 09:44:32 UTC
here is the latest patch,
https://patchwork.kernel.org/patch/10753143/
please apply it on top of 5.0-rc1 and check if it works or not.
Comment 39 Jean-Marc Lenoir 2019-01-10 21:57:25 UTC
I have tested this patch on Linux 5.0-rc1 and 4.20.1, it works well.
Comment 40 Zhang Rui 2019-03-25 02:20:16 UTC
Fix by the following commit
commit b1c0330823fe842dbb34641f1410f0afa51c29d3
Author:     Rafael J. Wysocki <rafael.j.wysocki@intel.com>
AuthorDate: Wed Jan 9 00:34:37 2019 +0100
Commit:     Rafael J. Wysocki <rafael.j.wysocki@intel.com>
CommitDate: Tue Jan 15 23:18:23 2019 +0100

    ACPI: EC: Look for ECDT EC after calling acpi_load_tables()
    
    Some systems have had functional issues since commit 5a8361f7ecce
    (ACPICA: Integrate package handling with module-level code) that,
    among other things, changed the initial values of the
    acpi_gbl_group_module_level_code and acpi_gbl_parse_table_as_term_list
    global flags in ACPICA which implicitly caused acpi_ec_ecdt_probe() to
    be called before acpi_load_tables() on the vast majority of platforms.
    
    Namely, before commit 5a8361f7ecce, acpi_load_tables() was called from
    acpi_early_init() if acpi_gbl_parse_table_as_term_list was FALSE and
    acpi_gbl_group_module_level_code was TRUE, which almost always was
    the case as FALSE and TRUE were their initial values, respectively.
    The acpi_gbl_parse_table_as_term_list value would be changed to TRUE
    for a couple of platforms in acpi_quirks_dmi_table[], but it remained
    FALSE in the vast majority of cases.
    
    After commit 5a8361f7ecce, the initial values of the two flags have
    been reversed, so in effect acpi_load_tables() has not been called
    from acpi_early_init() any more.  That, in turn, affects
    acpi_ec_ecdt_probe() which is invoked before acpi_load_tables() now
    and it is not possible to evaluate the _REG method for the EC address
    space handler installed by it.  That effectively causes the EC address
    space to be inaccessible to AML on platforms with an ECDT matching the
    EC device definition in the DSDT and functional problems ensue in
    there.
    
    Because the default behavior before commit 5a8361f7ecce was to call
    acpi_ec_ecdt_probe() after acpi_load_tables(), it should be safe to
    do that again.  Moreover, the EC address space handler installed by
    acpi_ec_ecdt_probe() is only needed for AML to be able to access the
    EC address space and the only AML that can run during acpi_load_tables()
    is module-level code which only is allowed to access address spaces
    with default handlers (memory, I/O and PCI config space).
    
    For this reason, move the acpi_ec_ecdt_probe() invocation back to
    acpi_bus_init(), from where it was taken away by commit d737f333b211
    (ACPI: probe ECDT before loading AML tables regardless of module-level
    code flag), and put it after the invocation of acpi_load_tables() to
    restore the original code ordering from before commit 5a8361f7ecce.
    
    Fixes: 5a8361f7ecce ("ACPICA: Integrate package handling with module-level code")
    Link: https://bugzilla.kernel.org/show_bug.cgi?id=199981
    Reported-by: step-ali <sunmooon15@gmail.com>
    Reported-by: Charles Stanhope <charles.stanhope@gmail.com>
    Tested-by: Charles Stanhope <charles.stanhope@gmail.com>
    Reported-by: Paulo Nascimento <paulo.ulusu@googlemail.com>
    Reported-by: David Purton <dcpurton@marshwiggle.net>
    Reported-by: Adam Harvey <adam@adamharvey.name>
    Reported-by: Zhang Rui <rui.zhang@intel.com>
    Tested-by: Zhang Rui <rui.zhang@intel.com>
    Tested-by: Jean-Marc Lenoir <archlinux@jihemel.com>
    Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>