Bug 200011
Description
Paulo Nascimento
2018-06-09 17:53:57 UTC
Could you provide cat /proc/acpi/wakeup and check if LID0 is enabled or not please? and please attach the acpidump output [pmmn@pn-macbook ~]$ inxi -F System: Host: pn-macbook Kernel: 4.17.0-2-MANJARO x86_64 bits: 64 Desktop: KDE Plasma 5.12.5 Distro: Manjaro Linux 17.1.10 Hakoila Machine: Type: Laptop System: Apple product: MacBookPro9,2 v: 1.0 serial: <root required> Mobo: Apple model: Mac-6F01561E16C75D06 v: MacBookPro9,2 serial: <root required> UEFI: Apple v: MBP91.88Z.00D3.B0C.1509111653 date: 09/11/2015 CPU: Topology: Dual Core model: Intel Core i5-3210M bits: 64 type: MT MCP L2 cache: 3072 KiB Speed: 2789 MHz min/max: 1200/3100 MHz Core speeds (MHz): 1: 2789 2: 1645 3: 2409 4: 2431 Graphics: Card-1: Intel 3rd Gen Core processor Graphics driver: i915 v: kernel Display: x11 server: X.Org 1.19.6 driver: intel unloaded: fbdev,modesetting,vesa resolution: 1280x800~60Hz OpenGL: renderer: Mesa DRI Intel Ivybridge Mobile v: 4.2 Mesa 18.1.1 Audio: Card-1: Intel 7 Series/C216 Family High Definition Audio driver: snd_hda_intel Sound Server: ALSA v: k4.17.0-2-MANJARO Network: Card-1: Broadcom Limited NetXtreme BCM57765 Gigabit Ethernet PCIe driver: tg3 IF: enp1s0f0 state: down mac: 68:5b:35:bc:93:7b Card-2: Broadcom Limited BCM4331 802.11a/b/g/n driver: bcma-pci-bridge IF-ID-1: wlp2s0b1 state: up mac: 60:03:08:ed:ae:c0 Drives: HDD Total Size: 698.65 GiB used: 94.16 GiB (13.5%) ID-1: /dev/sda vendor: ^APPLE model: APPLE HDD HTS545050A7E362 size: 465.76 GiB ID-2: /dev/sdb vendor: Seagate model: ST9250315AS size: 232.89 GiB Partition: ID-1: / size: 26.17 GiB used: 8.52 GiB (32.5%) fs: ext4 dev: /dev/sdb6 ID-2: /home size: 151.01 GiB used: 85.63 GiB (56.7%) fs: ext4 dev: /dev/sdb5 ID-3: swap-1 size: 4.42 GiB used: 0 KiB (0.0%) fs: swap dev: /dev/sdb4 Sensors: System Temperatures: cpu: 52.0 C mobo: N/A Fan Speeds (RPM): N/A Info: Processes: 290 Uptime: 9m Memory: 7.70 GiB used: 997.6 MiB (12.6%) Shell: bash inxi: 3.0.10 [pmmn@pn-macbook ~]$ cat /proc/acpi/wakeup Device S-state Status Sysfs node P0P2 S3 *disabled pci:0000:00:01.0 PEG1 S3 *disabled EC S4 *disabled platform:PNP0C09:00 GMUX S3 *disabled pnp:00:03 HDEF S3 *disabled pci:0000:00:1b.0 RP01 S3 *disabled pci:0000:00:1c.0 GIGE S3 *disabled pci:0000:01:00.0 SDXC S3 *disabled pci:0000:01:00.1 RP02 S3 *disabled pci:0000:00:1c.1 ARPT S3 *disabled pci:0000:02:00.0 RP03 S3 *disabled pci:0000:00:1c.2 EHC1 S3 *enabled pci:0000:00:1d.0 EHC2 S3 *enabled pci:0000:00:1a.0 XHC1 S3 *enabled pci:0000:00:14.0 ADP1 S4 *disabled platform:ACPI0003:00 LID0 S4 *enabled platform:PNP0C0D:00 [pmmn@pn-macbook ~]$ dmesg | grep ACPI Attached as file ACPI.info Created attachment 276627 [details]
dmesg | grep ACPI
I didn't find acpidump in AUR you need to install acpidump tool and run "acpidump > acpidump.out" with root privilege. acpidump > acpidump.out Created attachment 276747 [details]
acpidump.out
please attach the dmesg output after resume in both 4.14 kernel that woken up by LID and 4.17 kernel that woken up by power button. Created attachment 276883 [details]
dmesg output kernel 4.14 LID0
Created attachment 276885 [details]
dmesg output kernel 4.17 power_button
can you please do bisect to find out which commit introduces the problem? (In reply to Zhang Rui from comment #14) > can you please do bisect to find out which commit introduces the problem? Not the bug reporter but I've got the same issue with a MacBookAir6,2. Resume on lid open stopped working after upgrading from 4.16.13 to 4.17.2. Bisect says: first bad commit: [5a8361f7ecceaed64b4064000d16cb703462be49] ACPICA: Integrate package handling with module-level code Also not the bug reporter, but I can confirm that the issue is also introduced in 5a8361f7ecceaed64b4064000d16cb703462be49 on a MacBookPro12,1. Could you test the latest 4.18 rc and post the dmesg if it still does not work? Thanks Sorry. But with this heat my Macbook died. Thank you very much for your efforts. You have to ask another Macbook user. Regards, ulusu (In reply to Paulo Nascimento from comment #18) > Sorry. But with this heat my Macbook died. Thank you very much for your > efforts. You have to ask another Macbook user. Sorry that this happened. Let's leave this one open for the other reporters. Jona and Andrey, can you try the latest rc? We've been submitting fixes for the module-level code change. Sorry, I was quite busy, I will build the latest RC today evening and report back. (In reply to Erik Schmauss from comment #19) > (In reply to Paulo Nascimento from comment #18) > > Sorry. But with this heat my Macbook died. Thank you very much for your > > efforts. You have to ask another Macbook user. > > Sorry that this happened. Let's leave this one open for the other reporters. > > Jona and Andrey, can you try the latest rc? We've been submitting fixes for > the module-level code change. Created attachment 277621 [details]
dmesg_output_kernel_4.19rc7_macbookair62.txt
$ dmesg > dmesg_output_kernel_4.19rc7_macbookair62.txt
Sorry about the delay, I've tried 4.19-rc7 and for me the issue still persists (Macbook Air 6,2). Please find dmesg output attached. I have built d72e90f33aa4709ebecc5005562f52335e106a60 (tag v4.18-rc6) a few minutes ago and the issue still persists on my MacbookPro12,1. My ACPI wakeup configuration is
> muellerjona@MacbookPro:~$ cat /proc/acpi/wakeup
> Device S-state Status Sysfs node
> PEG0 S3 *disabled
> EC S4 *disabled platform:PNP0C09:00
> HDEF S3 *disabled pci:0000:00:1b.0
> RP01 S3 *disabled pci:0000:00:1c.0
> RP02 S3 *disabled pci:0000:00:1c.1
> RP03 S3 *disabled pci:0000:00:1c.2
> ARPT S4 *enabled pci:0000:03:00.0
> RP05 S3 *disabled pci:0000:00:1c.4
> RP06 S3 *disabled pci:0000:00:1c.5
> SPIT S3 *disabled
> XHC1 S3 *disabled pci:0000:00:14.0
> ADP1 S4 *disabled platform:ACPI0003:00
> LID0 S4 *enabled platform:PNP0C0D:00
Find my dmesg output attached.
Created attachment 277623 [details]
Dmesg output
$ sudo dmesg > dmesg-4.18.0-rc6.txt
Hi, I'm not a LID expert but the dmesg outputs that both of you sent does not have an error like the following in comment 12: [ 172.092019] ACPI: EC: event unblocked ... [ 172.099293] ACPI Error: [\_SB_.PCI0.P0P2.GFX0] Namespace lookup failure, AE_NOT_FOUND (20170728/psargs-364) [ 172.099311] ACPI Error: Method parse/execution failed \_SB.PCI0.LPCB.EC._Q80, AE_NOT_FOUND (20170728/psparse-550) This error indicates that the EC tried to query a non-existent "device" called GFX0. The only difference that I see in the execution of AML is the above. I know that we thinkg that pointing to the module-level code change but I am curious about this missing error message. Rui, do you know why newer versions of the kernel do not have the above error message? Was there something significant that changed in GPE processing? Rui, another thought: we need to record dmesg output when the lid opens/system wakes up. This will allow us to understand what's going on in the AML. Do you know of any debugging options to dump dmesg contents as the lid is opening and the system is waking? (In reply to Erik Schmauss from comment #25) > Hi, > > I'm not a LID expert but the dmesg outputs that both of you sent does not > have an error like the following in comment 12: > > [ 172.092019] ACPI: EC: event unblocked > ... > [ 172.099293] ACPI Error: [\_SB_.PCI0.P0P2.GFX0] Namespace lookup failure, > AE_NOT_FOUND (20170728/psargs-364) > [ 172.099311] ACPI Error: Method parse/execution failed > \_SB.PCI0.LPCB.EC._Q80, AE_NOT_FOUND (20170728/psparse-550) > > This error indicates that the EC tried to query a non-existent "device" > called GFX0. The only difference that I see in the execution of AML is the > above. I know that we thinkg that pointing to the module-level code change > but I am curious about this missing error message. Rui, do you know why > newer versions of the kernel do not have the above error message? Was there > something significant that changed in GPE processing? first of all, GFX0 is not defined, thus we should get this error whenever _Q80 is evaluated. so, I guess this is because _Q80 is not invoked in new kernel at all. here is the output from 4.14 kernel. [ 172.092019] ACPI: EC: event unblocked ... [ 172.098258] pciehp 0000:05:03.0:pcie204: Timeout on hotplug command 0x1038 (issued 171423 msec ago) [ 172.099293] ACPI Error: [\_SB_.PCI0.P0P2.GFX0] Namespace lookup failure, AE_NOT_FOUND (20170728/psargs-364) [ 172.099311] ACPI Error: Method parse/execution failed \_SB.PCI0.LPCB.EC._Q80, AE_NOT_FOUND (20170728/psparse-550) [ 172.287432] OOM killer enabled. [ 172.287433] Restarting tasks ... done. and this is the output from 4.17 kernel [ 126.041792] pcieport 0000:05:06.0: quirk_apple_wait_for_thunderbolt+0x0/0xe0 took 38188 usecs [ 126.042372] ACPI: EC: event unblocked [ 126.048653] sd 0:0:0:0: [sda] Starting disk [ 126.048689] sd 1:0:0:0: [sdb] Starting disk [ 126.237365] OOM killer enabled. [ 126.237367] Restarting tasks ... done. We can see that EC event is unblocked earlier in 4.14 kernel, which results in _Q80 evaluated. So I think this error message is caused by the EC code changes, but I'm not sure if it is related with the lid problem or not. (In reply to Erik Schmauss from comment #26) > Rui, another thought: we need to record dmesg output when the lid > opens/system wakes up. This will allow us to understand what's going on in > the AML. Do you know of any debugging options to dump dmesg contents as the > lid is opening and the system is waking? no, here is the process of S3 wakeup 1. user opens lid 2. an SCI/GPE fired, and wakes up the system 3. system starts to run code from the waking vector (FACS) 4. all the GPEs are cleared 5. system is resumed. thus, for the GPE that wakes up system from S3, its AML/method handler is not invoked at all. for this particular issue, I'd suggest to introduce code to dump the GPE register status just before suspend and right after resume. This would help me to know if a specific GPE is enabled as wakeup GPE or not, and which GPE is fired during resume. I think Yu has a patch for this before, let me find it out. I have one of the affected by the bug laptops and the bug is in "NEEDINFO". I'm up for installing a patch with debugging output, just point me to it. I have the same bug on my MacBook Pro 9,2. I also have the battery bug (https://bugzilla.kernel.org/show_bug.cgi?id=200141), but only when I put the kernel parameter acpi_osi=!Darwin. I am able to solve these two bugs by applying the following patch (I just reverted the part of the faulty commit that causes the bug). But it's probably not a long term solution, because if I understand I have enabled an obsolete configuration which will be removed in the future. ``` --- a/include/acpi/acpixf.h +++ b/include/acpi/acpixf.h @@ -161,7 +161,7 @@ * NOTE, this is essentially obsolete and will be removed soon * (01/2018). */ -ACPI_INIT_GLOBAL(u8, acpi_gbl_group_module_level_code, FALSE); +ACPI_INIT_GLOBAL(u8, acpi_gbl_group_module_level_code, TRUE); /* * Optionally support module level code by parsing an entire table as @@ -169,7 +169,7 @@ * NOTE, this is essentially obsolete and will be removed soon * (01/2018). */ -ACPI_INIT_GLOBAL(u8, acpi_gbl_execute_tables_as_methods, TRUE); +ACPI_INIT_GLOBAL(u8, acpi_gbl_execute_tables_as_methods, FALSE); /* * Optionally use 32-bit FADT addresses if and when there is a conflict ``` (In reply to Jean-Marc Lenoir from comment #32) > I have the same bug on my MacBook Pro 9,2. I also have the battery bug > (https://bugzilla.kernel.org/show_bug.cgi?id=200141), but only when I put > the kernel parameter acpi_osi=!Darwin. > > I am able to solve these two bugs by applying the following patch (I just > reverted the part of the faulty commit that causes the bug). But it's > probably not a long term solution, because if I understand I have enabled an > obsolete configuration which will be removed in the future. Thanks for this information and yes, we are going to get rid of this sometime next year. We should get to the bottom of this before we do. Could you post your dmesg with and without the changes below? > > ``` > --- a/include/acpi/acpixf.h > +++ b/include/acpi/acpixf.h > @@ -161,7 +161,7 @@ > * NOTE, this is essentially obsolete and will be removed soon > * (01/2018). > */ > -ACPI_INIT_GLOBAL(u8, acpi_gbl_group_module_level_code, FALSE); > +ACPI_INIT_GLOBAL(u8, acpi_gbl_group_module_level_code, TRUE); > > /* > * Optionally support module level code by parsing an entire table as > @@ -169,7 +169,7 @@ > * NOTE, this is essentially obsolete and will be removed soon > * (01/2018). > */ > -ACPI_INIT_GLOBAL(u8, acpi_gbl_execute_tables_as_methods, TRUE); > +ACPI_INIT_GLOBAL(u8, acpi_gbl_execute_tables_as_methods, FALSE); > > /* > * Optionally use 32-bit FADT addresses if and when there is a conflict > ``` Created attachment 278909 [details]
dmesg without my patch
I don't know if it is useful for you, but I've suspended by closing lid and resumed (manually) my laptop just before saving dmesg.
Created attachment 278911 [details]
dmesg with my patch
I've also suspended (by closing lid) and resumed (by opening lid) my laptop before saving dmesg.
Thanks for doing this. I think I found the problem. The region initialization isn't adding the address range properly. This causes the warning below to NOT appear which might be causing your issue. I'll work on a fix and post a patch (hopefully) in a few hours. ACPI Warning: SystemIO range 0x0000000000000428-0x000000000000042F conflicts with OpRegion 0x0000000000000400-0x000000000000047F (\PMIO) (20180531/utaddress-213) ACPI: If an ACPI driver is available for this device, you should use it Created attachment 278923 [details]
Add operation region addresses to the global address list
Please give this a try
Created attachment 278925 [details]
dmesg with es-add-address patch
Now I can see the ACPI warning, but unfortunately it is not sufficient to solve the bug.
I'm posting my dmesg with your patch.
(In reply to Jean-Marc Lenoir from comment #38) > Created attachment 278925 [details] > dmesg with es-add-address patch > > Now I can see the ACPI warning, but unfortunately it is not sufficient to > solve the bug. Thanks for testing, I think we'll include something like this in upstream anyway. > > I'm posting my dmesg with your patch. Created attachment 278935 [details]
Stop clearing GPE's when
Does your kernel version have this patch? If so, could to try this patch in addition to the one I posted?
Yes, I already have this patch in my kernel. Ok. What happens when you remove the acpi_osi=!Darwin from the boot parameters with the patch from comment 37? Exactly the same thing than without the patch: the LID0 bug is still present, but the battery bug (reported as absent) is fixed. For information, I've just tested on Linux 4.19-rc6 and there is exactly the same result than my current kernel. (In reply to Jean-Marc Lenoir from comment #43) > Exactly the same thing than without the patch: the LID0 bug is still > present, but the battery bug (reported as absent) is fixed. How about the message regarding Ac Adapter (ADP1)? Is ADP1 on-line? > > For information, I've just tested on Linux 4.19-rc6 and there is exactly the > same result than my current kernel. Created attachment 278959 [details] Bob's _REG patch Could you try this patch on top of the patch from comment 37 without any OSI boot parameters? @Jean-Marc, can you post your acpidump? I think you might have different firmware than the original poster... > How about the message regarding Ac Adapter (ADP1)? Is ADP1 on-line?
I assume ADP1 is on-line if /sys/class/power_supply/ADP1/online contains 1. It's interesting, because when I boot with the adapter connected, this "file" contains 0. If I unplug and re-plug the adapter , it becomes 1.
If I use the kernel with my patch to revert the faulty commit, /sys/class/power_supply/ADP1/online contains 1 without the need to unplug and re-plug the adapter.
Created attachment 278961 [details]
dmesg with both patches
With both patches, it doesn't work better.
I'm posting my dmesg with these patches and without OSI boot parameters.
Created attachment 278963 [details]
acpidump
My acpidump, taken from the kernel with both patches and without OSI boot parameters (I don't know if it matters).
(In reply to Jean-Marc Lenoir from comment #47) > > How about the message regarding Ac Adapter (ADP1)? Is ADP1 on-line? > > I assume ADP1 is on-line if /sys/class/power_supply/ADP1/online contains 1. > It's interesting, because when I boot with the adapter connected, this > "file" contains 0. If I unplug and re-plug the adapter , it becomes 1. > > If I use the kernel with my patch to revert the faulty commit, > /sys/class/power_supply/ADP1/online contains 1 without the need to unplug > and re-plug the adapter. Ok. So this might be the next step. Your dmesg from comment 48 contains the line [ 7.763316] ACPI: AC Adapter [ADP1] (off-line) [ 7.779700] apple_gmux: gmux device not present but your dmesg with your patch contains [ 10.219707] ACPI: AC Adapter [ADP1] (on-line) [ 10.252324] apple_gmux: gmux device not present So the next step would be to get these two messages to display the same. I'll take a look at this. Thanks for your acpidump. It matches the other acpidump files posted to this bugzilla. I'll get a macbook pro tomorrow so I will (hopefully) be able to debug from my end. Stay tuned... Created attachment 278975 [details]
Load table at the same time regardless of module-level code option
Try this one. I found that the table load occurs at a different place in the code depending on the acpi_gbl_execute_tables_as_methods variable. This could be affecting the EC (which is tied to AC adapter and LID0)...
Method (_PSR, 0, NotSerialized) // _PSR: Power Source { Return (PWRS) /* \PWRS */ } so it is PWRS returns different values, causing AC Adapter to be online vs. offline. PWRS is set in three places 1. EC _REG method 2. EC _Q21 method 3. _WAK method and PWRS should always equal \_SB_.PCI0.LPCB.EC__.EACP according to the ASL. So there are two possibilities that PWRS could return different value, the first one is that OS behaves differently to evaluate _REG/_Q21/_WAK during boot, the second one is that EACP has different values. I'd suggest we verify the second one first, I will make a custom dsdt, which makes AC _PSR always return \_SB_.PCI0.LPCB.EC__.EACP directly. BTW, I also don't see why module level code matters here. @Erik, is there anyway that I can filter out the non-module-level-code from the dsdt.dsl, or else we don't know what module level code does on this platform. Created attachment 278977 [details] custom dsdt to return EACP for AC status please follow step 5,6,7 in the below page to override the dsdt and check if the problem still exists. https://01.org/linux-acpi/documentation/overriding-dsdt (In reply to Zhang Rui from comment #53) > Method (_PSR, 0, NotSerialized) // _PSR: Power Source > { > Return (PWRS) /* \PWRS */ > } > > so it is PWRS returns different values, causing AC Adapter to be online vs. > offline. PWRS is set in three places > 1. EC _REG method > 2. EC _Q21 method > 3. _WAK method > and PWRS should always equal \_SB_.PCI0.LPCB.EC__.EACP according to the ASL. > > So there are two possibilities that PWRS could return different value, > the first one is that OS behaves differently to evaluate _REG/_Q21/_WAK > during boot, the second one is that EACP has different values. > > I'd suggest we verify the second one first, I will make a custom dsdt, which > makes AC _PSR always return \_SB_.PCI0.LPCB.EC__.EACP directly. > > Hi Rui, > BTW, I also don't see why module level code matters here. The reporter says he can fix the LID0 issue by reverting to the old module-level code or to flip the flags stated in comment 32. One thing that is different between old module-level code support and new module-level code is the load time of the ECDT so comment 52 alters the ECDT load order to test if there are differences. > @Erik, is there > anyway that I can filter out the non-module-level-code from the dsdt.dsl, or > else we don't know what module level code does on this platform. We don't currently have a way but I think I can write something. This would be valuable to have in the future anyway. I've glanced at the reporter's acpi tables and I didn't see any module-level code... Great! The patch comment 52 fixes the two bugs (LID0 and battery status) for me, with and without the acpi_osi=!Darwin parameter. (In reply to Jean-Marc Lenoir from comment #56) > Great! The patch comment 52 fixes the two bugs (LID0 and battery status) for > me, with and without the acpi_osi=!Darwin parameter. Just for reference, could you post the dmesg with this patch? Created attachment 278999 [details] dmesg with patch comment 52 As I suspected, comment 52 patch is the wrong order of loading the ACPI tables and ECDT. It goes directly against the spec but somehow it works... It's likely that this apple platform may not be spec compliant. Unfortunately, this patch cannot go upstream because it goes against the spec (ECDT should be loaded before other ACPI tables).. However, I gave you credit on testing the patch in comment 37 and that is going upstream. From what you say, comment 52 fixes your issue so I am tempted to close this bug. However, Rui's suggestion in comment 54 (in addition to comment 37 and removing the osi flag from the boot parameters) might be a better approach. I'll let you decide if you want to take this approach. If you're fine with the patch in comment 52, let us know and I'll close the bug. Thanks for your effort in testing. I think it was a productive bug report. Ok. I read some informations about Rui's approach this week-end and I think it's a more elegant solution. In addition, with my distribution I am able to change the DSDT without recompiling the kernel (using these instructions: https://wiki.archlinux.org/index.php/DSDT#Using_a_CPIO_archive). So I would like to use this approach. I tried to compile the kernel including the DSDT.hex file but it failed with the error: ``` drivers/acpi/tables.c: In function 'acpi_os_table_override': drivers/acpi/tables.c:727:44: error: 'AmlCode' undeclared (first use in this function) *new_table = (struct acpi_table_header *)AmlCode; ^~~~~~~ ``` Do you have any idea what is wrong? Or, @Zhang Rui, can you send me the corresponding dsdt.dsl file? I will include it with my distribution method, without recompiling the kernel. (In reply to Jean-Marc Lenoir from comment #60) > Ok. I read some informations about Rui's approach this week-end and I think > it's a more elegant solution. In addition, with my distribution I am able to > change the DSDT without recompiling the kernel (using these instructions: > https://wiki.archlinux.org/index.php/DSDT#Using_a_CPIO_archive). So I would > like to use this approach. > > I tried to compile the kernel including the DSDT.hex file but it failed with > the error: > ``` > drivers/acpi/tables.c: In function 'acpi_os_table_override': > drivers/acpi/tables.c:727:44: error: 'AmlCode' undeclared (first use in this > function) > *new_table = (struct acpi_table_header *)AmlCode; > ^~~~~~~ > ``` I think yuo need to look in the DSDT.hex and change the name of the array to "AmlCode". > Do you have any idea what is wrong? Or, @Zhang Rui, can you send me the > corresponding dsdt.dsl file? I will include it with my distribution method, > without recompiling the kernel. Thanks. Now, /sys/class/power_supply/ADP1/online equals 1 just after reboot, without the need to unplug/re-plug the adapter. But it doesn't fix my battery and LID bugs. (In reply to Jean-Marc Lenoir from comment #62) > Thanks. Now, /sys/class/power_supply/ADP1/online equals 1 just after reboot, > without the need to unplug/re-plug the adapter. But it doesn't fix my > battery and LID bugs. Ok, I think this means that you might want to go with the patch then... After more investigation, we found errors that are similar to this laptop lid issue so this may not be a mac-specific issue and it is possible that it's an issue with the EC driver. Hi Jean-Marc, could you try 4.20-rc1? @Rui, I think this may be an issue with the EC similar to https://bugzilla.kernel.org/show_bug.cgi?id=201351 but I'm not sure if it's the same issue. I've assigned the bug for you to look at. We can't do anything about this bug from ACPICA perspective so the fix may have to come from the EC driver... Created attachment 279329 [details] dmesg 4.20_rc1 Linux 4.20-rc1 doesn't fix my issues (lid and battery) but now I can see the warning line in comment 36. I join my new dmesg. So, first of all, please make sure your kernel is built with CONFIG_ACPI_DEBUG=Y, and then boot with kernel parameter "acpi.debug_level=0x420 acpi.debug_layer=0x05", and attach the full dmesg output after boot, for both good and bad kernels. (In reply to Zhang Rui from comment #53) > Method (_PSR, 0, NotSerialized) // _PSR: Power Source > { > Return (PWRS) /* \PWRS */ > } > > so it is PWRS returns different values, causing AC Adapter to be online vs. > offline. PWRS is set in three places > 1. EC _REG method > 2. EC _Q21 method > 3. _WAK method > and PWRS should always equal \_SB_.PCI0.LPCB.EC__.EACP according to the ASL. > > So there are two possibilities that PWRS could return different value, > the first one is that OS behaves differently to evaluate _REG/_Q21/_WAK > during boot, the second one is that EACP has different values. > > I'd suggest we verify the second one first, I will make a custom dsdt, which > makes AC _PSR always return \_SB_.PCI0.LPCB.EC__.EACP directly. > > (In reply to Jean-Marc Lenoir from comment #62) > Thanks. Now, /sys/class/power_supply/ADP1/online equals 1 just after reboot, > without the need to unplug/re-plug the adapter. But it doesn't fix my > battery and LID bugs. So it means that EACP always returns the correct value. So we are either missing _REG evaluation, or missing the EC _Q21 event. Comments #66 will show when the EC _REG will be evaluated, and please do attach the test result. Created attachment 279681 [details]
acpi.debug_level=0x420 acpi.debug_layer=0x05 on good kernel
Created attachment 279683 [details]
acpi.debug_level=0x420 acpi.debug_layer=0x05 on bad kernel
for good kernel [ 0.546501] ACPI: EC: EC started [ 0.546501] ACPI: EC: interrupt blocked [ 0.546906] Running _REG methods for SpaceId EmbeddedControl [ 0.547078] Executing Method \_SB.PCI0.LPCB.EC._REG [ 0.547115] Initializing Region \GNVS [ 0.547131] utaddress-0072 ut_add_address_range : Added [GNVS] address range: 0x000000008AD40C10-0x000000008AD40D96 [ 0.547167] Initializing Region \_SB.PCI0.LPCB.EC.ECOR [ 0.547528] Initializing Region \_SB.PCI0.IGPU.IGDM [ 0.547549] utaddress-0072 ut_add_address_range : Added [IGDM] address range: 0x000000008AD15190-0x000000008AD1718F [ 0.547973] Executed 1 _REG methods for SpaceId EmbeddedControl [ 0.547976] ACPI: \: Used as first EC [ 0.547977] ACPI: \: GPE=0x17, EC_CMD/EC_SC=0x66, EC_DATA=0x62 [ 0.547978] ACPI: \: Used as boot ECDT EC to handle transactions and for bad kernel [ 0.540038] ACPI: EC: EC started [ 0.540038] ACPI: EC: interrupt blocked [ 0.540038] Running _REG methods for SpaceId EmbeddedControl [ 0.540038] Executed 0 _REG methods for SpaceId EmbeddedControl [ 0.540038] ACPI: \: Used as first EC [ 0.540038] ACPI: \: GPE=0x17, EC_CMD/EC_SC=0x66, EC_DATA=0x62 [ 0.540038] ACPI: \: Used as boot ECDT EC to handle transactions So we can see that, in the early stage, previously we can invoke EC _REG because the ACPI tables have been loaded, but now we can not. Problem is that we should re-install the EC address space handler, thus invokes EC _REG, in acpi_ec_add(), but I didn't find such information in the bad kernel. Let me look into the code and generate a debug patch later to see what is going wrong in acpi_ec_add(). Created attachment 279729 [details]
debug patch to force re-install EC address space handler
I think I have found the problem but I'm not sure how to fix it properly yet.
Please try this debug patch on top of newer kernel (the patch is made based on 4.20-rc2) and confirm the problem is gone.
Great! I can confirm this patch works on both 4.20-rc4 and 4.19.5 kernels. please check if the patch at https://bugzilla.kernel.org/show_bug.cgi?id=199981#c34 helps or not This patch fix the problem on Linux 4.20-rc7. *** Bug 200141 has been marked as a duplicate of this bug. *** *** This bug has been marked as a duplicate of bug 199981 *** here is the latest patch, https://patchwork.kernel.org/patch/10753143/ please apply it on top of 5.0-rc1 and check if it works or not. |