Bug 153541

Summary: Module level code - ACPI problem on Z170 chipset
Product: ACPI Reporter: Dutch Guy (lucht_piloot)
Component: ACPICA-CoreAssignee: Lv Zheng (lv.zheng)
Status: CLOSED CODE_FIX    
Severity: normal CC: ingvarthorvald, lv.zheng, rui.zhang
Priority: P1    
Hardware: x86-64   
OS: Linux   
Kernel Version: 4.8.0-rc2 Subsystem:
Regression: No Bisected commit-id:
Attachments: output of acpidump
Namespace lock change proposal
[PATCH] ACPICA: Namespace: Add acpi_ns_get_node_unlocked()
[PATCH] ACPICA: Namespace: Fix dynamic table loading issues by tuning namespace/interpreter locks
[PATCH] ACPICA: Dispatcher: Fix a mutex issue for method auto serialization
[PATCH] ACPICA: Tables: Tune table mutex to be a leaf lock
[PATCH 1/8] ACPICA: Interpreter: Fix MLC issues by switching to new term_list grammar for table loading
[PATCH 2/8] ACPICA: Namespace: Add acpi_ns_get_node_unlocked()
[PATCH 3/8] ACPICA: Namespace: Fix dynamic table loading issues by tuning namespace/interpreter locks
[PATCH 4/8] ACPICA: Dispatcher: Fix a mutex issue for method auto serialization
[PATCH 5/8] ACPICA: Tables: Tune table mutex to be a leaf lock
[PATCH 6/8] ACPI 2.0 / AML: Enable correct ACPI subsystem initialization order for new table loading mode
[PATCH 7/8] ACPI 2.0 / AML: Improve module level execution by moving the If/Else/While execution to per-table basis
[PATCH 8/8] ACPI 2.0 / AML: Fix module level execution by correctly parsing table as TermList
DMESG -> kernel 4.8.0-RC4-NoPatch
DMESG -> kernel 4.8.0-RC4-RevertedCommit
DMESG -> kernel 4.8.0-RC4-PatchAfterMlc1
DMESG -> kernel 4.8.0-RC4-PatchAfterMlc2

Description Dutch Guy 2016-08-22 16:52:04 UTC
Created attachment 229701 [details]
output of acpidump

On my desktop consisting of the following config:
- i3-6100 CPU
- Asus Z170i Pro Gaming
- Ubuntu 16.04 64 bit

I get this error:
[    0.000023] ACPI: Core revision 20160422
[    0.018190] ACPI Error: [\_SB_.PCI0.XHC_.RHUB.HS11] Namespace lookup failure, AE_NOT_FOUND (20160422/dswload-210)
[    0.018196] ACPI Exception: AE_NOT_FOUND, During name lookup/catalog (20160422/psobject-227)
[    0.018226] ACPI Exception: AE_NOT_FOUND, (SSDT:xh_rvp08) while loading table (20160422/tbxfload-227)
[    0.023736] ACPI Error: 1 table load failures, 8 successful (20160422/tbxfload-247)

And then later:
[    0.372731] ACPI: Added _OSI(Module Device)
[    0.372733] ACPI: Added _OSI(Processor Device)
[    0.372735] ACPI: Added _OSI(3.0 _SCP Extensions)
[    0.372736] ACPI: Added _OSI(Processor Aggregator Device)
[    0.373435] ACPI: Executed 25 blocks of module-level executable AML code
[    0.379425] ACPI: [Firmware Bug]: BIOS _OSI(Linux) query ignored
[    0.381414] ACPI: Dynamic OEM Table Load:
[    0.381420] ACPI: SSDT 0xFFFF9590AD3B2800 000726 (v02 PmRef  Cpu0Ist  00003000 INTL 20120913)
[    0.381968] ACPI: \_PR_.CPU0: _OSC native thermal LVT Acked
[    0.382894] ACPI: Dynamic OEM Table Load:
[    0.382898] ACPI: SSDT 0xFFFF9590ACDD1000 00037F (v02 PmRef  Cpu0Cst  00003001 INTL 20120913)
[    0.383845] ACPI: Dynamic OEM Table Load:
[    0.383849] ACPI: SSDT 0xFFFF9590AD3B3000 0005AA (v02 PmRef  ApIst    00003000 INTL 20120913)
[    0.384548] ACPI: Dynamic OEM Table Load:
[    0.384551] ACPI: SSDT 0xFFFF9590ACE5C400 000119 (v02 PmRef  ApCst    00003000 INTL 20120913)
[    0.386356] ACPI : EC: EC started

I have tested it on 4.8.0-rc1, rc2 and on main (linus git repository) all with same result.

Also read this thread:
https://bugzilla.kernel.org/show_bug.cgi?id=117671
However in kernels I tested this fix was not included or is not working anymore or there might be another problem in my case.
Comment 1 [account disabled by the administrator] 2016-08-22 20:22:36 UTC
Do the trees your running have this commit id,3d4b7ae96d81dc8ed4ecd556118b632c2707ff08
as this is the one your mentioning that is currently for the fix in Linus's tree.
Comment 2 Lv Zheng 2016-08-23 03:08:22 UTC
The fix is in the kernel, but disabled by the following commit due to lock issues:
https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=00c611def8748a0a1cf1d31842e49b42dfdb3de1

we've internally submitted lock issue fix approaches.
And still waiting for the ACPICA upstream to determine.

Thanks
Lv
Comment 3 Dutch Guy 2016-08-23 08:51:12 UTC
Is there any way to track this proposal/commit to the kernel.

Thanks
Comment 4 Lv Zheng 2016-08-24 02:26:54 UTC
Created attachment 229971 [details]
Namespace lock change proposal

They'll appear in the kernel soon, I think.
My proposal is attached.

My first revision was posted on another bugzilla entry before.
However I improved it several times.

We are using team efforts to make sure the coverage of the change is OK.
After that, it will be released.

Thanks
Lv
Comment 5 [account disabled by the administrator] 2016-08-24 02:34:53 UTC
Send me a actual patch when you have one that works or attach it here and I can give you comments or something if you wish.
Comment 6 Lv Zheng 2016-08-24 03:09:03 UTC
(In reply to ingvarthorvald from comment #5)
> Send me a actual patch when you have one that works or attach it here and I
> can give you comments or something if you wish.

Sounds good.
I'll post fixes here.

Thanks
Lv
Comment 7 Lv Zheng 2016-08-25 07:50:24 UTC
Created attachment 230111 [details]
[PATCH] ACPICA: Namespace: Add acpi_ns_get_node_unlocked()
Comment 8 Lv Zheng 2016-08-25 07:50:51 UTC
Created attachment 230121 [details]
[PATCH] ACPICA: Namespace: Fix dynamic table loading issues by tuning namespace/interpreter locks
Comment 9 Lv Zheng 2016-08-25 07:51:17 UTC
Created attachment 230131 [details]
[PATCH] ACPICA: Dispatcher: Fix a mutex issue for method auto serialization
Comment 10 Lv Zheng 2016-08-25 07:51:38 UTC
Created attachment 230141 [details]
[PATCH] ACPICA: Tables: Tune table mutex to be a leaf lock
Comment 11 Lv Zheng 2016-08-25 07:54:49 UTC
The fixes are posted, also need to revert "00c611def8748a0a1cf1d31842e49b42dfdb3de1"

So you need to:

# git clone https://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm.git
# git checkout linux-next
# git am <the posted patches>
# git revert 00c611def8748a0a1cf1d31842e49b42dfdb3de1

Than try the patched kernel.

Thanks
Lv
Comment 12 Dutch Guy 2016-08-25 08:50:04 UTC
Sounds good, will try later today.
Do you want some logs from the PC I am using to test
(If so please specify which logs you would like to have)
Comment 13 Lv Zheng 2016-08-26 01:33:38 UTC
(In reply to Dutch Guy from comment #12)
> Sounds good, will try later today.
> Do you want some logs from the PC I am using to test
> (If so please specify which logs you would like to have)

Sure, for now, I only need the dmesg output before/after the actions mentioned in comment 11.

Thanks
Comment 14 Lv Zheng 2016-08-26 05:23:46 UTC
(In reply to Lv Zheng from comment #13)
> (In reply to Dutch Guy from comment #12)
> > Sounds good, will try later today.
> > Do you want some logs from the PC I am using to test
> > (If so please specify which logs you would like to have)
> 
> Sure, for now, I only need the dmesg output before/after the actions
> mentioned in comment 11.

May not be clear. Let me reword.

Test A:

I need you to upload the full dmesg output of the kernel boot log here before applying the patches and reverting the temporal fix.

1. build and boot your kernel
2. obtain the full dmesg output
# dmesg > dmesg-before.txt
3. upload dmesg-before.txt here

Test B:

I also need you to upload the full dmesg output of the kernel boot log here after applying the patches and reverting the temporal fix.

1. apply the patches and revert the temporal fix with your kernel
# git am <the posted patches>
# git revert 00c611def8748a0a1cf1d31842e49b42dfdb3de1
2. build and boot the kernel
3. obtain the full dmesg output
# dmesg > dmesg-after.txt
3. upload dmesg-after.txt here

If you failed to apply the fixes in step B.1, you can use the linux-pm.git/linux-next branch:
# git clone https://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm.git
# git checkout linux-next
And re-do the test A and B on top of this kernel repository.

Thanks in advance.
Comment 15 Dutch Guy 2016-08-26 15:35:57 UTC
Currently compiling the new kernel.
I have not downloaded a lot of patches from kernel.org, but is the easiest way to download, copy link location (of attachment) and then wget that link?
(it worked, however just checking if it is the easiest way :))

Will post back, when kernel is done and installed.
Comment 16 Dutch Guy 2016-08-26 15:46:19 UTC
Getting this error during kernel build:

  LD      drivers/ata/built-in.o
  CC [M]  net/802/p8022.o
Makefile:968: recipe for target 'drivers' failed
make[2]: *** [drivers] Error 2
make[2]: *** Waiting for unfinished jobs....
  CC [M]  net/802/psnap.o
  CC [M]  fs/nfs/nfstrace.o

I am using the:
# git clone https://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm.git
Comment 17 Dutch Guy 2016-08-26 16:30:09 UTC
4.8.0-RC3 also has some errors, only these are connected to the patches:

 CC [M]  sound/drivers/mts64.o
  CC      drivers/acpi/acpica/tbdata.o
  CC      fs/ecryptfs/super.o
drivers/acpi/acpica/tbdata.c: In function ‘acpi_tb_load_table’:
drivers/acpi/acpica/tbdata.c:821:3: error: implicit declaration of function ‘acpi_ev_update_gpes’ [-Werror=implicit-function-declaration]
   acpi_ev_update_gpes(owner_id);
   ^
cc1: some warnings being treated as errors
scripts/Makefile.build:289: recipe for target 'drivers/acpi/acpica/tbdata.o' failed
make[5]: *** [drivers/acpi/acpica/tbdata.o] Error 1
scripts/Makefile.build:440: recipe for target 'drivers/acpi/acpica' failed
make[4]: *** [drivers/acpi/acpica] Error 2
scripts/Makefile.build:440: recipe for target 'drivers/acpi' failed
make[3]: *** [drivers/acpi] Error 2
Makefile:968: recipe for target 'drivers' failed
make[2]: *** [drivers] Error 2
make[2]: *** Waiting for unfinished jobs....
  CC      fs/ecryptfs/mmap.o
  CC [M]  sound/drivers/portman2x4.o
  CC      fs/ecryptfs/read_write.o
Comment 18 Lv Zheng 2016-08-29 01:03:23 UTC
It's my problem.
I've fixed this and forgot to post it again here.

You can add "#include "acevents.h" to tbdata.c to fix this build issue.
Or let me post refreshed patches here.
as patches are now merged by ACPICA upstream.
So it might be better to test the upstreamed version.

Thanks and best regards
Lv
Comment 19 Dutch Guy 2016-08-29 06:27:46 UTC
I will do the test later this morning (on latest Linus RC kernel).
What is the upstream git you are referencing to (is that the rafael/linux-pm.git)?
Comment 20 Lv Zheng 2016-08-29 07:39:22 UTC
Created attachment 231031 [details]
[PATCH 1/8] ACPICA: Interpreter: Fix MLC issues by switching to new term_list grammar for table loading

The upstreamed lock fixes are based on this.
Comment 21 Lv Zheng 2016-08-29 07:39:49 UTC
Created attachment 231041 [details]
[PATCH 2/8] ACPICA: Namespace: Add acpi_ns_get_node_unlocked()
Comment 22 Lv Zheng 2016-08-29 07:40:32 UTC
Created attachment 231051 [details]
[PATCH 3/8] ACPICA: Namespace: Fix dynamic table loading issues by tuning namespace/interpreter locks
Comment 23 Lv Zheng 2016-08-29 07:41:10 UTC
Created attachment 231061 [details]
[PATCH 4/8] ACPICA: Dispatcher: Fix a mutex issue for method auto serialization
Comment 24 Lv Zheng 2016-08-29 07:41:40 UTC
Created attachment 231071 [details]
[PATCH 5/8] ACPICA: Tables: Tune table mutex to be a leaf lock
Comment 25 Lv Zheng 2016-08-29 07:42:18 UTC
Created attachment 231081 [details]
[PATCH 6/8] ACPI 2.0 / AML: Enable correct ACPI subsystem initialization order for new table loading mode
Comment 26 Lv Zheng 2016-08-29 07:42:48 UTC
Created attachment 231091 [details]
[PATCH 7/8] ACPI 2.0 / AML: Improve module level execution by moving the If/Else/While execution to per-table basis
Comment 27 Lv Zheng 2016-08-29 07:43:18 UTC
Created attachment 231101 [details]
[PATCH 8/8] ACPI 2.0 / AML: Fix module level execution by correctly parsing table as TermList
Comment 28 Lv Zheng 2016-08-29 07:54:52 UTC
> I will do the test later this morning (on latest Linus RC kernel).
> What is the upstream git you are referencing to (is that the
> rafael/linux-pm.git)?

I uploaded the upstreamed patches, they are upstreamed here:
https://github.com/acpica/acpica
In order to use them with Linux, we need to "linuxize" these patches from ACPICA upstream.
So I posted the "linuxized" here and they all contain upstreamed URL and commit ID.

Then the test step could be:

Test A: the curret code

I need you to upload the full dmesg output of the kernel boot log here before applying the patches and reverting the temporal fix.

1. build and boot your kernel
2. obtain the full dmesg output
# dmesg > dmesg-before.txt
3. upload dmesg-before.txt here

Test B: confirm "acpi_gbl_group_module_level_code = false"

I need you to upload the full dmesg output of the kernel boot log here for "acpi_gbl_group_module_level_code = false" mode.

1. apply the 1st set of patches, "the 1st set of patches" here means:
attachment 231031 [details]
attachment 231041 [details]
attachment 231051 [details]
attachment 231061 [details]
attachment 231071 [details]
attachment 231091 [details]
2. build and boot the kernel
3. obtain the full dmesg output
# dmesg > dmesg-after-mlc1.txt
3. upload dmesg-after-mlc1.txt here

Test C: confirm "acpi_gbl_parse_table_as_term_list = true"

I need you to upload the full dmesg output of the kernel boot log here for "acpi_gbl_parse_table_as_term_list = true" mode.

1. apply the 2nd set of patches on top of the 1st set of patches, "the 2nd set of patches" here means:
attachment 231081 [details]
attachment 231101 [details]
2. build and boot the kernel
3. obtain the full dmesg output
# dmesg > dmesg-after-mlc2.txt
3. upload dmesg-after-mlc2.txt here

If you failed to apply the fixes in step B.1/C.1, you can use the linux-pm.git/linux-next branch:
# git clone https://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm.git
# git checkout linux-next
And re-do the test A and B on top of this kernel repository.

Thanks in advance.
Comment 29 Dutch Guy 2016-08-29 09:49:10 UTC
Okey, just compiled the version with the earlier 4 patches. But to do it correct/complete I will just start with your previous message.
Will report back when I have all 3 dmesg outputs.

I noticed that the patches(8) need the "revert" patch to be in place, if I revert it, it will give a patch error. So for the test B & C this patch is still in place and not removed. Do I still need to remove this patch, because it is not yet so clear anymore?
# git revert 00c611def8748a0a1cf1d31842e49b42dfdb3de1

At the moment I am building 3 kernels:
kernel 4.8.0 RC4 with reverted 00c611def8748a0a1cf1d31842e49b42dfdb3de1
kernel 4.8.0 RC4 without reverted 00c611def8748a0a1cf1d31842e49b42dfdb3de1 & 5 patches
kernel 4.8.0 RC4 without reverted 00c611def8748a0a1cf1d31842e49b42dfdb3de1 & 8 patches

Please advice if this is not quite what you mentioned?
Comment 30 Dutch Guy 2016-08-29 17:24:34 UTC
Created attachment 231231 [details]
DMESG -> kernel 4.8.0-RC4-NoPatch
Comment 31 Dutch Guy 2016-08-29 17:25:07 UTC
Created attachment 231241 [details]
DMESG -> kernel 4.8.0-RC4-RevertedCommit
Comment 32 Dutch Guy 2016-08-29 17:26:08 UTC
Created attachment 231251 [details]
DMESG -> kernel 4.8.0-RC4-PatchAfterMlc1
Comment 33 Dutch Guy 2016-08-29 17:26:27 UTC
Created attachment 231261 [details]
DMESG -> kernel 4.8.0-RC4-PatchAfterMlc2
Comment 34 Dutch Guy 2016-08-29 17:29:16 UTC
This is DMESG from a fresh git clone, without ANY adjustments:
https://bugzilla.kernel.org/attachment.cgi?id=231231

This is DMESG from a fresh git clone, with commit reverted:
https://bugzilla.kernel.org/attachment.cgi?id=231241

This is DMESG from a fresh git clone, with 6 patches and NO commit reverted (so this is still in there!):
https://bugzilla.kernel.org/attachment.cgi?id=231251

This is DMESG from a fresh git clone, with 8 patches and NO commit reverted (so this is still in there!):
https://bugzilla.kernel.org/attachment.cgi?id=231261

Please tell me if you need more.
Kind regards.
Comment 35 Lv Zheng 2016-08-29 23:15:44 UTC
(In reply to Dutch Guy from comment #34)
> This is DMESG from a fresh git clone, without ANY adjustments:
> https://bugzilla.kernel.org/attachment.cgi?id=231231
> 
> This is DMESG from a fresh git clone, with commit reverted:
> https://bugzilla.kernel.org/attachment.cgi?id=231241
> 
> This is DMESG from a fresh git clone, with 6 patches and NO commit reverted
> (so this is still in there!):
> https://bugzilla.kernel.org/attachment.cgi?id=231251
> 
> This is DMESG from a fresh git clone, with 8 patches and NO commit reverted
> (so this is still in there!):
> https://bugzilla.kernel.org/attachment.cgi?id=231261
> 
> Please tell me if you need more.
> Kind regards.

The error:
[    0.018143] ACPI Error: [\_SB_.PCI0.XHC_.RHUB.HS11] Namespace lookup failure, AE_NOT_FOUND (20160422/dswload-210)
[    0.018149] ACPI Exception: AE_NOT_FOUND, During name lookup/catalog (20160422/psobject-227)
[    0.018182] ACPI Exception: AE_NOT_FOUND, (SSDT:xh_rvp08) while loading table (20160422/tbxfload-227)
[    0.023682] ACPI Error: 1 table load failures, 8 successful (20160422/tbxfload-247)

Seems to have gone in Test B and C.
Problem fixed, I'll mark the bug resolved.

Thanks
Comment 36 Lv Zheng 2016-08-29 23:18:49 UTC
(In reply to Dutch Guy from comment #29)
> Okey, just compiled the version with the earlier 4 patches. But to do it
> correct/complete I will just start with your previous message.
> Will report back when I have all 3 dmesg outputs.
> 
> I noticed that the patches(8) need the "revert" patch to be in place, if I
> revert it, it will give a patch error. So for the test B & C this patch is
> still in place and not removed. Do I still need to remove this patch,
> because it is not yet so clear anymore?
> # git revert 00c611def8748a0a1cf1d31842e49b42dfdb3de1
> 

I posted the reversion commit here: attachment 231081 [details].
So if you followed comment 28 and applied this patch for test B/C, you don't have revert the "00c611def8748a0a1cf1d31842e49b42dfdb3de1".

> At the moment I am building 3 kernels:
> kernel 4.8.0 RC4 with reverted 00c611def8748a0a1cf1d31842e49b42dfdb3de1
> kernel 4.8.0 RC4 without reverted 00c611def8748a0a1cf1d31842e49b42dfdb3de1 &
> 5 patches

(not 6 patches?)

> kernel 4.8.0 RC4 without reverted 00c611def8748a0a1cf1d31842e49b42dfdb3de1 &
> 8 patches
> 

Could be sufficient.

> Please advice if this is not quite what you mentioned?

Thanks
Comment 37 Lv Zheng 2016-08-29 23:25:23 UTC
(In reply to Dutch Guy from comment #34)
> This is DMESG from a fresh git clone, without ANY adjustments:
> https://bugzilla.kernel.org/attachment.cgi?id=231231

I can see the error:
[    0.018143] ACPI Error: [\_SB_.PCI0.XHC_.RHUB.HS11] Namespace lookup failure, AE_NOT_FOUND (20160422/dswload-210)
[    0.018149] ACPI Exception: AE_NOT_FOUND, During name lookup/catalog (20160422/psobject-227)
[    0.018182] ACPI Exception: AE_NOT_FOUND, (SSDT:xh_rvp08) while loading table (20160422/tbxfload-227)
[    0.023682] ACPI Error: 1 table load failures, 8 successful (20160422/tbxfload-247)


> 
> This is DMESG from a fresh git clone, with commit reverted:
> https://bugzilla.kernel.org/attachment.cgi?id=231241

I cannot see the error.
So it is fixed.
However the fix is not safe.

> 
> This is DMESG from a fresh git clone, with 6 patches and NO commit reverted
> (so this is still in there!):
> https://bugzilla.kernel.org/attachment.cgi?id=231251

I cannot see the error.
So it is fixed.
The test repo contains lock fixes, safer now.
However, acpi_gbl_group_module_level_code = false is not the right fix.

> 
> This is DMESG from a fresh git clone, with 8 patches and NO commit reverted
> (so this is still in there!):
> https://bugzilla.kernel.org/attachment.cgi?id=231261


I cannot see the error.
So it is fixed.
The test repo contains lock fixes, safer now.
acpi_gbl_parse_table_as_term_list = true is the right fix.

> 
> Please tell me if you need more.

No further tests are needed.
Thanks for the help.

Best regards
Comment 38 Dutch Guy 2016-08-30 05:07:48 UTC
(In reply to Lv Zheng from comment #36)
> > At the moment I am building 3 kernels:
> > kernel 4.8.0 RC4 with reverted 00c611def8748a0a1cf1d31842e49b42dfdb3de1
> > kernel 4.8.0 RC4 without reverted 00c611def8748a0a1cf1d31842e49b42dfdb3de1
> &
> > 5 patches
> 
> (not 6 patches?)

Correct, I meant 6

Thanks and I will wait for the inclusion of the fix in Linus his kernel (hopefully before 4.8 release)
Comment 39 Lv Zheng 2016-12-12 07:16:04 UTC
We changed strategy to track all ACPICA bugs on ACPICA bugzilla.
Please monitor this URL:
https://bugs.acpica.org/show_bug.cgi?id=963
I'll close this bug.