Bug 135691 - EC parallelism: parallel _Qxx evaluation problems
Summary: EC parallelism: parallel _Qxx evaluation problems
Status: CLOSED CODE_FIX
Alias: None
Product: ACPI
Classification: Unclassified
Component: EC (show other bugs)
Hardware: Intel Linux
: P1 normal
Assignee: Lv Zheng
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2016-07-20 05:20 UTC by Lv Zheng
Modified: 2016-08-15 08:53 UTC (History)
2 users (show)

See Also:
Kernel Version: 4.3+
Tree: Mainline
Regression: Yes


Attachments
acpidump output (270.36 KB, text/plain)
2016-07-20 05:21 UTC, Lv Zheng
Details
dmidecode output (8.56 KB, text/plain)
2016-07-20 05:21 UTC, Lv Zheng
Details
The boot kernel log (108.17 KB, text/plain)
2016-07-20 05:22 UTC, Lv Zheng
Details
[PATCH] Disable tracer after the culprit is detected (500 bytes, patch)
2016-07-20 06:00 UTC, Lv Zheng
Details | Diff
/proc/cpuinfo (3.71 KB, text/plain)
2016-07-20 14:52 UTC, H Buus
Details
Dmesg output with 224351 patch (64.96 KB, text/plain)
2016-07-20 15:02 UTC, H Buus
Details
Dmesg with CONFIG_ACPI_DEBUG=y and acpi.trace_state=enabled (337.61 KB, text/plain)
2016-07-22 04:10 UTC, H Buus
Details
[PATCH] ACPI / EC: Reduce parallelism for EC _Qxx evaluations (1.70 KB, patch)
2016-07-22 09:42 UTC, Lv Zheng
Details | Diff
dmesg with ec-paral1.patch (4.77 MB, text/plain)
2016-07-22 21:58 UTC, H Buus
Details
[PATCH] ACPI / EC: Fix a regression by reducing parallel _Qxx evaluations (2.42 KB, patch)
2016-07-25 10:34 UTC, Lv Zheng
Details | Diff

Description Lv Zheng 2016-07-20 05:20:08 UTC
[1.] One line summary of the problem: 

acpi regression first bad commit 02b771b64b73226052d6e731a0987db3b47281e9

[2.] Full description of the problem/report: 

I found what I believe is a regression on my Gateway ne570 laptop. The problem is reproducible with kernels v4.7-rc7, v4.4.13 and v4.3. I found dmesg and kern.log were filling with messages like this:
Jul 10 17:01:10 buus-ne570 kernel: [ 57.741630] ACPI Error: Method reached maximum reentrancy limit (255) (20150818/dsmethod-341)
Jul 10 17:01:10 buus-ne570 kernel: [ 57.803775] ACPI Error: Method reached maximum reentrancy limit (255) (20150818/dsmethod-341)
Jul 10 17:01:10 buus-ne570 kernel: [ 57.866294] ACPI Error: Method reached maximum reentrancy limit (255) (20150818/dsmethod-341)
Jul 10 17:01:10 buus-ne570 kernel: [ 57.928403] ACPI Error: Method reached maximum reentrancy limit (255) (20150818/dsmethod-341)

With the latest stable kernel, 4.7-rc7, the ACPI error messages have changed to reflect a newer version, specifically:
Jul 16 12:59:18 buus-ne570 kernel: [   78.862942] ACPI Error: Method reached maximum reentrancy limit (255) (20160422/dsmethod-340)

These messages are repeated several times a second, and then pause for several seconds, then start again. I believe they are the cause of more sluggish behavior than with the older kernels. I have also found that if the machine is idle, there's a pause in the "ACPI Error" messages. I think it depends on how much writing to disk is going on, but that's just a gut feeling. I'm not quite sure how to verify that assumption, or if it would help diagnose the problem.

I believe the problem is a regression, because it did not occur with kernels v3.16.7, v3.19.8, or v4.2.8.

git bisecting the linux-stable kernel led me to this commit:

# first bad commit: [02b771b64b73226052d6e731a0987db3b47281e9] ACPI / EC: Fix an issue caused by the serialized _Qxx evaluations

commit 02b771b64b73226052d6e731a0987db3b47281e9
Author: Lv Zheng <lv.zheng@intel.com>
Date: Wed Aug 12 11:12:02 2015 +0800

    ACPI / EC: Fix an issue caused by the serialized _Qxx evaluations

    It is proven that Windows evaluates _Qxx handlers in a parallel way. This
    patch follows this fact, splits _Qxx evaluations from the NOTIFY queue to
    form a separate queue, so that _Qxx evaluations can be queued up on
    different CPUs rather than being queued up on a CPU0 bound queue.
    Event handling related callbacks are also renamed and sorted in this patch.

    Link: https://bugzilla.kernel.org/show_bug.cgi?id=94411
    Reported-and-tested-by: Gabriele Mazzotta <gabriele.mzt@gmail.com>
    Signed-off-by: Lv Zheng <lv.zheng@intel.com>
    Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>

I can provide an annotated "git bisect log" associating commits from the linux-stable tree with the deb packages I built, installed, and tested to arrive at the "first bad commit".

[3.] Keywords (i.e., modules, networking, kernel):

[4.] Kernel version (from /proc/version): 
Linux version 4.7.0-040700rc7-generic (kernel@gloin) (gcc version 5.4.0 20160609 (Ubuntu 5.4.0-6ubuntu1) ) #201607110032 SMP Mon Jul 11 04:34:25 UTC 2016

[5.] Output of Oops.. message (if applicable) with symbolic information resolved (see Documentation/oops-tracing.txt) 
Not applicable

[6.] A small shell script or example program which triggers the problem (if possible)
Not applicable
 
[7.] Environment
[7.1.] Software (add the output of the ver_linux script here)

Linux buus-ne570 4.7.0-040700rc7-generic #201607110032 SMP Mon Jul 11 04:34:25 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux

GNU C                   5.4.0
GNU Make                4.1
Binutils                2.26.1
Util-linux              2.27.1
Mount                   2.27.1
Module-init-tools       22
E2fsprogs               1.42.13
Jfsutils                1.1.15
Pcmciautils             018
PPP                     2.4.7
Linux C Library         2.23
Dynamic linker (ldd)    2.23
Linux C++ Library       6.0.21
Procps                  3.3.10
Net-tools               1.60
Kbd                     1.15.5
Console-tools           1.15.5
Sh-utils                8.25
Udev                    229
Wireless-tools          30
Modules Loaded          ac acer_wmi ahci ansi_cprng arc4 ath ath3k ath9k ath9k_common ath9k_hw autofs4 battery bluetooth bnep btbcm btintel btrtl btusb button ccm cdrom cfg80211 coretemp crc16 crc32c_intel crc32_pclmul crct10dif_pclmul cryptd ctr dm_mod drbg drm drm_kms_helper efi_pstore efivarfs efivars ehci_hcd ehci_pci evdev ext4 fat fjes fuse ghash_clmulni_intel hid hid_generic hidp hmac i2c_algo_bit i915 intel_cstate intel_powerclamp intel_rapl intel_rapl_perf irqbypass jbd2 jfs joydev kvm kvm_intel libahci libata libphy lp lpc_ich mac80211 mbcache media mei mei_me mfd_core mmc_core nls_cp437 nls_utf8 parport parport_pc ppdev pps_core psmouse ptp rfcomm rfkill scsi_mod sdhci sdhci_pci sd_mod serio_raw sg shpchp snd snd_hda_codec snd_hda_codec_generic snd_hda_codec_hdmi snd_hda_codec_realtek snd_hda_core snd_hda_intel snd_hwdep snd_pcm snd_rawmidi snd_seq snd_seq_device snd_seq_midi snd_seq_midi_event snd_timer soundcore sparse_keymap sr_mod tg3 tpm tpm_tis usb_common usbcore uvcvideo vfat video videobuf2_core videobuf2_memops videobuf2_v4l2 videobuf2_vmalloc videodev wmi x86_pkg_temp_thermal xhci_hcd xhci_pci

[7.2.] Processor information (from /proc/cpuinfo):
Intel(R) Core(TM) i3-3217U CPU @ 1.80GHz
I'll provide the full output if you want it.

[7.3.] Module information (from /proc/modules):
I can provide this if you want it.

[7.4.] Loaded driver and hardware information (/proc/ioports, /proc/iomem)
I can provide this if you want it.

[7.5.] PCI information ('lspci -vvv' as root)
I can provide this if you want it.

[7.6.] SCSI information (from /proc/scsi/scsi)
Doesn't exist.

[7.7.] Other information that might be relevant to the problem
       (please look in /proc and include all information that you
       think to be relevant):

I can provide acpidump output, not sure if I can attach emails sent to linux-acpi list, so I'm not attaching it now.

dmi.bios.date: 10/08/2013
dmi.bios.vendor: Insyde Corp.
dmi.bios.version: V2.06
dmi.board.asset.tag: Type2 - Board Asset Tag
dmi.board.name: EA50_CX
dmi.board.vendor: Gateway
dmi.board.version: V2.06
dmi.chassis.asset.tag: Acer Asset Tag String
dmi.chassis.type: 10
dmi.chassis.vendor: Insyde Corp.
dmi.chassis.version: V2.06
dmi.modalias: dmi:bvnInsydeCorp.:bvrV2.06:bd10/08/2013:svnGateway:pnNE570:pvrV2.06:rvnGateway:rnEA50_CX:rvrV2.06:cvnInsydeCorp.:ct10:cvrV2.06:
dmi.product.name: NE570
dmi.product.version: V2.06
dmi.sys.vendor: Gateway

[X.] Other notes, patches, fixes, workarounds:

This machine has the latest UEFI firmware available from the manufacturer, Insyde v2.06. FWIW, I believe the Gateway NE570 is a re-branded Acer Aspire E1-570.
Comment 1 Lv Zheng 2016-07-20 05:21:31 UTC
Created attachment 224321 [details]
acpidump output
Comment 2 Lv Zheng 2016-07-20 05:21:49 UTC
Created attachment 224331 [details]
dmidecode output
Comment 3 Lv Zheng 2016-07-20 05:22:05 UTC
Created attachment 224341 [details]
The boot kernel log
Comment 4 Lv Zheng 2016-07-20 05:42:41 UTC
This looks like a different issue that the EC one.
It's more about a bug of ACPICA's thread_count, not EC.
Let's first do some tests to capture the affected control method.
I'll paste a debug patch later.

Also please upload /proc/cpuinfo here.

Thanks and best regards
-Lv
Comment 5 Lv Zheng 2016-07-20 06:00:44 UTC
Created attachment 224351 [details]
[PATCH] Disable tracer after the culprit is detected

1. Please upload /proc/cpuinfo here.

2. Please apply this patch and build the kernel, then boot the kernel with:
    acpi.trace_state=enable
   And upload the dmesg output here.
   Hope the attached patch can help to reduce the output to obtain what we want.
   You can also tune the log buffer size with "log_buf_len=" kernel parameter.
   If you still cannot obtain the full boot log, please just upload the partial one and let me think different debugging approach.

Thanks and best regards
-Lv
Comment 6 H Buus 2016-07-20 14:52:36 UTC
Created attachment 224371 [details]
/proc/cpuinfo
Comment 7 H Buus 2016-07-20 15:02:47 UTC
Created attachment 224381 [details]
Dmesg output with 224351 patch

It just occurred to me, I did not CONFIG_ACPI_DEBUG in .config. Maybe I should have? If so, let me know and I will recompile and test.
Comment 8 Lv Zheng 2016-07-21 07:48:58 UTC
(In reply to H Buus from comment #7)
> Created attachment 224381 [details]
> Dmesg output with 224351 patch
> 
> It just occurred to me, I did not CONFIG_ACPI_DEBUG in .config. Maybe I
> should have? If so, let me know and I will recompile and test.

Yes, you should have CONFIG_ACPI_DEBUG enabled.
Sorry that I forgot to mentione.

The full documentation of this facility can be found at:
Documentation/acpi/method-tracing.txt.

Thanks and best regards
-Lv
Comment 9 H Buus 2016-07-22 04:10:51 UTC
Created attachment 224841 [details]
Dmesg with CONFIG_ACPI_DEBUG=y and acpi.trace_state=enabled
Comment 10 Lv Zheng 2016-07-22 06:49:03 UTC
I found 380 _Q47 Begin, but only 108 _Q47 End.
Thus the "reentrancy limit" error is triggered.
However I need to know why, let me think it first.

Thanks and best regards
-Lv
Comment 11 Lv Zheng 2016-07-22 09:42:21 UTC
Created attachment 224951 [details]
[PATCH] ACPI / EC: Reduce parallelism for EC _Qxx evaluations

Looks like the system_wq is not suitable during the period ACPICA is not improved.
Let me work this around using a special workqueue with parallel executions limitted.
Please try this patch to see if it can stop the regression.

Thanks
-Lv
Comment 12 H Buus 2016-07-22 21:58:53 UTC
Created attachment 225521 [details]
dmesg with ec-paral1.patch

That seems to work. The attached dmesg3.out is stitched together from two captures of dmesg output. When I grepped, I got these counts:
$ grep -c 'Begin.*_Q47' dmesg3.out
4380
$ grep -c 'End.*_Q47' dmesg3.out
4323

4380-4323=57 looks much better than 380-108=272

After running for over 1820 seconds, I disabled trace_state, especially since I know I have overflowed the dmesg log. That is with log_buf_len=4M.

I will keep it running, since the longest it has taken to trigger the regression was after 2140 seconds of uptime. I'll report back in a few hours.
Comment 13 H Buus 2016-07-23 05:59:41 UTC
Looking good. The kernel has been running for almost 7 hours, and still no "Method reached maximum reentrancy limit" errors.
Comment 14 Lv Zheng 2016-07-23 11:18:05 UTC
However this is still dependent on the number of online cpus.
Due to the ACPICA limitation, we could further make the max_active as 2 or configurable when this is upstreamed.
In the future, when ACPICA is improved, we can use a greater number.

Anyway, thanks for the report.
I'll refine the patch and send it to the upstream.
In the meanwhile, I'll mark this bug as resolved.

Best regards
-Lv
Comment 15 Lv Zheng 2016-07-25 10:31:28 UTC
(In reply to H Buus from comment #12)
> Created attachment 225521 [details]
> dmesg with ec-paral1.patch
> 
> That seems to work. The attached dmesg3.out is stitched together from two
> captures of dmesg output. When I grepped, I got these counts:
> $ grep -c 'Begin.*_Q47' dmesg3.out
> 4380
> $ grep -c 'End.*_Q47' dmesg3.out
> 4323
> 
> 4380-4323=57 looks much better than 380-108=272

We should use WQ_UNBOUND here.
Please help me to do one more test.
I'll post the patch later.
Comment 16 Lv Zheng 2016-07-25 10:34:05 UTC
Created attachment 225791 [details]
[PATCH] ACPI / EC: Fix a regression by reducing parallel _Qxx evaluations

Please try this patch and upload the dmesg output.
I want to know if Begin-End can be limited to a reasonable value.
If you still can see breakage.
Please modify /sys/devices/virtual/workqueue/kec_query/max_alive to help to find a best fit value.
Thanks in advance.

Best regards
-Lv
Comment 17 H Buus 2016-07-25 22:49:51 UTC
I won't be able to do this until Wednesday. I assume I should revert the ec-paral1.patch, right?
Comment 18 Lv Zheng 2016-07-25 23:31:57 UTC
(In reply to H Buus from comment #17)
> I won't be able to do this until Wednesday.

That's OK.

> I assume I should revert the ec-paral1.patch, right?

Yes. :)

Thanks
-Lv
Comment 19 Lv Zheng 2016-07-27 08:34:10 UTC
(In reply to H Buus from comment #17)
> I won't be able to do this until Wednesday. I assume I should revert the
> ec-paral1.patch, right?

No need to do further tests.
We just decided to use the version 1 for the Linux upstream.

Thanks
-Lv
Comment 20 H Buus 2016-07-27 15:21:03 UTC
OK. Let me know if you change your mind.
Comment 21 Lv Zheng 2016-08-15 08:53:50 UTC
Patch upstreamed:
https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=e1191bd4f

Closing...

Thanks

Note You need to log in before you can comment on or make changes to this bug.