Bug 153511 - Boot EC requirement - Kernel 4.7 much longer boot compared to 4.4 and 4.6
Summary: Boot EC requirement - Kernel 4.7 much longer boot compared to 4.4 and 4.6
Status: CLOSED CODE_FIX
Alias: None
Product: ACPI
Classification: Unclassified
Component: EC (show other bugs)
Hardware: All Linux
: P1 normal
Assignee: Lv Zheng
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2016-08-21 21:10 UTC by jw.hendy
Modified: 2018-04-08 05:49 UTC (History)
6 users (show)

See Also:
Kernel Version: 4.7
Subsystem:
Regression: No
Bisected commit-id:


Attachments
side by side dmesg comparision where biggest early discrepancies occur (6.26 KB, text/plain)
2016-08-21 21:10 UTC, jw.hendy
Details
arch linux-lts default kernel config (4.4.19) (171.80 KB, text/plain)
2016-08-29 16:11 UTC, jw.hendy
Details
arch standard kernel default config (currently 4.7.2) (178.38 KB, text/plain)
2016-08-29 16:13 UTC, jw.hendy
Details
full dmesg from initial boot 4.4.16 (63.90 KB, text/plain)
2016-08-29 16:30 UTC, jw.hendy
Details
full dmesg from initial boot 4.7.0 (64.45 KB, text/plain)
2016-08-29 16:30 UTC, jw.hendy
Details
dmesg from LTS kernel (4.4.19) with acpi.trace_state=enable (66.36 KB, text/plain)
2016-09-01 22:58 UTC, jw.hendy
Details
dmesg from standard kernel (4.7.2) with acpi.trace_state=enable (67.29 KB, text/plain)
2016-09-01 22:58 UTC, jw.hendy
Details
dmesg from LTS kernel (4.4.19) with acpi.trace_state=enable (407.57 KB, text/plain)
2016-09-02 04:12 UTC, jw.hendy
Details
dmesg from standard kernel (4.7.2) with acpi.trace_state=enable (434.79 KB, text/plain)
2016-09-02 04:13 UTC, jw.hendy
Details
arch PKGBUILD for building 4.7.2 from source (12.66 KB, application/octet-stream)
2016-09-02 15:24 UTC, jw.hendy
Details
the version of drivers/acpi/ec.c I'm working with (45.50 KB, text/x-csrc)
2016-09-02 15:27 UTC, jw.hendy
Details
Two patches from v4.8 + 4 proposed patches (backported for v4.7.2) (27.76 KB, patch)
2016-09-03 19:54 UTC, Peter Wu
Details | Diff
dmesg with acpi.trace_state=enable for 4.7.2 compiled with Peter Wu's patches applied (415.03 KB, text/plain)
2016-09-06 23:29 UTC, jw.hendy
Details

Description jw.hendy 2016-08-21 21:10:33 UTC
Created attachment 229601 [details]
side by side dmesg comparision where biggest early discrepancies occur

Arch Linux recently rolled out 4.7.0, and I noticed my boot time increase dramatically. The lag/pause occurs just after selecting the desired kernel from my syslinux menu, and before it changes the modeline resolution and prompts for my cryptsetup root password.

I installed the linux-lts package for Arch (4.4.16) and compared the dmesg outputs, though the speed change was noticed from the last version of 4.6. It was more convenient to try 4.4 since there's a package for it (just want to rule out that it existed in 4.6). I've attached a file showing the three major discrepancies I see. Here's some quick comparisons for reference:

### kernel 4.4.16
$ systemd-analyze
Startup finished in 4.683s (kernel) + 7.625s (initrd) + 1.084s (userspace) = 13.393s

### kernel 4.7.0
$ systemd-analyze
Startup finished in 10.746s (kernel) + 10.518s (initrd) + 4.060s (userspace) = 25.325s


Biggest differences happen during these chunks of the boot process, both involving ACPI : EC. The first accounts for a ~1sec difference:

### kernel 4.4.16
[    0.279066] ACPI : EC: EC started
[    4.418531] ACPI: Interpreter enabled

### kernel 4.7.0
[    0.279066] ACPI : EC: EC started
[    4.418531] ACPI: Interpreter enabled

The second is a stop/start of EC which adds another 6sec to 4.7.0:

### kernel 4.4.16
[    4.528033] ACPI: Enabled 7 GPEs in block 00 to 3F         
[    4.528068] ACPI : EC: GPE = 0x16, I/O: command/status = 0 
[    4.528134] vgaarb: setting as boot device: PCI:0000:00:02

### kernel 4.7.0
[    5.711566] ACPI: Enabled 7 GPEs in block 00 to 3F        
[    5.711639] ACPI : EC: EC stopped                         
[    5.711667] ACPI : EC: GPE = 0x16, I/O: command/status = 0
[    5.711668] ACPI : EC: EC started                         
[   10.526295] vgaarb: setting as boot device: PCI:0000:00:02

I have a post on the Arch Linux forums about this as well (in case others comment there):
- https://bbs.archlinux.org/viewtopic.php?id=216096

I also found another user on Reddit reporting what appears to be similar:
- https://www.reddit.com/r/linuxquestions/comments/4yok1c/slow_kernel_boot_acpi_ec/

The laptop is an HP Zbook 15 running x86_64. I *believe* this is related to ACPI as that's where the first 1sec jump occurs, and the later big 6sec leap surrounds ACPI lines as well. That said, the first line *after* the time jump relates to vgaarb, so if you think the bug is really about that, please let me know and I can re-submit elsewhere.
Comment 1 Zhang Rui 2016-08-29 04:02:10 UTC
can you attach the kernel config files for the two kernels?
Comment 2 jw.hendy 2016-08-29 16:11:35 UTC
Created attachment 231181 [details]
arch linux-lts default kernel config (4.4.19)
Comment 3 jw.hendy 2016-08-29 16:13:21 UTC
Created attachment 231191 [details]
arch standard kernel default config (currently 4.7.2)
Comment 4 jw.hendy 2016-08-29 16:15:07 UTC
@Zhang Rui: Done. The kernel has updated to 4.7.2 (from 4.7.0) since I first experienced these issues, but the boot lag remains. If you'd like me to dig up the config for 4.7.0 I can do that; this was just easier.
Comment 5 jw.hendy 2016-08-29 16:30:28 UTC
Created attachment 231201 [details]
full dmesg from initial boot 4.4.16
Comment 6 jw.hendy 2016-08-29 16:30:49 UTC
Created attachment 231211 [details]
full dmesg from initial boot 4.7.0
Comment 7 Lv Zheng 2016-09-01 09:23:07 UTC
Need the comparable dmesgs with acpi.trace_state=enable boot parameter specified.

Thanks
Comment 8 jw.hendy 2016-09-01 17:37:35 UTC
Will add later today. Thanks for the advice.
Comment 9 jw.hendy 2016-09-01 22:58:08 UTC
Created attachment 231721 [details]
dmesg from LTS kernel (4.4.19) with acpi.trace_state=enable
Comment 10 jw.hendy 2016-09-01 22:58:45 UTC
Created attachment 231731 [details]
dmesg from standard kernel (4.7.2) with acpi.trace_state=enable
Comment 11 Lv Zheng 2016-09-02 00:02:56 UTC
"acpi.trace_state=enable" isn't working.
Please confirm if CONFIG_ACPI_DEBUG is set.
See Documentation/acpi/method-tracing.txt for details.

Thanks
Lv
Comment 12 jw.hendy 2016-09-02 01:47:52 UTC
Thanks for catching that. I haven't used the option and wasn't sure how to tell if it did anything new. I checked the attached configs and it's not set in the default arch package, so I'll have to compile both kernels after setting it.
Comment 13 jw.hendy 2016-09-02 04:12:59 UTC
Created attachment 231811 [details]
dmesg from LTS kernel (4.4.19) with acpi.trace_state=enable
Comment 14 jw.hendy 2016-09-02 04:13:29 UTC
Created attachment 231821 [details]
dmesg from standard kernel (4.7.2) with acpi.trace_state=enable
Comment 15 jw.hendy 2016-09-02 04:15:55 UTC
Updated previous dmesg files after re-compiling both current and LTS Arch kernels from ABS (https://wiki.archlinux.org/index.php/Arch_Build_System). All options left unchanged from bundled config other than CONFIG_ACPI_DEBUG=y. dmesg outputs definitely look different now!
Comment 16 Lv Zheng 2016-09-02 04:50:08 UTC
I have a patchset covering this issue.
This is in fact not an issue.
Just because _REG (which consumes time on this platform) is executed more times than the previous version.
However it really can be tuned not to execute so many times by checking if 2 EC devices are equivelent.

I'll Cc you when I send them to the community.
And ping you here after posting.

Thanks
Comment 17 Zhang Rui 2016-09-02 07:00:16 UTC
why not post here as early as possible? and maybe we can get one more tested-by when posting them to the list :p
Comment 18 Lv Zheng 2016-09-02 07:48:33 UTC
Because the issue reported here is a known and fixed issue in another bug.
For the fix of that bug, I've already prepared the patchset and is about to send them out.
Comment 19 Lv Zheng 2016-09-02 07:54:38 UTC
Hi, Hendy (BTW, could you let me know your full name?)

Please apply the following patches and try again:
https://patchwork.kernel.org/patch/9310505/
https://patchwork.kernel.org/patch/9310503/
https://patchwork.kernel.org/patch/9310509/
https://patchwork.kernel.org/patch/9310513/

Thanks
Comment 20 jw.hendy 2016-09-02 14:55:25 UTC
Sure thing. It's John Henderson. I'm going to apply them now and re-compile. Will let you know shortly and thanks for suck quick results!
Comment 21 jw.hendy 2016-09-02 15:23:59 UTC
Hmmm. Arch has a patchset that's bundled with the package, and I'm wondering if this could be affecting something. I'm going to attach ec.c for you, and the output of each patch is below. Let me know if I'm doing something incorrectly in applying. I looked manually at the 1-4 patch and it appears acpi_ec_dsdt_probe(void) doesn't match what the patch is looking for.

For reference, this is where the package pulls sources from:
- https://www.kernel.org/pub/linux/kernel/v4.x/linux-4.7.tar.xz
- https://www.kernel.org/pub/linux/kernel/v4.x/patch-4.7.2.xz

I'll upload the PKGBUILD for reference as well.

##########

$ patch -p1 -i ../1-4-ACPI-EC-Cleanup-first_ec-boot_ec-code.patch 
patching file drivers/acpi/ec.c
Hunk #1 FAILED at 179.
Hunk #2 succeeded at 1222 (offset -6 lines).
Hunk #3 succeeded at 1293 (offset -6 lines).
Hunk #4 FAILED at 1379.
Hunk #5 succeeded at 1395 with fuzz 1 (offset -14 lines).
Hunk #6 succeeded at 1403 (offset -14 lines).
Hunk #7 succeeded at 1410 (offset -14 lines).
Hunk #8 succeeded at 1437 (offset -14 lines).
Hunk #9 FAILED at 1503.
Hunk #10 succeeded at 1593 with fuzz 2 (offset -14 lines).
Hunk #11 FAILED at 1616.
4 out of 11 hunks FAILED -- saving rejects to file drivers/acpi/ec.c.rej

$ patch -p1 -i ../2-4-ACPI-EC-Fix-a-memory-leakage-issue-in-acpi_ec_add.patch 
patching file drivers/acpi/ec.c
Hunk #1 succeeded at 1395 with fuzz 1 (offset -43 lines).
Hunk #2 succeeded at 1424 (offset -43 lines).

$ patch -p1 -i ../3-4-ACPI-EC-Fix-a-gap-that-ECDT-EC-cannot-handle-EC-events.patch 
patching file drivers/acpi/ec.c
Hunk #1 succeeded at 107 (offset -1 lines).
Hunk #2 succeeded at 1299 (offset -7 lines).
Hunk #3 succeeded at 1328 (offset -7 lines).
Hunk #4 succeeded at 1348 (offset -7 lines).
Hunk #5 FAILED at 1392.
Hunk #6 FAILED at 1414.
Hunk #7 FAILED at 1433.
Hunk #8 succeeded at 1413 (offset -43 lines).
Hunk #9 succeeded at 1428 (offset -43 lines).
Hunk #10 succeeded at 1446 (offset -43 lines).
Hunk #11 succeeded at 1489 with fuzz 2 (offset -45 lines).
Hunk #12 FAILED at 1543.
Hunk #13 FAILED at 1652.
5 out of 13 hunks FAILED -- saving rejects to file drivers/acpi/ec.c.rej
patching file drivers/acpi/internal.h
Hunk #1 succeeded at 182 (offset -3 lines).
patching file drivers/acpi/scan.c
Hunk #1 succeeded at 1959 with fuzz 2 (offset -85 lines).

$ patch -p1 -i ../4-4-ACPI-EC-Fix-issues-related-to-boot_ec.patch 
patching file drivers/acpi/ec.c
Hunk #1 succeeded at 1387 with fuzz 1 (offset -74 lines).
Hunk #2 succeeded at 1444 (offset -65 lines).
Hunk #3 succeeded at 1469 (offset -65 lines).
Hunk #4 succeeded at 1486 (offset -65 lines).
Hunk #5 FAILED at 1629.
Hunk #6 FAILED at 1642.
2 out of 6 hunks FAILED -- saving rejects to file drivers/acpi/ec.c.rej
Comment 22 jw.hendy 2016-09-02 15:24:46 UTC
Created attachment 231961 [details]
arch PKGBUILD for building 4.7.2 from source
Comment 23 jw.hendy 2016-09-02 15:27:35 UTC
Created attachment 231971 [details]
the version of drivers/acpi/ec.c I'm working with
Comment 24 Peter Wu 2016-09-03 16:58:40 UTC
(In reply to jw.hendy from comment #21)
> Hmmm. Arch has a patchset that's bundled with the package, and I'm wondering
> if this could be affecting something. I'm going to attach ec.c for you, and
> the output of each patch is below. Let me know if I'm doing something
> incorrectly in applying. I looked manually at the 1-4 patch and it appears
> acpi_ec_dsdt_probe(void) doesn't match what the patch is looking for.
> 
> For reference, this is where the package pulls sources from:
> - https://www.kernel.org/pub/linux/kernel/v4.x/linux-4.7.tar.xz
> - https://www.kernel.org/pub/linux/kernel/v4.x/patch-4.7.2.xz

This "patchset" are just changes needed to update from 4.7.0 to 4.7.2 (stable). I was able to apply these four patches on top of v4.8-rc4, you might want to try that instead.
Comment 25 jw.hendy 2016-09-03 18:28:04 UTC
Not sure I follow. So if I'm on 4.7.2 these patches are already in place?
Comment 26 jw.hendy 2016-09-03 18:30:58 UTC
Oh. Sorry, I get it. The arch patches turn bring 4.7.0 up to 4.7.2. I guess I can try things on 4.8rc but I don't plan in running a release candidate kernel and the issue is with this kernel. Why not have them applied on 4.7.x?
Comment 27 Peter Wu 2016-09-03 19:54:48 UTC
Created attachment 232071 [details]
Two patches from v4.8 + 4 proposed patches (backported for v4.7.2)

The patch that is supposed to solve your issue is based of some ongoing development which is why it does not apply directly. I doubt that these will be backported, these will probably end up in v4.9 (I guess).

For testing the patches you can try to apply it on any version after v4.8-rc1, but if you really do not want to do that, I have attached a (compile-tested) backport based on v4.7.2 + fd6231e785f1 + bc539567ab52 + four patches from Lv.
Comment 28 jw.hendy 2016-09-06 23:28:54 UTC
Thanks Peter! I was able to apply that to the Arch package for 4.7.2 from ABS.

This appears to help significantly, and I'm back down to ~14sec. I attached an acpi traced dmesg for the patched version of 4.7.2 for inspection.

$ systemd-analyze 
Startup finished in 5.994s (kernel) + 7.424s (initrd) + 808ms (userspace) = 14.228s

$ systemd-analyze blame
          5.891s systemd-cryptsetup@luks\x2d7101e83b\x2d31c0\x2d4cdf\x2dbc07\x2
           202ms wicd.service
           180ms systemd-journald.service
            59ms systemd-journal-flush.service
            45ms systemd-logind.service
            42ms alsa-restore.service
            40ms systemd-udev-trigger.service
            39ms systemd-udevd.service
            35ms systemd-vconsole-setup.service
            35ms dev-hugepages.mount
            32ms systemd-modules-load.service
            31ms sys-kernel-debug.mount
            28ms kmod-static-nodes.service
            28ms polkit.service
            25ms systemd-tmpfiles-setup-dev.service
            24ms systemd-remount-fs.service
            19ms systemd-fsck@dev-disk-by\x2duuid-37441f68\x2d9d76\x2d45bc\x2db
            16ms systemd-fsck-root.service
            16ms user@1000.service
            15ms dev-mqueue.mount
            14ms tmp.mount

I'd say the changes here resolved nearly all or all of the issue, at least to the degree that if I'd updated to a version with this performance I wouldn't have suspected anything major going on like I did.

Thanks!
Comment 29 jw.hendy 2016-09-06 23:29:44 UTC
Created attachment 232301 [details]
dmesg with acpi.trace_state=enable for 4.7.2 compiled with Peter Wu's patches applied
Comment 30 Lv Zheng 2016-09-07 01:52:53 UTC
(In reply to jw.hendy from comment #28)
> Thanks Peter! I was able to apply that to the Arch package for 4.7.2 from
> ABS.
> 
> This appears to help significantly, and I'm back down to ~14sec. I attached
> an acpi traced dmesg for the patched version of 4.7.2 for inspection.
> 
> $ systemd-analyze 
> Startup finished in 5.994s (kernel) + 7.424s (initrd) + 808ms (userspace) =
> 14.228s
> 
> $ systemd-analyze blame
>           5.891s
> systemd-cryptsetup@luks\x2d7101e83b\x2d31c0\x2d4cdf\x2dbc07\x2
>            202ms wicd.service
>            180ms systemd-journald.service
>             59ms systemd-journal-flush.service
>             45ms systemd-logind.service
>             42ms alsa-restore.service
>             40ms systemd-udev-trigger.service
>             39ms systemd-udevd.service
>             35ms systemd-vconsole-setup.service
>             35ms dev-hugepages.mount
>             32ms systemd-modules-load.service
>             31ms sys-kernel-debug.mount
>             28ms kmod-static-nodes.service
>             28ms polkit.service
>             25ms systemd-tmpfiles-setup-dev.service
>             24ms systemd-remount-fs.service
>             19ms
> systemd-fsck@dev-disk-by\x2duuid-37441f68\x2d9d76\x2d45bc\x2db
>             16ms systemd-fsck-root.service
>             16ms user@1000.service
>             15ms dev-mqueue.mount
>             14ms tmp.mount
> 
> I'd say the changes here resolved nearly all or all of the issue, at least
> to the degree that if I'd updated to a version with this performance I
> wouldn't have suspected anything major going on like I did.
> 
> Thanks!

So marking this as resolve.
I'll rebase the patchset for the upstream.

Thanks for the help, Peter and John!
Comment 32 jw.hendy 2016-11-15 04:14:42 UTC
Thanks, Lv. I'm not familiar with how to know when those are brought into the kernel. Arch is on 4.8.7 now and I still have the issue (~23sec boot). What kernel is this targeted for? If you have a link of the process, feel free to save your words and just paste that!

I read a bit about submitting a patch and the review process, but I'm not clear if things tend to wait for a particular type of update (x.0 vs. 4.x, vs. 4.x.y).
Comment 33 Lv Zheng 2016-11-16 01:53:42 UTC
It's merged by upstream in v4.9-rc1.
So it should appear in some 4.9.x kernels.
Comment 34 jw.hendy 2016-11-16 02:50:58 UTC
Perfect -- can't wait, and thanks for your resolution.
Comment 35 Lv Zheng 2016-11-17 08:02:04 UTC
If you want to know the process...
TBH, I don't know distribution vendors' processes.
I just confirmed this via:
# git tag --contains <commit ID>
With the upstream linus tree.
Commit ID is "2a5708409e4e" in this case.
Comment 36 Lv Zheng 2017-01-03 01:50:31 UTC
*** Bug 191561 has been marked as a duplicate of this bug. ***
Comment 37 zKqri0 2018-04-08 05:49:54 UTC
I still have this bug with kernel 4.15 on Ubuntu and also when I last used Arch on 4.14 I had it too.

Note You need to log in before you can comment on or make changes to this bug.