Bug 219229

Summary: significant delays when secureboot is enabled after 6.10 kernel
Product: EFI Reporter: Pengyu Ma (mapengyu)
Component: BootAssignee: EFI Virtual User (efi)
Status: RESOLVED CODE_FIX    
Severity: normal CC: ardb, jarkko, jejb, mario.limonciello
Priority: P3    
Hardware: All   
OS: Linux   
Kernel Version: Subsystem:
Regression: No Bisected commit-id:
Attachments: 6.11 kernel dmesg secureboot enabled
6.11 kernel dmesg secureboot disabled
6.8 kernel dmesg secureboot enabled
mainline 6.11-rc7 dmesg 17s boot
kernel config of 6.11-rc7

Description Pengyu Ma 2024-09-04 02:56:01 UTC
Created attachment 306811 [details]
6.11 kernel dmesg secureboot enabled

When secureboot is enabled,
the kernel boot time is ~20 seconds after 6.10 kernel.
it's ~7 seconds on 6.8 kernel version.

When secureboot is disabled,
the boot time is ~7 seconds too.

Reproduced on both AMD and Intel platform on ThinkPad X1 and T14.

It probably caused autologin failure and micmute led not loaded on AMD platform.

6.9 kernel version is not tested since not signed kernel found.
6.8, 6.10, 6.11 are tested, the first bad version is 6.10.
Comment 1 Pengyu Ma 2024-09-04 02:57:06 UTC
Created attachment 306812 [details]
6.11 kernel dmesg secureboot disabled
Comment 2 Pengyu Ma 2024-09-04 02:57:28 UTC
Created attachment 306813 [details]
6.8 kernel dmesg secureboot enabled
Comment 3 The Linux kernel's regression tracker (Thorsten Leemhuis) 2024-09-05 10:33:09 UTC
You specified "IA64" as arch, that sounds wrong.

And are those tests performed with a vanilla kernel or some distro kernel?
Comment 4 Mario Limonciello (AMD) 2024-09-09 16:05:40 UTC
Yeah I agree. I think what you need to do here is build a mainline upstream kernel, sign it, and then enroll your key into the bios and reproduce the issue.
Comment 5 Pengyu Ma 2024-09-10 02:33:13 UTC
commit 6519fea6fd372b2247a48d72dcb23e14de70b4ea
Author: James Bottomley <James.Bottomley@HansenPartnership.com>
Date:   Mon Apr 29 16:28:06 2024 -0400

    tpm: add hmac checks to tpm2_pcr_extend()
    
    tpm2_pcr_extend() is used by trusted keys to extend a PCR to prevent a
    key from being re-loaded until the next reboot.  To use this
    functionality securely, that extend must be protected by a session
    hmac.  This patch adds HMAC protection so tampering with the
    tpm2_pcr_extend() command in flight is detected.

Upstream kernel can reproduce this issue from 6.10-rc1.
This commit above from 6.10-rc1 caused this delay.
Comment 6 Pengyu Ma 2024-09-10 02:34:16 UTC
Created attachment 306846 [details]
mainline 6.11-rc7 dmesg 17s boot
Comment 7 The Linux kernel's regression tracker (Thorsten Leemhuis) 2024-09-10 05:09:17 UTC
Pengyu Ma, can I CC you when forwarding this report by mail? This would expose your email address to the public.
Comment 8 Pengyu Ma 2024-09-10 07:42:59 UTC
@Leemhuis,

Yes, please do, thanks.
Comment 9 The Linux kernel's regression tracker (Thorsten Leemhuis) 2024-09-10 09:21:34 UTC
(In reply to Pengyu Ma from comment #8)
> Yes, please do, thanks.

Forwarded: https://lore.kernel.org/regressions/92fbcc4c252ec9070d71a6c7d4f1d196ec67eeb0.camel@huaweicloud.com/T/#t
Comment 10 Mario Limonciello (AMD) 2024-09-10 18:06:59 UTC
I think for now until the direction is settled it's best to use CONFIG_TCG_TPM2_HMAC=n to avoid this issue.
Comment 11 jarkko 2024-09-14 11:07:31 UTC
(In reply to Pengyu Ma from comment #0)
> Created attachment 306811 [details]
> 6.11 kernel dmesg secureboot enabled
> 
> When secureboot is enabled,
> the kernel boot time is ~20 seconds after 6.10 kernel.
> it's ~7 seconds on 6.8 kernel version.
> 
> When secureboot is disabled,
> the boot time is ~7 seconds too.
> 
> Reproduced on both AMD and Intel platform on ThinkPad X1 and T14.
> 
> It probably caused autologin failure and micmute led not loaded on AMD
> platform.
> 
> 6.9 kernel version is not tested since not signed kernel found.
> 6.8, 6.10, 6.11 are tested, the first bad version is 6.10.

Can you attach you kernel config file? With that I can reverse engineer
the TPM call paths inside kernel.
 
BR, Jarkko
Comment 12 Pengyu Ma 2024-09-15 11:01:13 UTC
Created attachment 306879 [details]
kernel config of 6.11-rc7

@Jarkko

Attached the kernel config which can reproduce the issue.

Thanks.
Comment 13 jarkko 2024-09-15 14:05:25 UTC
Thanks! I'll look into it eventually. Obviously we cannot get the exact same boot time with and without encryption.

However, I found a bottleneck in the implementation. Null ECC key is swapped after every single in-kernel TPM command. Even after creation it is immediately swapped, which does not make sense to me.

So I'm trying to make swapping lazy so that most of the time TPM2_LoadContext would not be required if kernel uses TPM a lot. Might take over a week to get PoC right for this but I think I have idea how to approach this.
Comment 14 jarkko 2024-09-15 14:57:58 UTC
(In reply to jarkko from comment #13)
> Thanks! I'll look into it eventually. Obviously we cannot get the exact same
> boot time with and without encryption.
> 
> However, I found a bottleneck in the implementation. Null ECC key is swapped
> after every single in-kernel TPM command. Even after creation it is
> immediately swapped, which does not make sense to me.
> 
> So I'm trying to make swapping lazy so that most of the time
> TPM2_LoadContext would not be required if kernel uses TPM a lot. Might take
> over a week to get PoC right for this but I think I have idea how to
> approach this.

I opened a branch for improvements. Just putting it here as reference https://git.kernel.org/pub/scm/linux/kernel/git/jarkko/linux-tpmdd.git/log/?h=hmac-v1

I'll create a patch set eventually based on this branch.
Comment 15 Pengyu Ma 2024-09-16 02:29:09 UTC
After applied the patches in Comment 14, the boot time is ~15 seconds.
Less than 20 sec, but still much more than 7 sec when disabling HMAC.
Comment 16 jarkko 2024-09-20 11:53:51 UTC
(In reply to Pengyu Ma from comment #15)
> After applied the patches in Comment 14, the boot time is ~15 seconds.
> Less than 20 sec, but still much more than 7 sec when disabling HMAC.

Linking this here: https://lore.kernel.org/linux-integrity/20240918203559.192605-1-jarkko@kernel.org/

We can close this after the patch set has been landed to the mainline tree. This will take a few weeks because I will send this between rc1 and rc2. Thanks for the productive fruitful co-op! Couldn't have move this fast without your guidance.
Comment 17 jarkko 2024-09-21 13:11:44 UTC
Link to the performance figures:

https://lore.kernel.org/linux-integrity/CALSz7m3SXE3v-yB=_E3Xf5zCDv6bAYhjb+KHrnZ6J14ay2q9sw@mail.gmail.com/

I.e. now we have come down from 20 to less 9 seconds, which is good enough.