Bug 215867

Summary: tboot suspend broken
Product: Platform Specific/Hardware Reporter: Derek Dolney (z23)
Component: x86-64Assignee: Derek Dolney (z23)
Status: RESOLVED CODE_FIX    
Severity: normal CC: bp, regressions, vincent.donnefort
Priority: P1    
Hardware: All   
OS: Linux   
Kernel Version: 5.12.0 Subsystem:
Regression: Yes Bisected commit-id:

Description Derek Dolney 2022-04-21 02:07:42 UTC
I am using tboot (v1.10.5) to make use of intel-txt and all was working fine with the Linux kernel 5.10 series. However later in 5.12 release candidates, I have  proper booting however suspend is broken. I am using a Lenovo T460p. Usually when suspending on this machine the power button LED will blink 8 times and then it goes into a sleep state. With newer kernels I get power LED and caps lock LED blinking, cpu fan runs fast, and can't get out of that state without hard powerdown.

I did a git bisect on and found that commit 453e41085183980087f8a80dada523caf1131c3c is the one that breaks tboot+suspend to ram. It is part of a series of some cpu hotplug commits.

Just to be clear: if I build a kernel from the commit just before this one, I can suspend and resume, but if I build with this commit I can not suspend, laptop gets stuck on blinking power LED. Let me also mention that, given the above commit, if I do not use tboot, I can suspend and resume ok. It is only within the tboot boot context that I have suspend&resume problems.
Comment 1 Borislav Petkov 2022-04-21 09:26:00 UTC
Switching to mail because I can't CC the patch author on bugzilla.

Vincent, see below. It points to your commit:

453e41085183 ("cpu/hotplug: Add cpuhp_invoke_callback_range()")

@Derek, just to make sure: you're seeing this with the latest 5.17
kernel too, correct?

Thx.

On Thu, Apr 21, 2022 at 02:07:42AM +0000, bugzilla-daemon@kernel.org wrote:
> https://bugzilla.kernel.org/show_bug.cgi?id=215867
> 
>             Bug ID: 215867
>            Summary: tboot suspend broken
>            Product: Platform Specific/Hardware
>            Version: 2.5
>     Kernel Version: 5.12.0
>           Hardware: All
>                 OS: Linux
>               Tree: Mainline
>             Status: NEW
>           Severity: normal
>           Priority: P1
>          Component: x86-64
>           Assignee: platform_x86_64@kernel-bugs.osdl.org
>           Reporter: kernel@dolney.com
>         Regression: Yes
> 
> I am using tboot (v1.10.5) to make use of intel-txt and all was working fine
> with the Linux kernel 5.10 series. However later in 5.12 release candidates,
> I
> have  proper booting however suspend is broken. I am using a Lenovo T460p.
> Usually when suspending on this machine the power button LED will blink 8
> times
> and then it goes into a sleep state. With newer kernels I get power LED and
> caps lock LED blinking, cpu fan runs fast, and can't get out of that state
> without hard powerdown.
> 
> I did a git bisect on and found that commit
> 453e41085183980087f8a80dada523caf1131c3c is the one that breaks tboot+suspend
> to ram. It is part of a series of some cpu hotplug commits.
> 
> Just to be clear: if I build a kernel from the commit just before this one, I
> can suspend and resume, but if I build with this commit I can not suspend,
> laptop gets stuck on blinking power LED. Let me also mention that, given the
> above commit, if I do not use tboot, I can suspend and resume ok. It is only
> within the tboot boot context that I have suspend&resume problems.
> 
> -- 
> You may reply to this email to add a comment.
> 
> You are receiving this mail because:
> You are watching the assignee of the bug.
Comment 2 Vincent Donnefort 2022-04-21 11:20:08 UTC
Hi Derek,

Sorry you're having a problem with that. 

This patch only affects which steps are called. I see in tboot.c a registration for CPUHP_AP_X86_TBOOT_DYING. The problem might come from there.

I don't have your platform, but I tried with a dummy CPUHP_AP_X86_TBOOT_DYING and I can't see any problem, the callback is properly run. Would you have a chance to compare the steps list during hotunplug, with and without the patch? You can use the the cpuhp_* trace events for that.
Comment 3 The Linux kernel's regression tracker (Thorsten Leemhuis) 2022-05-15 08:43:54 UTC
hey everyone? did this regression fall through the cracks? Or was some progress made to get it fixed?
Comment 4 The Linux kernel's regression tracker (Thorsten Leemhuis) 2022-05-20 06:10:14 UTC
hi Derek, this issue is still in my list of open regressions, but it seems there wasn't any progress. Have you tried what Vincent asked you for?

(In reply to Vincent Donnefort from comment #2)

> Would you have a chance to compare the steps list during
> hotunplug, with and without the patch? You can use the the cpuhp_* trace
> events for that.

Or did you loose interest in this?
Comment 5 Derek Dolney 2022-05-20 12:07:04 UTC
We have been working on this off list. Vincent has sent me a patch and I should have time to test it this weekend. Will report back after that.
Comment 6 Borislav Petkov 2022-11-19 16:19:33 UTC
Assigning to Derek for that report.
Comment 7 Derek Dolney 2022-11-20 02:53:40 UTC
Nearing accepted patch, here is the latest: https://lkml.org/lkml/2022/9/27/371
Comment 8 Derek Dolney 2022-12-06 01:27:46 UTC
Patch has been merged into git tip: https://lkml.org/lkml/2022/12/2/399
Comment 9 Borislav Petkov 2022-12-07 18:20:11 UTC
Sounds like this is done then. Closing.