Bug 218750 - Regression: S0ix sleep broken on 13th Gen Lenovo laptops
Summary: Regression: S0ix sleep broken on 13th Gen Lenovo laptops
Status: RESOLVED PATCH_ALREADY_AVAILABLE
Alias: None
Product: ACPI
Classification: Unclassified
Component: Power-Sleep-Wake (show other bugs)
Hardware: Intel Linux
: P3 blocking
Assignee: acpi_power-sleep-wake
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2024-04-19 17:49 UTC by Mark Pearson
Modified: 2024-04-25 09:25 UTC (History)
3 users (show)

See Also:
Kernel Version:
Subsystem:
Regression: No
Bisected commit-id:


Attachments
ACPI dump (562.16 KB, application/gzip)
2024-04-22 18:07 UTC, Mark Pearson
Details

Description Mark Pearson 2024-04-19 17:49:06 UTC
Hi,

S0ix sleep is broken on X1 Carbon G11 and X1 Yoga G8 platforms (Raptorlake CPU). 
Still checking if other platforms from that generation are impacted.

Bisect is pointing at:

[banther@x1c11 linux]$ git bisect bad
073237281a508ac80ec025872ad7de50cfb5a28a is the first bad commit
commit 073237281a508ac80ec025872ad7de50cfb5a28a
Author: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Date:   Tue Feb 6 20:33:45 2024 +0100

    ACPI: PM: s2idle: Enable Low-Power S0 Idle MSFT UUID for non-AMD systems
    
    Systems based on Intel platforms that use the MSFT UUID for Low-Power S0
    Idle (LPS0) have started to ship, so allow the kernel to use the MSFT
    UUID in the non-AMD case too, but in that case make it avoid evaluating
    the same _DSM function for two different UUIDs and prioritize the MSFT
    one.
    
    While at it, combine two MSFT _DSM function mask checks in
    acpi_s2idle_restore_early() so as to make it reflect the
    acpi_s2idle_prepare_late() flow more closely and adjust the
    Modern Standby entry and exit comments slightly.
    
    Non-AMD systems that do not support MSFT UUID for Low-power S0 Idle are
    not expected to be affected by this change in any way.
    
    Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
    Reviewed-by: Mario Limonciello <mario.limonciello@amd.com>

 drivers/acpi/x86/s2idle.c | 37 +++++++++++++++++++++++++++----------
 1 file changed, 27 insertions(+), 10 deletions(-)

I did some sanity checking and confirmed that if I force lps0_dsm_func_mask_microsoft = -EINVAL that S0ix works again.

Can this change be reverted while we figure out how to fix this please?

I'm happy to help with any testing, and can pull in Lenovo FW teams if FW level fixes are needed (with a note that those will take time, and I think this has to be reverted from 6.9 until the issue is root caused and understood).

I'll note that I'm not seeing any problems on the Meteorlake system I have - so it may be platform or CPU version specific. I've not had time to confirm

Thanks to Hans de Goede for notifying me of the issue in the 6.9-rc builds.

Mark
Comment 1 Mario Limonciello (AMD) 2024-04-22 02:24:39 UTC
I think you should attach an acpidump from an affected system.

Something notable about that commit that is different between Intel and AMD is the Intel case avoids evaluation of the same DSM with multiple UUIDs. Depending upon what the firmware actually does in the different cases perhaps that's the root of why this commit causes issues.

You could try to change it to behave like the way AMD does and let it evaluate both Microsoft and Intel paths if both are present instead of prioritizing to see if this theory has any legs.
Comment 2 Mark Pearson 2024-04-22 18:07:09 UTC
Created attachment 306196 [details]
ACPI dump

Thanks Mario for the suggestion.

acpidump attached
Comment 3 Mario Limonciello (AMD) 2024-04-22 18:13:24 UTC
Yup, try this.  If that works I'd say, go ahead and post a patch for Rafael to review.  It's better than a revert.

diff --git a/drivers/acpi/x86/s2idle.c b/drivers/acpi/x86/s2idle.c
index cd84af23f7ea..3da2f075711e 100644
--- a/drivers/acpi/x86/s2idle.c
+++ b/drivers/acpi/x86/s2idle.c
@@ -488,21 +488,6 @@ static int lps0_device_attach(struct acpi_device *adev,
                rev_id = 1;
                lps0_dsm_func_mask = validate_dsm(adev->handle,
                                        ACPI_LPS0_DSM_UUID, rev_id, &lps0_dsm_guid);
-               if (lps0_dsm_func_mask > 0 && lps0_dsm_func_mask_microsoft > 0) {
-                       unsigned int func_mask;
-
-                       /*
-                        * Avoid evaluating the same _DSM function for two
-                        * different UUIDs and prioritize the MSFT one.
-                        */
-                       func_mask = lps0_dsm_func_mask & lps0_dsm_func_mask_microsoft;
-                       if (func_mask) {
-                               acpi_handle_info(adev->handle,
-                                                "Duplicate LPS0 _DSM functions (mask: 0x%x)\n",
-                                                func_mask);
-                               lps0_dsm_func_mask &= ~func_mask;
-                       }
-               }
        }

        if (lps0_dsm_func_mask < 0 && lps0_dsm_func_mask_microsoft < 0)
Comment 4 Rafael J. Wysocki 2024-04-23 19:45:48 UTC
Patch posted: https://lore.kernel.org/linux-acpi/12427214.O9o76ZdvQC@kreacher/

Note You need to log in before you can comment on or make changes to this bug.