Bug 211585 - Laptop hangs on second suspend unless a file is written to second disk - Intel Core i7-5500U Acer Aspire R13
Summary: Laptop hangs on second suspend unless a file is written to second disk - Inte...
Status: NEEDINFO
Alias: None
Product: ACPI
Classification: Unclassified
Component: Power-Sleep-Wake (show other bugs)
Hardware: Intel Linux
: P1 normal
Assignee: Zhang Rui
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2021-02-06 01:21 UTC by Rob Smith
Modified: 2022-06-21 07:17 UTC (History)
2 users (show)

See Also:
Kernel Version: 5.10.12-1
Subsystem:
Regression: No
Bisected commit-id:


Attachments

Description Rob Smith 2021-02-06 01:21:28 UTC
Description
===========
Acer R13 (R7-371t) with dual SSDs consistently suspends successfully one time
only, fails on the second suspend (hangs/freezes, the screen is black but
keyboard backlight remains on, status LED remains blue and does not switch to
flashing orange as it should during successful suspend). After a failed
suspend, the system will eventually reboot.

After extensive trial and error, I have found that writing a file to the second SSD prior to suspend resolves the issue.

Investigation
=============
I have been running Ubuntu on the primary disk (Liteon) in my laptop for some
years without issue; suspend worked reliably. Windows 10 was installed on the
second disk (Samsung) and that too had no suspend issues.

I recently installed OpenSuSE Tumbleweed on the Samsung SSD, replacing Windows,
and found that suspending OpenSuSE would hang the system on the second
attempt. I tried the following steps to resolve without success:

+ Changing the boot order of the drives.
+ Smart self tests of the drives (no issues found).
+ Booting to runlevel 3 and suspending
+ Alternative methods of suspending (KDE menu, power button, lid close, systemctl suspend)
+ Switching X from modesetting driver to intel
+ Checking the kernel logs for errors, removing potentially problematic  modules 

Nothing appeared to have any effect. 'journalctl -kxb -1' showed the following
output for a failed suspend (failure to reach syncing filesystems):

kernel: PM: suspend entry (deep)
-- Reboot --

Whereas a successful suspend (first attempt after boot) would show:

kernel: PM: suspend entry (deep)
kernel: Filesystems sync: 0.020 seconds
kernel: Freezing user space processes ... (elapsed
0.001 seconds) done.
...

Workaround
==========
I had read other users with dual disks reporting the same issue, so I opened
the laptop and removed the Liteon disk and moved the Samsung disk into the first
slot. I booted with just the samsung disk installed. The suspend issue no
longer occurred.

I then opened the laptop again and re-installed the Liteon disk in the second
slot. I booted expecting the issue to be resolved but again the laptop hung on
the second suspend.

I experimented with attempting to suspend while the disk was unmounted or
mounted, occasionally writing a file to the disk to confirm that it was still
functioning. Eventually it dawned on me that a suspend would succeed if a file
had been written to the disk prior to suspend. The following script has
prevented the issue from occurring over many suspends.

/usr/lib/systemd/system-sleep/liteon-keepalive.sleep:

if [ "${1}" == "pre" ]; then
  # Do the thing you want before suspend here, e.g.:
  echo "we are suspending at $(date)..." > /mnt/data/.keepaliveonsuspend
elif [ "${1}" == "post" ]; then
  # Do the thing you want after resume here, e.g.:
  echo "...and we are back from $(date)" >> /mnt/data/.keepaliveonsuspend
fi

Hardware/Software Details
=========================

Acer Aspire R13 (R7-371t)
Bios: InsydeH2O Rev.5.0 v1.12
CPU: Intel Core i7-5500U @ 4x 3GHz [49.0°C]
GPU: Mesa DRI Intel(R) HD Graphics 5500 (BDW GT2)
RAM: 5860MiB / 7860MiB
Disks/SSDs:
[1:0:0:0]    disk    ATA      LITEON IT L8T-12 202   /dev/sda 
[3:0:0:0]    disk    ATA      Samsung SSD 850  1B6Q  /dev/sdb

Distributor ID: openSUSE
Description:    openSUSE Tumbleweed
Release:        20210203
OS: openSUSE 20210203
Kernel: x86_64 Linux 5.10.12-1-default

Conclusions
==========
It seems unlikely to me that the Liteon disk is faulty, as I have not had any other issues with it and it is passing SMART tests. I suspect that either:

+ The disk is not being properly re-initialised after suspend.
+ There is a BIOS or disk firmware issue (no updates are likely to be issued in future).
Comment 1 Zhang Rui 2022-06-21 07:17:10 UTC
what controller are these disks connected to?
can you disable the runtime PM for the second controller and see if the problem still exists?
But this doesn't seem to be a software problem to me anyway.

Note You need to log in before you can comment on or make changes to this bug.