Bug 215880 - Resume process hangs for 5-6 seconds starting sometime in 5.16
Summary: Resume process hangs for 5-6 seconds starting sometime in 5.16
Status: NEW
Alias: None
Product: SCSI Drivers
Classification: Unclassified
Component: Other (show other bugs)
Hardware: AMD Linux
: P1 normal
Assignee: scsi_drivers-other
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2022-04-24 15:39 UTC by ericspero
Modified: 2022-10-07 00:15 UTC (History)
9 users (show)

See Also:
Kernel Version: 5.16
Tree: Mainline
Subsystem:
Regression: No


Attachments
dmesg after suspend-resume cycle using patched kernel (5.80 KB, text/plain)
2022-06-06 16:48 UTC, ericspero
Details
Script that enables kernel debugging options (3.81 KB, text/plain)
2022-06-07 23:51 UTC, Bart Van Assche
Details

Description ericspero 2022-04-24 15:39:22 UTC
Previously reported on the Gentoo bug tracker: https://bugs.gentoo.org/837500

Basic issue: 

Starting sometime in 5.16, my GUI freezes up during resume for 5-6 seconds: the screen is frozen, and it seems key presses and mouse movements are not registered. If I suspend from a tty, the 'loginctl suspend' command takes about 5-6 seconds to finish after resuming, however here my key presses are registered. 

It seems like the first thing it does on resume is to suspend the device, as if the system powered off too quickly or something.

I used git bisect to identify the commit where the problem begins: 

========
commit a19a93e4c6a98c9c0f2f5a6db76846f10d7d1f85
Author: Bart Van Assche <bvanassche@acm.org>
Date:   Wed Oct 6 14:54:51 2021 -0700

    scsi: core: pm: Rely on the device driver core for async power management
    
    Instead of implementing asynchronous resume support in the SCSI core, rely
    on the device driver core for resuming SCSI devices asynchronously.
    Instead of only supporting asynchronous resumes, also support asynchronous
    suspends.
    
    Link: https://lore.kernel.org/r/20211006215453.3318929-2-bvanassche@acm.org
    Cc: Alan Stern <stern@rowland.harvard.edu>
    Cc: Dan Williams <dan.j.williams@intel.com>
    Cc: Hannes Reinecke <hare@suse.com>
    Cc: Adrian Hunter <adrian.hunter@intel.com>
    Cc: Martin Kepplinger <martin.kepplinger@puri.sm>
    Signed-off-by: Bart Van Assche <bvanassche@acm.org>
    Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

 drivers/scsi/hosts.c      |  1 +
 drivers/scsi/scsi.c       |  8 --------
 drivers/scsi/scsi_pm.c    | 44 ++------------------------------------------
 drivers/scsi/scsi_priv.h  |  4 +---
 drivers/scsi/scsi_scan.c  | 17 +++++++++++++++++
 drivers/scsi/scsi_sysfs.c |  1 +
 drivers/scsi/sd.c         |  1 -
 7 files changed, 22 insertions(+), 54 deletions(-)
========

For git bisect I followed the instructions provided on the gentoo wiki: https://wiki.gentoo.org/wiki/Kernel_git-bisect, which uses the linux-stable repository.

Here is a sample dmesg output when I do not experience the bug ('good') and when I do ('bad'). Nothing to me seems obviously wrong. You do see some 'extra' lines in the 'bad' case at the end, but these lines also appear in the 'good' case later on; they don't have enough time to appear in the 'good' case when the system resumes quickly as normal. There are xHCI errors related to USB buses, but they appear in both cases.

Good: https://dpaste.org/S90gh
Bad:  https://dpaste.org/9KQ6C

Hardware info:
AMD Ryzen 3600
AMD Radeon RX580 
AMD Radeon 270X (usually bound to vfio-pci, otherwise amdgpu)
ASUS Strix ROG B450-F (latest BIOS)
Seasonic Prime GX-750 (brand new)
I also have an M.2 NVME drive (boot), a SATA SSD, and one HDD.


I'm happy to try any troubleshooting steps.
Comment 1 ericspero 2022-04-24 15:44:58 UTC
Forgot to mention some software info:

OS: Gentoo
WM: sway
init: OpenRC
login manager: elogind
Comment 2 Bart Van Assche 2022-05-01 19:16:23 UTC
In the Gentoo bug report there are links to the dmesg output for the 'good' and 'bad' cases. In the 'bad' dmesg output I found the following:

ata2: link is slow to respond, please be patient (ready=0)

Does that message still appear if the following three patches are reverted on top of kernel v5.16:
* 9131bff6a9f1 ("scsi: core: pm: Only runtime resume if necessary").
* 1c9575326a4a ("scsi: sd: Rename sd_resume() into sd_resume_system()").
* a19a93e4c6a9 ("scsi: core: pm: Rely on the device driver core for async power management").

If so, Damien, can you please help with checking whether or not this is perhaps an ATA regression?
Comment 3 ericspero 2022-05-01 20:02:49 UTC
Thank you for the response.

Sorry that I didn't make this clear in the original post, but I do get this message in the 'good' case as well - I just did not wait long enough for it to appear before copying the dmesg output. In the 'bad' case, the 5-6 second pause I get on resume gives it enough time for it to appear before I call dmesg. 

Here is a dmesg output from a 'good' case (5.15.33) where I waited a few seconds after resuming: 

https://dpaste.org/rdvvD

I assume those patches are not present in 5.15.33. If you would still like me to try reverting these patches please let me know.

I'm also noticing that in both 'good' and 'bad' cases the first line of output I see in dmesg after starting the resume process is "Suspending system...". I waited 10-15 seconds before resuming, but there are no gaps between messages in the timestamps. This is not how it works on my laptop that uses plain old systemd, where I get the equivalent message "PM: suspend entry (s2idle)" before the system suspends.
Comment 4 Damien Le Moal 2022-05-02 12:40:21 UTC
Hmmm... The message comes from ata_wait_ready(), which as the name suggests waits for the link to become ready (PHY and device present detected). And if the link does not become ready after 5s, that message is printed but the wait still continues. And given that the link *does* become ready, it looks like the drive/adapter are simply slow to wake-up. What is on ata2 ? Is it an HDD or a SSD ? If it is an HDD, I am not suprised that it is slow. Need time to spin-up :)
Comment 5 ericspero 2022-05-02 13:35:59 UTC
Why yes, it is a 4TB HDD. When I unplug it and then do suspend-resume I do not get the 'Waiting for ata2' error, and I don't get the 5-6 second system hang either! 

The drive is probably around 5 years old now, but it seems to work fine, and a smartctl short test shows no errors.
Comment 6 Damien Le Moal 2022-05-02 13:40:18 UTC
HDD disk spinup can take more than 10s. That is not something strange. The HDD controller electronic will "boot" faster than that and may start replying to commands without waiting for the full spin-up to complete. However, depending on the HDD internal implementation, the HDD controller may have to wait for the spinup to complete first to access its metadata (drive information) stored on the disk platters. For such HDDs, the drive will not respond until it is fully spun-up.

All things considered, I would say that the message is "normal" and not to be worried about.
Comment 7 ericspero 2022-05-02 14:04:40 UTC
I see, good to know. Thanks for the explanation, and for the original pointer as well.
Comment 8 Bart Van Assche 2022-05-05 22:24:18 UTC
I'm working on a patch.
Comment 9 ericspero 2022-05-06 20:49:15 UTC
Ok. Please let me know if you need any testing done.
Comment 10 The Linux kernel's regression tracker (Thorsten Leemhuis) 2022-05-30 13:20:01 UTC
(In reply to Bart Van Assche from comment #8)
> I'm working on a patch.

Bart, was there any progress? From here it looks like there wasn't any to get this regression fixed, but maybe I just didn't look at the right place.
Comment 11 Bart Van Assche 2022-05-30 16:58:53 UTC
(In reply to The Linux kernel's regression tracker (Thorsten Leemhuis) from comment #10)
> Bart, was there any progress? From here it looks like there wasn't any to
> get this regression fixed, but maybe I just didn't look at the right place.

The past two weeks I was on vacation. I plan to continue working on this bug soon.
Comment 12 Bart Van Assche 2022-06-06 00:50:50 UTC
Can anyone help with testing this kernel branch: https://github.com/bvanassche/linux/tree/sd-resume ? The following patches exist on that kernel branch on top of kernel v5.18:

      block: Export blk_status_to_str()
      scsi: core: Move the definition of SCSI_QUEUE_DELAY
      scsi: core: Retry after a delay if the device is becoming ready
      scsi: sd: Rework asynchronous resume support
      scsi: sd: Add asynchronous resume debugging code
Comment 13 ericspero 2022-06-06 16:48:08 UTC
Created attachment 301109 [details]
dmesg after suspend-resume cycle using patched kernel
Comment 14 ericspero 2022-06-06 16:52:13 UTC
Thanks for sharing the branch. I've just done a few tests which I think I've done correctly, but I'll provide the steps I followed just in case.

1. git clone https://github.com/bvanassche/linux.git into /usr/src/linux
2. git checkout sd-resume
3. make, etc. etc.

To test I called `loginctl suspend`, and then hit enter to resume. I did this both in and out of a graphical environment.

After the first suspend -> resume cycle the system comes back without any 5-6 second delay. Here is a snippet of a dmesg output when this is the case (also attached in previous message):

https://dpaste.org/v5tFk/raw

However, when I `loginctl suspend` again, my screen goes black and the system does not respond to any keyboard inputs that normally cause it to resume, but the system seems to stay powered on (all fans still running). It seems to behave as if it is in S2 sleep, except it can't be resumed.

I will try to fetch a log snippet of when it is in this un-resumeable state.
Comment 15 ericspero 2022-06-06 18:19:18 UTC
These are the only log entries I see after I attempt to `loginctl suspend` a second time during a single boot:

Jun 6 13:54:36 gentoo sudo[2098]: pam_unix(sudo:session): session closed for user root
Jun 6 13:54:43 gentoo polkitd[1946]: Registered Authentication Agent for unix-process:2105:14523 (system bus name :1.7 [/usr/bin/pkttyagent --notify-fd 6 --fallback], object path /org/freedesktop/PolicyKit1/AuthenticationAgent, locale en_CA.utf8)
Comment 16 Bart Van Assche 2022-06-07 02:38:27 UTC
Thank you for having tested the sd-resume branch and also for having shared these results. I will see whether I can reproduce the reported behavior in a VM.
Comment 17 Bart Van Assche 2022-06-07 21:47:01 UTC
Please retest with the new version of the sd-resume branch. The deadlock triggered by the previous version has been fixed as follows:

diff --git a/drivers/scsi/sd.c b/drivers/scsi/sd.c
index 836e13dae4d5..88f2d4737b47 100644
--- a/drivers/scsi/sd.c
+++ b/drivers/scsi/sd.c
@@ -3576,9 +3576,11 @@ static void sd_start_done(struct request *req, blk_status_t status)
        }
 
 start_failed:
-       mutex_lock(&sdev->state_mutex);
+       /*
+        * This function may be called from interrupt context so it is not
+        * allowed here to call mutex_lock(&sdev->state_mutex).
+        */
        scsi_device_set_state(sdev, SDEV_OFFLINE);
-       mutex_unlock(&sdev->state_mutex);
 }
Comment 18 ericspero 2022-06-07 23:40:32 UTC
I am unfortunately still seeing the same 'deadlock' after updating to the latest on the sd-resume branch. 

I tried it twice, the second time I wiped the git repo and started a new pull from scratch just to make sure I wasn't somehow rebuilding the old sd-resume.

And again it looks like nothing is logged when I do the second suspend.
Comment 19 Bart Van Assche 2022-06-07 23:50:39 UTC
How about comparing git commit IDs to make sure that we are testing the same kernel branch? The topmost commit of the sd-resume branch on my development system is as follows: 1d48600df88f ("scsi: sd: Add asynchronous resume debugging code"). If we are using the same kernel branch, please enable kernel debugging in the kernel configuration file by running the attached enable-kernel-debugging-options script.
Comment 20 Bart Van Assche 2022-06-07 23:51:30 UTC
Created attachment 301120 [details]
Script that enables kernel debugging options
Comment 21 ericspero 2022-06-08 00:15:34 UTC
git log shows the same commit:

commit 1d48600df88f2c14b68e4884ceddc2f2d3e5bb4f (HEAD -> sd-resume, origin/sd-resume)
Author: Bart Van Assche <bvanassche@acm.org>
Date:   Sun May 8 16:31:52 2022 -0700

    scsi: sd: Add asynchronous resume debugging code

    Signed-off-by: Bart Van Assche <bvanassche@acm.org>

I have run the attached script in the kernel source directory, which makes some changes to the .config file. It looks like some of the options in "enable_y_options" are not being set to y, which I assume is because that option is absent from the .config file. Just thought I would mention. 

I will compile this kernel and we will see how it goes.
Comment 22 ericspero 2022-06-08 00:43:50 UTC
It looks like running 'make oldconfig' after running the script introduced most of the necessary options to .config, and running the script again set them '=y'. Some, like CONFIG_PM_TRACE, were not added, and are still '=n'.

When I make and install this kernel it has a '-dbg+' suffix. 

Strangely, when I try to resume from 'loginctl suspend' when running this kernel, the system either reboots or does a cold boot. The first thing I see is my BIOS splash. Sorry, I wish I better able to understand what is going on.
Comment 23 Bart Van Assche 2022-06-08 01:53:34 UTC
The reboot or cold boot may indicate that a kernel crash happened in interrupt context. How about enabling netconsole to send the kernel messages over the network and capturing these messages on a second system with Wireshark?

A possible alternative is to apply the patches at the top of the sd-resume branch with git cherry-pick on top of a kernel version that is known to work on your setup, e.g. v5.16.20. I can prepare a branch that has the result of this process if that would help.
Comment 24 ericspero 2022-06-08 02:52:44 UTC
Thanks for the suggestions. I've not done either of these things before but both seem doable. It's getting late here so it will have to wait until tomorrow. 

I will try netconsole first, and then the cherry pick method. My usual kernel is 5.18.2, so it shouldn't be an issue with 5.18. If you think this means the cherry pick option won't be helpful then please let me know.
Comment 25 ericspero 2022-06-16 17:02:07 UTC
Sorry for the delay - I got busy with other things, and it took me a while to figure out how to set up netconsole.

I made sure I was up to date with the latest on your github repo. The kernel version appears to be 5.19 now. When attempting to resume from suspend, it still boots up cold.

As soon as I got netconsole to print messages to the other machine I suspended the sender (with the problematic kernel). Here are the logs it spits out. No further messages were emitted after attempting to resume from suspend.

https://dpaste.org/Lg1jc

Is it maybe time to try the other method of applying your patches on top of a known working kernel? I mentioned last time that my day-to-day kernel is pretty up-to-date: 5.18.2 (I am a couple patch releases behind).
Comment 26 jason600 2022-06-28 21:17:18 UTC
I have been experiencing the same problem since 5.16. I have a hd and ssd (boot from the ssd). Since 5.16 resume from suspend to ram is taking 6-8 seconds (with 5.15 and before, it was less then 2 seconds. 

I narrowed it down to the hd waiting for spinup to finish suspend. Hdparm had not effect, also unmounting the drive had no effect. The only was to fix it was to remove the hd (mounted on sda) with 

# echo 1 > /sys/block/sda/device/delete

Then it resumed in 2 seconds again, but I couldn’t access the hd again until reboot.

I installed the patches from https://github.com/bvanassche/linux/tree/sd-resume on Opensuse Tumbleweed kernel 5.18.6.

I can confirm the patches work, resume from suspend is now back to under 2 seconds, with no obvious problems.

Thank you very much.
Comment 27 Bart Van Assche 2022-06-28 22:30:05 UTC
Thanks for testing! The patches from the sd-resume branch have been posted on the linux-scsi mailing list. See also https://lore.kernel.org/linux-scsi/20220628222131.14780-1-bvanassche@acm.org/T/#t
Comment 28 Nils 2022-07-15 18:44:27 UTC
Possibly related:

https://bugzilla.kernel.org/show_bug.cgi?id=216087

"Only Toshiba MG08/MG09 SATA hard disks disappear after resume from S3 suspend (reproducible) - AMD Ryzen 5 PRO 4650G"

on an X570 board (0x7901).
Comment 29 Nils 2022-08-15 11:02:50 UTC
(In reply to Bart Van Assche from comment #27)
> Thanks for testing! The patches from the sd-resume branch have been posted
> on the linux-scsi mailing list. See also
> https://lore.kernel.org/linux-scsi/20220628222131.14780-1-bvanassche@acm.org/
> T/#t

When will the changes be merged into mainline please?
Comment 30 Bart Van Assche 2022-08-15 13:36:06 UTC
I think that commit 88f1669019bd ("scsi: sd: Rework asynchronous resume support") got merged in Linus' tree (kernel v6.0-rc1) about two days ago.
Comment 31 gzhqyz 2022-08-16 11:08:54 UTC
Hi, on my laptop (500GB SATA HDD, Disk model: WDC WD5000LPVX-2), commit 88f1669019bd cause system hang after resume from suspend and reverting the commit fixes my problem.
Comment 32 Damien Le Moal 2022-08-16 15:44:59 UTC
(In reply to gzhqyz from comment #31)
> Hi, on my laptop (500GB SATA HDD, Disk model: WDC WD5000LPVX-2), commit
> 88f1669019bd cause system hang after resume from suspend and reverting the
> commit fixes my problem.

Can you try kernel 6.0-rc1 ?
Comment 33 Nils 2022-08-16 16:10:26 UTC
(In reply to Bart Van Assche from comment #30)
> I think that commit 88f1669019bd ("scsi: sd: Rework asynchronous resume
> support") got merged in Linus' tree (kernel v6.0-rc1) about two days ago.

Great, thanks.  Looking forward to testing it once my distributor ships it.
Comment 34 Damien Le Moal 2022-08-16 16:14:23 UTC
(In reply to Nils from comment #33)
> (In reply to Bart Van Assche from comment #30)
> > I think that commit 88f1669019bd ("scsi: sd: Rework asynchronous resume
> > support") got merged in Linus' tree (kernel v6.0-rc1) about two days ago.
> 
> Great, thanks.  Looking forward to testing it once my distributor ships it.

The patch has a fixes tag, so it will get backported to the latest 5.19 stable too.
Comment 35 gzhqyz 2022-08-16 17:11:55 UTC
(In reply to Damien Le Moal from comment #32)
> (In reply to gzhqyz from comment #31)
> > Hi, on my laptop (500GB SATA HDD, Disk model: WDC WD5000LPVX-2), commit
> > 88f1669019bd cause system hang after resume from suspend and reverting the
> > commit fixes my problem.
> 
> Can you try kernel 6.0-rc1 ?

Hi, the issue I reported is found in 6.0-rc1. Here is the kernel I am using (without commit 88f1669019bd):

Linux archlinux 6.0.0-rc1-2-mainline #1 SMP Tue, 16 Aug 2022 03:18:24 +0000 x86_64 GNU/Linux
Comment 36 jason600 2022-08-25 20:01:41 UTC
(In reply to Bart Van Assche from comment #27)
> Thanks for testing! The patches from the sd-resume branch have been posted
> on the linux-scsi mailing list. See also
> https://lore.kernel.org/linux-scsi/20220628222131.14780-1-bvanassche@acm.org/
> T/#t

Hi Bart, just an update for you. I noticed this had been removed from the 6.0-rc1 for freezing after suspend.

I've been compiling my kernel with this fix on various 5.18 kernels (with opensuse tumbleweed), it has worked fine, no freezing on resume as others have mentioned.

Yesterday, I updated to 5.19.2 kernel, applied the fix, recompiled, and it froze after the first suspend. Rebooted and the same thing happened again. I recompiled the kernel with the fix, just to make sure i didn't mess it up, and the same happened again.

When you originally did this fix, you based it on 5.18, and indeed, it works fine on 5.18 for me. There were a lot of changes to the drivers/scsi/sd.c file for 5.19, presumably it was those changes that made this fix start freezing after suspend.

Perhaps you could check if the other people that experienced freezing were using either 5.19 or 6.0-rc1.

Hopefully, this information might help in finding the cause and re-fixing this issue.

Thanks for all you help
Comment 37 Bart Van Assche 2022-08-25 20:31:02 UTC
On 8/25/22 13:01, bugzilla-daemon@kernel.org wrote:
> https://bugzilla.kernel.org/show_bug.cgi?id=215880
> 
> --- Comment #36 from jason600 (jason600.groome@gmail.com) ---
> (In reply to Bart Van Assche from comment #27)
>> Thanks for testing! The patches from the sd-resume branch have been posted
>> on the linux-scsi mailing list. See also
>>
>> https://lore.kernel.org/linux-scsi/20220628222131.14780-1-bvanassche@acm.org/
>> T/#t
> 
> Hi Bart, just an update for you. I noticed this had been removed from the
> 6.0-rc1 for freezing after suspend.
> 
> I've been compiling my kernel with this fix on various 5.18 kernels (with
> opensuse tumbleweed), it has worked fine, no freezing on resume as others
> have
> mentioned.
> 
> Yesterday, I updated to 5.19.2 kernel, applied the fix, recompiled, and it
> froze after the first suspend. Rebooted and the same thing happened again. I
> recompiled the kernel with the fix, just to make sure i didn't mess it up,
> and
> the same happened again.
> 
> When you originally did this fix, you based it on 5.18, and indeed, it works
> fine on 5.18 for me. There were a lot of changes to the drivers/scsi/sd.c
> file
> for 5.19, presumably it was those changes that made this fix start freezing
> after suspend.
> 
> Perhaps you could check if the other people that experienced freezing were
> using either 5.19 or 6.0-rc1.

Multiple people reported issues with freezes during suspend with kernel 
v6.0-rc1. Please take a look at the following report: 
https://lore.kernel.org/all/dd6844e7-f338-a4e9-2dad-0960e25b2ca1@redhat.com/. 
It shows that if zoned ATA disks are present that blk_mq_freeze_queue() 
may be called from inside ata_scsi_dev_rescan() on the context of a work 
queue. ATA rescanning happens from inside the SCSI error handler. So 
there is potential for a lockup because of the following:
* Execution of the START command being postponed because the SCSI error 
handler is active.
* blk_mq_freeze_queue() waiting for the START command to finish.
* The START completion handler not being executed because it got queued 
on the same work queue as the ATA rescan work.

Unfortunately I do not know enough about the ATA core to proceed. I need 
help from an ATA expert.
Comment 38 damien.lemoal 2022-08-25 22:15:41 UTC
On 8/26/22 05:31, bugzilla-daemon@kernel.org wrote:
> https://bugzilla.kernel.org/show_bug.cgi?id=215880
> 
> --- Comment #37 from Bart Van Assche (bvanassche@acm.org) ---
> On 8/25/22 13:01, bugzilla-daemon@kernel.org wrote:
>> https://bugzilla.kernel.org/show_bug.cgi?id=215880
>>
>> --- Comment #36 from jason600 (jason600.groome@gmail.com) ---
>> (In reply to Bart Van Assche from comment #27)
>>> Thanks for testing! The patches from the sd-resume branch have been posted
>>> on the linux-scsi mailing list. See also
>>>
>>>
>>> https://lore.kernel.org/linux-scsi/20220628222131.14780-1-bvanassche@acm.org/
>>> T/#t
>>
>> Hi Bart, just an update for you. I noticed this had been removed from the
>> 6.0-rc1 for freezing after suspend.
>>
>> I've been compiling my kernel with this fix on various 5.18 kernels (with
>> opensuse tumbleweed), it has worked fine, no freezing on resume as others
>> have
>> mentioned.
>>
>> Yesterday, I updated to 5.19.2 kernel, applied the fix, recompiled, and it
>> froze after the first suspend. Rebooted and the same thing happened again. I
>> recompiled the kernel with the fix, just to make sure i didn't mess it up,
>> and
>> the same happened again.
>>
>> When you originally did this fix, you based it on 5.18, and indeed, it works
>> fine on 5.18 for me. There were a lot of changes to the drivers/scsi/sd.c
>> file
>> for 5.19, presumably it was those changes that made this fix start freezing
>> after suspend.
>>
>> Perhaps you could check if the other people that experienced freezing were
>> using either 5.19 or 6.0-rc1.
> 
> Multiple people reported issues with freezes during suspend with kernel 
> v6.0-rc1. Please take a look at the following report: 
> https://lore.kernel.org/all/dd6844e7-f338-a4e9-2dad-0960e25b2ca1@redhat.com/. 
> It shows that if zoned ATA disks are present that blk_mq_freeze_queue() 
> may be called from inside ata_scsi_dev_rescan() on the context of a work 
> queue. ATA rescanning happens from inside the SCSI error handler. So 
> there is potential for a lockup because of the following:
> * Execution of the START command being postponed because the SCSI error 
> handler is active.
> * blk_mq_freeze_queue() waiting for the START command to finish.
> * The START completion handler not being executed because it got queued 
> on the same work queue as the ATA rescan work.
> 
> Unfortunately I do not know enough about the ATA core to proceed. I need 
> help from an ATA expert.

I will have a look today.
Comment 39 damien.lemoal 2022-08-26 07:00:38 UTC
On 8/26/22 07:15, bugzilla-daemon@kernel.org wrote:
> https://bugzilla.kernel.org/show_bug.cgi?id=215880
> 
> --- Comment #38 from damien.lemoal@opensource.wdc.com ---
> On 8/26/22 05:31, bugzilla-daemon@kernel.org wrote:
>> https://bugzilla.kernel.org/show_bug.cgi?id=215880
>>
>> --- Comment #37 from Bart Van Assche (bvanassche@acm.org) ---
>> On 8/25/22 13:01, bugzilla-daemon@kernel.org wrote:
>>> https://bugzilla.kernel.org/show_bug.cgi?id=215880
>>>
>>> --- Comment #36 from jason600 (jason600.groome@gmail.com) ---
>>> (In reply to Bart Van Assche from comment #27)
>>>> Thanks for testing! The patches from the sd-resume branch have been posted
>>>> on the linux-scsi mailing list. See also
>>>>
>>>>
>>>>
>>>> https://lore.kernel.org/linux-scsi/20220628222131.14780-1-bvanassche@acm.org/
>>>> T/#t
>>>
>>> Hi Bart, just an update for you. I noticed this had been removed from the
>>> 6.0-rc1 for freezing after suspend.
>>>
>>> I've been compiling my kernel with this fix on various 5.18 kernels (with
>>> opensuse tumbleweed), it has worked fine, no freezing on resume as others
>>> have
>>> mentioned.
>>>
>>> Yesterday, I updated to 5.19.2 kernel, applied the fix, recompiled, and it
>>> froze after the first suspend. Rebooted and the same thing happened again.
>>> I
>>> recompiled the kernel with the fix, just to make sure i didn't mess it up,
>>> and
>>> the same happened again.
>>>
>>> When you originally did this fix, you based it on 5.18, and indeed, it
>>> works
>>> fine on 5.18 for me. There were a lot of changes to the drivers/scsi/sd.c
>>> file
>>> for 5.19, presumably it was those changes that made this fix start freezing
>>> after suspend.
>>>
>>> Perhaps you could check if the other people that experienced freezing were
>>> using either 5.19 or 6.0-rc1.
>>
>> Multiple people reported issues with freezes during suspend with kernel 
>> v6.0-rc1. Please take a look at the following report: 
>>
>> https://lore.kernel.org/all/dd6844e7-f338-a4e9-2dad-0960e25b2ca1@redhat.com/. 
>> It shows that if zoned ATA disks are present that blk_mq_freeze_queue() 
>> may be called from inside ata_scsi_dev_rescan() on the context of a work 
>> queue. ATA rescanning happens from inside the SCSI error handler. So 
>> there is potential for a lockup because of the following:
>> * Execution of the START command being postponed because the SCSI error 
>> handler is active.
>> * blk_mq_freeze_queue() waiting for the START command to finish.
>> * The START completion handler not being executed because it got queued 
>> on the same work queue as the ATA rescan work.

Checking the code, the dev pm resume call chain for ATA look like this.

ata_port_resume() -> ata_port_request_pm() -> ata_port_schedule_eh() ->
scsi_error_handler() thread runs -> shost->transportt->eh_strategy_handler
== ata_scsi_error() -> ata_scsi_port_error_handler() ->
ata_eh_handle_port_resume() -> ap->ops->port_resume() ==
ahci_port_resume() for AHCI adapters.

There are no commands issued to the device by this chain, only
registers/port settings being changed. So this should always complete
quickly and in itself not be the reason for the START_STOP command issued
by the sd driver to get stuck.

After calling ata_eh_handle_port_resume(), ata_scsi_port_error_handler()
will trigger a reset through ata_std_error_handler() -> ata_do_eh() and
that will cause the device rescan in EH context. Device rescan will
definitely spin up the device so the START_STOP command that sd_resume()
issues seems rather useless...

As a hack, it would be good to try something like this:

diff --git a/drivers/ata/libata-scsi.c b/drivers/ata/libata-scsi.c
index 29e2f55c6faa..1bc92c04f048 100644
--- a/drivers/ata/libata-scsi.c
+++ b/drivers/ata/libata-scsi.c
@@ -1081,6 +1081,7 @@ int ata_scsi_dev_config(struct scsi_device *sdev,
struct ata_device *dev)
 	} else {
 		sdev->sector_size = ata_id_logical_sector_size(dev->id);
 		sdev->manage_start_stop = 1;
+		sdev->no_start_on_resume = 1;
 	}

 	/*
diff --git a/drivers/scsi/sd.c b/drivers/scsi/sd.c
index 8f79fa6318fe..4c28ca4d038b 100644
--- a/drivers/scsi/sd.c
+++ b/drivers/scsi/sd.c
@@ -3761,7 +3761,7 @@ static int sd_suspend_runtime(struct device *dev)
 static int sd_resume(struct device *dev)
 {
 	struct scsi_disk *sdkp = dev_get_drvdata(dev);
-	int ret;
+	int ret = 0;

 	if (!sdkp)	/* E.g.: runtime resume at the start of sd_probe() */
 		return 0;
@@ -3770,7 +3770,8 @@ static int sd_resume(struct device *dev)
 		return 0;

 	sd_printk(KERN_NOTICE, sdkp, "Starting disk\n");
-	ret = sd_start_stop_device(sdkp, 1);
+	if (!sdkp->device->no_start_on_resume)
+		ret = sd_start_stop_device(sdkp, 1);
 	if (!ret)
 		opal_unlock_from_suspend(sdkp->opal_dev);
 	return ret;
diff --git a/include/scsi/scsi_device.h b/include/scsi/scsi_device.h
index 3113471ca375..92e141536c6c 100644
--- a/include/scsi/scsi_device.h
+++ b/include/scsi/scsi_device.h
@@ -192,6 +192,7 @@ struct scsi_device {
 	unsigned no_start_on_add:1;	/* do not issue start on add */
 	unsigned allow_restart:1; /* issue START_UNIT in error handler */
 	unsigned manage_start_stop:1;	/* Let HLD (sd) manage start/stop */
+	unsigned no_start_on_resume:1; /* Do not issue START_STOP_UNIT on resume */
 	unsigned start_stop_pwr_cond:1;	/* Set power cond. in START_STOP_UNIT */
 	unsigned no_uld_attach:1; /* disable connecting to upper level drivers */
 	unsigned select_no_atn:1;

I am not sure at all this is correct though. This actually may break other
suspend/resume flavors. If we could somehow synchronize scsi pm resume to
run *after* ata pm resume, all problems should go away I think.
Comment 40 jason600 2022-10-06 15:37:22 UTC
I have been using Damien Lemoal's patch for the last couple of days on kernel 5.19.8, it works fine for me, and no obvious issues.

I also tried in on kernel 6.0, again works fine and no obvious issues.

(In reply to damien.lemoal from comment #39)
> I am not sure at all this is correct though. This actually may break other
> suspend/resume flavors. If we could somehow synchronize scsi pm resume to
> run *after* ata pm resume, all problems should go away I think.

With regard to Damiens comments, and the problems with this patch last time it was submitted, I think it would be a good idea to have more people testing this patch before re-submitting it.
Comment 41 Damien Le Moal 2022-10-06 23:48:06 UTC
(In reply to jason600 from comment #40)
> I have been using Damien Lemoal's patch for the last couple of days on
> kernel 5.19.8, it works fine for me, and no obvious issues.
> 
> I also tried in on kernel 6.0, again works fine and no obvious issues.
> 
> (In reply to damien.lemoal from comment #39)
> > I am not sure at all this is correct though. This actually may break other
> > suspend/resume flavors. If we could somehow synchronize scsi pm resume to
> > run *after* ata pm resume, all problems should go away I think.
> 
> With regard to Damiens comments, and the problems with this patch last time
> it was submitted, I think it would be a good idea to have more people
> testing this patch before re-submitting it.

I did not submit that patch on the lists, because I am not convinced this is the right approach. It is a "hack" right now. As mentioned, the problem is that scsi and ata pm resume are running at the same time because the dev/pm code does not seem to know the dependency between the devices (struct dev). Would need to dig further into that first to see if a cleaner solution can be coded. Will try, but busy, so it may take some time. I first need to be able to recreate the issue on my systems. Will start with that :)
Comment 42 Bart Van Assche 2022-10-07 00:10:46 UTC
Has it been considered to use device_link_add() to enforce the order in which devices are resumed?
Comment 43 damien.lemoal 2022-10-07 00:15:49 UTC
On 10/7/22 09:10, bugzilla-daemon@kernel.org wrote:
> https://bugzilla.kernel.org/show_bug.cgi?id=215880
> 
> --- Comment #42 from Bart Van Assche (bvanassche@acm.org) ---
> Has it been considered to use device_link_add() to enforce the order in which
> devices are resumed?
> 

I am not very familiar with that API level and pm code in general. So I
cannot say. Will need to look into it.

Note You need to log in before you can comment on or make changes to this bug.