Bug 216413 - [BISECT INCLUDED] scsi/sd Rework asynchronous resume support breaks S2idle and S3 on several systems
Summary: [BISECT INCLUDED] scsi/sd Rework asynchronous resume support breaks S2idle an...
Status: RESOLVED CODE_FIX
Alias: None
Product: IO/Storage
Classification: Unclassified
Component: SCSI (show other bugs)
Hardware: Intel Linux
: P1 blocking
Assignee: Bart Van Assche
URL:
Keywords:
Depends on:
Blocks: 178231
  Show dependency tree
 
Reported: 2022-08-25 21:24 UTC by Todd Brandt
Modified: 2022-08-29 22:37 UTC (History)
4 users (show)

See Also:
Kernel Version: 6.0.0-rc1
Subsystem:
Regression: No
Bisected commit-id:


Attachments
otcpl-hp-x360-bsw-dmesg.txt (340.51 KB, text/plain)
2022-08-25 21:30 UTC, Todd Brandt
Details
otcpl-hp-x360-system-info.txt (24.80 KB, text/plain)
2022-08-25 21:35 UTC, Todd Brandt
Details
otcpl-dell-3493-icl-system-info.txt (26.45 KB, text/plain)
2022-08-25 21:38 UTC, Todd Brandt
Details

Description Todd Brandt 2022-08-25 21:24:23 UTC
A commit in 6.0.0-rc1 has caused S2idle and S3 (freeze & mem) to completely hang the system on these 4 machines in our lab:

1) Clevo System76 Lemur 6
2) Lenovo Yoga 2 Pro
3) Dell Inspiron 3493
4) HP Pavillion x360

To reproduce the issue simply run kernel 6.0.0-rc1 or newer on these systems and run "sudo sleepgraph -m freeze" or "sudo sleepgraph -m mem". The system will hang after that.

I've bisected the problem to this specific commit:

88f1669019bd62b3009a3cebf772fbaaa21b9f38 is the first bad commit
commit 88f1669019bd62b3009a3cebf772fbaaa21b9f38
Author: Bart Van Assche <bvanassche@acm.org>
Date:   Thu Jun 30 12:57:03 2022 -0700

    scsi: sd: Rework asynchronous resume support
    
    For some technologies, e.g. an ATA bus, resuming can take multiple
    seconds. Waiting for resume to finish can cause a very noticeable delay.
    Hence this commit that restores the behavior from before "scsi: core: pm:
    Rely on the device driver core for async power management" for most SCSI
    devices.
    
    This commit introduces a behavior change: if the START command fails, do
    not consider this as a SCSI disk resume failure.
    
    Link: https://bugzilla.kernel.org/show_bug.cgi?id=215880
    Link: https://lore.kernel.org/r/20220630195703.10155-3-bvanassche@acm.org
    Fixes: a19a93e4c6a9 ("scsi: core: pm: Rely on the device driver core for async power management")
    Cc: Ming Lei <ming.lei@redhat.com>
    Cc: Hannes Reinecke <hare@suse.de>
    Cc: John Garry <john.garry@huawei.com>
    Cc: ericspero@icloud.com
    Cc: jason600.groome@gmail.com
    Tested-by: jason600.groome@gmail.com
    Signed-off-by: Bart Van Assche <bvanassche@acm.org>
    Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

:040000 040000 dbd390c19cfddba2b559b06691404aee4c165384 54c7fa67e3a1605878999bdf1e39a95ca793238a M	drivers
Comment 1 Bart Van Assche 2022-08-25 21:27:26 UTC
A revert for that patch has been posted: https://lore.kernel.org/linux-scsi/20220816172638.538734-1-bvanassche@acm.org/
Comment 2 Todd Brandt 2022-08-25 21:30:55 UTC
Created attachment 301662 [details]
otcpl-hp-x360-bsw-dmesg.txt

S3 suspend fail dmesg log for the HP Pavillion x360
Comment 3 Todd Brandt 2022-08-25 21:32:02 UTC
One other thing to note, I tried removing this one commit from the very latest 6.0.0-rc2 code upstream and it fixed it completely. So there's no doubt this one commit is the sole cause of the hang.
Comment 4 Todd Brandt 2022-08-25 21:35:52 UTC
Created attachment 301663 [details]
otcpl-hp-x360-system-info.txt

System info for the HP Pavillion x360
Comment 5 Todd Brandt 2022-08-25 21:38:40 UTC
Created attachment 301664 [details]
otcpl-dell-3493-icl-system-info.txt

Dell Inspiron 3493 system info
Comment 6 Todd Brandt 2022-08-25 21:46:48 UTC
(In reply to Bart Van Assche from comment #1)
> A revert for that patch has been posted:
> https://lore.kernel.org/linux-scsi/20220816172638.538734-1-bvanassche@acm.
> org/

When will this make it upstream? In 6.0.0-rc3 hopefully?
Comment 7 Bart Van Assche 2022-08-25 21:55:46 UTC
On 8/25/22 14:46, bugzilla-daemon@kernel.org wrote:
> When will this make it upstream? In 6.0.0-rc3 hopefully?

I'm not sure. This depends on the SCSI maintainer.
Comment 8 Todd Brandt 2022-08-25 22:12:13 UTC
(In reply to Bart Van Assche from comment #7)
> On 8/25/22 14:46, bugzilla-daemon@kernel.org wrote:
> > When will this make it upstream? In 6.0.0-rc3 hopefully?
> 
> I'm not sure. This depends on the SCSI maintainer.

ok, thank you, I'll keep monitoring and will close this bug when it lands upstream.
Comment 9 Todd Brandt 2022-08-29 22:37:04 UTC
Looks like the commit was successfully reverted in 6.0.0-rc3. S2idle works again on the HP Pavillion x360 and Dell Inspiron 3493.

Note You need to log in before you can comment on or make changes to this bug.