Bug 198947 - Lenovo T440p don't wake up from suspend
Summary: Lenovo T440p don't wake up from suspend
Status: CLOSED CODE_FIX
Alias: None
Product: ACPI
Classification: Unclassified
Component: Power-Sleep-Wake (show other bugs)
Hardware: All Linux
: P1 high
Assignee: Zhang Rui
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2018-02-28 18:01 UTC by Frank
Modified: 2018-06-22 02:26 UTC (History)
2 users (show)

See Also:
Kernel Version: 4.15.4
Subsystem:
Regression: No
Bisected commit-id:


Attachments
The log of the bisect procedure. (5.02 KB, application/octet-stream)
2018-05-22 09:05 UTC, Frank
Details

Description Frank 2018-02-28 18:01:38 UTC
Steps to Reproduce:
1. boot the kernel
2. close the laptop
3. open the laptop

Actual results:
The device will not wake up any more. Only an hard reset will help.

Expected results:
That the device will wake up

Additional info:
4.14.18 don't have this problem.
Firmware: 2.47 
Device Code: 20AN00C1GE
Comment 1 Frank 2018-03-02 08:31:28 UTC
Today tested with 4.15.6. But it have the same problem.
Comment 2 Chen Yu 2018-03-06 17:17:25 UTC
Could you please help do a bisect?
Comment 3 Chen Yu 2018-03-12 02:01:53 UTC
ping..
Comment 4 Frank 2018-03-12 06:40:30 UTC
What information do you need?
Comment 5 Chen Yu 2018-03-30 03:27:57 UTC
(In reply to Frank from comment #4)
> What information do you need?
Please provide the 
cat /sys/power/mem_sleep in both 4.14.18 and 4.15.6


And..git bisect thus to find the offender commit is a good method to address this issue. Since there isn't too many version between 4.14.18 and 4.15.6. But please test with upstream kernel rather than the distribution one when doing bisect.
Comment 6 Frank 2018-03-30 09:45:14 UTC
After an detail look at the device, it looks like it will not enter corrent the sleep state. Bacause with the working kernel the led on the cover will do flashing after enter the sleep state. But with the failed kernel, the led don't flashing after close the cover.

Doing the same, when the device is connect to the power supply, then it will work.
Only when it runs on battery power, it will fails.

cat /sys/power/mem_sleep 
4.14.18:
	s2idle [deep]
4.15.12:
	s2idle [deep]
4.15.14(vanilla):
	s2idle [deep]	
4.16-rc7(vanilla):
	s2idle [deep]

Kernel 4.16-rc7 will work on battery and mains power.
All other except 4.14.18 and 4.16-rc7 will fail on battery power.
Comment 7 Frank 2018-04-27 09:01:25 UTC
Today I tested 4.16.5 and it will have the same problem.
The last working is 4.16-rc7 and 4.14.18.
Comment 8 Frank 2018-04-27 10:56:09 UTC
4.17-rc2 -> fails
4.14.37  -> works
Comment 9 Zhang Rui 2018-05-02 06:21:35 UTC
hmmm, so the problem is that
1. suspend always works well, with AC power, on any kernel
2. suspend breaks with battery power, on some kernel later than 4.14.18, then it gets fixed in 4.16-rc7, and then breaks again in 4.16.5 and 4.17-rc2?

First of all, I'm not quite convinced that it works in 4.16-rc7 but breaks in 4.16.5. could you please try 4.16-rc7 and 4.16.0, and confirm if it works or not? (please use vanilla kernel in both cases.)

If it is still true (works in 4.16-rc7, and breaks in 4.16.0), then please do a git bisect between these two releases to find out the offending commit.
Comment 10 Frank 2018-05-04 15:51:03 UTC
Hi Rui,
today I have tested 4.16-rc7 and 4.16.0. 
In both cases I used the vanilla one from kernel.org.
Both was failed. An I retest 4.14.37.
The 4.14 line looks the last working line.
With what details can I help you?

I only see that the LCD will get black. But the CPU fan will not stop.
And the LED on the cover don't begin to flash. On an working kernel the fan stops after 2-5 seconds and the LED on the cover will starts flashing slowly.

So it can also be that the complete sleep state can't be enters.
Comment 11 Zhang Rui 2018-05-07 06:54:08 UTC
so please confirm if the problem
1. exists in 4.15-rc1
2. does not exists in 4.14.0
if yes, then please bisect between 4.14.0 and 4.15-rc1, to see which commit introduces the problem.
Comment 12 Frank 2018-05-12 09:20:41 UTC
Yes, this will so.
Sorry for the stupid question, but how must I do the bisect between the kernels?
Comment 14 Frank 2018-05-21 09:16:07 UTC
Thanks, for the link.
I will test the commits and send the "bad" back.
But it will take an while, because of an big lack of time.
Comment 15 Frank 2018-05-22 09:05:31 UTC
Created attachment 276129 [details]
The log of the bisect procedure.
Comment 16 Frank 2018-05-22 09:06:31 UTC
After one day of testing this patch look like break it:
git bisect good | tee -a /var/tmp/bisect.log 
08810a4119aaebf6318f209ec5dd9828e969cba4 is the first bad commit
commit 08810a4119aaebf6318f209ec5dd9828e969cba4
Author: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Date:   Wed Oct 25 14:12:29 2017 +0200

    PM / core: Add NEVER_SKIP and SMART_PREPARE driver flags
    
    The motivation for this change is to provide a way to work around
    a problem with the direct-complete mechanism used for avoiding
    system suspend/resume handling for devices in runtime suspend.
    
    The problem is that some middle layer code (the PCI bus type and
    the ACPI PM domain in particular) returns positive values from its
    system suspend ->prepare callbacks regardless of whether the driver's
    ->prepare returns a positive value or 0, which effectively prevents
    drivers from being able to control the direct-complete feature.
    Some drivers need that control, however, and the PCI bus type has
    grown its own flag to deal with this issue, but since it is not
    limited to PCI, it is better to address it by adding driver flags at
    the core level.
    
    To that end, add a driver_flags field to struct dev_pm_info for flags
    that can be set by device drivers at the probe time to inform the PM
    core and/or bus types, PM domains and so on on the capabilities and/or
    preferences of device drivers.  Also add two static inline helpers
    for setting that field and testing it against a given set of flags
    and make the driver core clear it automatically on driver remove
    and probe failures.
    
    Define and document two PM driver flags related to the direct-
    complete feature: NEVER_SKIP and SMART_PREPARE that can be used,
    respectively, to indicate to the PM core that the direct-complete
    mechanism should never be used for the device and to inform the
    middle layer code (bus types, PM domains etc) that it can only
    request the PM core to use the direct-complete mechanism for
    the device (by returning a positive value from its ->prepare
    callback) if it also has been requested by the driver.
    
    While at it, make the core check pm_runtime_suspended() when
    setting power.direct_complete so that it doesn't need to be
    checked by ->prepare callbacks.
    
    Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
    Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
    Acked-by: Bjorn Helgaas <bhelgaas@google.com>
    Reviewed-by: Ulf Hansson <ulf.hansson@linaro.org>

:040000 040000 6f18a781ca7ee0501888a66532f0667f2926aeb1 440821a72777285dccc37d3a8254688bf4a24486 M	Documentation
:040000 040000 6aaceba7f5aae9368a1e6e287a1f56cb1326adbf 557c1672f5101aeae16ce6bda4969c42dd3321bb M	drivers
:040000 040000 bdc707f2a476baf517361c46ed28977cb30b6e1b 7c33fb89c953ad06a7b1c8b686d6b6a403aa509b M	include
Comment 17 Zhang Rui 2018-06-20 05:55:41 UTC
first please confirm if the problem still exists in the latest 4.18-rc kernel, as there are a couple of related fixes that I was aware of.
if yes, then please confirm
1, the problem exists with this kernel
git checkout 08810a4119aaebf6318f209ec5dd9828e969cba4
2. the problem does not exist with this kernel
git checkout ~08810a4119aaebf6318f209ec5dd9828e969cba4~
Comment 18 Frank 2018-06-21 18:16:54 UTC
Hi Rui,
today I tested 4.18.0-rc1 and it will do its' job. :)
Comment 19 Zhang Rui 2018-06-22 02:26:30 UTC
good to know, bug closed.

Note You need to log in before you can comment on or make changes to this bug.