Bug 196853

Summary: Thinkpad S1 Yoga resumes from suspend after a few seconds
Product: Drivers Reporter: Zachary Warren (conflatulence)
Component: OtherAssignee: drivers_other
Status: NEW ---    
Severity: normal CC: conflatulence, srinivas.pandruvada
Priority: P1    
Hardware: Intel   
OS: Linux   
Kernel Version: 4.12.10 Subsystem:
Regression: Yes Bisected commit-id:
Attachments: output of journalctl -k -b with some suspend/resume cycles
.config used for bisecting
output of lsmod
journactl -b -k with suspend and dynamic debugging for hid-sensor-trigger.c
journactl with changed hysteresis values
Journal -b -k with suspend and dyndbg turned on from boot
Just prevent opening some devices
user req open
Remove reports enable for raw I/F.
journalctl -k -b with second patch and 4 suspend resume cycles
A workaround patch

Description Zachary Warren 2017-09-07 06:02:25 UTC
Created attachment 258243 [details]
output of journalctl -k -b with some suspend/resume cycles

Thinkpad S1 Yoga resumes from suspend after 3-4 seconds in 4.12.10 (without user interaction). Doing a bisect, the problematic commit is fc7957b6cdd7 iio: hid-sensor-trigger: Fix the race with user space powering up sensors.

* I saw this a bit earlier in Fedora 26 because Fedora applied that fix before it got into the stable release (it's in kernel-4.12.8-300.fc26.src.rpm).

Output of journactl -k -b with a couple of suspend/resume cycles is attached.

Note if I disable XHC by echo XHC > /proc/acpi/wakeup the suspend also works normally. The normal wakeup output is below (XHC enabled).
cat /proc/acpi/wakeup 
Device	S-state	  Status   Sysfs node
P0P1	  S4	*disabled
XHC	  S3	*enabled   pci:0000:00:14.0
HDEF	  S4	*disabled  pci:0000:00:1b.0
RP01	  S4	*disabled  pci:0000:00:1c.0
PXSX	  S4	*disabled
RP03	  S4	*disabled  pci:0000:00:1c.2
PXSX	  S5	*disabled  pci:0000:04:00.0
PXSX	  S4	*disabled  pci:0000:05:00.0
		*disabled  platform:rtsx_pci_sdmmc.0
		*disabled  platform:rtsx_pci_ms.0
PEG0	  S4	*disabled
PEGP	  S4	*disabled
PEG1	  S4	*disabled
PEG2	  S4	*disabled
LID0	  S4	*enabled   platform:PNP0C0D:00

I can't see how to find more info about what causes the machine to wake up again. I tried to use pm_trace but it oopsed.
Comment 1 Zachary Warren 2017-09-07 06:06:32 UTC
Created attachment 258245 [details]
.config used for bisecting
Comment 2 Zachary Warren 2017-09-07 06:15:35 UTC
Created attachment 258247 [details]
output of lsmod
Comment 3 Srinivas Pandruvada 2017-09-07 16:14:52 UTC
Looks like the sensor hub is a wake device and is connected to USB.
What is the output of 
lsusb
Comment 4 Srinivas Pandruvada 2017-09-07 17:43:49 UTC
Also enable dynamic debug

echo 'file hid-sensor-trigger.c +p' >                              /sys/kernel/debug/dynamic_debug/control

If during S3, if the sensors are kept on means that you can wake the device by lifting or moving. 


In /sys/bus/iio/devices/iio:device*/ there should be a hysteresis attribute.
It can be increased to avoid wakeups. If you write a bigger value and readback that it is accepted, should work.
Comment 5 Zachary Warren 2017-09-07 23:19:04 UTC
Created attachment 258261 [details]
journactl -b -k with suspend and dynamic debugging for hid-sensor-trigger.c
Comment 6 Zachary Warren 2017-09-07 23:19:47 UTC
lsusb output
Bus 001 Device 004: ID 056a:00ec Wacom Co., Ltd TPCEC
Bus 001 Device 003: ID 0483:91d1 STMicroelectronics Sensor Hub
Bus 001 Device 002: ID 8087:8000 Intel Corp. 
Bus 001 Device 001: ID 1d6b:0002 Linux Foundation 2.0 root hub
Bus 003 Device 001: ID 1d6b:0003 Linux Foundation 3.0 root hub
Bus 002 Device 004: ID 04f2:b39f Chicony Electronics Co., Ltd 
Bus 002 Device 003: ID 04f3:0254 Elan Microelectronics Corp. 
Bus 002 Device 008: ID 22b8:2e24 Motorola PCS 
Bus 002 Device 001: ID 1d6b:0002 Linux Foundation 2.0 root hub
Comment 7 Zachary Warren 2017-09-07 23:55:12 UTC
Created attachment 258263 [details]
journactl with changed hysteresis values
Comment 8 Zachary Warren 2017-09-08 00:00:23 UTC
I tried to change hysteresis values but it did not change the behaviour.

These are hysteresis values available:
devices]$ for dev in iio\:device* ; do echo $dev $(cat $dev/name) ; cat $dev/*hysteresis ; done
iio:device0 incli_3d
cat: 'iio:device0/in_incli_hysteresis': Invalid argument
iio:device1 als
cat: 'iio:device1/in_illuminance_hysteresis': Invalid argument
cat: 'iio:device1/in_intensity_hysteresis': Invalid argument
iio:device2 dev_rotation
cat: 'iio:device2/in_rot_hysteresis': Invalid argument
iio:device3 accel_3d
0.020000
iio:device4 gyro_3d
0.500000
iio:device5 magn_3d
20.000000
cat: 'iio:device5/in_rot_hysteresis': Invalid argument

I changed 0.02 to 1.0, 0.5 to 10.0 and 20 to 100. The log of suspend with these changes is attached. It is a continuation of the dyndbg log (no reboot in between). Interesting, the hysteresis values return to their default values after reset.
Comment 9 Srinivas Pandruvada 2017-09-08 00:20:08 UTC
Not sure how this commit would have caused this issue. All sensors are powered off before suspend.

Sep 08 09:03:41 alice kernel: HID_SENSOR HID-SENSOR-200041 set power_state 6 report_state 1
Sep 08 09:03:41 alice kernel: HID_SENSOR HID-SENSOR-200083 set power_state 6 report_state 1
Sep 08 09:03:41 alice kernel: HID_SENSOR HID-SENSOR-200073 set power_state 6 report_state 1
S

What is the output of
# systemctl status iio-sensor-proxy.service

I don't think that you have this service running.
Can you start and try?
Comment 10 Srinivas Pandruvada 2017-09-08 01:13:23 UTC
One more request:

Can you enable the dyanmic debug during kernel boot via boot param (edit grub command line). In this way i can also see the boot trace.

dyndbg="hid-sensor-trigger.c +p"
Comment 11 Zachary Warren 2017-09-08 03:34:15 UTC
Created attachment 258265 [details]
Journal -b -k with suspend and dyndbg turned on from boot

I had some trouble turning on the dyndbg from boot, I think because hid_sensor_trigger is a module. Using hid_sensor_trigger.dyndbg=... on the command line didn't seem to work either but adding a /etc/modprobe.d/hid_sensor_trigger.conf with the line "options hid_sensor_trigger dyndbg" seem to work.
Comment 12 Zachary Warren 2017-09-08 03:35:32 UTC
The iio-sensor-proxy service seems to be running:

systemctl status iio-sensor-proxy.service 
● iio-sensor-proxy.service - IIO Sensor Proxy service
   Loaded: loaded (/usr/lib/systemd/system/iio-sensor-proxy.service; static; vendor preset: disabled)
   Active: active (running) since Fri 2017-09-08 13:24:19 AEST; 10min ago
 Main PID: 698 (iio-sensor-prox)
    Tasks: 3 (limit: 4915)
   CGroup: /system.slice/iio-sensor-proxy.service
           └─698 /usr/sbin/iio-sensor-proxy

Sep 08 13:24:16 alice systemd[1]: Starting IIO Sensor Proxy service...
Sep 08 13:24:19 alice systemd[1]: Started IIO Sensor Proxy service.
Sep 08 13:24:20 alice iio-sensor-prox[698]: Failed to read float from /sys/devices/pci0000:00/0000:00:1d.0/usb1/1-1/1-1.7/1-1.7:1.0/0003:0483:91D1.0001/HID-SENSOR-200073.2.auto/iio:device1/in_scale
Sep 08 13:24:20 alice iio-sensor-prox[698]: Failed to read float from /sys/devices/pci0000:00/0000:00:1d.0/usb1/1-1/1-1.7/1-1.7:1.0/0003:0483:91D1.0001/HID-SENSOR-200073.2.auto/iio:device1/in_offset
Comment 13 Zachary Warren 2017-09-08 04:29:20 UTC
Sorry I've made missed some important info: When I did the bisect I had iio-sensor-proxy masked (I didn't do this intentionally, I had it masked from messing about with power settings a long time ago). I enabled it and saw the buggy behavior on 4.12.10 but didn't go back to the pre-fc7957 kernel to see how it behaved. I've done that test now and I think having iio-sensor-proxy enabled causes the wake-from-suspend problem even before fc7957. I'll do some more tests to confirm it but that is what I currently believe.
Comment 14 Srinivas Pandruvada 2017-09-08 06:01:24 UTC
Created attachment 258267 [details]
Just prevent opening some devices
Comment 15 Srinivas Pandruvada 2017-09-08 06:04:26 UTC
Created attachment 258269 [details]
user req open

Try this and let me know it helps.

As you said you have now issue even before this patch. The FW is broken as it is generating interrupts on powered off sensor.
I suggest try if there is any updated bios.
Comment 16 Zachary Warren 2017-09-08 06:54:49 UTC
OK I got a bios update but the problem remains. Applying user_req.diff the problem does not reproduce (great!). This is with iio-sensor-proxy masked. If I then start iio-sensor-proxy the problem reproduces again (that was expected).

So I think if I go back and find if/when suspend + iio-sensor-proxy was ever working on this machine I'll raise a new bug separate from this one.
Comment 17 Srinivas Pandruvada 2017-09-08 13:15:19 UTC
So we still need debug why USB never stops. Can you try this?

With the user_req.diff change applied and  the iio-sensor-proxy stopped,
Go to 
each
/sys/bus/iio/devices/iio:device*
and try cat attribute ending with *_raw

Now try to do suspend, I think you will have the same issue.

Then see if you can identify one iio:device which is causing issue.
Comment 18 Zachary Warren 2017-09-10 05:44:44 UTC
For devices dev_rotation, als and gyro_3d suspend works fine after cat _raw file.

For devices magn_3d, incl_3d and accel_3d the first suspend after cat of _raw wakes after a few seconds. Subsequent suspends (without another cat _raw) work fine.

Another thing (maybe not relevant), if I start iio-sensor-proxy and stop it again, suspend always wakes after a few seconds. I can see in the lsmod output that even after stopping iio-sensor-proxy some hid_sensor modules are still in use. I can't rmmod them. Looking at lsof I can't see what's keeping them open.

Here is lsmod after starting and stopping iio-sensor-proxy. Prior to starting iio-sensor-proy, all the 1s in the right column were 0s.
[zach@alice devices]$ lsmod | grep hid
hid_sensor_accel_3d    16384  1
hid_sensor_magn_3d     16384  1
hid_sensor_rotation    16384  0
hid_sensor_als         16384  1
hid_sensor_gyro_3d     16384  0
hid_sensor_incl_3d     16384  0
< other hid modules snipped off >
Comment 19 Srinivas Pandruvada 2017-09-11 17:25:55 UTC
The iio sensor proxy doesn't terminate well, so the modules will not be released.
To disable and test you can rename the iio-sensor-proxy executable in
/usr/lib/systemd/system/iio-sensor-proxy.service

On your platform the FW is broken for reports. It never stops. So really use iio-sensor-proxy need to remove xhci as wakeup device.

But I want to check how to fix the raw interface first. I am attaching another patch to try. Before applying remove the last patch I sent.
Please take dmesg and attach.
Comment 20 Srinivas Pandruvada 2017-09-11 17:26:59 UTC
Created attachment 258305 [details]
Remove reports enable for raw I/F.
Comment 21 Zachary Warren 2017-09-12 00:35:19 UTC
Created attachment 258313 [details]
journalctl -k -b with second patch and 4 suspend resume cycles

There are 4 suspend/resume cycles in this log.
Suspend 1: Suspend immediately after boot, suspend resumes without user interaction after a few seconds (no cat of _raw yet).
Suspend 2: Suspend works normally, I pressed the power button to resume.

Now I cat iio device accel_3d in_accel_raw

Suspend 3: Suspend resumes after a few seconds without user interaction.
Suspend 4: Suspend works normally, I pressed the power button to resume.
Comment 22 Srinivas Pandruvada 2017-09-12 17:41:42 UTC
Thanks for test and results.
With the change, I have disabled the reports notification for raw reads. This means that HID-sensor will only reply to a read from host.
Also all sensors are powered off before suspend. The read confirms the settings.

Then also you have a wakeup. I think this is caused by some other USB device.

Can you do try blacklistinh hid-sensor-hub which will also prevent loading any sensor modules?
 
You can blacklist from kernel command line also
modprobe.blacklist= hid_sensor_hub
Comment 23 Tatsuyuki Ishi 2017-09-13 10:02:55 UTC
I seems to have issues with suspension since 4.13 (I skipped 4.12 patch series). I'm on X1 Yoga (1st gen), and the symptom is that either the system refuses to enter suspend, or fails to wake up with keyboard lights on but power light blinking.

Am I countering the same issue? I also have the same iio-sensor-proxy error.
Comment 24 Zachary Warren 2017-09-13 13:31:53 UTC
Srinivas: I tried blacklist of hid_sensor_hub but it still seems to load and all the sensor modules as well. If I blacklist all the sensor modules then suspend is working fine every time (even with hid_sensor_hub module loaded). I have /etc/modprobe.d/hid_sensor_hub.conf with the following lines:
blacklist hid_sensor_magn_3d
blacklist hid_sensor_rotation
blacklist hid_sensor_gyro_3d
blacklist hid_sensor_accel_3d
blacklist hid_sensor_als
blacklist hid_sensor_incl_3d

The output of lsmod | grep hid
hid_sensor_hub         20480  0
hid_multitouch         20480  0

I would like to test this a bit more but have no time for a few days. 

Tatsuyuki: My suspicion is it's a different issue. One way to check is set up the hid_sensor_hub.conf as above, reboot and test suspend. If suspend is ok with blacklisted sensor modules, it may be the same. Also check output of lsusb, does it contain "STMicroelectronics Sensor Hub"?
Comment 25 Srinivas Pandruvada 2017-10-10 00:18:19 UTC
Created attachment 258781 [details]
A workaround patch

Please try the attached change. This should address your original issue. But if you start iio-sensor-proxy service or use cat to read raw values, you still have same issue as before. In that case only real option is to disable XHC as wake up device.

To apply this patch
- Remove all previous test patches (git reset --hard HEAD)
- Apply this patch (git am ..)
- Disable iio sensor proxy ( I would change the path name of iio-sensor-proxy bin file to some non existing). I see disable this service still gets enabled.

- Without any cat of raw data of sensors just to suspend/resume as you do.

Hope this will not cause any premature wakeup.

Once you confirm, I will post this patch to the mailing list. if it doesn't work, please enable dynamic debug as before and attach logs.
Comment 26 Zachary Warren 2017-10-11 09:45:36 UTC
Hi Srinivas,

Yes the patch is good, suspend works fine with iio-sensor-proxy masked/disabled (I didn't try any cat of devices), I tested on on 4.12.14.

Thanks!
Comment 27 Srinivas Pandruvada 2017-10-11 15:33:40 UTC
Thanks. I will post this patch to the mailing list.
Comment 28 Srinivas Pandruvada 2017-10-18 17:48:23 UTC
This change is applied to linux-next to appear in 4.15.
If there are no complains in 4.15, I will mark for stable kernel also.