Bug 202311 - mei_me intermittently over 2,000 ms on resume from suspend
Summary: mei_me intermittently over 2,000 ms on resume from suspend
Status: ASSIGNED
Alias: None
Product: Drivers
Classification: Unclassified
Component: Other (show other bugs)
Hardware: All Linux
: P1 normal
Assignee: Tomas Winkler
URL:
Keywords:
Depends on:
Blocks: 178231
  Show dependency tree
 
Reported: 2019-01-17 01:08 UTC by Len Brown
Modified: 2020-01-08 15:50 UTC (History)
4 users (show)

See Also:
Kernel Version: 4.20
Subsystem:
Regression: No
Bisected commit-id:


Attachments
Example of mei_me slowness from CFL-H machine in S3 resume (1.05 MB, text/html)
2019-01-17 01:08 UTC, Len Brown
Details
sleepgraph timeline with callgraph over mei_me (1.85 MB, text/html)
2019-01-17 02:40 UTC, Todd Brandt
Details
expanded callgraph showing mei_reset (193.37 KB, image/png)
2019-01-17 16:25 UTC, Len Brown
Details
issue.def (367 bytes, text/plain)
2019-04-25 15:50 UTC, Todd Brandt
Details

Description Len Brown 2019-01-17 01:08:56 UTC
Created attachment 280543 [details]
Example of mei_me slowness from CFL-H machine in S3 resume

On several machines, including a Dell XPS13 9360,
the mei_me device sometimes takes over 2,000 ms to resume.
All of resume waits for this device to complete
when this happens.

In endurance tests of 2,000 suspend/resume cycles,
mei_me behaves this way about 20 times (about 1% of the time).

This is seen on both suspend to mem (ACPI S3) and also s2idle.
Comment 1 Todd Brandt 2019-01-17 02:40:47 UTC
Created attachment 280545 [details]
sleepgraph timeline with callgraph over mei_me

sleepgraph timeline with callgraph enabled, only mei_me callgraphs are shown.
Comment 2 Todd Brandt 2019-01-17 02:42:38 UTC
The issue is occurring in the mei_reset (see callgraph timeline):

mei_me_hw_start [mei_me] (2005.518 ms @ 7098.274)

it's waiting for a schedule call.
Comment 3 Tomas Winkler 2019-01-17 09:31:23 UTC
Can you please provide BIOS and CSE FW version (cat /sys/class/mei/mei0/fw_ver)
Thanks
Comment 4 Len Brown 2019-01-17 16:25:36 UTC
Created attachment 280571 [details]
expanded callgraph showing mei_reset

If you expand the callgraph in comment #1
this is that you see.
Comment 5 Todd Brandt 2019-01-17 16:27:34 UTC
$ cat /sys/class/mei/mei0/fw_ver
0:12.0.22.1310
0:12.0.22.1310
0:12.0.22.1310
Comment 6 Todd Brandt 2019-01-17 16:29:39 UTC
As for BIOS version, from the timeline's dmesg header:
# sysinfo | man:Intel Corporation | plat:CoffeeLake H DDR4 RVP | cpu:Genuine Intel(R) CPU 0000 @ 2.10GHz | bios:CNLSFWR1.R00.X181.B00.1812130202 | numcpu:16 | memsz:8053184 | memfr:3353116
Comment 7 Tomas Winkler 2019-02-11 14:49:38 UTC
Can you try to disable the TPM  via BIOS, if this helps?
Comment 8 Todd Brandt 2019-04-25 15:50:14 UTC
Created attachment 282531 [details]
issue.def
Comment 9 Todd Brandt 2019-10-08 20:08:52 UTC
I just verified that the TPM Security BIOS switch has been disabled from the beginning. i.e. this one (looks like the picture but not greyed out):

https://kbimg.dell.com/library/KB/DELL_ORGANIZATIONAL_GROUPS/DELL_GLOBAL/Content%20Team/Grey%20TPM%20Lat%207350%201.png
Comment 10 Len Brown 2019-11-15 03:12:37 UTC
Still an issue in Linux 5.4-rc7

Dell XPS 9360 with latest BIOS running with SETUP to "factory defaults", which has the TPM disabled (per above).

/sys/class/dmi/id:
bios_date:05/26/2019
bios_vendor:Dell Inc.
bios_version:2.12.0
board_name:0839Y6

/sys/class/mei/mei0:
dev_state:ENABLED
fw_status:94000245
fw_status:82218506
fw_status:00000030
fw_status:00684004
fw_status:00001F01
fw_status:47C00BC9
fw_ver:0:11.8.65.3590
fw_ver:0:11.8.65.3590
fw_ver:0:11.5.1.1006
hbm_ver:2.0
hbm_ver_drv:2.1
Comment 11 Len Brown 2020-01-08 15:35:10 UTC
Hi Tomas, what can we do to debug this?
Comment 12 Tomas Winkler 2020-01-08 15:50:20 UTC
This is not a kernel driver issue as far as I understand, but fw one, which busy with some bookkeeping task.
I've provided all the data to the fw team, as you pointed out in the issue description this happens %1 of the time, so I'm trying to understand if reflects  the priority of this issue in real life.

Note You need to log in before you can comment on or make changes to this bug.