Bug 76211 - [hsw] i915 resume latency 909 ms
Summary: [hsw] i915 resume latency 909 ms
Status: RESOLVED MOVED
Alias: None
Product: Drivers
Classification: Unclassified
Component: Video(DRI - Intel) (show other bugs)
Hardware: All Linux
: P1 normal
Assignee: intel-gfx-bugs@lists.freedesktop.org
URL: https://bugs.freedesktop.org/show_bug...
Keywords:
Depends on:
Blocks:
 
Reported: 2014-05-14 18:09 UTC by Len Brown
Modified: 2017-10-17 12:44 UTC (History)
4 users (show)

See Also:
Kernel Version: 3.15-rc1
Subsystem:
Regression: Yes
Bisected commit-id:


Attachments
0001-drm-i915-don-t-wait-for-power-cycle-when-waiting-for.patch (3.24 KB, patch)
2014-05-14 18:09 UTC, Len Brown
Details | Diff
avoid full hotplug detect on resume (1.73 KB, patch)
2014-05-15 19:24 UTC, Jesse Barnes
Details | Diff
make crtc enable asynchronous (11.09 KB, patch)
2014-05-15 19:24 UTC, Jesse Barnes
Details | Diff
screen image of i915 kernel fault from Dell XPS11 (4.98 MB, image/jpeg)
2014-05-24 01:53 UTC, Len Brown
Details
current analyze_suspend output (272.98 KB, text/html)
2016-10-18 04:30 UTC, Len Brown
Details

Description Len Brown 2014-05-14 18:09:40 UTC
Created attachment 136101 [details]
0001-drm-i915-don-t-wait-for-power-cycle-when-waiting-for.patch

On a Dell XPS11 convertable (Intel HSW-YLT),

i915 resume from S3 got slower in 3.15-rc1
Linux 3.14: 400ms 
Linux 3.15-rc1: 1200ms

If the attached patch is reverted, 400ms resume time is restored.

Unfortunately, the attached patch also had the dramatic
_benefit_ of reducing i915 suspend latency on the same XPS11
to under 150ms from over 1300ms.

The numbers are different when measured on a Lenovo Yoga2 Pro HSW-ULT,
but the trend is the same.  The patch increases resume time
to about 700ms from under 400ms, and the patch reduces
suspend time  to under 300ms from over 1100ms.

Obviously, it would be ideal to capture the suspend speed
benefit of this patch without having the resume cost.  However,
if we have to pick between suspend and resume performance,
resume performance is more important.

Note that it is easy to graphically measure resume latency with this tool:
https://01.org/suspendresume

if analyze_suspend.py is invoked with the -f argument, it also
produces an ftrace that can show exactly where the time goes.
In the case of 3.15-rc1 on the Dell XPS11:

intel_edp_panel_on takes > 1000ms
  wait_panel_power_cycle takes 800ms
  wait_panel_status takes 200ms
Comment 1 Jani Nikula 2014-05-15 11:32:03 UTC
The regressing commit is

commit 1a5ef5b7b4ee4a9514ca10057b08d978431a03e2
Author: Paulo Zanoni <paulo.r.zanoni@intel.com>
Date:   Thu Dec 19 14:29:43 2013 -0200

    drm/i915: don't wait for power cycle when waiting for power off
Comment 2 Jani Nikula 2014-05-15 11:58:24 UTC
(In reply to Len Brown from comment #0)
> if analyze_suspend.py is invoked with the -f argument, it also
> produces an ftrace that can show exactly where the time goes.
> In the case of 3.15-rc1 on the Dell XPS11:
> 
> intel_edp_panel_on takes > 1000ms
>   wait_panel_power_cycle takes 800ms
>   wait_panel_status takes 200ms

Please share these figures with the regressing commit reverted, with the time hogs there.
Comment 3 Jesse Barnes 2014-05-15 19:24:07 UTC
Created attachment 136261 [details]
avoid full hotplug detect on resume

This one may be sufficient.  If not, try the next one (or just try the next one for fun, should make things really fast).
Comment 4 Jesse Barnes 2014-05-15 19:24:27 UTC
Created attachment 136271 [details]
make crtc enable asynchronous
Comment 5 Len Brown 2014-05-19 22:47:07 UTC
Re: patch in comment #3

No measurable difference.

resume in 1246ms becomes 1233ms

Same time-consumers as above.
Here's a copy/paste of the anayze_suspend.py ftrace output,
sorry the indenting is not retained.

I'll see if i can attach the whole compressed output.

0000:00:02.0 resume_general (1212.094 ms @ 141.618 to 1353.712)
dpm_run_callback (1212.093 ms @ 141.618)
initcall_debug_start (0.023 ms @ 141.619)
pm_dev_dbg (0.000 ms @ 141.644)
pci_pm_resume (1212.052 ms @ 141.645)
pci_has_legacy_pm_support (0.000 ms @ 141.646)
pci_pm_default_resume (0.003 ms @ 141.647)
i915_pm_resume [i915] (1212.045 ms @ 141.651)
dev_get_drvdata (0.000 ms @ 141.651)
i915_resume [i915] (1212.043 ms @ 141.653)
__i915_drm_thaw [i915] (1212.041 ms @ 141.653)
mutex_lock (0.001 ms @ 141.654)
i915_gem_restore_gtt_mappings [i915] (88.342 ms @ 141.656)
mutex_unlock (0.000 ms @ 229.999)
i915_restore_state [i915] (0.710 ms @ 230.000)
intel_opregion_setup [i915] (6.124 ms @ 230.710)
intel_init_pch_refclk [i915] (0.073 ms @ 236.835)
drm_mode_config_reset [drm] (0.001 ms @ 236.909)
mutex_lock (0.001 ms @ 236.910)
i915_gem_init_hw [i915] (4.437 ms @ 236.911)
mutex_unlock (0.000 ms @ 241.349)
drm_irq_install [drm] (0.197 ms @ 241.349)
intel_modeset_init_hw [i915] (0.469 ms @ 241.546)
drm_modeset_lock_all [drm] (0.004 ms @ 242.016)
intel_modeset_setup_hw_state [i915] (1071.893 ms @ 242.020)
haswell_get_pipe_config [i915] (0.011 ms @ 242.020)
haswell_get_pipe_config [i915] (0.013 ms @ 242.031)
haswell_get_pipe_config [i915] (0.012 ms @ 242.045)
intel_ddi_setup_hw_pll_state [i915] (0.000 ms @ 242.058)
intel_ddi_get_hw_state [i915] (0.006 ms @ 242.058)
intel_ddi_get_hw_state [i915] (0.005 ms @ 242.064)
intel_ddi_get_hw_state [i915] (0.005 ms @ 242.070)
intel_ddi_connector_get_hw_state [i915] (0.008 ms @ 242.075)
intel_ddi_connector_get_hw_state [i915] (0.008 ms @ 242.084)
intel_ddi_connector_get_hw_state [i915] (0.008 ms @ 242.092)
intel_ddi_connector_get_hw_state [i915] (0.009 ms @ 242.101)
gen6_read32 [i915] (0.002 ms @ 242.110)
hsw_write32 [i915] (0.003 ms @ 242.112)
intel_crtc_update_dpms [i915] (0.001 ms @ 242.115)
intel_dump_pipe_config [i915] (0.001 ms @ 242.117)
gen6_read32 [i915] (0.002 ms @ 242.119)
hsw_write32 [i915] (0.003 ms @ 242.121)
intel_crtc_update_dpms [i915] (0.001 ms @ 242.124)
intel_dump_pipe_config [i915] (0.001 ms @ 242.126)
gen6_read32 [i915] (0.002 ms @ 242.127)
hsw_write32 [i915] (0.003 ms @ 242.129)
intel_crtc_update_dpms [i915] (0.001 ms @ 242.132)
intel_dump_pipe_config [i915] (0.001 ms @ 242.133)
ilk_wm_get_hw_state [i915] (0.031 ms @ 242.135)
i915_redisable_vga [i915] (0.007 ms @ 242.167)
__intel_set_mode [i915] (1071.285 ms @ 242.174)
kmem_cache_alloc_trace (0.009 ms @ 242.174)
kmem_cache_alloc_trace (0.005 ms @ 242.183)
drm_mode_copy [drm] (0.000 ms @ 242.188)
drm_mode_copy [drm] (0.000 ms @ 242.189)
drm_mode_set_crtcinfo [drm] (0.000 ms @ 242.190)
drm_mode_set_crtcinfo [drm] (0.000 ms @ 242.191)
intel_ddi_compute_config [i915] (0.041 ms @ 242.191)
intel_pipe_has_type [i915] (0.000 ms @ 242.233)
hsw_crtc_supports_ips [i915] (0.000 ms @ 242.233)
intel_dump_pipe_config [i915] (0.001 ms @ 242.234)
drm_calc_timestamping_constants [drm] (0.000 ms @ 242.236)
drm_object_property_set_value [drm] (0.000 ms @ 242.237)
haswell_modeset_global_resources [i915] (0.039 ms @ 242.237)
drm_vblank_pre_modeset [drm] (0.004 ms @ 242.277)
haswell_crtc_mode_set [i915] (0.141 ms @ 242.281)
drm_vblank_post_modeset [drm] (0.001 ms @ 242.422)
intel_ddi_mode_set [i915] (0.001 ms @ 242.424)
haswell_crtc_enable [i915] (1071.018 ms @ 242.425)
intel_set_cpu_fifo_underrun_reporting [i915] (0.001 ms @ 242.426)
intel_ddi_pre_enable [i915] (1035.452 ms @ 242.428)
intel_ddi_get_encoder_port [i915] (0.000 ms @ 242.428)
intel_edp_panel_on [i915] (1015.037 ms @ 242.429)
edp_have_panel_power [i915] (0.003 ms @ 242.429)
wait_panel_power_cycle [i915] (806.108 ms @ 242.433)
msecs_to_jiffies (0.000 ms @ 242.433)
schedule_timeout_uninterruptible (806.101 ms @ 242.434)
wait_panel_status [i915] (0.005 ms @ 1048.536)
_pp_ctrl_reg [i915] (0.000 ms @ 1048.541)
ironlake_get_pp_control [i915] (0.003 ms @ 1048.542)
hsw_write32 [i915] (0.003 ms @ 1048.545)
gen6_read32 [i915] (0.002 ms @ 1048.549)
wait_panel_status [i915] (208.914 ms @ 1048.552)
hsw_write32 [i915] (0.005 ms @ 1257.467)
intel_dp_sink_dpms [i915] (1.040 ms @ 1257.473)
intel_dp_start_link_train [i915] (11.334 ms @ 1258.514)
intel_dp_complete_link_train [i915] (8.031 ms @ 1269.849)
intel_ddi_enable_pipe_clock [i915] (0.002 ms @ 1277.882)
ironlake_pfit_enable [i915] (0.000 ms @ 1277.885)
intel_crtc_load_lut [i915] (1.639 ms @ 1277.886)
intel_ddi_set_pipe_settings [i915] (0.007 ms @ 1279.526)
intel_ddi_enable_transcoder_func [i915] (0.008 ms @ 1279.534)
intel_update_watermarks [i915] (0.086 ms @ 1279.543)
intel_enable_pipe [i915] (0.028 ms @ 1279.630)
intel_enable_ddi [i915] (0.505 ms @ 1279.659)
intel_opregion_notify_encoder [i915] (0.002 ms @ 1280.165)
intel_enable_primary_hw_plane [i915] (17.246 ms @ 1280.168)
intel_enable_planes [i915] (0.002 ms @ 1297.415)
intel_crtc_update_cursor [i915] (0.000 ms @ 1297.417)
hsw_enable_ips [i915] (15.990 ms @ 1297.418)
mutex_lock (0.001 ms @ 1313.409)
intel_update_fbc [i915] (0.031 ms @ 1313.411)
mutex_unlock (0.000 ms @ 1313.443)
kfree (0.007 ms @ 1313.445)
kfree (0.007 ms @ 1313.452)
__intel_set_mode [i915] (0.080 ms @ 1313.461)
__intel_set_mode [i915] (0.077 ms @ 1313.542)
intel_modeset_check_state [i915] (0.292 ms @ 1313.620)
drm_modeset_unlock_all [drm] (0.004 ms @ 1313.914)
intel_hpd_init [i915] (0.030 ms @ 1313.918)
drm_sysfs_hotplug_event [drm] (0.156 ms @ 1313.950)
intel_opregion_init [i915] (38.115 ms @ 1314.107)
console_trylock (0.005 ms @ 1352.222)
queue_work_on (0.031 ms @ 1352.228)
mutex_lock (1.432 ms @ 1352.259)
mutex_unlock (0.000 ms @ 1353.692)
intel_runtime_pm_put [i915] (0.001 ms @ 1353.693)
drm_kms_helper_poll_enable [drm_kms_helper] (0.000 ms @ 1353.696)
__suspend_report_result (0.000 ms @ 1353.698)
initcall_debug_report (0.013 ms @ 1353.699)
Comment 6 Len Brown 2014-05-19 22:50:45 UTC
bugzilla accepts 5MB attachments, but the compressed trace is 9MB
Comment 7 Len Brown 2014-05-19 23:08:58 UTC
the patch in comment #4 did not apply to upstream Linux.
What should I apply it to?
Comment 8 Jesse Barnes 2014-05-19 23:14:40 UTC
It was based off the drm-intel-nightly branch from git://anongit.freedesktop.org/drm-intel, but that was last week, we've probably had 100 patches land since then...
Comment 9 Len Brown 2014-05-20 05:11:13 UTC
right, it doesn't apply to today's drm-inte-nightly either
Comment 10 Len Brown 2014-05-21 20:34:34 UTC
drm-intel-nightly doesn't boot.
So even if you patch did apply there, that branch isn't useful.

Also, the fix to this bug must be with a patch that applies to Linux-3.15-rc.
So if you can send a patch that applies to upstream Linux, I'll be more
than happy to test it.
Comment 11 Jesse Barnes 2014-05-22 20:19:13 UTC
Can you try the fast-modeset branch of my kernel.org tree?
git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes/drm-intel
Comment 12 Len Brown 2014-05-23 01:40:42 UTC
tested:

git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes/drm-intel
fast-modeset branch, as of top commit:

commit 895d2bde2fc067dcc73edfa4857cd6b088fceabb
Author: Jesse Barnes <jbarnes@virtuousgeek.org>
Date:   Thu May 15 12:18:41 2014 -0700

    drm/i915: make CRTC enable/disable asynchronous

it panics on boot.
Comment 13 Len Brown 2014-05-23 02:16:28 UTC
> it panics on boot.

not always.
sometimes system freeze, sometime system reset.

once I got it to boot and was able to measure suspend/resume.

i915 suspend: 115ms

This is in parallel with the Lynx Point Audio, which is currently 224ms

i915 resume: invisible!

I ran this back-to-back with analyze_syspend -x2, and did not detect any deferred i915 resume delaying a subsequent suspend.  (what happens if a suspend request occurs when your deferred resume has not completed?)

So this patch looks like a very interesting prototype, but very unstable.
Comment 14 Jesse Barnes 2014-05-23 15:27:04 UTC
Yeah I don't defer the whole suspend, just the mode set part.  And that gets synchronized later on (or at suspend if we suspend again that fast), but as part of a sync_work() call rather than a sync on the whole resume.

For the panics, do you have any logs?  I didn't see panics here on my MBA when testing, but I probably have a different config than you.
Comment 15 Len Brown 2014-05-23 22:18:42 UTC
unfortunately, no logs on the panics, and death
seemed to come in a variety of ways...

They seemed to happen when the screen was not printing characters
and then when things stopped all the only thing on the screen
was something about i915 switching back to text mode -- presumably
because of the panic that had since scrolled off the screen.

drm-intel-nightly has died on boot all my recent attempts to use it --
so you might be inheriting something bad from drm-intel tree.
If you can re-base base your patch on upstream, then apples/apples
may be more reliable and objective.

If there are any knobs to turn that may make the failure more
visible, I'll be happy to turn them.  I can also try this tree
on another box.
Comment 16 Len Brown 2014-05-23 22:54:39 UTC
Lenovo Yoga2 Pro had no problems booting the same kernel binary
(built from kernel in comment #11)
that the Dell XPS11 struggled with.
I rebooted the Yoga2 5 times w/ no issues.

suspend/resume also worked fine on the Yoga2.
i915 supsend time was 277ms, resume time not visible
(/me wonders if we should add trace points to this patch
 so that analyze_suspend can see when the i915 does this work).

Though stable, the Yoga2 277ms is considerably slower than on the Dell XPS11,
which suspended in 115ms.  Further, this 277ms is the long-pole
in suspend on the Yoga2, with the next slowest parallel device
taking 25ms.
Comment 17 Len Brown 2014-05-24 01:53:01 UTC
Created attachment 137281 [details]
screen image of i915 kernel fault from Dell XPS11

The Lenovo had vt.handoff=7 on the cmdline and the Dell did not,
so I slapped that on the Dell cmdline and it booted, suspended/resumed,
but didn't reboot.

The good news is that the resulting crash showed on the screen -- image
attached.  Hopefully it gives a clue.
Comment 18 Len Brown 2014-05-24 02:03:29 UTC
On the XPS11, in about 10 boot requests, I get 3-4 successful boots,
a couple of resets, and twice the stack trace attached in comment #17.

If you can't read it in the image, the fault address is 0x00..000088
Comment 19 Ville Syrjala 2014-05-24 11:29:53 UTC
(In reply to Len Brown from comment #18)
> On the XPS11, in about 10 boot requests, I get 3-4 successful boots,
> a couple of resets, and twice the stack trace attached in comment #17.
> 
> If you can't read it in the image, the fault address is 0x00..000088

commit 46470fc932ac8a0e8317a220b3f4ea4ed903338e
Author: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Date:   Wed May 21 19:01:06 2014 +0300

    drm/i915: Add null state batch to active list

should fix the oops.
Comment 20 Len Brown 2014-05-29 04:14:54 UTC
I extracted comment #19's 46470fc932a from drm-intel tree,
and applied it on top of jbarnes branch in comment #11.

On the Lenovo Yoga2 Pro, it works fine -- though it worked fine before.

On the Del XPS11,
the system is no longer bringing up the GUI -- just the background
and the X cursor -- so it seems that the crashes hosed the graphics
config on this system somewhat.  I'm going to re-test after a re-install.
Comment 21 Len Brown 2014-05-29 04:49:36 UTC
On fresh install of Ubuntu 14.04 on the Dell XPS11,
the patch from comment #19 applied on top of the tree
from comment #11 has booted without problems 5 times in 5 attempts.
Comment 22 Jani Nikula 2014-08-14 11:19:40 UTC
Jesse, what's the status with the patches?
Comment 23 Jesse Barnes 2014-08-14 15:44:10 UTC
Daniel rejected them based on the assumption that we'd have async atomic mode setting at some point.
Comment 24 Jani Nikula 2014-08-15 08:13:03 UTC
Well then, what's the status of this bug?
Comment 25 Jesse Barnes 2014-08-15 16:24:25 UTC
We need to re-measure, but iirc the regression was caused by us honoring panel timings, so it was really fixing a bug.

We can keep this bug open to track the resume performance in general, or just close it.  The resume time should be fixed up after we land the async atomic stuff.
Comment 26 Jani Nikula 2015-01-28 14:00:53 UTC
(In reply to Jesse Barnes from comment #25)
> We need to re-measure, but iirc the regression was caused by us honoring
> panel timings, so it was really fixing a bug.
> 
> We can keep this bug open to track the resume performance in general, or
> just close it.  The resume time should be fixed up after we land the async
> atomic stuff.

Closing. The power sequencing code has changed so much in the past months that any measurements here are not valid anymore. Feel free to reopen with current data.
Comment 27 Len Brown 2016-10-18 04:30:52 UTC
Created attachment 241901 [details]
current analyze_suspend output

This problem still exists in Linux-4.8.0, though the delay
on this machine is now 909 ms, rather than 1,200 ms.

Attached, please find the analzye_suspend.py output as run
on the same laptop as above, installed with Ubuntu 16.10
and its 4.8.0-22-generic kernel.

Curiously, i915 resume invokes intel_hdmi_detect FOUR TIMES --
twice in a kthread, and again twice in-line.  This costs 160ms
each time it is invoked.
Comment 28 Jani Nikula 2017-04-03 15:31:50 UTC
This bug has fallen below the radar, not least because we've consolidated all bug handling to the freedesktop.org bugzilla. Created a new bug there, and resolving this as moved.

https://bugs.freedesktop.org/show_bug.cgi?id=100540

Note You need to log in before you can comment on or make changes to this bug.