Bug 60637 - [ivb hsw] DP link training errors on DPMS off (crtc disable)
Summary: [ivb hsw] DP link training errors on DPMS off (crtc disable)
Status: RESOLVED OBSOLETE
Alias: None
Product: Drivers
Classification: Unclassified
Component: Video(DRI - Intel) (show other bugs)
Hardware: All Linux
: P1 normal
Assignee: intel-gfx-bugs@lists.freedesktop.org
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2013-07-27 14:42 UTC by da_audiophile
Modified: 2015-10-07 10:06 UTC (History)
7 users (show)

See Also:
Kernel Version: 3.18.1
Subsystem:
Regression: Yes
Bisected commit-id:


Attachments
complete dmesg (80.72 KB, text/plain)
2013-07-27 14:42 UTC, da_audiophile
Details
dmesg with debugging enabled (3.12.0-git) (181.25 KB, text/plain)
2013-11-12 20:35 UTC, da_audiophile
Details
dmesg under 3.17-rc6 (77.76 KB, text/plain)
2014-09-23 23:11 UTC, da_audiophile
Details
dmesg with drm.debug=14 kernel flag (208.10 KB, text/plain)
2014-09-24 19:28 UTC, da_audiophile
Details
dmesg with debugging enabled on haswell (250.33 KB, text/plain)
2014-12-17 22:46 UTC, da_audiophile
Details

Description da_audiophile 2013-07-27 14:42:31 UTC
Created attachment 107025 [details]
complete dmesg

Since updating the the 3.10 series, the following error is recorded by dmesg which correspond to my monitor entering a suspend state.  For the purposes of this bug report, I rebooted, logged in and allowed xscreensaver to put the monitor to sleep.  Settings per xscreensaver-demo:

[X] Power Management Enabled
 Standby after 10 min
 Suspend after 11 min
 Off after 12 min

Note that the first error is thrown slightly after 11 min and subsequent error after after 12 min.  I suspect these correspond to my settings above.  I have no idea why it is then repeated 1 h 48 m later though.

% dmesg | grep train 
[  670.187674] [drm:intel_dp_complete_link_train] *ERROR* failed to train DP, aborting
[  730.617317] [drm:intel_dp_complete_link_train] *ERROR* failed to train DP, aborting
[ 6531.587198] [drm:intel_dp_complete_link_train] *ERROR* failed to train DP, aborting
[ 6531.784460] [drm:intel_dp_complete_link_train] *ERROR* failed to train DP, aborting

My CPU is an Intel i7-3770k (Ivy) and I am using the integrated HD 4000 graphics together with the distro provided driver packages: intel-dri libva-intel-driver xf86-video-intel.  I am using 'sna' acceleration per /etc/X11/xorg.conf.d/20-intel.conf.

Glad to provide additional logs/info upon request.
Comment 1 da_audiophile 2013-07-27 16:43:42 UTC
I have further evidence that the errors are tied to the suspend state.  At 2 h 32 m of uptime, I locked the screen thus starting the clock to suspend the monitor.  The following is the dmesg output after leaving the system alone.

Note that 2 h 32 m is 9120 sec.

[ 9880.920473] [drm:intel_dp_complete_link_train] *ERROR* failed to train DP, aborting
[ 9941.357420] [drm:intel_dp_complete_link_train] *ERROR* failed to train DP, aborting

The first error occurs approx 13 min after I locked the screen; the next error occurs approx 1 min after the first.
Comment 2 Jani Nikula 2013-10-09 13:20:43 UTC
We have a bunch of DP link training fixes in drm-intel-nightly branch of [1], please try that.

[1] git://people.freedesktop.org/~danvet/drm-intel
Comment 3 Jani Nikula 2013-10-09 13:23:55 UTC
Also you'll get much more info about what's going on by enabling drm.debug=0xe module parameter. That might require you to increase log_buf_size kernel parameter to capture everything if the problem takes a long time to reproduce.
Comment 4 Jani Nikula 2013-11-11 12:50:09 UTC
Ping for testing and info.
Comment 5 da_audiophile 2013-11-11 16:30:55 UTC
Sorry, slipped off my radar.

Forgive my ignorance, but after cloning that repo, what package(s) am I building from its contents?  I am running Arch Linux and am very familiar with the Arch PKGBUILD files; I have the following packages installed relative to this ticket:

*intel-dri
*libva-intel-driver
*xf86-video-intel

Again, I am unsure what the git repo source is providing me.
Comment 6 Daniel Vetter 2013-11-11 19:09:15 UTC
(In reply to da_audiophile from comment #5)
> Again, I am unsure what the git repo source is providing me.

The linux kernel.
Comment 7 da_audiophile 2013-11-11 19:22:01 UTC
@Daniel - How far ahead of 3.12.0 is the repo to which you directed me?
Comment 8 Daniel Vetter 2013-11-11 19:29:37 UTC
roughly 750 commits right now. It's all for drm drivers though, otherwise it's vanilla 3.12.0.
Comment 9 da_audiophile 2013-11-11 19:40:22 UTC
OK.  I will build and report back.  Am I OK leaving the 'stable' version of the video drivers installed for the purposes of testing?

*intel-dri 9.2.2-1
*libva-intel-driver 1.2.1-1
*xf86-video-intel 2.21.15-1

FYI - I am using a slightly modified version of the official Arch PKGBUILD for the kernel for building.[1]  This should be fine but may omit some files from the build package.

1. https://projects.archlinux.org/svntogit/packages.git/tree/trunk/PKGBUILD?h=packages/linux
Comment 10 Daniel Vetter 2013-11-11 19:55:14 UTC
(In reply to da_audiophile from comment #9)
> OK.  I will build and report back.  Am I OK leaving the 'stable' version of
> the video drivers installed for the purposes of testing?

The kernel is massively backwards compatible, so upgrading only the kernel should always Just Work.
Comment 11 da_audiophile 2013-11-11 22:29:50 UTC
OK... I am still getting these:

[  970.469103] [drm:intel_dp_start_link_train] *ERROR* too many voltage retries, give up
[  970.472966] [drm:intel_dp_start_link_train] *ERROR* too many voltage retries, give up
[  970.482809] [drm:intel_dp_start_link_train] *ERROR* too many voltage retries, give up
[  970.486663] [drm:intel_dp_start_link_train] *ERROR* too many voltage retries, give up
[  970.496017] [drm:intel_dp_start_link_train] *ERROR* too many voltage retries, give up
[  970.499852] [drm:intel_dp_start_link_train] *ERROR* too many voltage retries, give up
[  970.509184] [drm:intel_dp_start_link_train] *ERROR* too many voltage retries, give up
[  970.509369] [drm:intel_dp_complete_link_train] *ERROR* failed to train DP, aborting
[ 4039.178108] [drm:intel_dp_start_link_train] *ERROR* too many voltage retries, give up
[ 4039.181971] [drm:intel_dp_start_link_train] *ERROR* too many voltage retries, give up
[ 4039.185828] [drm:intel_dp_start_link_train] *ERROR* too many voltage retries, give up
[ 4039.189680] [drm:intel_dp_start_link_train] *ERROR* too many voltage retries, give up
[ 4039.193531] [drm:intel_dp_start_link_train] *ERROR* too many voltage retries, give up
[ 4039.197370] [drm:intel_dp_start_link_train] *ERROR* too many voltage retries, give up
[ 4039.201209] [drm:intel_dp_start_link_train] *ERROR* too many voltage retries, give up
[ 4039.201393] [drm:intel_dp_complete_link_train] *ERROR* failed to train DP, aborting
[ 4039.287484] [drm:intel_dp_start_link_train] *ERROR* too many voltage retries, give up
[ 4039.291317] [drm:intel_dp_start_link_train] *ERROR* too many voltage retries, give up
[ 4039.295191] [drm:intel_dp_start_link_train] *ERROR* too many voltage retries, give up
[ 4039.299012] [drm:intel_dp_start_link_train] *ERROR* too many voltage retries, give up
[ 4039.302829] [drm:intel_dp_start_link_train] *ERROR* too many voltage retries, give up
[ 4039.306650] [drm:intel_dp_start_link_train] *ERROR* too many voltage retries, give up
[ 4039.310465] [drm:intel_dp_start_link_train] *ERROR* too many voltage retries, give up
[ 4039.310647] [drm:intel_dp_complete_link_train] *ERROR* failed to train DP, aborting

I read back and saw your advice to "enabling drm.debug=0xe module parameter" which I think goes into /etc/modprobe.d/i915.conf like this:
 options drm.debug=0xe

Am I correct?
Comment 12 da_audiophile 2013-11-12 00:09:48 UTC
Rebooting with the above in /etc/modprobe.d/i915.conf doesn't seem to be more verbose in demsg, perhaps I did it wrong?

% cat /etc/modprobe.d/i915.conf
options drm.debug=0xe

[  676.946879] [drm:intel_dp_start_link_train] *ERROR* too many voltage retries, give up
[  676.950648] [drm:intel_dp_start_link_train] *ERROR* too many voltage retries, give up
[  676.959818] [drm:intel_dp_start_link_train] *ERROR* too many voltage retries, give up
[  676.963587] [drm:intel_dp_start_link_train] *ERROR* too many voltage retries, give up
[  676.972775] [drm:intel_dp_start_link_train] *ERROR* too many voltage retries, give up
[  676.976534] [drm:intel_dp_start_link_train] *ERROR* too many voltage retries, give up
[  676.985717] [drm:intel_dp_start_link_train] *ERROR* too many voltage retries, give up
[  676.985894] [drm:intel_dp_complete_link_train] *ERROR* failed to train DP, aborting
[  737.297275] [drm:intel_dp_start_link_train] *ERROR* too many voltage retries, give up
[  737.301138] [drm:intel_dp_start_link_train] *ERROR* too many voltage retries, give up
[  737.310967] [drm:intel_dp_start_link_train] *ERROR* too many voltage retries, give up
[  737.314812] [drm:intel_dp_start_link_train] *ERROR* too many voltage retries, give up
[  737.324150] [drm:intel_dp_start_link_train] *ERROR* too many voltage retries, give up
[  737.327978] [drm:intel_dp_start_link_train] *ERROR* too many voltage retries, give up
[  737.336845] [drm:intel_dp_start_link_train] *ERROR* too many voltage retries, give up
[  737.337028] [drm:intel_dp_complete_link_train] *ERROR* failed to train DP, aborting
[ 5736.860092] [drm:intel_dp_start_link_train] *ERROR* too many voltage retries, give up
[ 5736.863967] [drm:intel_dp_start_link_train] *ERROR* too many voltage retries, give up
[ 5736.873832] [drm:intel_dp_start_link_train] *ERROR* too many voltage retries, give up
[ 5736.877686] [drm:intel_dp_start_link_train] *ERROR* too many voltage retries, give up
[ 5736.887528] [drm:intel_dp_start_link_train] *ERROR* too many voltage retries, give up
[ 5736.891367] [drm:intel_dp_start_link_train] *ERROR* too many voltage retries, give up
[ 5736.900725] [drm:intel_dp_start_link_train] *ERROR* too many voltage retries, give up
[ 5736.900909] [drm:intel_dp_complete_link_train] *ERROR* failed to train DP, aborting
[ 5737.681930] [drm:intel_dp_start_link_train] *ERROR* too many voltage retries, give up
[ 5737.687582] [drm:intel_dp_start_link_train] *ERROR* too many voltage retries, give up
[ 5737.693234] [drm:intel_dp_start_link_train] *ERROR* too many voltage retries, give up
[ 5737.698883] [drm:intel_dp_start_link_train] *ERROR* too many voltage retries, give up
[ 5737.704528] [drm:intel_dp_start_link_train] *ERROR* too many voltage retries, give up
[ 5737.710171] [drm:intel_dp_start_link_train] *ERROR* too many voltage retries, give up
[ 5737.715800] [drm:intel_dp_start_link_train] *ERROR* too many voltage retries, give up
[ 5737.715980] [drm:intel_dp_complete_link_train] *ERROR* failed to train DP, aborting
[ 5738.599802] [drm:intel_dp_start_link_train] *ERROR* too many voltage retries, give up
Comment 13 Daniel Vetter 2013-11-12 08:40:48 UTC
Usually distros load i915.ko already from the initramfs, so you need to rebuild that one. Also in the modprobe file it should be

options drm debug=0xe

instead of the drm.debug=0xe (which is for the kernel bootline).
Comment 14 Jani Nikula 2013-11-12 09:21:31 UTC
And please attach the complete dmesg.
Comment 15 da_audiophile 2013-11-12 20:05:31 UTC
OK... I changed the line per your suggestion /./ / and I did rebuilt my images.  I will post the complete dmesg shortly.
Comment 16 da_audiophile 2013-11-12 20:35:22 UTC
Created attachment 114421 [details]
dmesg with debugging enabled (3.12.0-git)
Comment 17 da_audiophile 2013-11-12 20:35:59 UTC
OK... I have attached the dmesg with debugging enabled... LOTS of lines in there :)
Comment 18 Jani Nikula 2014-08-14 11:31:14 UTC
We seem to have ignored this bug a bit. Please try the latest kernels and report back.
Comment 19 da_audiophile 2014-08-14 18:56:48 UTC
Can do.  Just updated to 3.16.1 but I have experienced the same problems under 3.15.9 so I will report back.
Comment 20 da_audiophile 2014-08-14 21:40:23 UTC
Nope.  Issue persists.  Shall I boot with the debug options and post the output again?  A few lines from the normal dmesg:

[10404.394235] [drm:intel_dp_start_link_train] *ERROR* too many voltage retries, give up
[10404.398134] [drm:intel_dp_start_link_train] *ERROR* too many voltage retries, give up
[10404.408964] [drm:intel_dp_start_link_train] *ERROR* too many voltage retries, give up
[10404.412833] [drm:intel_dp_start_link_train] *ERROR* too many voltage retries, give up
[10404.422685] [drm:intel_dp_start_link_train] *ERROR* too many voltage retries, give up
[10404.426542] [drm:intel_dp_start_link_train] *ERROR* too many voltage retries, give up
[10404.435919] [drm:intel_dp_start_link_train] *ERROR* too many voltage retries, give up
[10404.436102] [drm:intel_dp_complete_link_train] *ERROR* failed to train DP, aborting
[10405.445554] [drm:intel_dp_start_link_train] *ERROR* too many voltage retries, give up
[10405.451263] [drm:intel_dp_start_link_train] *ERROR* too many voltage retries, give up
[10405.456919] [drm:intel_dp_start_link_train] *ERROR* too many voltage retries, give up
[10405.462566] [drm:intel_dp_start_link_train] *ERROR* too many voltage retries, give up
[10405.468195] [drm:intel_dp_start_link_train] *ERROR* too many voltage retries, give up
[10405.473817] [drm:intel_dp_start_link_train] *ERROR* too many voltage retries, give up
[10405.479446] [drm:intel_dp_start_link_train] *ERROR* too many voltage retries, give up
[10405.479627] [drm:intel_dp_complete_link_train] *ERROR* failed to train DP, aborting
[10406.729871] [drm:intel_dp_start_link_train] *ERROR* too many voltage retries, give up
[10406.735484] [drm:intel_dp_start_link_train] *ERROR* too many voltage retries, give up
[10406.741089] [drm:intel_dp_start_link_train] *ERROR* too many voltage retries, give up
Comment 21 da_audiophile 2014-09-23 21:25:23 UTC
I am getting the same errors under 3.17-rc6.  Please advise.

[  516.036560] [drm:intel_dp_start_link_train] *ERROR* too many voltage retries, give up
[  516.042288] [drm:intel_dp_start_link_train] *ERROR* too many voltage retries, give up
[  516.048000] [drm:intel_dp_start_link_train] *ERROR* too many voltage retries, give up
[  516.053701] [drm:intel_dp_start_link_train] *ERROR* too many voltage retries, give up
[  516.059399] [drm:intel_dp_start_link_train] *ERROR* too many voltage retries, give up
[  516.065085] [drm:intel_dp_start_link_train] *ERROR* too many voltage retries, give up
[  516.070770] [drm:intel_dp_start_link_train] *ERROR* too many voltage retries, give up
[  516.070953] [drm:intel_dp_complete_link_train] *ERROR* failed to train DP, aborting
Comment 22 Rodrigo Vivi 2014-09-23 23:04:42 UTC
Please attach the full dmesg here.
Comment 23 da_audiophile 2014-09-23 23:11:46 UTC
Created attachment 151631 [details]
dmesg under 3.17-rc6
Comment 24 Jani Nikula 2014-09-24 12:05:08 UTC
(In reply to Rodrigo Vivi from comment #22)
> Please attach the full dmesg here.

With drm.debug=14 module parameter please.
Comment 25 da_audiophile 2014-09-24 19:27:32 UTC
Sure, attached booted into 3.16.3.
Comment 26 da_audiophile 2014-09-24 19:28:02 UTC
Created attachment 151821 [details]
dmesg with drm.debug=14 kernel flag
Comment 27 da_audiophile 2014-09-28 21:46:59 UTC
I have tried with a BIOS option relating to iGPU powersavings both on and off but I get the same results.  The BIOS calls it, "Render Standby."  Any suggestions?
Comment 28 da_audiophile 2014-12-17 22:46:05 UTC
Running 3.18.1 now using a Haswell CPU and experiencing the same errors.

CPU: i7-4790K
Motherboard chipset: Z97

I am glad to help debug and have attached a new dmesg with debugging enabled on the new hardware.
Comment 29 da_audiophile 2014-12-17 22:46:32 UTC
Created attachment 161071 [details]
dmesg with debugging enabled on haswell
Comment 30 Jani Nikula 2015-01-29 09:38:32 UTC
Do you have the same monitor with both ivb and hsw?
Comment 31 da_audiophile 2015-01-29 21:23:44 UTC
Yes, it is an Asus PB278Q[1] which needs to be connected via the Display 1.2 port rather than HDMI or DVI.  I don't know if that's relevant but wanted to make sure it is documented.  I just updated to 3.18.4 and am finding the same errors.

I can hook up the ivb to another monitor as it is no longer my primary WS and see if I can reproduce if you feel that would be helpful.

1. http://www.asus.com/us/Monitors_Projectors/PB278Q/
Comment 32 da_audiophile 2015-02-03 14:57:18 UTC
I should add that I just now experienced the same dmesg errors but without the screen going into standby at all.  Here I powered up the machine with the monitor off while I made some coffee... upon turning the monitor on, and logging in, I noticed the timestamp in the dmesg errors (~4 min) which was about the time I came into the office and switched the monitor on.

[   12.160460] e1000e: eno1 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: Rx/Tx
[   12.160488] IPv6: ADDRCONF(NETDEV_CHANGE): eno1: link becomes ready
[  221.267000] [drm:intel_dp_start_link_train] *ERROR* too many full retries, give up
[  221.274440] [drm:ivybridge_set_fifo_underrun_reporting] *ERROR* uncleared fifo underrun on pipe A
[  221.274449] [drm:ivb_err_int_handler] *ERROR* Pipe A FIFO underrun
[  221.276336] [drm:intel_dp_start_link_train] *ERROR* too many full retries, give up
[  221.285532] [drm:intel_dp_start_link_train] *ERROR* too many full retries, give up
[  221.294636] [drm:intel_dp_start_link_train] *ERROR* too many full retries, give up
[  221.303589] [drm:intel_dp_start_link_train] *ERROR* too many full retries, give up
[  221.312449] [drm:intel_dp_start_link_train] *ERROR* too many full retries, give up
[  221.321306] [drm:intel_dp_start_link_train] *ERROR* too many full retries, give up
[  221.321499] [drm:intel_dp_complete_link_train] *ERROR* failed to train DP, aborting
Comment 33 jamespatterson 2015-02-23 20:34:09 UTC
I get this all the time. I have to disable dpms otherwise it takes a few minutes before I can console cycle enough to convince the monitor to come back on.

[ 1635.507939] [drm:ivybridge_set_fifo_underrun_reporting] *ERROR* uncleared fifo underrun on pipe A
[ 1635.507947] [drm:ivb_err_int_handler] *ERROR* Pipe A FIFO underrun

Workaround is: xset -dpms at graphical login and remember to always login after a boot!
Comment 34 jamespatterson 2015-02-23 20:35:43 UTC
i7-4770T 3.18.7-200.fc21.x86_64 sorry for the noise.
Comment 35 da_audiophile 2015-03-09 13:10:31 UTC
I wanted to update this report.  Summary: I believe the cause of these errors are related to the DVI port connection.  I have verified this on two different systems with different display port cables.  There is nothing wrong with the hardware.  The errors just seem to occur on systems connected via the display port.

Recall from comment #31:

> it is an Asus PB278Q[1] which needs to be connected via the Display 1.2 port
> rather than HDMI or DVI.  I don't know if that's relevant but wanted to make
> sure it is documented.

I recently connected the same hardware up via HDMI and have been symptom free since.  I hope this helps you narrow down the root cause of the issue.  I am glad to try any patches with the display port or HDMI port.
Comment 36 vollekannehoschi 2015-03-10 18:15:49 UTC
I have same problem. Here is my info:
Intel Core i7-4790K CPU
Monitor: LG 34UM95-P
HDMI with 3440x1440@50Hz works. DisplayPort with 3440x1440@60Hz gives me this errors on every DPMS wake up:
Mär 10 16:48:43 maxxgubbl kernel: [drm:intel_dp_start_link_train] *ERROR* failed to enable link training
Mär 10 16:48:43 maxxgubbl kernel: [drm:intel_dp_complete_link_train] *ERROR* failed to start channel equalization

and the monitor doesn't show anything. Only switching rates with xrandr brings the display back up again.
Comment 37 da_audiophile 2015-03-10 19:32:42 UTC
OK, so that's a 3rd strike against the display port.  Do any of the upstream kernel devs have hardware with a display port so they can replicate/debug?
Comment 38 Jani Nikula 2015-10-07 10:06:42 UTC
Long time no updates, closing.

If the problem persists with latest kernels, please file a bug at the freedesktop.org bugzilla [1], referencing this bug. Thank you.

[1] https://bugs.freedesktop.org/enter_bug.cgi?product=DRI&component=DRM/Intel

Note You need to log in before you can comment on or make changes to this bug.