Bug 74931

Summary: [drm:intel_dp_start_link_train] *ERROR* too many voltage retries, give up
Product: Drivers Reporter: Felipe Contreras (felipe.contreras)
Component: Video(DRI - Intel)Assignee: Imre Deak (imre.deak)
Status: RESOLVED MOVED    
Severity: low CC: cosmopull, intel-gfx-bugs, lopeonline+kernelbugzilla, pepe.romero
Priority: P1    
Hardware: All   
OS: Linux   
URL: https://bugs.freedesktop.org/show_bug.cgi?id=70117
Kernel Version: 3.14.1 Subsystem:
Regression: No Bisected commit-id:
Attachments: Full log with drm.debug=0xe

Description Felipe Contreras 2014-04-27 20:42:09 UTC
I get this when booting up and when the machine wakes up. It has been happening since many releases now. Intel HD graphics 4000.
Comment 1 Imre Deak 2014-04-28 09:47:10 UTC
Hi Felipe! Could you attach a full dmesg log with drm.debug=0xe kernel option set?

I guess since you didn't mention it your screen still comes up ok, despite of the error. That's possible, since the error you reported shows that DP clock recovery fails, but in this case the driver still continues the link training with the last attempted CR settings. We should still figure out why CR fails.
Comment 2 Felipe Contreras 2014-04-28 18:40:14 UTC
(In reply to Imre Deak from comment #1)
> Hi Felipe!

Hi :)

> Could you attach a full dmesg log with drm.debug=0xe kernel
> option set?

Sure, done.
 
> I guess since you didn't mention it your screen still comes up ok, despite
> of the error. That's possible, since the error you reported shows that DP
> clock recovery fails, but in this case the driver still continues the link
> training with the last attempted CR settings. We should still figure out why
> CR fails.

Yes, the screen comes up OK. Earlier I had a problem recovering from sleep from X, but that seems to be fixed now.
Comment 3 Felipe Contreras 2014-04-28 18:40:52 UTC
Created attachment 134071 [details]
Full log with drm.debug=0xe
Comment 4 Imre Deak 2014-04-30 13:37:40 UTC
Could you give a try to the latest intel-drm-nightly kernel:

git://anongit.freedesktop.org/drm-intel [drm-intel-nightly branch]

If that doesn't work could you try on top of that the patch below? It's just a wild guess. We force the panel VDD for the time of clock recovery (which fails) but we don't force it for channel equalization (which succeeds). I don't see why we would need to force VDD, since turning on the panel should provide the necessary power. So let's try to do the CR without forcing VDD too.

diff --git a/drivers/gpu/drm/i915/intel_dp.c b/drivers/gpu/drm/i915/intel_dp.c
index a3cb52f..9524ae6 100644
--- a/drivers/gpu/drm/i915/intel_dp.c
+++ b/drivers/gpu/drm/i915/intel_dp.c
@@ -1912,9 +1912,9 @@ static void intel_enable_dp(struct intel_encoder *encoder)
 
 	intel_edp_panel_vdd_on(intel_dp);
 	intel_dp_sink_dpms(intel_dp, DRM_MODE_DPMS_ON);
-	intel_dp_start_link_train(intel_dp);
 	intel_edp_panel_on(intel_dp);
 	edp_panel_vdd_off(intel_dp, true);
+	intel_dp_start_link_train(intel_dp);
 	intel_dp_complete_link_train(intel_dp);
 	intel_dp_stop_link_train(intel_dp);
 }
Comment 5 Felipe Contreras 2014-04-30 17:48:32 UTC
(In reply to Imre Deak from comment #4)
> Could you give a try to the latest intel-drm-nightly kernel:
> 
> git://anongit.freedesktop.org/drm-intel [drm-intel-nightly branch]

That didn't work.

> If that doesn't work could you try on top of that the patch below?

That did work :)

I tried that both on drm-intel-nightly and 3.14.1, works fine on both.
Comment 6 Felipe Contreras 2014-05-05 01:19:39 UTC
So... Are you going to push this patch?
Comment 7 Imre Deak 2014-05-05 08:25:35 UTC
(In reply to Felipe Contreras from comment #6)
> So... Are you going to push this patch?

First, thanks for trying the above things.

This candidate fix is based only on my guess, so before posting some form of it I'd like to better understand what goes wrong with a forced VDD CR and if/how this change affects other platforms (as we prefer keeping things platform independent in general). I hope to come up with something during this week and update this ticket.
Comment 8 Imre Deak 2014-05-05 10:48:50 UTC
My bad, but I just noticed now that there is already
https://bugs.freedesktop.org/show_bug.cgi?id=70117
tracking the same issue with the same guess-patch fix from Jani, I let him know about your report here. So moving the tracking of this over to fdo.
Comment 9 Andrey 2014-08-29 20:51:09 UTC
I have same error in dmesg on Asus UX32VD laptop running on Nvidia 620GT (has also built-in Intel HD4000 Optimus) and main problem is this error cause X server freezes as "ber4444" reported here http://askubuntu.com/questions/458359/drmintel-dp-start-link-train-too-many-voltage-retries-give-up# (me replied as "Demontager" too).
While laptop running on Intel HD graphics no such freezes.
Comment 10 Luca Graziani 2015-07-02 07:36:34 UTC
I have just installed the latest LTS version of ubuntu desktop 14.04.2 amd64 and i have the same problem.
As Andrey, I have laptop Asus UX32VD and when X server freezes, i must press ctrl+alt+F1 and ctrl+alt+F7 to "unfreeze" system.
I have installed the nvidia tested binary driver version 331.113 from ubuntu drivers list.
Comment 11 lopeonline+kernelbugzilla 2016-04-28 07:17:48 UTC
(In reply to Imre Deak from comment #4)

I was experiencing this bug with Intel graphics and Displayport.

Problem detailed symptoms:
My screens were turning off regularly. On good days they would go black for about 2 seconds, on bad days they would turn off completely and I had to disable and re-enable them.

I was running ubuntu 3.14.##### kernel
I recompiled the 3.13 kernel applied your patch and it works BEAUTIFULLY.

I'm still get this warning though:
[drm] Wrong MCH_SSKPD value: 0x########

Reading more about it apparently the BIOS is setting an invalid number in there which apparently seems to cause a buffer underrun, which causes the Displaylink device to turn off.

I'm trying a 4.x kernel now, I don't know if my system will run with it. But I see they've totally reworked the driver source, so I'm not applying your patch to the new kernel. I'll see how it goes.
Comment 12 Jani Nikula 2016-04-28 08:20:23 UTC
(In reply to lopeonline+kernelbugzilla from comment #11)
> I'm trying a 4.x kernel now, I don't know if my system will run with it. But
> I see they've totally reworked the driver source, so I'm not applying your
> patch to the new kernel. I'll see how it goes.

Please go for v4.5 or preferrably v4.6-rc5, and please do follow-up at https://bugs.freedesktop.org/show_bug.cgi?id=70117.