Bug 13453 - high latency when disabling LVDS output
high latency when disabling LVDS output
Status: RESOLVED CODE_FIX
Product: Drivers
Classification: Unclassified
Component: Video(DRI - Intel)
All Linux
: P1 normal
Assigned To: Jesse Barnes
:
Depends on:
Blocks:
  Show dependency treegraph
 
Reported: 2009-06-04 09:31 UTC by James
Modified: 2010-12-16 14:08 UTC (History)
3 users (show)

See Also:
Kernel Version: 2.6.29.4
Tree: Fedora
Regression: No


Attachments
Output from strace -r xset dpms force off (16.01 KB, text/plain)
2009-06-04 09:31 UTC, James
Details
remove panel power delays (724 bytes, patch)
2009-12-02 19:00 UTC, Jesse Barnes
Details | Diff
dmesg with the problem (53.42 KB, text/plain)
2009-12-04 19:13 UTC, James
Details
Use msleep around panel status checks (705 bytes, patch)
2009-12-04 19:24 UTC, Jesse Barnes
Details | Diff
syslog excerpt from soft lockup with second patch (59.04 KB, text/plain)
2009-12-14 14:51 UTC, James
Details
Patch to add msleep(1) between LVDS status checks for 2.6.36. (865 bytes, patch)
2010-10-25 15:43 UTC, James
Details | Diff

Description James 2009-06-04 09:31:44 UTC
Created attachment 21746 [details]
Output from strace -r xset dpms force off

Running xset dpms force off turn off the display causes a spike of kernel-mode
CPU activity and around 1 second of system-wide latency (repeating sound from
the audio buffer, etc.).

The following kernel, xorg driver and libdrm versions are installed:
kernel-2.6.29.4-167.fc11.x86_64
xorg-x11-drv-intel-2.7.0-7.fc11.x86_64
libdrm-2.4.6-7.fc11.x86_64

Attached is the output from strace -r xset dpms force off.
Comment 1 James 2009-06-16 13:14:45 UTC
Still present in kernel-2.6.29.5-186.fc11.x86_64.
Comment 2 James 2009-08-15 20:24:27 UTC
No improvement with kernel-2.6.30.5-28.rc2.fc11.x86_64.
Comment 3 James 2009-10-22 13:00:52 UTC
Still present with:

kernel-2.6.31.1-56.fc12.x86_64
xorg-x11-drv-intel-2.9.0-2.fc12.x86_64
Comment 4 Jesse Barnes 2009-12-02 19:00:19 UTC
Created attachment 23999 [details]
remove panel power delays

Does the problem go away with this patch applied?  We should probably make this loops a bit friendlier to CPU usage.
Comment 5 James 2009-12-02 19:04:30 UTC
I'll give it a go and report back. I should add that enabling full kernel pre-emption largely eliminates the latency (over voluntary pre-emption), but there's still a spike in CPU usage.
Comment 6 James 2009-12-03 21:37:17 UTC
OK, so far I've not seen any of the tell-tale spikes of kernel CPU usage with the patch applied. I'll try again later with a kernel built with voluntary pre-emption to see if the latency issue is fixed.
Comment 7 James 2009-12-04 19:13:49 UTC
Created attachment 24025 [details]
dmesg with the problem

Got an unhappiness with the patch applied today; screen went off and wouldn't turn back on:

INFO: task i915/0:151 blocked for more than 120 seconds.

See the attachment for more.
Comment 8 Jesse Barnes 2009-12-04 19:24:20 UTC
Created attachment 24026 [details]
Use msleep around panel status checks

Well a hung screen is one possible side effect of this patch (certain operations require the panel to be off; that patch disables any waiting for that condition which could cause trouble).

Can you give this one a try?  I don't *think* we're holding any spinlocks at this point so it should be safe.
Comment 9 James 2009-12-04 19:34:46 UTC
Should this patch be applied to the original intel_lvds.c or on top of the other one?
Comment 10 Jesse Barnes 2009-12-04 19:54:23 UTC
To the original, it obsoletes the other patch.
Comment 11 Jesse Barnes 2009-12-10 22:44:14 UTC
Any news?
Comment 12 James 2009-12-10 23:00:31 UTC
I'm currently using a kernel with the patch from comment #8. It seems to work and I've not encountered any problems with it so far.
Comment 13 James 2009-12-11 14:26:45 UTC
One thing I might have neglected to mention (I assumed it was the same problem, hence the original summary) is that a similar thing happens when the screensaver starts to kick in: specifically, at the start of the fade-out there is a spike of CPU activity and sometimes audio latency, and is not fixed by the patch above. (I believe it also occurrs when just running xrandr with no arguments.)

[If this should be filed separately, let me know.]
Comment 14 James 2009-12-14 14:50:57 UTC
Oh dear, caught an unhappiness:

Dec 14 14:27:06 localhost kernel: BUG: soft lockup - CPU#0 stuck for 61s! [Xorg:1504]

Dec 14 14:27:06 localhost kernel: Call Trace:
Dec 14 14:27:06 localhost kernel: [<ffffffffa0085122>] ? i9xx_crtc_dpms+0x216/0x
39b [i915]
Dec 14 14:27:06 localhost kernel: [<ffffffffa00852d4>] ? intel_crtc_dpms+0x2d/0xec [i915]
Dec 14 14:27:06 localhost kernel: [<ffffffffa0061a47>] ? drm_helper_connector_dpms+0x200/0x209 [drm_kms_helper]
Dec 14 14:27:06 localhost kernel: [<ffffffffa0034c4b>] ? drm_mode_connector_property_set_ioctl+0x108/0x175 [drm]
Dec 14 14:27:06 localhost kernel: [<ffffffffa0034b43>] ? drm_mode_connector_property_set_ioctl+0x0/0x175 [drm]
Dec 14 14:27:06 localhost kernel: [<ffffffffa002af4f>] ? drm_ioctl+0x237/0x2f3 [drm]
Dec 14 14:27:06 localhost kernel: [<ffffffff811e75ec>] ? inode_has_perm+0x7a/0x90
Dec 14 14:27:06 localhost kernel: [<ffffffff811216c0>] ? vfs_ioctl+0x6f/0x87
Dec 14 14:27:06 localhost kernel: [<ffffffff81121bd6>] ? do_vfs_ioctl+0x482/0x4c8
Dec 14 14:27:06 localhost kernel: [<ffffffff81121c72>] ? sys_ioctl+0x56/0x79
Dec 14 14:27:06 localhost kernel: [<ffffffff81011d72>] ? system_call_fastpath+0x16/0x1b

Syslog excerpt attached below.
Comment 15 James 2009-12-14 14:51:28 UTC
Created attachment 24178 [details]
syslog excerpt from soft lockup with second patch
Comment 16 Jesse Barnes 2010-02-05 21:37:03 UTC
Just to be sure, can you gdb your i915.o module and do a list i9xx_crtc_dpms+0x216?  That should give us the actual line number where the hang is occurring.

Another option would be to add a timeout to the loops for the panel status, and just return after a second or so.
Comment 17 Jesse Barnes 2010-02-18 23:10:27 UTC
I think this is a dupe.

*** This bug has been marked as a duplicate of bug 15015 ***
Comment 18 James 2010-02-18 23:23:48 UTC
(In reply to comment #16)
> Just to be sure, can you gdb your i915.o module and do a list
> i9xx_crtc_dpms+0x216?  That should give us the actual line number where the
> hang is occurring.
> 
> Another option would be to add a timeout to the loops for the panel status, and
> just return after a second or so.

Sorry that I missed this. I don't have that kernel any more, but I'll apply the second patch and rebuild and see if I can provoke the issue.
Comment 19 James 2010-08-17 23:03:41 UTC
I've decided to re-visit this bug (finally). I've been applying the patch from comment #8 to recent kernels (2.6.34.4) AND the fbc-disable-timeout patch from bug 15015, with success. In particular, I see no DPMS- or screensaver-fade-related latency (or the accompanying CPU usage spike) on X3100 graphics. However, I suspect that the latency at the start of screensaver-fade is still an issue on X4500 graphics on the CRT port, I'll have to test further. I've not seen any further soft-lockups.

Given that it's the comment #8 patch in this bug which fixes it, it is right that this bug should have been closed as a duplicate of bug 15015 (which, as I understand it, resolves the soft-lockups)? Also, will the patch from this bug make it into mainstream?
Comment 20 James 2010-08-30 13:41:58 UTC
Following on from Comment #19: I have now seen a CPU spike/audio latency on X3100 graphics as well, when the screensaver fade-out kicks in. This is on F13's 2.6.36.6 with the comment #8 patch applied. (Perhaps I should file another bug for this.)
Comment 21 James 2010-10-23 16:04:22 UTC
Been running kernels with patch from Comment #8 without problem for quite some time now, but have to manually apply to distro kernel each time to resolve latency issue. Any chance of it being pushed upstream?
Comment 22 James 2010-10-25 15:43:26 UTC
Created attachment 35012 [details]
Patch to add msleep(1) between LVDS status checks for 2.6.36.

I'm now testing kernel 2.6.36, for which the patch of Comment #8 no longer applies. However, I have inspected the source code, and changed the last argument of wait_for to 1 (which I believe is morally the same as what I was applying before). It also seems to work (no CPU spike).

Patch attached.
Comment 23 Chris Wilson 2010-12-16 14:08:40 UTC
It's not a stability fix, so I'm not going to propose it for stable. It is fixed in mainline, if you want it backported then you need to convince the stable maintainers.

Note You need to log in before you can comment on or make changes to this bug.