Created attachment 21746 [details]
Output from strace -r xset dpms force off
Running xset dpms force off turn off the display causes a spike of kernel-mode
CPU activity and around 1 second of system-wide latency (repeating sound from
the audio buffer, etc.).
The following kernel, xorg driver and libdrm versions are installed:
Attached is the output from strace -r xset dpms force off.
Still present in kernel-220.127.116.11-186.fc11.x86_64.
No improvement with kernel-18.104.22.168-28.rc2.fc11.x86_64.
Still present with:
Created attachment 23999 [details]
remove panel power delays
Does the problem go away with this patch applied? We should probably make this loops a bit friendlier to CPU usage.
I'll give it a go and report back. I should add that enabling full kernel pre-emption largely eliminates the latency (over voluntary pre-emption), but there's still a spike in CPU usage.
OK, so far I've not seen any of the tell-tale spikes of kernel CPU usage with the patch applied. I'll try again later with a kernel built with voluntary pre-emption to see if the latency issue is fixed.
Created attachment 24025 [details]
dmesg with the problem
Got an unhappiness with the patch applied today; screen went off and wouldn't turn back on:
INFO: task i915/0:151 blocked for more than 120 seconds.
See the attachment for more.
Created attachment 24026 [details]
Use msleep around panel status checks
Well a hung screen is one possible side effect of this patch (certain operations require the panel to be off; that patch disables any waiting for that condition which could cause trouble).
Can you give this one a try? I don't *think* we're holding any spinlocks at this point so it should be safe.
Should this patch be applied to the original intel_lvds.c or on top of the other one?
To the original, it obsoletes the other patch.
I'm currently using a kernel with the patch from comment #8. It seems to work and I've not encountered any problems with it so far.
One thing I might have neglected to mention (I assumed it was the same problem, hence the original summary) is that a similar thing happens when the screensaver starts to kick in: specifically, at the start of the fade-out there is a spike of CPU activity and sometimes audio latency, and is not fixed by the patch above. (I believe it also occurrs when just running xrandr with no arguments.)
[If this should be filed separately, let me know.]
Oh dear, caught an unhappiness:
Dec 14 14:27:06 localhost kernel: BUG: soft lockup - CPU#0 stuck for 61s! [Xorg:1504]
Dec 14 14:27:06 localhost kernel: Call Trace:
Dec 14 14:27:06 localhost kernel: [<ffffffffa0085122>] ? i9xx_crtc_dpms+0x216/0x
Dec 14 14:27:06 localhost kernel: [<ffffffffa00852d4>] ? intel_crtc_dpms+0x2d/0xec [i915]
Dec 14 14:27:06 localhost kernel: [<ffffffffa0061a47>] ? drm_helper_connector_dpms+0x200/0x209 [drm_kms_helper]
Dec 14 14:27:06 localhost kernel: [<ffffffffa0034c4b>] ? drm_mode_connector_property_set_ioctl+0x108/0x175 [drm]
Dec 14 14:27:06 localhost kernel: [<ffffffffa0034b43>] ? drm_mode_connector_property_set_ioctl+0x0/0x175 [drm]
Dec 14 14:27:06 localhost kernel: [<ffffffffa002af4f>] ? drm_ioctl+0x237/0x2f3 [drm]
Dec 14 14:27:06 localhost kernel: [<ffffffff811e75ec>] ? inode_has_perm+0x7a/0x90
Dec 14 14:27:06 localhost kernel: [<ffffffff811216c0>] ? vfs_ioctl+0x6f/0x87
Dec 14 14:27:06 localhost kernel: [<ffffffff81121bd6>] ? do_vfs_ioctl+0x482/0x4c8
Dec 14 14:27:06 localhost kernel: [<ffffffff81121c72>] ? sys_ioctl+0x56/0x79
Dec 14 14:27:06 localhost kernel: [<ffffffff81011d72>] ? system_call_fastpath+0x16/0x1b
Syslog excerpt attached below.
Created attachment 24178 [details]
syslog excerpt from soft lockup with second patch
Just to be sure, can you gdb your i915.o module and do a list i9xx_crtc_dpms+0x216? That should give us the actual line number where the hang is occurring.
Another option would be to add a timeout to the loops for the panel status, and just return after a second or so.
I think this is a dupe.
*** This bug has been marked as a duplicate of bug 15015 ***
(In reply to comment #16)
> Just to be sure, can you gdb your i915.o module and do a list
> i9xx_crtc_dpms+0x216? That should give us the actual line number where the
> hang is occurring.
> Another option would be to add a timeout to the loops for the panel status,
> just return after a second or so.
Sorry that I missed this. I don't have that kernel any more, but I'll apply the second patch and rebuild and see if I can provoke the issue.
I've decided to re-visit this bug (finally). I've been applying the patch from comment #8 to recent kernels (22.214.171.124) AND the fbc-disable-timeout patch from bug 15015, with success. In particular, I see no DPMS- or screensaver-fade-related latency (or the accompanying CPU usage spike) on X3100 graphics. However, I suspect that the latency at the start of screensaver-fade is still an issue on X4500 graphics on the CRT port, I'll have to test further. I've not seen any further soft-lockups.
Given that it's the comment #8 patch in this bug which fixes it, it is right that this bug should have been closed as a duplicate of bug 15015 (which, as I understand it, resolves the soft-lockups)? Also, will the patch from this bug make it into mainstream?
Following on from Comment #19: I have now seen a CPU spike/audio latency on X3100 graphics as well, when the screensaver fade-out kicks in. This is on F13's 126.96.36.199 with the comment #8 patch applied. (Perhaps I should file another bug for this.)
Been running kernels with patch from Comment #8 without problem for quite some time now, but have to manually apply to distro kernel each time to resolve latency issue. Any chance of it being pushed upstream?
Created attachment 35012 [details]
Patch to add msleep(1) between LVDS status checks for 2.6.36.
I'm now testing kernel 2.6.36, for which the patch of Comment #8 no longer applies. However, I have inspected the source code, and changed the last argument of wait_for to 1 (which I believe is morally the same as what I was applying before). It also seems to work (no CPU spike).
It's not a stability fix, so I'm not going to propose it for stable. It is fixed in mainline, if you want it backported then you need to convince the stable maintainers.