Bug 41892 - KMS locking sucks, causing set_cursor/pageflip stalls due to connector probing
Summary: KMS locking sucks, causing set_cursor/pageflip stalls due to connector probing
Status: RESOLVED CODE_FIX
Alias: None
Product: Drivers
Classification: Unclassified
Component: Video(DRI - Intel) (show other bugs)
Hardware: All Linux
: P1 normal
Assignee: intel-gfx-bugs@lists.freedesktop.org
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2011-08-28 15:53 UTC by past.art2841
Modified: 2013-01-21 22:29 UTC (History)
3 users (show)

See Also:
Kernel Version:
Subsystem:
Regression: No
Bisected commit-id:


Attachments
sudo lspci -vvv (24.39 KB, text/plain)
2011-08-28 15:53 UTC, past.art2841
Details

Description past.art2841 2011-08-28 15:53:56 UTC
Created attachment 70692 [details]
sudo lspci -vvv

After kernel update I noticed strange problem with screen: when I do something without moving cursor most of thing works fine, but if I move cursor every 5-15 seconds screen freezes for a while... and then everything works fine. I recompiled kernel to use latencytop and found that this latency is caused by drm_mode_cursor_ioctl which takes about 750..1000ms (but in "good-working" kernel this takes 20-25ms). I used "git bisect" to find the exact point of the problem:

bad is "a6360dd37e1a144ed11e6548371bade559a1e4df" (tag v2.6.39-rc3)
good is "0ce790e7d736cedc563e1fb4e998babf5a4dbc3d" (tag v2.6.39-rc1)

The result was "7f58aabc369014fda3a4a33604ba0a1b63b941ac": using kernel source from this commit causes the problem, using previous ("9f01b25048ad12b5d71f4f7d3b62ef737639a08d") does not.


System description (with "good-working" kernel):
# uname -a
Linux LeX-laptop 2.6.39-rc1-ARCH-00002-geccaca2-dirty #14 SMP PREEMPT Sun Aug 28 18:38:20 MSK 2011 x86_64 Genuine Intel(R) CPU U4100 @ 1.30GHz GenuineIntel GNU/Linux

# cat /proc/version
Linux version 2.6.39-rc1-ARCH-00002-geccaca2-dirty (lex@LeX-PC) (gcc version 4.6.1 20110819 (prerelease) (GCC) ) #14 SMP PREEMPT Sun Aug 28 18:38:20 MSK 2011

# cat /proc/cmdline
BOOT_IMAGE=/vmlinuz-linux-git root=/dev/sda3 ro max_loop=8 ipv6.disable=1 i915.modeset=1

# sudo lspci -vvv
(see attachment)

The machine is laptop: Samsung x420-fa02ru.
Comment 1 Daniel Vetter 2012-03-24 23:32:04 UTC
We've disabled the gmbus controller meanwhile (due to a few other issues) and merged a patch which should speed up edid reads (which I presume is what's blocking cursor updates for you because the regular hotplug code needs to take the same locks). Hence please retest with 3.3.

Furthermore we've reenabled gmbus again for 3.4 (because the bug that lead to it being disabled finally got fixed), so it'd be awesome if you can try a 3.4-rc kernel and check whether these issues crop up for you again.
Comment 2 past.art2841 2012-03-25 13:29:13 UTC
I have tried to use e22057c8599373e5caef0bc42bdb95d2a361ab0d version from git.

As I see, the issue is still present. I tried to use the default configuration with CONFIG_LATENCYTOP=Y, latencytop shows maximum latency for drm_mode_cursor_ioctl up to 250ms when I move cursor.

For previous versions of kernels I wrote patch that partially solve the symptom of the problem: I simply commented mutex_lock(&dev->mode_config.mutex); and unlock calls in drm_mode_cursor_ioctl function. The latency decreased after that. I tried to use it again with the latest version available from git, this didn't worked. The maxmimum latency is still up to 250ms according to latencytop.

PS What about bisection I posted above, it seemed to be incorrect. I didn't wrote about it because this site was down.

Thank you.
Comment 3 Jesse Barnes 2012-04-18 22:01:49 UTC
This should be partially fixed now that Eugeni's fixes to the i2c probing have landed.  But the real fix is to split our locks better so we can update the cursor without being blocked by output detection.
Comment 4 Daniel Vetter 2012-11-13 12:01:04 UTC
Ok, I've just declared that this is the master bug for tracking locking woes in our modeset code, specifically that connector probing can cause pageflip/cursor movement to stall.

Adjusting summary accordingly.
Comment 5 Daniel Vetter 2013-01-21 22:29:42 UTC
Locking rework to fix this is now merged into drm-next:

commit 735dc0d1e29329ff34ec97f66e130cce481c9607
Merge: bac4b7c 20c60c3
Author: Dave Airlie <airlied@redhat.com>
Date:   Mon Jan 21 07:44:58 2013 +1000

    Merge branch 'drm-kms-locking' of git://people.freedesktop.org/~danvet/drm-intel into drm-next

Please test and reopen if you still experience stalls.

Note You need to log in before you can comment on or make changes to this bug.