Bug 54691
Summary: | Error enabling power savings on Ironlake chip since 3.8.0 | ||
---|---|---|---|
Product: | Drivers | Reporter: | Coacher (itumaykin+kernel) |
Component: | Video(DRI - Intel) | Assignee: | intel-gfx-bugs (intel-gfx-bugs) |
Status: | RESOLVED CODE_FIX | ||
Severity: | normal | CC: | daniel, intel-gfx-bugs, ketetefid |
Priority: | P1 | ||
Hardware: | All | ||
OS: | Linux | ||
Kernel Version: | >= 3.8.0 | Subsystem: | |
Regression: | Yes | Bisected commit-id: | |
Attachments: |
lspci -vvv
enable interrupts earlier when resuming backported patch 2nd try at a backport |
I've done bisecting between 3.7.10 and 3.8.0, here is the result: 3e9605018ab3e333d51cc90fccfde2031886763b is the first bad commit commit 3e9605018ab3e333d51cc90fccfde2031886763b Author: Chris Wilson <chris@chris-wilson.co.uk> Date: Tue Nov 27 16:22:54 2012 +0000 drm/i915: Rearrange code to only have a single method for waiting upon the ring Replace the wait for the ring to be clear with the more common wait for the ring to be idle. The principle advantage is one less exported intel_ring_wait function, and the removal of a hardcoded value. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch> :040000 040000 aeb1a24182824a9da2090842ec3e521400037b0e 25ab310832eb71fc0d3ef9d41bfb1d949247a455 M drivers git bisect log output (JIC): git bisect start # good: [356d8c6fb2a7cf49e836742738a8b9a47e77cfea] Linux 3.7.10 git bisect good 356d8c6fb2a7cf49e836742738a8b9a47e77cfea # bad: [19f949f52599ba7c3f67a5897ac6be14bfcb1200] Linux 3.8 git bisect bad 19f949f52599ba7c3f67a5897ac6be14bfcb1200 # good: [29594404d7fe73cd80eaa4ee8c43dcc53970c60e] Linux 3.7 git bisect good 29594404d7fe73cd80eaa4ee8c43dcc53970c60e # good: [dadfab4873256d2145640c0ce468fcbfb48977fe] Merge tag 'firewire-updates' of git://git.kernel.org/pub/scm/linux/kernel/git/ieee1394/linux1394 git bisect good dadfab4873256d2145640c0ce468fcbfb48977fe # bad: [992956189de58cae9f2be40585bc25105cd7c5ad] efi: Fix the build with user namespaces enabled. git bisect bad 992956189de58cae9f2be40585bc25105cd7c5ad # good: [2b8318881ddbcb67c5e8d2178b42284749442222] Merge tag 'fbdev-for-3.8' of git://gitorious.org/linux-omap-dss2/linux git bisect good 2b8318881ddbcb67c5e8d2178b42284749442222 # bad: [3c2e81ef344a90bb0a39d84af6878b4aeff568a2] Merge branch 'drm-next' of git://people.freedesktop.org/~airlied/linux git bisect bad 3c2e81ef344a90bb0a39d84af6878b4aeff568a2 # good: [640631d04cd2cfbb4792d6a8fc5fcab14ee273a5] drm/exynos: use sgt instead of pages for framebuffer address git bisect good 640631d04cd2cfbb4792d6a8fc5fcab14ee273a5 # good: [aed606e3bc1f10753254db308d3fd8c053c41328] Merge branch 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/gerg/m68knommu git bisect good aed606e3bc1f10753254db308d3fd8c053c41328 # good: [221392c3ad0432e39fd74a349364f66cb0ed78f6] sched: numa: Fix build error if CONFIG_NUMA_BALANCING && !CONFIG_TRANSPARENT_HUGEPAGE git bisect good 221392c3ad0432e39fd74a349364f66cb0ed78f6 # good: [01ce113ca5b18aea4c97dea62287394ca4f8ad7f] drm/exynos: modify wait_for_vblank of fimd git bisect good 01ce113ca5b18aea4c97dea62287394ca4f8ad7f # good: [2f3f24061c5c489074ad492bf694a5a76ebd8fc5] Merge branch 'exynos-drm-next' of git://git.kernel.org/pub/scm/linux/kernel/git/daeinki/drm-exynos into drm-next git bisect good 2f3f24061c5c489074ad492bf694a5a76ebd8fc5 # bad: [633cf8f5056c3e72158e4dbc387b3d65926d2d55] drm/i915: Don't allow ring tail to reach the same cacheline as head git bisect bad 633cf8f5056c3e72158e4dbc387b3d65926d2d55 # good: [45e2b5f640b3766da3eda48f6c35f088155c06f3] drm/i915: force restore on lid open git bisect good 45e2b5f640b3766da3eda48f6c35f088155c06f3 # bad: [3e9605018ab3e333d51cc90fccfde2031886763b] drm/i915: Rearrange code to only have a single method for waiting upon the ring git bisect bad 3e9605018ab3e333d51cc90fccfde2031886763b # good: [b662a0663230853fccdfceeda5db031f5d4b657c] drm/i915: Simplify flushing activity on the ring git bisect good b662a0663230853fccdfceeda5db031f5d4b657c Created attachment 94361 [details]
enable interrupts earlier when resuming
Please test the attached patch, thanks.
(In reply to comment #4) > Created an attachment (id=94361) [details] > enable interrupts earlier when resuming > > Please test the attached patch, thanks. I would be most thankful if you would kindly provide a patch for 3.8.1 as well. I definitely could change the patch to suit the kernel but I was too afraid of doing that. I am sorry for my lack of expertise. I would provide immediate testing, since I am affected by this. patch -p1 </home/tefid/Downloads/i915_pm.patch --dry-run patching file drivers/gpu/drm/i915/i915_drv.c Hunk #1 succeeded at 486 (offset -7 lines). Hunk #2 FAILED at 567. 1 out of 2 hunks FAILED -- saving rejects to file drivers/gpu/drm/i915/i915_drv.c.rej Created attachment 94391 [details]
backported patch
Some other patches are required, hopefully I've picked up all the right ones. Cumulative diff attached.
The patch from the last comment made an unbootable kernel. Unfortunately, my configuration got lost and I am in the process of building a new kernel. Is the patch completely OK? It does not produce any messages -- just and immediate black screen and a hard reset is needed. Hm, I haven't tested the backport at all besides with the compiler ... Can you first check whether the first patch works on top of 3.9-rc1, then we'll worry about the backport? I confirm that with the suggested patch ("enable interrupts earlier when resuming" one) on top of kernel-3.9-rc1 the problem is fixed. @Kete Tefid can you confirm this so I can mark this as resolved? (In reply to comment #9) > I confirm that with the suggested patch ("enable interrupts earlier when > resuming" one) on top of kernel-3.9-rc1 the problem is fixed. Out of curiosity I've tested pure 3.9.0-rc1 and the problem is gone too. Seems that it was fixed already by some other change. (In reply to comment #10) > (In reply to comment #9) > > I confirm that with the suggested patch ("enable interrupts earlier when > > resuming" one) on top of kernel-3.9-rc1 the problem is fixed. > > Out of curiosity I've tested pure 3.9.0-rc1 and the problem is gone too. > Seems > that it was fixed already by some other change. Sorry for the disinformation. This issue is still presented in vanilla 3.9.0-rc1 and the patch mentioned above really do fix it. On 3.8.0 this issue was triggered upon every boot on my machine, but on 3.9.0-rc1 a suspend/resume cycle is needed, that's why I thought it's gone completely. My bad. Fix is merged into drm-intel-fixes now: commit 58db81c51c080595995219f591ca834e7352731c Author: Daniel Vetter <daniel.vetter@ffwll.ch> Date: Tue Mar 5 09:50:58 2013 +0100 drm/i915: enable irqs earlier when resuming It is a bit unclear to me why 3.8 also regressed, since all the seemingly relevant commits are only in 3.9-rc1. Created attachment 94541 [details]
2nd try at a backport
I've found the prerequisite patch I think and added it to the backport. Please test this on top of 3.8.
(In reply to comment #12) > It is a bit unclear to me why 3.8 also regressed, since all the seemingly > relevant commits are only in 3.9-rc1. I don't know either, but on my machine 3.8.0 is also affected. Yeah, the commit which regressed things is actually in 3.8 already. I've added a cc: stable to the fix and will forward it asap: commit 15239099d7a7a9ecdc1ccb5b187ae4cda5488ff9 Author: Daniel Vetter <daniel.vetter@ffwll.ch> Date: Tue Mar 5 09:50:58 2013 +0100 drm/i915: enable irqs earlier when resuming Thank you for your help! Hello everyone. I haven't tested fresh 3.8.x kernels lately because there is also another bug that's really annoying and I was running 3.7.10, but I've finally decided to give 3.8.4 a try. Surprisingly, this bug is there. After a little investigation I found out that at some point patch named: "drm/i915: enable irqs earlier when resuming" was introduced to 3.8 branch and this is merely the patch attached above, but recently revert of this patch was also merged into 3.8 branch AND is included in 3.8.4. That's why this bug is resurfaced again. The commit message of revert says: "It caused problems in the 3.8-stable series, but 3.9-rc is just fine." So, this revert breaking things again on 3.8.4 kernel and doesn't belong to 3.8 branch as stated in commit message. Please revert this revert, i.e. leave the changes made with the patch "drm/i915: enable irqs earlier when resuming". Nope, ilk rc6 is known broken, disabled by default and this patch caused at least 5 different other bugs (and tons more "me, too" reports). Since the init sequence is this messy I've essentially given up on fixing this for 3.8, since 3.9-rc kernels work correctly. Everything should work as expected with default options on 3.8. Ok, thank you for clarification. Will wait till 3.9 then. |
Created attachment 94311 [details] lspci -vvv After upgrading kernel to 3.8.0 there is the following stack trace is in my dmesg upon every boot: [22762.594676] ------------[ cut here ]------------ [22762.594700] WARNING: at drivers/gpu/drm/i915/i915_gem.c:1021 __wait_seqno+0x4fd/0x530 [i915]() [22762.594701] Hardware name: Aspire 1830T [22762.594728] Modules linked in: ipv6 nf_conntrack_ipv4 nf_defrag_ipv4 xt_conntrack nf_conntrack iptable_filter ip_tables bnep bluetooth dm_mod uvcvideo videobuf2_vmalloc videobuf2_memops videobuf2_core videodev ums_realtek hid_logitech_dj i915 intel_agp iTCO_wdt acer_wmi iTCO_vendor_support ath9k ath9k_common ath9k_hw ath atl1c mac80211 cfg80211 sparse_keymap drm_kms_helper intel_gtt video wmi [22762.594731] Pid: 13103, comm: kworker/u:0 Tainted: G W 3.8.1-ACER-1830T #1 [22762.594732] Call Trace: [22762.594743] [<ffffffff8103907b>] ? warn_slowpath_common+0x7b/0xc0 [22762.594753] [<ffffffffa01dc5dd>] ? __wait_seqno+0x4fd/0x530 [i915] [22762.594766] [<ffffffffa0211501>] ? __i915_update_gfx_val+0x31/0xf0 [i915] [22762.594778] [<ffffffffa02141a4>] ? i915_update_gfx_val+0x24/0x40 [i915] [22762.594788] [<ffffffffa01dcd5f>] ? i915_add_request+0x2bf/0x2e0 [i915] [22762.594800] [<ffffffffa0214a2f>] ? intel_enable_gt_powersave+0x49f/0x870 [i915] [22762.594813] [<ffffffffa01fc2aa>] ? intel_modeset_init_hw+0x3a/0x50 [i915] [22762.594820] [<ffffffffa01c91c9>] ? __i915_drm_thaw+0xd9/0xf0 [i915] [22762.594828] [<ffffffffa01c96f4>] ? i915_resume+0x74/0xd0 [i915] [22762.594834] [<ffffffff812d18a0>] ? pci_pm_thaw+0x80/0x80 [22762.594842] [<ffffffff8137d3b6>] ? dpm_run_callback.isra.5+0x36/0x80 [22762.594845] [<ffffffff8137d6d6>] ? device_resume+0xa6/0x140 [22762.594849] [<ffffffff8137d784>] ? async_resume+0x14/0x40 [22762.594853] [<ffffffff810630f7>] ? async_run_entry_fn+0xa7/0x1c0 [22762.594858] [<ffffffff81055d07>] ? process_one_work+0x117/0x530 [22762.594861] [<ffffffff81053119>] ? need_to_create_worker+0x9/0x20 [22762.594865] [<ffffffff81055a8f>] ? manage_workers+0x1df/0x2a0 [22762.594867] [<ffffffff81063050>] ? async_schedule+0x10/0x10 [22762.594871] [<ffffffff81056562>] ? worker_thread+0x182/0x490 [22762.594874] [<ffffffff810645bf>] ? __wake_up_common+0x4f/0x80 [22762.594878] [<ffffffff810563e0>] ? rescuer_thread+0x280/0x280 [22762.594881] [<ffffffff8105b9e3>] ? kthread+0xb3/0xc0 [22762.594884] [<ffffffff81060000>] ? hrtimers_resume+0x10/0x50 [22762.594887] [<ffffffff8105b930>] ? kthread_freezable_should_stop+0x60/0x60 [22762.594891] [<ffffffff81564b2c>] ? ret_from_fork+0x7c/0xb0 [22762.594922] [<ffffffff8105b930>] ? kthread_freezable_should_stop+0x60/0x60 [22762.594924] ---[ end trace 5bb601912bd1588a ]--- [22762.594927] [drm:ironlake_enable_rc6] *ERROR* failed to enable ironlake power power savings There was no such problem with 3.7.*. My notebook is Acer 1830T with Ironlake chip running Gentoo amd64. I've attached lscpi output and if you need some more info I am ready to provide it.