Bug 42907 - [SNB] 3.3.0-rc5+git: WARNING: at drivers/gpu/drm/i915/i915_irq.c:652 ironlake_irq_handler+0x4ea/0x500()
[SNB] 3.3.0-rc5+git: WARNING: at drivers/gpu/drm/i915/i915_irq.c:652 ironlake...
Status: CLOSED CODE_FIX
Product: Drivers
Classification: Unclassified
Component: Video(DRI - Intel)
All Linux
: P1 normal
Assigned To: Ben Widawsky
:
: 43107 (view as bug list)
Depends on:
Blocks: 42644
  Show dependency treegraph
 
Reported: 2012-03-11 19:35 UTC by Maciej Rutecki
Modified: 2012-07-01 10:31 UTC (History)
8 users (show)

See Also:
Kernel Version: 3.3.0-rc5+git
Tree: Mainline
Regression: Yes


Attachments

Description Maciej Rutecki 2012-03-11 19:35:06 UTC
Subject    : 3.3.0-rc5+git: WARNING: at drivers/gpu/drm/i915/i915_irq.c:652 ironlake_irq_handler+0x4ea/0x500()
Submitter  : Soeren Sonnenburg <sonne@debian.org>
Date       : 2012-03-05 23:59
Message-ID : 1330991976.9223.16.camel@no
References : http://marc.info/?l=linux-kernel&m=133099242810332&w=2

This entry is being used for tracking a regression from 3.2. Please don't
close it until the problem is fixed in the mainline.
Comment 1 Daniel Vetter 2012-03-18 16:40:56 UTC
We've had sightings of this before 3.2 and tried to fix it for 3.2. Evidently there's still something not quite right in the logic, but afaics this does not smell like a regression.

While I have the attention of the regression tracking team, can someone please look at:

https://bugzilla.kernel.org/show_bug.cgi?id=42762
Comment 2 Ben Widawsky 2012-03-30 02:40:15 UTC
(In reply to comment #0)
> Subject    : 3.3.0-rc5+git: WARNING: at drivers/gpu/drm/i915/i915_irq.c:652
> ironlake_irq_handler+0x4ea/0x500()
> Submitter  : Soeren Sonnenburg <sonne@debian.org>
> Date       : 2012-03-05 23:59
> Message-ID : 1330991976.9223.16.camel@no
> References : http://marc.info/?l=linux-kernel&m=133099242810332&w=2
> 
> This entry is being used for tracking a regression from 3.2. Please don't
> close it until the problem is fixed in the mainline.

Would it be possible to bisect this? Daniel put a fix which first went in v3.2-rc1, and has been there until now. This logic shouldn't have changed much since then.
Comment 3 Daniel Vetter 2012-04-15 10:07:23 UTC
*** Bug 43107 has been marked as a duplicate of this bug. ***
Comment 4 Jesse Barnes 2012-04-18 21:11:53 UTC
Maciej, any update?
Comment 5 Maciej Rutecki 2012-04-19 18:56:05 UTC
I have no new information.

Regards
Comment 6 Jesse Barnes 2012-04-19 19:05:12 UTC
Any chance you can bisect like Ben asked?
Comment 7 Patryk Rządziński 2012-04-30 17:34:03 UTC
Hello,

I started receiving strange behavior the moment I disabled PM Runtime in the kernel (vanilla-3.3.3). I realized that when booting, progress would get stuck from 30 seconds to few minutes moments after init starts. Please note the timing.

[    2.957943] Freeing unused kernel memory: 424k freed
[    2.959940] Freeing unused kernel memory: 756k freed
[   26.700741] ------------[ cut here ]------------
[   26.700754] WARNING: at drivers/gpu/drm/i915/i915_irq.c:652 0xffffffff81308c22()
[   26.700761] Hardware name: Dell System Vostro 3750
[   26.700765] Missed a PM interrupt
[   26.700769] Modules linked in:
[   26.700778] Pid: 0, comm: swapper/0 Not tainted 3.3.3 #1
[   26.700783] Call Trace:
[   26.700787]  <IRQ>  [<ffffffff8107058b>] ? 0xffffffff8107058b
[   26.700800]  [<ffffffff81070685>] ? 0xffffffff81070685
[   26.700806]  [<ffffffff8108839e>] ? 0xffffffff8108839e
[   26.700812]  [<ffffffff81308c22>] ? 0xffffffff81308c22
[   26.700818]  [<ffffffff810cb79a>] ? 0xffffffff810cb79a
[   26.700833]  [<ffffffff810cb8e1>] ? 0xffffffff810cb8e1
[   26.700835]  [<ffffffff810ce7ff>] ? 0xffffffff810ce7ff
[   26.700837]  [<ffffffff81037625>] ? 0xffffffff81037625
[   26.700839]  [<ffffffff81037533>] ? 0xffffffff81037533
[   26.700841]  [<ffffffff81589dee>] ? 0xffffffff81589dee
[   26.700843]  [<ffffffff81096691>] ? 0xffffffff81096691
[   26.700845]  [<ffffffff81076260>] ? 0xffffffff81076260
[   26.700847]  [<ffffffff810aa0ef>] ? 0xffffffff810aa0ef
[   26.700849]  [<ffffffff8158b8dc>] ? 0xffffffff8158b8dc
[   26.700851]  [<ffffffff81037695>] ? 0xffffffff81037695
[   26.700853]  [<ffffffff8107663e>] ? 0xffffffff8107663e
[   26.700855]  [<ffffffff810501f8>] ? 0xffffffff810501f8
[   26.700857]  [<ffffffff8158b09e>] ? 0xffffffff8158b09e
[   26.700858]  <EOI>  [<ffffffff8126cd70>] ? 0xffffffff8126cd70
[   26.700862]  [<ffffffff8126cd4f>] ? 0xffffffff8126cd4f
[   26.700864]  [<ffffffff81437481>] ? 0xffffffff81437481
[   26.700866]  [<ffffffff81034125>] ? 0xffffffff81034125
[   26.700868]  [<ffffffff818748e0>] ? 0xffffffff818748e0
[   26.700870]  [<ffffffff81874000>] ? 0xffffffff81874000
[   26.700872]  [<ffffffff8187421a>] ? 0xffffffff8187421a
[   26.700875] ---[ end trace b7fe085284267851 ]---

I think this is the same issue - if you disagree, please feel free to delete this comment. Furthermore, I hope it points you in the right direction of resolving it.
Comment 8 Jesse Barnes 2012-06-20 20:08:58 UTC
I have a new theory that this message is bogus due to our two level interrupt scheme.  Our IIR can hold up to two events, so if we get two PM related interrupts in rapid succession (before masking or acking it), we'll go through the mask/ack code and on the next interrupt will read out the queued value, which may be the same as the one we just received.

So unless there are bad effects from this warning, I'd say we should just remove it, or somehow handle the queued events better.
Comment 9 Florian Mickler 2012-07-01 09:42:55 UTC
A patch referencing this bug report has been merged in Linux v3.5-rc5:

commit 58bf8062d0b293b8e1028e5b0342082002886bd4
Author: Daniel Vetter <daniel.vetter@ffwll.ch>
Date:   Thu Jun 21 14:55:22 2012 +0200

    drm/i915: rip out the PM_IIR WARN

Note You need to log in before you can comment on or make changes to this bug.