Bug 12152

Summary: Huge wakeups number from i1915
Product: Drivers Reporter: Rafael J. Wysocki (rjw)
Component: Video(DRI - non Intel)Assignee: drivers_video-dri
Status: CLOSED CODE_FIX    
Severity: normal CC: bgamari, corsac, luis6674, tom
Priority: P1    
Hardware: All   
OS: Linux   
Kernel Version: 2.6.28-rc6-00209-g6a12141 Subsystem:
Regression: Yes Bisected commit-id:
Bug Depends on:    
Bug Blocks: 11808    
Attachments: dmesg log from 2.6.27
dmesg log from 2.6.28
dmesg log from 2.6.28 after resume from hibernation

Description Rafael J. Wysocki 2008-12-03 14:10:15 UTC
Subject    : Huge wakeups number from i1915
Submitter  : "Yves-Alexis Perez" <corsac@debian.org>
Date       : 2008-12-02 16:48
References : http://marc.info/?l=linux-acpi&m=122823656702994&w=4

This entry is being used for tracking a regression from 2.6.27.  Please don't
close it until the problem is fixed in the mainline.
Comment 1 Rafael J. Wysocki 2008-12-07 23:27:50 UTC
On Monday, 8 of December 2008, Yves-Alexis Perez wrote:
> On lun, 2008-12-08 at 07:24 +0100, Yves-Alexis Perez wrote:
> > On dim, 2008-12-07 at 15:12 -0800, Arjan van de Ven wrote:
> > > 
> > > > > at least in some of the cases where this has been seen the cause
> > > is
> > > > > the following:
> > > > > The i915 DRM driver used to do polling for completion, busy
> > > > > waiting. It moved to be interrupt driven, which is usually better
> > > > > for power, but it will show up as more wakeups in powertop....
> > > > 
> > > > IOW, this is not a regression?
> > > 
> > > I don't know about this specifc case (not enough information) but for
> > > the case I described it's not a regression. Going to interrupt driven
> > > from busy waiting is an improvement not a regression :)
> > 
> > Well, several thousand or more interrupts really seems like a
> > regression :). But it seems that's the same thing as the “IRQ
> > spinning” (there was a thread on dri-devel about that).
> > 
> > It seems fixed with a patch from Matthew Garrett applied to
> > drm-intel/for-airlied but I don't think this has been applied to Linus
> > master.
> 
> And it seems the same thing as
> https://bugs.freedesktop.org/show_bug.cgi?id=18609
Comment 2 Eric Anholt 2008-12-09 19:02:24 UTC
arjan's comment here is accurate, if the reporter isn't actually talking about the issue in 18609 (there was no indication in the original message that that was the case).
Comment 3 Yves-Alexis Perez 2008-12-09 22:28:13 UTC
I am the reporter, and yes it is at least related with #18609. When under load, the 16 irq is lost “nobody cared” etc. Then huge wakeups numbers in powertop appear, along with huge power consumption (because of interrupts flooding or something like that?)
Comment 4 Eric Anholt 2008-12-10 09:55:11 UTC
The nobody cared message is what needs to be reported.  Reporting the pain that follows that just confuses things.

4/4 other reporters of this issue I've talked to have confirmed that the fix that just got pulled to linus master fixes it, so hopefully it does for you too.
Comment 5 Yves-Alexis Perez 2008-12-10 10:30:31 UTC
As I said (and this is in this initial message opening this bug), yes, the patch reverting the MSI stuff fixes the problem.
Comment 6 Rafael J. Wysocki 2008-12-10 12:48:23 UTC
OK, which commit in the Linus' tree is this?
Comment 7 Yves-Alexis Perez 2008-12-10 12:51:05 UTC
The “fixing” commit? Not sure, I dont run Linus's tree atm.
Comment 8 Tomas Carnecky 2008-12-10 13:19:40 UTC
In linux-2.6.git it's this commit:

commit b60678a75d44fa9d5969f79781bd856ad5858609
Author: Keith Packard <keithp@keithp.com>
Date:   Mon Dec 8 11:12:28 2008 -0800

    drm/i915: Disable the GM965 MSI errata workaround.
    
    Since applying the fix suggested by the errata (disabling MSI), we've had
    issues with interrupts being stuck on despite IIR being 0 on GM965 hardware.
    Most reporters of the issue have confirmed that turning MSI back on fixes
    things, and given the difficulties experienced in getting reliable MSI working
    on Linux, it's believable that the errata was about software issues and not
    actual hardware issues.
    
    Signed-off-by: Dave Airlie <airlied@redhat.com>
Comment 9 Alberto Gonzalez 2008-12-30 11:41:28 UTC
I'm not sure if what I'm seeing is the same, but it looks similar:

With 2.6.27.10 i get about 8-10 wakeups per second on an idle system. With 2.6.28 I get about 100 more due to:

<interrupt> : uhci_hcd:usb1, i915@pci:0000:00:02.0

The interesting thing is that if I hibernate the system and then resume it, those wakeups go away and I'm back to 8-10 per second.

This is an old Pentium 4 desktop with i845GL chipset. The kernels used are the stock ones from Arch Linux (pretty vanilla kernels, and I don't even know how to compile my own anyway). I guess I could attach the dmesg from .27, .28 and .28 after hibernation in case someone can see anything interesting there.

Thanks.
Comment 10 Alberto Gonzalez 2008-12-30 11:44:50 UTC
Created attachment 19547 [details]
dmesg log from 2.6.27
Comment 11 Alberto Gonzalez 2008-12-30 11:45:19 UTC
Created attachment 19548 [details]
dmesg log from 2.6.28
Comment 12 Alberto Gonzalez 2008-12-30 11:45:50 UTC
Created attachment 19549 [details]
dmesg log from 2.6.28 after resume from hibernation
Comment 13 Eric Anholt 2008-12-30 15:27:28 UTC
Please open a new bug for your new issue.
Comment 14 Alberto Gonzalez 2008-12-31 12:27:59 UTC
I opened a new one here:

http://bugzilla.kernel.org/show_bug.cgi?id=12337

Thanks.