Bug 25412

Summary: bisected: ACPI SCI interrupt storm - Dell Studio XPS 435MT, i7-940
Product: ACPI Reporter: Len Brown (lenb)
Component: Config-InterruptsAssignee: Rafael J. Wysocki (rjw)
Status: CLOSED CODE_FIX    
Severity: normal CC: acpi-bugzilla, florian, maciej.rutecki
Priority: P1    
Hardware: All   
OS: Linux   
Kernel Version: 2.6.36-rc1 Subsystem:
Regression: Yes Bisected commit-id:
Bug Depends on:    
Bug Blocks: 16444    
Attachments: /proc/interrupts
grep . /sys/firmware/acpi/interrupts/*
acpidump.out
patch vs. 2.6.36
ACPICA: Disable GPEs during initialization
dmesg
debug patch used in comment #14, for reference
lin-ming + rafael debug patch
dmesg resulting from above

Description Len Brown 2010-12-21 22:45:51 UTC

    
Comment 1 Len Brown 2010-12-21 22:47:12 UTC
Created attachment 41162 [details]
/proc/interrupts
Comment 2 Len Brown 2010-12-21 22:47:58 UTC
Created attachment 41172 [details]
grep . /sys/firmware/acpi/interrupts/*
Comment 3 Len Brown 2010-12-21 22:48:33 UTC
Created attachment 41182 [details]
acpidump.out
Comment 4 Len Brown 2010-12-21 22:51:11 UTC
 lspci
00:00.0 Host bridge: Intel Corporation 5520/5500/X58 I/O Hub to ESI Port (rev 12)
00:01.0 PCI bridge: Intel Corporation 5520/5500/X58 I/O Hub PCI Express Root Port 1 (rev 12)
00:03.0 PCI bridge: Intel Corporation 5520/5500/X58 I/O Hub PCI Express Root Port 3 (rev 12)
00:07.0 PCI bridge: Intel Corporation 5520/5500/X58 I/O Hub PCI Express Root Port 7 (rev 12)
00:14.0 PIC: Intel Corporation 5520/5500/X58 I/O Hub System Management Registers (rev 12)
00:14.1 PIC: Intel Corporation 5520/5500/X58 I/O Hub GPIO and Scratch Pad Registers (rev 12)
00:14.2 PIC: Intel Corporation 5520/5500/X58 I/O Hub Control Status and RAS Registers (rev 12)
00:14.3 PIC: Intel Corporation 5520/5500/X58 I/O Hub Throttle Registers (rev 12)
00:19.0 Ethernet controller: Intel Corporation 82567LF-2 Gigabit Network Connection
00:1a.0 USB Controller: Intel Corporation 82801JI (ICH10 Family) USB UHCI Controller #4
00:1a.1 USB Controller: Intel Corporation 82801JI (ICH10 Family) USB UHCI Controller #5
00:1a.2 USB Controller: Intel Corporation 82801JI (ICH10 Family) USB UHCI Controller #6
00:1a.7 USB Controller: Intel Corporation 82801JI (ICH10 Family) USB2 EHCI Controller #2
00:1b.0 Audio device: Intel Corporation 82801JI (ICH10 Family) HD Audio Controller
00:1c.0 PCI bridge: Intel Corporation 82801JI (ICH10 Family) PCI Express Root Port 1
00:1c.1 PCI bridge: Intel Corporation 82801JI (ICH10 Family) PCI Express Port 2
00:1d.0 USB Controller: Intel Corporation 82801JI (ICH10 Family) USB UHCI Controller #1
00:1d.1 USB Controller: Intel Corporation 82801JI (ICH10 Family) USB UHCI Controller #2
00:1d.2 USB Controller: Intel Corporation 82801JI (ICH10 Family) USB UHCI Controller #3
00:1d.7 USB Controller: Intel Corporation 82801JI (ICH10 Family) USB2 EHCI Controller #1
00:1e.0 PCI bridge: Intel Corporation 82801 PCI Bridge (rev 90)
00:1f.0 ISA bridge: Intel Corporation 82801JIR (ICH10R) LPC Interface Controller
00:1f.2 IDE interface: Intel Corporation 82801JI (ICH10 Family) 4 port SATA IDE Controller #1
00:1f.3 SMBus: Intel Corporation 82801JI (ICH10 Family) SMBus Controller
00:1f.5 IDE interface: Intel Corporation 82801JI (ICH10 Family) 2 port SATA IDE Controller #2
02:00.0 FireWire (IEEE 1394): VIA Technologies, Inc. VT6315 Series Firewire Controller
04:00.0 VGA compatible controller: nVidia Corporation G72 [GeForce 7300 LE] (rev a1)
Comment 5 Len Brown 2010-12-25 01:10:44 UTC
caused by:

commit 9874647ba1bdf3e1af25e079070a00676f60f2f0
Author: Rafael J. Wysocki <rjw@sisk.pl>
Date:   Thu Jul 8 00:43:36 2010 +0200

    ACPI / ACPICA: Do not execute _PRW methods during initialization

which shipped in 2.6.36-rc1 (marking this as a regression).
failure still present in 2.6.37-rc6

in 2.6.35 and before the regression:

/sys/firmware/acpi/interrupts/gpe1B:       0    disabled
/sys/firmware/acpi/interrupts/gpe1C:       0    invalid
/sys/firmware/acpi/interrupts/gpe1D:       0    disabled

upon the regression:

/sys/firmware/acpi/interrupts/gpe1B: 1085701   enabled
/sys/firmware/acpi/interrupts/gpe1C:       0   invalid
/sys/firmware/acpi/interrupts/gpe1D: 1085702   enabled
Comment 6 Len Brown 2010-12-25 03:19:10 UTC
Created attachment 41592 [details]
patch vs. 2.6.36

Rafael's patch,
ACPI: Execute _PRW for devices reported as inactive or not present

makes this problem stop, but not before over 600 interrupts are serviced.
Comment 7 Len Brown 2010-12-25 03:20:38 UTC
interrupts with patch from comment #6 applied.
Note gpe1B and gpe1D:

grep . /sys/firmware/acpi/interrupts/*
/sys/firmware/acpi/interrupts/error:       0
/sys/firmware/acpi/interrupts/ff_gbl_lock:       0   enabled
/sys/firmware/acpi/interrupts/ff_pmtimer:       0   invalid
/sys/firmware/acpi/interrupts/ff_pwr_btn:       0   enabled
/sys/firmware/acpi/interrupts/ff_rt_clk:       0   disabled
/sys/firmware/acpi/interrupts/ff_slp_btn:       0   invalid
/sys/firmware/acpi/interrupts/gpe00:       0   invalid
/sys/firmware/acpi/interrupts/gpe01:       0   invalid
/sys/firmware/acpi/interrupts/gpe02:       0   invalid
/sys/firmware/acpi/interrupts/gpe03:       0   disabled
/sys/firmware/acpi/interrupts/gpe04:       0   disabled
/sys/firmware/acpi/interrupts/gpe05:       0   disabled
/sys/firmware/acpi/interrupts/gpe06:       0   invalid
/sys/firmware/acpi/interrupts/gpe07:       0   invalid
/sys/firmware/acpi/interrupts/gpe08:       0   invalid
/sys/firmware/acpi/interrupts/gpe09:       0   disabled
/sys/firmware/acpi/interrupts/gpe0A:       0   enabled
/sys/firmware/acpi/interrupts/gpe0B:       0   disabled
/sys/firmware/acpi/interrupts/gpe0C:       0   disabled
/sys/firmware/acpi/interrupts/gpe0D:       0   disabled
/sys/firmware/acpi/interrupts/gpe0E:       0   disabled
/sys/firmware/acpi/interrupts/gpe0F:       0   invalid
/sys/firmware/acpi/interrupts/gpe10:       0   invalid
/sys/firmware/acpi/interrupts/gpe11:       0   invalid
/sys/firmware/acpi/interrupts/gpe12:       0   invalid
/sys/firmware/acpi/interrupts/gpe13:       0   invalid
/sys/firmware/acpi/interrupts/gpe14:       0   invalid
/sys/firmware/acpi/interrupts/gpe15:       0   invalid
/sys/firmware/acpi/interrupts/gpe16:       0   invalid
/sys/firmware/acpi/interrupts/gpe17:       0   invalid
/sys/firmware/acpi/interrupts/gpe18:       0   invalid
/sys/firmware/acpi/interrupts/gpe19:       0   invalid
/sys/firmware/acpi/interrupts/gpe1A:       0   invalid
/sys/firmware/acpi/interrupts/gpe1B:     344   disabled
/sys/firmware/acpi/interrupts/gpe1C:       0   invalid
/sys/firmware/acpi/interrupts/gpe1D:     344   disabled
/sys/firmware/acpi/interrupts/gpe1E:       0   invalid
/sys/firmware/acpi/interrupts/gpe1F:       0   invalid
/sys/firmware/acpi/interrupts/gpe20:       0   disabled
/sys/firmware/acpi/interrupts/gpe21:       0   invalid
/sys/firmware/acpi/interrupts/gpe22:       0   invalid
/sys/firmware/acpi/interrupts/gpe23:       0   invalid
/sys/firmware/acpi/interrupts/gpe24:       0   invalid
/sys/firmware/acpi/interrupts/gpe25:       0   invalid
/sys/firmware/acpi/interrupts/gpe26:       0   invalid
/sys/firmware/acpi/interrupts/gpe27:       0   invalid
/sys/firmware/acpi/interrupts/gpe28:       0   invalid
/sys/firmware/acpi/interrupts/gpe29:       0   invalid
/sys/firmware/acpi/interrupts/gpe2A:       0   invalid
/sys/firmware/acpi/interrupts/gpe2B:       0   invalid
/sys/firmware/acpi/interrupts/gpe2C:       0   invalid
/sys/firmware/acpi/interrupts/gpe2D:       0   invalid
/sys/firmware/acpi/interrupts/gpe2E:       0   invalid
/sys/firmware/acpi/interrupts/gpe2F:       0   invalid
/sys/firmware/acpi/interrupts/gpe30:       0   invalid
/sys/firmware/acpi/interrupts/gpe31:       0   invalid
/sys/firmware/acpi/interrupts/gpe32:       0   invalid
/sys/firmware/acpi/interrupts/gpe33:       0   invalid
/sys/firmware/acpi/interrupts/gpe34:       0   invalid
/sys/firmware/acpi/interrupts/gpe35:       0   invalid
/sys/firmware/acpi/interrupts/gpe36:       0   invalid
/sys/firmware/acpi/interrupts/gpe37:       0   invalid
/sys/firmware/acpi/interrupts/gpe38:       0   invalid
/sys/firmware/acpi/interrupts/gpe39:       0   invalid
/sys/firmware/acpi/interrupts/gpe3A:       0   invalid
/sys/firmware/acpi/interrupts/gpe3B:       0   invalid
/sys/firmware/acpi/interrupts/gpe3C:       0   invalid
/sys/firmware/acpi/interrupts/gpe3D:       0   invalid
/sys/firmware/acpi/interrupts/gpe3E:       0   invalid
/sys/firmware/acpi/interrupts/gpe3F:       0   invalid
/sys/firmware/acpi/interrupts/gpe_all:     688
/sys/firmware/acpi/interrupts/sci:     344
/sys/firmware/acpi/interrupts/sci_not:       0
Comment 8 Rafael J. Wysocki 2010-12-25 13:04:52 UTC
So, it looks like they are enabled by the BIOS?

ACPICA shouldn't enable them before we call acpi_update_gpes(), but the
GPEs are already found as "wakeup" at that point.

Hmm.
Comment 9 Rafael J. Wysocki 2010-12-25 13:42:16 UTC
Created attachment 41602 [details]
ACPICA: Disable GPEs during initialization

Does the attached patch, in addition to the patch from comment #6 , help?
Comment 10 Len Brown 2010-12-25 23:22:27 UTC
Yes, when the patch in comment #9 is added on top of the pach
in comment #8, the interrupt storm goes away totally.

/sys/firmware/acpi/interrupts/gpe1B:       0   disabled
/sys/firmware/acpi/interrupts/gpe1C:       0   invalid
/sys/firmware/acpi/interrupts/gpe1D:       0   disabled

marking bug as RESOLVED, as there is a working patch available.
Comment 11 Rafael J. Wysocki 2010-12-26 11:51:32 UTC
The patch from comment #9 has been posted with a changelog:
https://lkml.org/lkml/2010/12/26/15
Comment 12 Rafael J. Wysocki 2010-12-26 11:52:50 UTC
Handled-By : Rafael J. Wysocki <rjw@sisk.pl>
Patch : https://lkml.org/lkml/2010/12/26/15
Patch : https://bugzilla.kernel.org/attachment.cgi?id=41592
Comment 14 Len Brown 2011-02-11 20:38:00 UTC
Created attachment 47382 [details]
dmesg

dmesg from 2.6.37-rc1 + Lin Ming's SCI/GPE debug patch
Comment 15 Len Brown 2011-02-11 20:43:36 UTC
Created attachment 47392 [details]
debug patch used in comment #14, for reference
Comment 16 Len Brown 2011-02-11 22:47:53 UTC
Created attachment 47422 [details]
lin-ming + rafael debug patch
Comment 17 Len Brown 2011-02-11 22:51:06 UTC
Created attachment 47432 [details]
dmesg resulting from above
Comment 18 Rafael J. Wysocki 2011-02-12 08:36:32 UTC
Hmm, both dmesg outputs from comment #17 and from comment #14 look basically
the same and they seem to mean that the relevant GPEs aren't enabled during
acpi_init() or earlier.

Are you still seeing the issue from comment #5 with the kernel from comment
#14 (ie. with the debug patch applied)?
Comment 19 Rafael J. Wysocki 2011-02-12 08:43:06 UTC
Ah, and with the patch from comment #6 applied?
Comment 20 Florian Mickler 2011-03-30 21:58:10 UTC
Should this be reopened?
Comment 21 Rafael J. Wysocki 2011-03-30 22:01:50 UTC
No.
Comment 22 Florian Mickler 2011-03-30 22:51:38 UTC
Good. :)

I was going through my bugzilla mail to tie up loose ends.