Bug 6217 - ACPI event that doesn't have a handler is not disabled correctly
Summary: ACPI event that doesn't have a handler is not disabled correctly
Status: CLOSED PATCH_ALREADY_AVAILABLE
Alias: None
Product: ACPI
Classification: Unclassified
Component: Config-Interrupts (show other bugs)
Hardware: i386 Linux
: P2 normal
Assignee: Lin Ming
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2006-03-12 09:44 UTC by Thomas W. Larsen
Modified: 2008-07-16 15:39 UTC (History)
3 users (show)

See Also:
Kernel Version: 2.6.15
Subsystem:
Regression: ---
Bisected commit-id:


Attachments
Original DSDT (206.81 KB, text/plain)
2006-03-15 10:58 UTC, Thomas W. Larsen
Details
ACPIdump (121.90 KB, text/plain)
2006-03-15 11:24 UTC, Thomas W. Larsen
Details
Dmesg output (14.14 KB, text/plain)
2006-03-15 12:16 UTC, Thomas W. Larsen
Details
lspci -vv output (18.07 KB, text/plain)
2006-03-15 12:17 UTC, Thomas W. Larsen
Details
cat /proc/acpi/debug_level (1.28 KB, text/plain)
2007-03-07 07:26 UTC, Manuel Mayer
Details
dmesg | grep ACPI (10.82 KB, text/plain)
2007-03-07 07:29 UTC, Manuel Mayer
Details
/var/log/messages (69.29 KB, application/x-tar)
2007-03-07 07:40 UTC, Manuel Mayer
Details
patch: disable gpe if no handler exists (686 bytes, patch)
2007-10-31 00:24 UTC, Zhang Rui
Details | Diff
patch vs 2.6.24-rc2 (1.78 KB, patch)
2007-11-20 10:46 UTC, Len Brown
Details | Diff
acpidump output on this machine(katie) (122.49 KB, text/plain)
2008-04-03 16:14 UTC, Damián Viano
Details
full dmesg with acpi debug features on (71.82 KB, text/plain)
2008-04-03 16:16 UTC, Damián Viano
Details
proposed patch (461 bytes, patch)
2008-04-03 18:25 UTC, Damián Viano
Details | Diff
patch mentioned at comment #27 (4.16 KB, patch)
2008-06-24 02:19 UTC, Lin Ming
Details | Diff

Description Thomas W. Larsen 2006-03-12 09:44:09 UTC
Most recent kernel where this bug did not occur: NA
Distribution: Gentoo
Hardware Environment: Quanta KN1 based notebook (915GM/PM, nvidia gf 6600)
Software Environment: Gentoo / 2.6.15
Problem Description:
Most ACPI functionality seem to run fine except when an external monitor is
connected. All procs freeze, and syslog is spammed with:
"ACPI Error (evgpe-0688): No handler or method for GPE[17], disabling event
[20060127]"
Whenever the monitor is disconnected, the kernel resumes normal operation.
I tried rewriting the DSDT to the best of my (very limited) knowledge. Got rid
of the 4 errors, but this did not affect the issue.

DSDT original (in hex format) can be found at this url:
http://www.stoneinnovation.no/zicada/dsdt.hex
Comment 1 Thomas W. Larsen 2006-03-13 06:55:49 UTC
Tried commenting out any gfx handling code i could find in the dsdt,- did not 
affect issue. Tried commenting out the acpi error handler in evgpe.c,- this 
stopped syslog from getting spammed,- but as soon as the monitor was connected, 
the system was beeing slowed down, and the temp was rising (temperature up 
debugmsg in dmesg). Linux reported 0 load as this was occuring.
It seems to me that the evgpe.c code is unsuccessful in disabling the gpe[17] 
event,- yet there is no output to dmesg on this beeing attempted. Race 
condition issue ? I can supply logfile outputs and output from other tools when 
i get home later on today if needed.
Comment 2 Thomas W. Larsen 2006-03-15 10:58:12 UTC
Created attachment 7580 [details]
Original DSDT

Added the original DSDT.
Comment 3 Robert Moore 2006-03-15 11:06:37 UTC
There is no _L17 or _E17 method in the DSDT, so this GPE will probably not be 
handled. It should be disabled after this, though.

Please post the acpidump for the machine, the FADT contains the GPE info.
Comment 4 Thomas W. Larsen 2006-03-15 11:24:09 UTC
Created attachment 7581 [details]
ACPIdump

Added acpidump output
Comment 5 Robert Moore 2006-03-15 12:11:13 UTC
entire dmesg would help also
Comment 6 Thomas W. Larsen 2006-03-15 12:16:35 UTC
Created attachment 7582 [details]
Dmesg output

Added complete dmesg output.
Comment 7 Thomas W. Larsen 2006-03-15 12:17:47 UTC
Created attachment 7583 [details]
lspci -vv output

Added lspci -vv output.
Comment 8 Thomas W. Larsen 2006-03-16 10:46:19 UTC
Tried adding a dummy _L17 GPE doing nothing. This gets rid of the errors, but
loads the cpu quite heavily if the external monitor is connected.
Comment 9 Robert Moore 2006-03-16 12:55:16 UTC
I can't discover how the GPE 0x17 is being enabled. You may have to enable 
debug tracing during ACPI initialization so we can see what GPEs are being 
enabled, and why.
Comment 10 Robert Moore 2006-05-12 14:42:35 UTC
Need trace or we will close this.
Comment 11 Adrian Bunk 2006-07-10 13:16:11 UTC
Please reopen this bug if:
- it is still present in kernel 2.6.17 and
- you can provide the requested information.
Comment 12 Manuel Mayer 2007-03-07 05:40:05 UTC
Distribution: Kubuntu
Hardware Environment: same: Quanta KN1 based notebook, Geforce Go 6600
Software Environment: Kubuntu / 2.6.17-11, also tested with newest stable
version (2.6.20.1)

error is still present when connecting an external monitor:  
[  239.460000] ACPI Error (evgpe-0711): No handler or method for GPE[17],
disabling event [20060707]
loops until monitor is disconnected. 

Since I'm pretty new to linux (but learning fast and willing to help ;-)) if you
tell me how to enable debug tracing during acpi initialization, I'll deliver the
trace.

thanks for your help.
Comment 13 Manuel Mayer 2007-03-07 07:26:48 UTC
Created attachment 10639 [details]
cat /proc/acpi/debug_level

okay, hope I did it right: I compiled kernel 2.6.20.1 with config_acpi_debug=y.

This attachment shows the activated debug_level. Please tell me if I have to
enable other debug options as well.
Comment 14 Manuel Mayer 2007-03-07 07:29:17 UTC
Created attachment 10640 [details]
dmesg | grep ACPI

(after restarting computer without external monitor attached). ACPI_DEBUG
information should be in there. Please remember I'm new to this and just trying
to provide the right information.
Comment 15 Manuel Mayer 2007-03-07 07:40:00 UTC
Created attachment 10641 [details]
/var/log/messages

/var/log/messages

after rebooting the machine (without external monitor plugged) I plugged the
ext. monitor and the looping error message popped up (no event handler).
Comment 16 Zhang Rui 2007-10-23 23:42:40 UTC
Hi, Manuel,
Do you still have this problem in the latest kernel release, say 2.6.23.1?
Can you please do the following test in the latest kernel:
echo 0x04 >/sys/module/acpi/parameters/debug_layer
echo 0x8800001f > /sys/module/acpi/paramters/debug_level
connect the external monitor and attach the dmesg output.
Comment 17 Zhang Rui 2007-10-31 00:24:12 UTC
Created attachment 13358 [details]
patch: disable gpe if no handler exists

From the acpidump you attached,
CRTC and CRTS in the PMIO OpRegion stands for the status and enable bit of the gpe 0x17. GPE 0x17 is enabled in the PCI0._INI method.
And it is never disabled again even if no gpe handler exists for this gpe.
This patch disables the gpe if no handler for this gpe is detected.
Comment 18 Len Brown 2007-11-20 10:46:37 UTC
Created attachment 13654 [details]
patch vs 2.6.24-rc2

This version of Rui's patch is applied to the ACPI tree.
Comment 19 Len Brown 2007-11-20 10:51:03 UTC
Thomas, Manuel,
It would be great if one of you could confirm
that this patch fixes the problem.

thanks,
-Len
Comment 20 Len Brown 2008-02-10 14:19:38 UTC
shipped in Linux-2.6.24-git22
closed
Comment 21 Damián Viano 2008-04-03 12:46:50 UTC
I'm seeing this issue in a CTL KN1 computer. I've reproduced this with the latest linux acpi git tree (1192aeb), even tried 2.6.24-git22 so I guess the problem never got fixed and the bug should be reopen.

I'm willing to help with anything to debug this.
Comment 22 Damián Viano 2008-04-03 16:14:27 UTC
Created attachment 15599 [details]
acpidump output on this machine(katie)

made with acpidump 20071116
Comment 23 Damián Viano 2008-04-03 16:16:49 UTC
Created attachment 15600 [details]
full dmesg with acpi debug features on

This is the full dmesg output from the mentioned kernel (built from acpi git tree) configured with oldconfig from a 2.6.24 from debian sid .config and with the acpi debug enabled. After the boot I plugged the monitor for a second.
Comment 24 Damián Viano 2008-04-03 18:25:26 UTC
Created attachment 15601 [details]
proposed patch

I think I found it. The problem seems to be in checking if the GPE that was fired is enabled before disabling it in acpi_ev_disable_gpe() and even returning AE_OK in that case. So I removed the check making it always disable the requested GPE, which should be OK IIUC.

This indeed fix the problem leaving only one "ACPI Error (evgpe-0710): No handler or method for GPE[17], disabling event [20070126]" message in the logs and preventing the IRQ storm.

I've also tried moving the acpi_ev_update_gpe_enable_masks call before the check but didn't make any difference, so I think this is the way to go. Unless there is a better reason to leave the check there.
Comment 25 Zhang Rui 2008-04-04 19:06:46 UTC
hah, patch is already available for upstream kernel, please refer to
http://marc.info/?l=linux-acpi&m=120511546326164&w=2
and 
https://bugzilla.redhat.com/show_bug.cgi?id=251744

thanks for your effort, Damian, :)
Comment 26 Len Brown 2008-04-30 20:20:07 UTC
patch in comment #24 shipped in Linux-2.6.25-git16

commit 51ae796f7fa1d8034252628572053f477bc29913
Author: Damián Viano <des@debian.org>
Date:   Tue Apr 29 03:32:25 2008 -0400

    ACPICA: always disable GPE when requested
    
    acpi_ev_disable_gpe() has an optimization where it doesn't disable
    a GPE that it "doesn't have to".  Unfortunately, it can get tricked
    by AML that scribbles on register state behind its back.  So when asked
    to disable a GPE, simply do it -- a redundant register write
    in the common case is a fair price to pay to be bomb-proof
    for the rare cases.
    
    http://bugzilla.kernel.org/show_bug.cgi?id=6217
    
    Signed-off-by: Damián Viano <des@debian.org>
    Acked-by: Zhang Rui <rui.zhang@intel.com>
    Signed-off-by: Len Brown <len.brown@intel.com>
Comment 27 Robert Moore 2008-06-17 15:21:12 UTC
17 June 2008. Implemented another change for the GPE disable. We now perform a read-change-write of the enable register instead of simply writing out the cached enable mask. This will prevent inadvertent enabling of GPEs if a rogue GPE is received during initialization (before GPE handlers are installed.)
Comment 28 Lin Ming 2008-06-24 02:19:46 UTC
Created attachment 16595 [details]
patch mentioned at comment #27
Comment 29 Len Brown 2008-06-24 19:01:25 UTC
re: patch in comment #28

scripts/checkpatch.pl
ERROR: trailing whitespace
#41: FILE: drivers/acpi/events/evgpe.c:281:
+  $

ERROR: code indent should use tabs where possible
#42: FILE: drivers/acpi/events/evgpe.c:282:
+ ^I/*$

ERROR: code indent should use tabs where possible
#43: FILE: drivers/acpi/events/evgpe.c:283:
+ ^I * Even if we don't know the GPE type, make sure that we always$

ERROR: code indent should use tabs where possible
#44: FILE: drivers/acpi/events/evgpe.c:284:
+ ^I * disable it. low_disable_gpe will just clear the enable bit for this$

ERROR: code indent should use tabs where possible
#45: FILE: drivers/acpi/events/evgpe.c:285:
+ ^I * GPE and write it. It will not write out the current GPE enable mask,$

WARNING: line over 80 characters
#46: FILE: drivers/acpi/events/evgpe.c:286:
+ 	 * since this may inadvertently enable GPEs too early, if a rogue GPE has

ERROR: code indent should use tabs where possible
#46: FILE: drivers/acpi/events/evgpe.c:286:
+ ^I * since this may inadvertently enable GPEs too early, if a rogue GPE has$

ERROR: code indent should use tabs where possible
#47: FILE: drivers/acpi/events/evgpe.c:287:
+ ^I * come in during ACPICA initialization - possibly as a result of AML or$

ERROR: code indent should use tabs where possible
#48: FILE: drivers/acpi/events/evgpe.c:288:
+ ^I * other code that has enabled the GPE.$

ERROR: code indent should use tabs where possible
#49: FILE: drivers/acpi/events/evgpe.c:289:
+ ^I */$

ERROR: code indent should use tabs where possible
#50: FILE: drivers/acpi/events/evgpe.c:290:
+ ^Istatus = acpi_hw_low_disable_gpe(gpe_event_info);$

ERROR: code indent should use tabs where possible
#51: FILE: drivers/acpi/events/evgpe.c:291:
+ ^Ireturn_ACPI_STATUS(status);$

WARNING: braces {} are not necessary for single statement blocks
#82: FILE: drivers/acpi/hardware/hwgpe.c:77:
+	if (!gpe_register_info) {
+		return (AE_NOT_EXIST);
+	}

ERROR: return is not a function, parentheses are not required
#83: FILE: drivers/acpi/hardware/hwgpe.c:78:
+		return (AE_NOT_EXIST);

WARNING: braces {} are not necessary for single statement blocks
#90: FILE: drivers/acpi/hardware/hwgpe.c:85:
+	if (ACPI_FAILURE(status)) {
+		return (status);
+	}

ERROR: return is not a function, parentheses are not required
#91: FILE: drivers/acpi/hardware/hwgpe.c:86:
+		return (status);

ERROR: return is not a function, parentheses are not required
#106: FILE: drivers/acpi/hardware/hwgpe.c:101:
+	return (status);

ERROR: "foo * bar" should be "foo *bar"
#119: FILE: drivers/acpi/hardware/hwgpe.c:119:
+acpi_hw_write_gpe_enable_reg(struct acpi_gpe_event_info * gpe_event_info)

total: 15 errors, 3 warnings, 103 lines checked
Comment 30 Len Brown 2008-06-24 19:09:17 UTC
whelp, git am --whitespace=strip
fixed 11 of the whitespace errors, we'll get the rest
when we lindent ACPICA.

applied to acpi-test
Comment 31 Adrian Bunk 2008-07-16 15:39:58 UTC
in Linus' tree as commit e38e8a0743b0e996a8a3fbea8908fe75a84f02c7

Note You need to log in before you can comment on or make changes to this bug.