Bug 16440 - HP CQ60:EC GPE storm detected, transactions will use polling mode
Summary: HP CQ60:EC GPE storm detected, transactions will use polling mode
Status: CLOSED UNREPRODUCIBLE
Alias: None
Product: ACPI
Classification: Unclassified
Component: EC (show other bugs)
Hardware: All Linux
: P1 normal
Assignee: Lan Tianyu
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2010-07-23 02:05 UTC by bugzillakernelorg
Modified: 2013-05-05 12:52 UTC (History)
4 users (show)

See Also:
Kernel Version: 2.6.38
Subsystem:
Regression: No
Bisected commit-id:


Attachments
acpidump from same machine (205.25 KB, text/plain)
2010-07-23 02:05 UTC, bugzillakernelorg
Details

Description bugzillakernelorg 2010-07-23 02:05:43 UTC
Created attachment 27215 [details]
acpidump from same machine

this also occurred on 2.6.35-rc5

the laptop tends to do this after some time (hours, days) and it acts quite clunky after it occurs.

nothing out of the ordinary is being done at the time these things occur; the only strange thing on my laptop is an out of control firefox session that keeps the machine very busy, in both disk and cpu.

i don't think this happened at all before https://bugzilla.kernel.org/show_bug.cgi?id=15344 (and my keymap is still busted, dunno if the EC has a role in that at all)
Comment 1 amw.kernel 2010-07-30 11:32:09 UTC
Got this too in 2.6.34 #1 SMP x86_64 - all the time

Jul 30 20:05:46 presto kernel: ACPI: Power Button [PWRF]
Jul 30 20:05:46 presto kernel: Monitor-Mwait will be used to enter C-1 state
Jul 30 20:05:46 presto kernel: Monitor-Mwait will be used to enter C-2 state
Jul 30 20:05:46 presto kernel: Monitor-Mwait will be used to enter C-3 state
Jul 30 20:05:46 presto kernel: Marking TSC unstable due to TSC halts in idle
Jul 30 20:05:46 presto kernel: PM: Adding info for No Bus:cooling_device0
Jul 30 20:05:46 presto kernel: Switching to clocksource hpet
Jul 30 20:05:46 presto kernel: PM: Adding info for No Bus:cooling_device1
Jul 30 20:05:46 presto kernel: PM: Adding info for No Bus:2E851DA7-D053-495F-9DF
A-1A4AD62E6A86
Jul 30 20:05:46 presto kernel: PM: Adding info for No Bus:71436D3B-FBDD-4C72-BCB
8-435BFE0D64F9
Jul 30 20:05:46 presto kernel: PM: Adding info for No Bus:05901221-D566-11D1-B2F
0-00A0C9062910
Jul 30 20:05:46 presto kernel: ACPI: WMI: Mapper loaded
Jul 30 20:05:46 presto kernel: PM: Adding info for platform:regulatory.0
Jul 30 20:05:46 presto kernel: cfg80211: Calling CRDA to update world regulatory
 domain
Jul 30 20:05:46 presto kernel: PM: Adding info for No Bus:timer
Jul 30 20:05:46 presto kernel: ACPI: EC: GPE storm detected, transactions will u
se polling mode
Jul 30 20:05:46 presto kernel: PM: Adding info for No Bus:seq
Jul 30 20:05:46 presto kernel: PM: Adding info for No Bus:acpi_video0
Jul 30 20:05:46 presto kernel: PM: Adding info for No Bus:cooling_device2
Jul 30 20:05:46 presto kernel: acpi device:03: registered as cooling_device2
Jul 30 20:05:46 presto kernel: PM: Adding info for No Bus:acpi_video0
Jul 30 20:05:46 presto kernel: PM: Adding info for No Bus:input5
J
Comment 2 Zhang Rui 2010-10-22 03:15:06 UTC
does the problem still exist in the latest upstream kernel, say 2.6.35 or 2.6.36-rc?
Comment 3 bugzillakernelorg 2010-10-23 05:57:01 UTC
it happens very rarely even on the version i reported, if i ever see it again on .35 or .36 (or heaven forbid, years from now) i'll reopen
Comment 4 bugzillakernelorg 2010-10-23 05:58:22 UTC
nevermind, didn't see the intervening comment; sorry
Comment 5 bugzillakernelorg 2011-05-06 19:49:49 UTC
Linux krang 2.6.38-9-generic #43-Ubuntu SMP Thu Apr 28 15:23:06 UTC 2011 x86_64 x86_64 x86_64 GNU/Linux

[113357.085125] ACPI: EC: GPE storm detected, transactions will use polling mode

since this is a bios/firmware thing on this particular computer, what can i do to help? the only practical downside after this occurs is that the machine hiccups/stalls when the backlight changes, and it's really obnoxious since it fades.

if someone could give me a pointer to how to pick apart the firmware to see what runs on the EC i could take a crack at it
Comment 6 Zhang Rui 2012-01-18 02:14:00 UTC
It's great that kernel bugzilla is back.

can you please verify if the problem still exists in the latest upstream
kernel?
Comment 7 bugzillakernelorg 2012-01-19 21:45:17 UTC
it takes a _very_ long time to typically happen, from what i understood about looking at the code is that when the regular preferred way of interacting with the EC fails once to work properly, it permanently goes into polling mode; where it waits up to 500ms for each operation.

what would i be testing? that the EC still malfunctions occasionally or that the new kernel can recover from this situation?

i'm confident that i can reproduce it given time on the new kernel, provided it still degrades to polling only.
Comment 8 bugzillakernelorg 2012-01-19 21:47:54 UTC
last time it happened was about 2 weeks ago, but must have been longer; it's rotated out of my logs. it would be a nonissue if it didn't hitch the machine for 500ms every time the backlight changed in any way
Comment 9 Lan Tianyu 2013-04-09 09:51:09 UTC
Hi, does this bug exist in the last upstream kernel?
Comment 10 Lan Tianyu 2013-04-16 02:59:35 UTC
ping ...
Comment 11 Zhang Rui 2013-04-25 08:00:18 UTC
what is the model name of your box?
Comment 12 bugzillakernelorg 2013-04-28 11:02:08 UTC
re: #9
i can't reasonably run an upstream kernel at the moment, it's too much of a long shot for it to even happen, and it takes a very long time, and something transparently corrupted a bunch of files last time (i think that was 3.4.0)

for the foreseeable future i'll be using 2.6.38, this can close this and reopened when that changes if it helps.

re: #11
model information:

Handle 0x0001, DMI type 1, 27 bytes
System Information
	Manufacturer: Hewlett-Packard
	Product Name: Compaq Presario CQ60 Notebook PC
	Version: PCID
	Serial Number: 2CE845
	UUID: E149FD20-B20C-11DD-AB17-DE4F0208
	Wake-up Type: Power Switch
	SKU Number: FT237EA#ABE
	Family: 103C_5335KV

Handle 0x0002, DMI type 2, 16 bytes
Base Board Information
	Manufacturer: Hewlett-Packard
	Product Name: 3612
	Version: 09.67
	Serial Number: 2CE845
	Asset Tag: Base Board Asset Tag
	Features:
		Board is a hosting board
		Board is replaceable
	Location In Chassis: Base Board Chassis Location
	Chassis Handle: 0x0003
	Type: Motherboard
	Contained Object Handles: 0

last 4 digits removed from identifying bits

semirambling probably not important to the bug bits below:

when i filed the initial bug i really thought the controller could recover and stop polling after a while, but from my poking around last time it happened it appeared to go insane in a semi-permanent (save for power cycle) way

i also grepped my logs and apparently it hasn't even happened since sometime in 2011, nothing notable has changed; i don't think i even reset the bios, but i have changed how i use the backlight, it's inhibited almost all the time and i manually blank it so the fading doesn't happen

last time i tried to investigate the bug i set up a script to change the backlight level for hours to random levels and it still hadn't done it after about 8, everything weird i've ever had happen to linux is when it's deep in swap and there's not much free memory to work with so the throughput is very low, i wouldn't be surprised if every problem i've ever experienced is just from an unlikely page getting touched in something at the wrong time, slightly changing the timing, pulseaudio goes a bit nuts under pressure too
Comment 13 Lan Tianyu 2013-05-05 12:52:42 UTC
Thanks for your feedback. Since there is no way to reproduce the bug so mark this unreproducible. Please feel free to reopen the bug when it happen again. It's better to test it on the latest upstream kernel because this bug may have fixed.

Note You need to log in before you can comment on or make changes to this bug.