Bug 10919
Description
Rafael J. Wysocki
2008-06-15 04:16:11 UTC
Created attachment 16526 [details]
dmesg
seems like dmesg got lost in my mail somewhere, here it is again.
> ACPI: EC: GPE storm detected, disabling EC GPE
please check if last patch in 10724 bug report helps. Created attachment 16648 [details] dmesg from 2.6.26-rc8 with patch I assume you mean the patch from comment 36 in bug 10724. I tried it on top of 2.6.26-rc8 and it seems to fix the issue. However I see some of these strange messages in dmesg now (full dmesg is attached): ACPI: EC: non-query interrupt received, switching to interrupt mode And I just tried an unpatched 2.6.26-rc8 kernel and it still has the issue. This bug is still present in the 2.6.26-rc9 kernel. Handled-By : Alexey Starikovskiy <astarikovskiy@suse.de> Created attachment 16806 [details]
disable gpe during while pending query
This patch should make switch silent...
Alexey, your patch does work fine so far. Thanks. Good, let's mark it resolved. Thanks for report and testing! Created attachment 16865 [details]
output of dmesg
Hm, looks I was too fast reporting this as fixed. Using your patch dimming the display does indeed work without problems, but I'm getting this message again:
ACPI: EC: non-query interrupt received, switching to interrupt mode
I've checked my logs and it really didn't appear after a few boots when I was reporting this as fixed. Now it's there again. I'm still running the same kernel.
I'll attach my dmesg output in case it might be useful
Maximilian, This could only happen if you applied previous version of the patch. With the last patch the offending printk is guarded with the check for GPE_STORM bit, which is set once and never cleared: if (printk_ratelimit() && !test_bit(EC_FLAGS_GPE_STORM, &ec->flags)); pr_info(PREFIX "non-query interrupt received," " switching to interrupt mode\n"); please make sure you have this check. (In reply to comment #12) > Maximilian, > This could only happen if you applied previous version of the patch. > With the last patch the offending printk is guarded with the check for > GPE_STORM bit, which is set once and never cleared: > > if (printk_ratelimit() && > !test_bit(EC_FLAGS_GPE_STORM, &ec->flags)); > pr_info(PREFIX "non-query interrupt received," > " switching to interrupt mode\n"); > please make sure you have this check. The check isn't guarding the printk though. It's guarding an empty statement, thanks to a stray semicolon :-). Created attachment 16866 [details]
same patch, semicolon removed
Thanks,
Here is an updated patch
*** Bug 11089 has been marked as a duplicate of this bug. *** Better, but it's still noisy - I get printk_ratelimit() messages with no accompanying message! printk_ratelimit() needs to go last: if (!test_bit(EC_FLAGS_GPE_STORM, &ec->flags) && printk_ratelimit()) pr_info(PREFIX "non-query interrupt received," " switching to interrupt mode\n"); Created attachment 16869 [details]
silence EC mode switch, 3rd attempt
Hope it works now...
hm, still not good. I'm using you patch (comment #17) with 2.6.26 final and I get a kernel Oops quite early in the boot process, i. e. before the framebuffer, so I had to disable the fb to actually see it. It doesn't happen with an unpatched 2.6.26 kernel. If it's useful I can try to get a digicam and take a picture of the Oops, but I don't have one here right now. Could you write down just the function names? here is some output I wrote down: BUG: unable to handle kernel NULL pointer dereference at 00000001 [...] EIP is at acpi_ec_gpe_handler [...] acpi_av_gpe_dispatch acpi_ev_gpe_detect acpi_ev_sci_xrupt_handler acpi_irq handle_IRQ_event handle_level_irq do_IRQ default_idle ktime_get default_idle default_idle common_interrupt default_idle default_idle default_idle cpu_idle start_kernel unknown_bootoption Created attachment 16880 [details]
4th attempt
Thanks for listing... It's my fault again, please check new patch.
Thanks, using that patch everything looks fine for now. Created attachment 16926 [details]
Incremental fix: re-enable interrupts in acpi_ec_wait
Alexey: I retested your patch (v4) on my laptop and I noticed "missing confirmations, switch off interrupt mode". This fix-up makes it go away for me.
Alan, no, this is wrong -- by this you enable GPE too early. Could you please make a log of ec.c with #define DEBUG uncommented and this string? Created attachment 16929 [details]
Dmesg with patch V4 applied showing "missing confirmations"
Sorry, that was the wrong patch. I only meant to send the first hunk, adding two lines to re-enable the GPE in acpi_ec_wait, as per the description. I can't see anything wrong with that - but I've been wrong before :-).
Here's the dmesg as requested. With #define DEBUG uncommented, I don't get a GPE storm on boot - so I trigger it manually by pressing some hotkeys (which auto-repeat).
Created attachment 16931 [details]
wake up regardless of mode
hm, may be this would help?
Created attachment 16948 [details]
Incremental fix: storm workaround also requires polling mode
No, that doesn't help.
I think if you're going to disable interrupts temporarily during queries (or while queries are pending), then you've got to switch to polling mode as well. Like this. You can't wait for interrupts during a query if you've disabled them :-).
Any news here? Anything where I can help? This bug is still present in 2.6.27-rc2 Created attachment 17671 [details]
Patch 1/4 : Don't issue the burst disable command if EC exits the burst mode
Created attachment 17672 [details]
Patch 2/4: Clear the query_pending bit only after processing EC notification event
Created attachment 17673 [details]
Patch 3/4: Simplify EC working flowchart and always enable EC GPE
Created attachment 17674 [details]
patch 4/4: Add some udelay in EC GPE handler to avoid EC GPE interrupt storm
Hi, Maximillian Do you have opportunity to try the attached four patches on the latest kernel(2.6.27-rc5) and see whether the system can work well? It will also be great if you can try the patch in bug9998,#c76 on the 2.6.27-rc5 kernel. Thanks. > ------- Comment #33 from yakui.zhao@intel.com 2008-09-07 23:05 ------- > Hi, Maximillian > Do you have opportunity to try the attached four patches on the latest > kernel(2.6.27-rc5) and see whether the system can work well? > > It will also be great if you can try the patch in bug9998,#c76 on the > 2.6.27-rc5 kernel. > Thanks. I tried both, your 4 patches and the patch from bug9998 and both do work fine with my laptop so far. Created attachment 18045 [details]
patch vs 2.6.27-rc7
This version of Alexey's patch has been applied to the acpi test tree.
If it doesn't work for you, please let us know.
thanks,
-Len
I tried the patch and it does work fine. Thanks for the help to everybody. FYI: It also cures EC GPE storm on Thinkpad X20 that appeared in 2.6.27 (was OK in 2.6.26 and is OK in 2.6.27+ the patch). shipped in linux-2.6.28-rc1 closed commit 7c6db4e050601f359081fde418ca6dc4fc2d0011 Author: Alexey Starikovskiy <astarikovskiy@suse.de> Date: Thu Sep 25 21:00:31 2008 +0400 ACPI: EC: do transaction from interrupt context |