Latest working kernel version: 2.6.25-9 Earliest failing kernel version: 2.6.27 Distribution: Arch Hardware Environment: Compal IFL90 After a recent arch kernel upgade to 2.6.27, kacpid would take over 20-50% of my cpu cycles after the computer was put under load. I upgraded my kernel to 2.6.28-rc6 to see if the problem is still present, and it is. Downgrading the kernel back to 2.6.25-9 has solved the problem for now, and everything works normally. I will attach my acpid.log and also the output of grep . /sys/firmware/acpi/interrupts/*.
Created attachment 19185 [details] My acpid.log
Created attachment 19186 [details] Output of grep . /sys/firmware/acpi/interrupts/*
From the info of /sys/firmware/acpi/interrupt/* it seems that the GPE 1C is triggered at very high frequency. Will you please attach the output of acpidump, lspci -vxxx? As this is a regression, will you please use the git-bisect to identify which commit causes the regression? Thanks.
just a wild guess, try bisecting with target drivers/acpi/ec.c to see if the changes in that file were related to this regression...
I apologize for my slow action on this. I'm really busy this week and next week, but after that I should have the time to hunt down the change and provide the info needed for an actual fix.
I've looked at the problem a little more and found kernel 2.6.27.7 to be the first kernel version with the bug. I have yet to git bisect to find the commit, but I should be able to do that sometime within the next week.
ping Jori. :)
Sorry for my slowness! I was much busier during the holiday season than I ever expected. I am back to work bisecting as of now.
I'm still new to the use of git bisect and I have a couple questions. When I start bisecting, I use the current git tree as bad. The bug arose between 2.6.27.6 and 2.6.27.7. Is there any way to narrow my search to commits between just those two versions? If not, what should I use as the first good version?
hmm, as you are using the stable kernel, first you need to git clone the stable tree. then run git bisect good v2.6.27.6 git bisect bad v2.6.27.7 git bisect start
Finally found the commit. Thanks for bearing with me while I figured this out. Ends up I cloned the wrong tree.
Created attachment 19704 [details] git bisect log output
Hi, Jori thanks for the git-bisect. From the output the last bad commit is : >commit d09277432f84ae0c8588032518e1ff7842ef5606 >Author: Alexey Starikovskiy <astarikovskiy@suse.de> >Date: Sun Nov 9 19:01:06 2008 +0300 ACPI: EC: lower interrupt storm treshold http://bugzilla.kernel.org/show_bug.cgi?id=11892 thanks for your work Hi, Rui How about assigning this bug to EC category? Thanks.
right. let's see if Alexey has some ideas. :)
Guys, why gpe_all != sci (comment #2)? Jori, please enable '#define DEBUG' and post dmesg.
Alexey, Can you please be a little more descriptive about how and where to enable '#define DEBUG'? This is the first kernel bug I've worked on and I'm unfamiliar with a lot of this stuff. Thanks.
please open <kernel-source>/drivers/acpi/ec.c in text editor and uncomment '#define DEBUG' statement at the beginning of the file. Save and compile the kernel as usual.
Created attachment 19724 [details] dmesg output with '#define DEBUG' uncommented in ec.c
The laptop that experiences this problem died a couple weeks ago, but I finally got it back today. I compiled the latest kernel from Torvald's git tree, and I still experience the same problem. Has there been any progress made toward finding a fix? I'd really like to run a later kernel than 2.6.27.6.
Is there anything else I can do to help find a fix?
I have some additional info that may be of some use. I recently booted the ubuntu jaunty alpha4 livecd to see if the same issue occurs. Although the kernel used is 2.6.28, the problem isn't present. Does the ubuntu kernel patchset change anything in ec.c?
Hello all! I have the same laptop model and I'm thinking of switching from Ubuntu to arch, so I'm kind of concerned about this bug. I'm currently running Ubuntu Hardy, but I should be switching to arch pretty soon (as soon as I get enough free time for a full migration). I've been running Arch (2.6.28-ARCH)on virtualbox-ose and it doesn't seem to be an issue in there, but well, it's virtualized... Anyway, just let me know if there's anything I can do to help.
Yes, one thing you could try is to check if changing udelay(...) in drivers/acpi/ec.c to msleep(1) helps. This was changed back and forth several times already, and always there was a reason... Now I start to think that MSI users (who require udelay()) should have different EC driver...
Francisco, What bios version are you using? This bug only occurs for me when the cpu is under load. Kacpid takes over the cpu as soon as gnome loads, but for testing purposes I've just been using the stress program found in the arch repositories to put load on the cpu. I'm interested to see if the bug can be duplicated or if it's just a problem with my configuration, so post as soon as your able to test a full installation.
Alexey, I am happy to report that I tried your suggestion, and I have the 2.6.28 kernel running with no problem. It seems we've found the fix.
Created attachment 20719 [details] separate MSI delays Please check if this patch works?
I applied the patch to the 2.6.29 kernel and it works like a charm. Thanks for finding a solution.
Hello again. I'm running Bios Version 1.13 I should be installing Arch this weekend, so by Sunday I should have something to report. I'm glad to hear that there is a solution. Thank you for finding it! Hope it gets merged soon. Francisco
I have finally installed Arch. I have everything configured under Gnome and got a 100% usable system. My uname -a reads: Linux MegalaptopII 2.6.28-ARCH #1 SMP PREEMPT Tue Mar 17 07:22:53 CET 2009 x86_64 Intel(R) Core(TM)2 Duo CPU T7500 @ 2.20GHz GenuineIntel GNU/Linux I am experiencing no issues with acpi or CPU usage. I'm guessing it must be specific to your config... Hope this info can be of some help.
I just compiled the latest arch kernel sans patch, and I still experience the issue. I wonder what the difference is between my machine and yours. Alexey, any chance that this patch will be merged soon?
Len Brown thinks it is 2.6.31 material...
Owo! 2.6.31 will still take a while, I guess... I don't know the difference between our machines. you can compare your specs with the ones I posted. Maybe it's CPU specific?
Linux arch 2.6.29-ARCH #1 SMP PREEMPT Sat Apr 4 20:53:21 CDT 2009 x86_64 Intel(R) Core(TM)2 Duo CPU T7700 @ 2.40GHz GenuineIntel GNU/Linux The cpus are different, but it's only a clock speed difference. I don't think that would affect this, but who knows?
It is indeed strange. But it might in fact be the difference. Maybe someone else with a Compal IFL90 could give some feedback too. In the meantime I tried out the "stress" program in Arch you mentioned and I still have no CPU usage issues. Using the latest kernel now (2.6.29-ARCH) and still no issues. Btw, I'm using 'top' to check on the CPU usage.
Now that 2.6.30 has been released, could this be patched for the next kernel release?
Alexey, what's the status of this bug and the MSI patches?
no activity in this bug report for 2 months. please re-open if this is still an issue in the latest stable kernel.