Bug 13155
Summary: | random freeze when EC interrupts are enabled | ||
---|---|---|---|
Product: | ACPI | Reporter: | Vitus Jensen (vjensen) |
Component: | EC | Assignee: | Zhang Rui (rui.zhang) |
Status: | CLOSED INSUFFICIENT_DATA | ||
Severity: | normal | CC: | lenb, rui.zhang |
Priority: | P1 | ||
Hardware: | All | ||
OS: | Linux | ||
Kernel Version: | 2.6.30-rc3 | Subsystem: | |
Regression: | No | Bisected commit-id: | |
Attachments: |
bootlog from unmodified 2.6.29-gentoo-r1
patch to prohibit EC interrupts 2.6.30-gentoo-r5, powertop output 2.6.30-gentoo-r5, dmesg 2 dumps of /proc/interrupts, 10 seconds interval linux-acpi, 2.6.33-rc2, dmesg hack GPEs away A port of the Embedded Controller driver v2.0 to 2.6.36 |
Description
Vitus Jensen
2009-04-23 08:33:21 UTC
Created attachment 21091 [details]
bootlog from unmodified 2.6.29-gentoo-r1
bootlog showing EC interrupts getting enabled during boot
Created attachment 21092 [details]
patch to prohibit EC interrupts
A minimal patch against 2.6.30-rc3 to remove the freezes. Instead of disabling EC interrupts it just sets EC_FLAGS_GPE_STORM.
Please check if last patch from #12949 changes situation. I tried to apply and edit http://bugzilla.kernel.org/attachment.cgi?id=21105 to 2.6.29-rc3 but failed. Using 2.6.30-rc3 as base I just had to remove 2 set_flags() lines and got a bootable kernel but it still shows freezes after some time. The patched kernel shows some 2000 occurence of acpi interrupts in /proc/interrupts but I'm in doubt whether to report more details because the patch wasn't applied cleany. Please advise. closed due to inactivity for 3 months please re-open if this is still a problem in 2.6.30.stable Created attachment 22945 [details]
2.6.30-gentoo-r5, powertop output
Running 2.6.30-gentoo-r5 the machine no longer freezes. At least not during the last 30 minutes of browsing and kernel compile. But it produces ~80000 ints/s prohibiting any means of powersave.
ACPI events (AC on/off), sleepbutton, suspend/resume works.
Created attachment 22946 [details]
2.6.30-gentoo-r5, dmesg
Created attachment 22974 [details]
2 dumps of /proc/interrupts, 10 seconds interval
OK, so it's not 80000 ints/s but 80000 wakeups per second. In 2.6.27 (wireless-testing plus ec_intr patch) it's about 40/s.
Tried commit 8aeb0a352af7eb26863e53c203eeb852fd4590c3 from the acpi-test branch at kernel.org but this shows the very same picture: acpi events work, no freezes so far but around 80000 wakeups/s. Is this still an related issue to the freezes? Or should I create a new bugzilla entry? Vitus, From comment #8, you don't have any acpi interrupts during the period, and overall 1200 interrupts is quite low. Updating git repository to 2.6.31acpi-ge56d953 fixed the wakeups, only ~50/s now (20% acpi interrupts). There are some issues with that kernel but EC isn't one of them. So close this bug. Because I needed a current kernel I tried current gentoo's 2.6.32-gentoo-r1 and sadly found the same freezes as always. Retried it with tag "v2.6.32" from git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux-2.6.git and 2.6.33-rc2 from linux-acpi-2.6 (commit 1201b2a9bec0413188ada1443ece1a52da6dbff4) and the thinkpad froze after seconds (boot to console, emerge -1 some-ebuild) or hours (rsync to webdav). As I can't reproduce the freeze at will I assume that it was never fixed. Newer kernels don't reach C3 or generate a lot of wakeups so for real work I'm staying at my hacked 2.6.27. I was happy to catch the freeze both on linux-2.6.git and linux-acpi-2.6 with DEBUG enabled in ec.c and while watching the debug output. There is nothing special to notice :-( The last line may be "ACPI: EC: transaction end" or some message from cron or my sudo shell. Then all of a sudden the hard disk LED stays on, the keyboard doesn't react anymore, dead. My next step would be to hack the EC interrupts away. But as I can't use those kernels for work anyway is there something easier to try? Are you at all interested in the non-interrupt way? Created attachment 24396 [details]
linux-acpi, 2.6.33-rc2, dmesg
output from 2.6.33-rc2. DEBUG enabled in drivers/acpi/ec.c
does the problem still exist in the latest upstream kernel? say 2.6.35 or 2.6.36-rc (In reply to comment #14) > does the problem still exist in the latest upstream kernel? say 2.6.35 or > 2.6.36-rc I pretty much gave up on this thinkpad and stayed at my patched 2.6.27-wireless-testing kernel. There is a opportunity to test other kernels tomorrow this time, is there any easy hack to disable EC interrupts in 2.6.35? Just in case the hangs are still there, to validate EC intr is the culprit? Testing tag 2.6.36 from kernel.org: no freezes in 3 hours work (compile, browse, suspend to RAM). Seems fixed. This kernel only seldomly reaches C3, so for real work I will continue on 2.6.27 but the freeze is gone :-) good news. As Alexey is not longer working on ACPI EC driver, I'll close it as CODE_FIXED. Please feel free to re-open it if you can reproduce the problem in the latest git kernel. OK, I reopen it. When using the machine the next time I had the usual freeze while hacking away in emacs. No power connected, no wlan, resumed and edited for around 1 hour: harddisk LED on, total freeze. To try something I hacked GPEs away as in the simple patch attached (not much time, no internet). I still have no idea what triggers the freeze so I compiled kernel for several hours: no freeze. Ported ec.c from v2.6.27 (Embedded Controller Driver v2.0) to 2.6.36 today and did some surfing, always with the simple patch: no freeze. I'm now using Embedded Controller Driver v2.0 and will continue so unless you advise differently. Created attachment 35512 [details]
hack GPEs away
This is just to remove GPE from 2.6.36. It triggers double suspends on the R51e so it's probably not a great idea :D
Created attachment 45222 [details] A port of the Embedded Controller driver v2.0 to 2.6.36 This is the stable solution to the freeze problem as mentioned in comment #18: a port of the Embedded Controller Driver (ec.c) from 2.6.27 with an additional hack to disable the EC interrupt after it's first occurence as ec_intr=0 wasn't available in that version (see #if 0 part). I'm running 2.6.36 with this patch (and CONFIG_HZ_100=y) since 30th october 2010. Battery rundowns, switching from AC to battery and vice versa, suspend to ram etc. Very stable, not a single freeze or anything unusual :-) does this problem still exist in the latest upstream kernel? Are you refering to the ACPICA changes merged into v2.6.38? I will try that version, either tomorrow or saturday. Updated the 2.6.36 configuration to 2.6.38, installed kernel and modules, rebooted. My machine automatically only boots into a text system and instead of starting X11 I tried to re-install the thinkpad modules. But "emerge -1 tp_smapi" freezed the machine while still scanning dependencies: harddisk LED on, no reaction to keyboard. As usual. So yes, the problem still exists. Is there again a possibilities for users to disable EC interrupts? Or some other thing I could try? It's great that kernel bugzilla is back. can you please verify if the problem still exists in the latest upstream kernel? if the problem still exists, can you please re-describe the symptom in the latest kernel? bug closed as there is no response from the bug reporter. The last kernel I used was 3.4 with halt=mwait parameter. This combination did not require any change on the EC driver. I sold the laptop now because of wlan problems, high power consumption and not really needing it. |