Bug 10510
Summary: | s3: interrupt storm upon resume - Asus M6700Me | ||
---|---|---|---|
Product: | ACPI | Reporter: | Matthias Bläsing (matthias.blaesing) |
Component: | Power-Battery | Assignee: | Zhang Rui (rui.zhang) |
Status: | REJECTED WILL_NOT_FIX | ||
Severity: | normal | CC: | acpi-bugzilla, akpm |
Priority: | P1 | ||
Hardware: | All | ||
OS: | Linux | ||
Kernel Version: | 2.6.25 | Subsystem: | |
Regression: | --- | Bisected commit-id: | |
Attachments: |
lspci -vvxx output
gzipped output (syslog) of run without battery output (syslog) of test-run with battery module loaded (beware: get expanded to 3MB) ACPI Interrupts monitored while a suspend/resume cyle w/ and w/o the battery module events while removing/reinserting battery try the debug patch in which the query_pending bit is clear after processing EC notification event |
Description
Matthias Bläsing
2008-04-23 13:03:26 UTC
(In reply to comment #0) > Latest working kernel version: 2.6.24 (sort of, with a patch in osl.c, that > flushes the work queue (bug number unknown, don't know how to tame the > bugzilla beast)) The patch in bug 9772/2884? > root@prometheus:~# acpi -a > Battery 1: charged, 100%, rate information unavailable. > Battery 2: charged, 100%, rate information unavailable. > AC Adapter 1: on-line > root@prometheus:~# acpi -a > Battery 1: charged, 100%, rate information unavailable. > AC Adapter 1: on-line > root@prometheus:~# acpi -a > Battery 1: charged, 100%, rate information unavailable. > AC Adapter 1: on-line > root@prometheus:~# acpi -a > Battery 1: charged, 100%, rate information unavailable. > Battery 2: charged, 100%, rate information unavailable. > AC Adapter 1: on-line > root@prometheus:~# > Please attach the acpidump output. Please attach the dmesg output after this test. Please re-do the test with acpid killed. Marked as regression. Can you please send out that osl.c patch also? It might have got lost. (In reply to comment #2) > Marked as regression. > > Can you please send out that osl.c patch also? It might have got lost. > It's bug #10265. Matthias, Does battery 2 keep on popping up and disappearing all the time after S3? Or things became normal after a few seconds? This is a piece of BIOS code from your acpidump. _L13() { ... If (\_SB.BT1S) { Store (0x00, \_SB.BT1S) Notify (\_SB.BAT1, 0x01) } Else { Notify (\_SB.BAT1, 0x00) Store (0x01, \_SB.BT1S) } ... } I don't not why the BIOS sets/clears the BAT1 present flag (BT1S) every time GPE 0x13 is fired. But this can well explain the symptom on your laptop. Linux/ACPI battery driver receives several ACPI Battery notifications when resuming, checks the _STA method and finds that BAT1 is absent and then become present. I don't think this is a regression. Matthias, You said that 2.6.24 with one extra patch doesn't have this problem, can you make a double check? Hey, please excuse my fault, I booted into 2.6.24 (the one with the above mentioned patch) and - yep I got the same symptoms. So I see two basic problem (maye 3): 1.) The battery reporting is flaky after resume from S3 (not sure whether this was the case with 2.6.23) 2.) Something breaks after resume with respect to the asus multimedia keys and 3.) (this might be selective attention) the system feels sluggish (afaik the cpu stages are controlled via acpi -- acpi_cpufreq might indicate that) removing the acpid and it didn't helped. The multimedia keys stopped working completely after the resume until I removed the battery module. Then the reaction to the button presses was slow. Half an hour later the system feels "normal" again. Could it be, that the acpi events pile up and the system is slow to work down the pile? Maybe I can try a whole suspend/resume cycle without the battery module - maybe this enlightens this a bit more. Ok - battery seems a bit fishy. With battery loaded it takes approx. 10s to get the multimedia key-press registered. With the battery modules remove approx 2s. I even removed the hardware (second battery module) and got battery 2 reported ... The dbus-monitor used on the system bus shows, that hal reports removes and additions of the hardware nearly every second. I had a look at /proc/interrupts and I'm not sure what to take out of the fact, that there are more acpi interrupts counted than timer interrupts. Is this considered normal? > please excuse my fault, I booted into 2.6.24 > and - yep I got the same symptoms clearing regression flag on this report. > Debian SID Does this distro try to remove the battery driver before suspend? (that is the only way one can explain the need for the patch in bug #10265). What happens if the distro scripts are modified so that battery is not unloaded (or if you boot a kernel with ACPI_BATTERY=y so that the unload fails?) Please attach the output of "lspci -vvxx". I run a few more tests and made sure that I didn't use the Debian scripts, but put the system to sleep by hand (echo mem > /sys/power/sleep). When I remove the battery module prior to putting the system to sleep I get a few acpi interrupts for something like 20s (then system is back to normal) (approx. 10 interrupts per second). When the battery module is still loaded I get the "storm" of approx. 1000 interrups per second and I'm not sure whether or the not the situation settles down after some time. Created attachment 15987 [details]
lspci -vvxx output
Please make sure CONFIG_ACPI_DEBUG is set echo 0x044 > /sys/module/acpi/parameters/debug_layer echo 0x8800001f > /sys/module/acpi/parameters/debug_level and re-do the same test you did in comment #8. Please attach the dmesg output w/ and w/o the battery module. Some statistics before I attach the result from the syslog: w/ battery: acpi-ints before suspend/resume: 2968 acpi-ints after suspend/resume: 4986 acpi-ints 30s later: 10574 w/o battery: acpi-ints before suspend/resume: 3664 acpi-ints after suspend/resume: 4156 acpi-ints 30s later: 4394 Created attachment 16022 [details]
gzipped output (syslog) of run without battery
Created attachment 16023 [details]
output (syslog) of test-run with battery module loaded (beware: get expanded to 3MB)
(In reply to comment #11) > Some statistics before I attach the result from the syslog: > > w/ battery: > acpi-ints before suspend/resume: 2968 > acpi-ints after suspend/resume: 4986 > acpi-ints 30s later: 10574 > > w/o battery: > acpi-ints before suspend/resume: 3664 > acpi-ints after suspend/resume: 4156 > acpi-ints 30s later: 4394 > IMO, this is still too much. Could you do the same test after kill all the processes that are reading /proc/acpi/event? Could you please do this test cd /sys/firmware/acpi/interrupts/ grep . * and attach the result after resume. (w/ and w/o battery module) Created attachment 16027 [details]
ACPI Interrupts monitored while a suspend/resume cyle w/ and w/o the battery module
The result of the test is attached (acpi-interrups.tgz) this is how the data was created:
root@prometheus:~# cd /sys/firmware/acpi/interrupts/
root@prometheus:/sys/firmware/acpi/interrupts# /etc/init.d/acpid stop; modprobe -r battery; sleep 10; grep . * >> /tmp/acpi-interrups-battery-wo-prior; echo mem > /sys/power/state; grep . * >> /tmp/acpi-interrups-battery-wo-after; sleep 30; grep . * >> /tmp/acpi-interrups-battery-wo-30s; sleep 60; modprobe battery; sleep 60; grep . * >> /tmp/acpi-interrups-battery-w-prior; echo mem > /sys/power/state; grep . * >> /tmp/acpi-interrups-battery-w-after;sleep 30; grep . * >> /tmp/acpi-interrups-battery-w-30s
Some more info: It's not tied to S3 resume. It also happens when I remove/plugin the second battery. The uptime is currently approx. 10 min and I'm about reaching 150000 ACPI Interrupts. I think this is the root problem and the suspend/resume cycle just makes it visible. I just had a thought: It seems that the ACPI subsystem is flooded with interrupts. Couldn't be the interrupts deactivated and the battery system switch to a polling model (maybe monitoring the interrupt frequency from time to time). Not sure this is possible, but I would rather loose the multimedia buttons, than my akku reporting. it seems that some user space scripts are invoked to query the AC/Battery status, and this brings a lot of ec interrupts. Could you please try the following test? lsof /proc/acpi/event; kill all the processes polling this file. cat /proc/acpi/event remove/plugin the second battery see if the interrupt storm occurs and attach the ouput of /proc/acpi/event. Created attachment 16424 [details]
events while removing/reinserting battery
Yep also without having anything listening on /proc/acpi/events (apart from a cat) I got the storm. I stopped acpid, started cat, removed the battery (same 100 interrupts every 1-2 Minutes). When I reinserted the battery, the counter began to step up and in about a minute it went from 9000 to approx. 30000 (and its still going up).
This is kernel 2.6.26-rc5.
Hi, Matthias, sorry for the delay. I'm afraid this is a BIOS/Hardware issue that we couldn't fix in Linux kernel. GPE _L13 keeps on firing after resume, about 20 times in 30 minutes. Each _L13 will result in the loading/unloading of battery driver, and loading/unloading the battery driver need to access some EC address space, which may bring hundreds of interrupts. Please check if upgrading the BIOS helps. Reject this bug. Created attachment 17585 [details]
try the debug patch in which the query_pending bit is clear after processing EC notification event
Hi, Matthias
Will you please try the debug patch on the latest kernel(2.6.27-rc4) and see whether the number of ACPI interrupt is increased as fast as before?
Thanks.
Excuse me, but I sold the notebook approx. two months ago, so can't test anymore. Hi, Matthias Thanks for notification. It doesn't matter. Maybe someone can test the attached patch. Thanks. |