Bug 13268
Summary: | ACPI interrupt storm when system is warm - Arima M620-DC (ICH4) | ||
---|---|---|---|
Product: | ACPI | Reporter: | Christopher Horler (cshorler) |
Component: | Config-Interrupts | Assignee: | ykzhao (yakui.zhao) |
Status: | CLOSED DOCUMENTED | ||
Severity: | normal | CC: | acpi-bugzilla, cshorler, lenb, rui.zhang, yakui.zhao |
Priority: | P1 | ||
Hardware: | All | ||
OS: | Linux | ||
Kernel Version: | 2.6.29.2 | Subsystem: | |
Regression: | No | Bisected commit-id: | |
Attachments: |
acpidump output
disassembly of acpi tables dmesg output after storm grep . /sys/firmware/acpi/interrupts/* try the custom DSDT kernel log with customized DSDT interrupts with custom DSDT, still gpe00 |
Description
Christopher Horler
2009-05-07 22:15:20 UTC
Created attachment 21265 [details]
disassembly of acpi tables
Created attachment 21266 [details]
dmesg output after storm
Created attachment 21267 [details]
grep . /sys/firmware/acpi/interrupts/*
Hi, Christopher From the info in comment #3 it seems that the GPE0 is triggered so frequently. And from the acpidump it seems that this is caused by the bogus BIOS. >Method (_L00, 0, NotSerialized) { } When the GPE0 is triggered, there is nothing to do in the _L00 method. And then GPE0 will be triggered again. So IMO this is a BIOS issue.And it had better be fixed by upgrading BIOS. thanks. the problem happens in every kernel release that you have tried, right? please run "echo disable > /sys/firmware/acpi/interrupts/gpe00" before the interrupt storm and see if it helps. Created attachment 21269 [details] try the custom DSDT Will you please try the custom and see whether the issue still exists? In the custom DSDT the polarity of THRM_POL will be inverted. How to use the custom DSDT can be found in : http://www.lesswatts.org/projects/acpi/faq.php Note: As the DSDT.hex is already attached, the first four steps can be skipped. Thanks. (In reply to comment #4) > Hi, Christopher > So IMO this is a BIOS issue.And it had better be fixed by upgrading BIOS. Quite possibly a BIOS issue! However, I've requested a newer BIOS a couple times before and been told that there isn't one. So there's not much I can do unless you know another source. Chris (In reply to comment #5) > the problem happens in every kernel release that you have tried, right? yes - for as long as I can remember (I can remember as far back as SuSE 10.1, but can't remember what kernel that was running - it probable it was happening before that too). > please run "echo disable > /sys/firmware/acpi/interrupts/gpe00" before the > interrupt storm and see if it helps. I tried this and I think it helped - at least I tried to provoke the problem and it didn't appear in about 45 mins of trying. Thanks! What practical impact does disabling a gpe00 have? (other than solving my problem). (In reply to comment #6) > Created an attachment (id=21269) [details] > try the custom DSDT > > Will you please try the custom and see whether the issue still exists? > In the custom DSDT the polarity of THRM_POL will be inverted. > How to use the custom DSDT can be found in : > http://www.lesswatts.org/projects/acpi/faq.php > > Note: As the DSDT.hex is already attached, the first four steps can be > skipped. > Thanks. I recompiled the kernel and installed it and then booted the system. I still get the interrupt storm with the patched DSDT - logs attached. The number is less, but the system wasn't running as long as last time so this is probably just proportional to the difference in time. Chris Created attachment 21279 [details]
kernel log with customized DSDT
Created attachment 21280 [details]
interrupts with custom DSDT, still gpe00
Method (_L00, 0, NotSerialized) { } this is gotten from the acpidump you attached. We can see that nothing is done in the GPE00 handler. So IMO, GPE00 is a nop to Linux kernel, i.e. disabling this GPE is harmless. And "echo disable > /sys/firmware/acpi/interrupts/gpe00" is the command to disable GPE00. then my question is that, 1. does this problem exist in every kernel you've tried? 2. does this happen from the beginning, or it's caused at runtime by some specific actions? Hi, Rui As there exists the GPE storm on GPE00, it can't be disabled by using the command of "echo disable > /sys/firmware/acpi/interrupts/gpe00". And the problem is related with the bogus GPE _L00 method. From the ICH4 chipset it seems that the GPE00 is driven by THRM signal. And whether the GPE00_STS is set is controlled by the bit of THRM_POL. In the custom DSDT the polarity of THRM_POL bit is inverted. But from the log it seems that the problem still exists even after the custom DSDT is used. Thanks. (In reply to comment #12) > then my question is that, > 1. does this problem exist in every kernel you've tried? Every 2.6 series kernel. > 2. does this happen from the beginning, or it's caused at runtime by some > specific actions? The system normally starts in a stable state - unless rebooting after the interrupt storm. In which case the storm sometimes continues (I think turning off for a few minutes normally resets everything). When the system is in a stable state echo disable > /sys/firmware/acpi/interrupts/gpe00 is effective. I have now added this to the boot.local script, and so far I have had no more interrupt storms. Normally I can cause it by running some graphically intensive websites (with lots of CSS and flash on the pages). It seems to be independent of the graphics driver in use with X (I've tried ATI's and the open source radeon driver). I think it's in some way related to CPU load. It's impossible to give an exact scenario which will initiate the interrupt storm, sometimes it won't happen. Thanks for your help! Chris (In reply to comment #14) > (In reply to comment #12) > > then my question is that, > > 1. does this problem exist in every kernel you've tried? > > Every 2.6 series kernel. > > > 2. does this happen from the beginning, or it's caused at runtime by some > > specific actions? > > The system normally starts in a stable state - unless rebooting after the > interrupt storm. In which case the storm sometimes continues (I think > turning > off for a few minutes normally resets everything). > > When the system is in a stable state > > echo disable > /sys/firmware/acpi/interrupts/gpe00 Right. When the system is in the stable state, the GPE00 can be disabled by "echo disable > /sys/firmware/acpi/interrupts/gpe00". > > is effective. I have now added this to the boot.local script, and so far I > have had no more interrupt storms. > > Normally I can cause it by running some graphically intensive websites (with > lots of CSS and flash on the pages). It seems to be independent of the > graphics driver in use with X (I've tried ATI's and the open source radeon > driver). I think it's in some way related to CPU load. From the ACPIdump and ICh4 spec we know that the GPE00 is related with thermal.When the cpu temperature arises, the GPE00 interrupt will be triggered. But nothing can be done in the _L00 method. Then the interrupt storm happens. > > It's impossible to give an exact scenario which will initiate the interrupt > storm, sometimes it won't happen. From the ICH4 spec the GPE00_STS can be controlled via the polarity of THRM_POL bit. But in the custom DSDT the polarity of THRM_POL bit is inverted in the _L00 method, there still exists the interrupt storm. In fact IMO this is a BIOS bug.(The bogus GPE00 method). And it had better be fixed by upgrading BIOS. (In reply to comment #15) > > From the ICH4 spec the GPE00_STS can be controlled via the polarity of > THRM_POL > bit. But in the custom DSDT the polarity of THRM_POL bit is inverted in the > _L00 method, there still exists the interrupt storm. > > > In fact IMO this is a BIOS bug.(The bogus GPE00 method). And it had better be > fixed by upgrading BIOS. right, but we still need to make sure that this happens on Windows as well. But I don't know how to verify an interrupt storm on Windows, does anyone have any ideas? for windows, run perfmon and add a counter for interrupts/sec ? (In reply to comment #17) > for windows, run perfmon > and add a counter for interrupts/sec ? I reinstalled Windows into Virtual Box and no longer have a non- Virtual Box installation. I've not really had a dual boot machine for about 5 years, so it's not very easy to test this. Since disabling gpe00 in the boot scripts - I've not encountered this issue again. (To date). If I could get a BIOS update that would be great - but I have no idea where to look. I investigated once before without success. It's a bit difficult to find what you want when you don't know where to look (Arima may have sold part of their business, the OEM I bought through went bust and the BIOS manufacturer doesn't seem to have an updates website). Anyone, correct me if I'm wrong - you might know other places to look, or understand the .tw website. If we can prove that Windows figures out how to work properly in the face of this BIOS bug, then it justifies spending the effort to make Linux handle the same bug. I don't know what "virtual box" is, but if windows isn't talking to the real hardware, then that isn't interesting. I'm closing this bug as "documented" at this point, as a workaround is documented that gets you going. If you can show Windows on the hardware works, or we run into other systems with the same issue, we can re-open and investigate further. |