Latest working kernel version: 2.6.24.x Earliest failing kernel version: Distribution: Ubuntu, but own kernel Hardware Environment: x86, Ali ide chipset Software Environment: Ubuntu 8.04 Problem Description: My linux works only with "acpi=off" parameter in kernel cmdline. With this bug I encounter since 2.6.25-rcX, but didn't have enough time to check it. Below is brief dump from console: kernel BUG at drivers/acpi/osl.c:460! invalid opcode: 0000 [#1] DEBUG_PAGEALLOC Modules linked in: af_packet arc4 ecb crypto_blkcipher cryptomgr crypt_algapi rt2500pci rt2x00pci....... Pid: 0, comm: swapper Not tainted (2.6.35.3-dbg #4) EIP: .. EIP is at acpi_os_reat_port+0x40/0x4b ESI: DS: Process swapper Call Trace: acpi_hw_low_level_read acpi_hw_register_read acpi_hw_register_write acpi_set_register acpi_idle_enter_simple acpi_set_register acpi_idle_enter_bm cpuidle_idle_call cpuidle_idle_call rest_init ... EIP: acpi_os_read_port+.... ... [end trace ...] Steps to reproduce: Unfortuntely, it seems this bug occurs randomly, but always locks up. (Maybe higher load?)
Please attach the acpidump output.
I'm sorry, but I have to state, that my laptop yesterday fall asleep forever. Yet I can't get any acpidump no more. I would stop(or postpone) this bug issue, due to I can't help myself with dead laptop.
acpi_status acpi_os_read_port(acpi_io_address port, u32 * value, u32 width) { u32 dummy; if (!value) value = &dummy; *value = 0; if (width <= 8) { *(u8 *) value = inb(port); } else if (width <= 16) { *(u16 *) value = inw(port); } else if (width <= 32) { *(u32 *) value = inl(port); } else { 460: BUG(); } return AE_OK; } Strange, that means we are being called to read an IO port of width > 32 bits. acpi_os_read_port() is unchanged since the (working) 2.6.24, so something else above must have changed. If the machine comes back to life, in addition to acquiring the acpidump output, it would be good to try "processor.max_cstate=2" and if that doesn't work then boot with "idle=poll" and make sure the system is at least sane in that basic configuration. Then try building build with CONFIG_CPU_IDLE=n closing as unreproducible now, if the machine comes back to life, please re-open. note that sometimes removing the AC and the battery can wake a machine that has permanently fallen asleep.
However, my laptop suddenly came to life and so I can upload acpidump files. (the cure: another plugin-plugout the battery from slot)
Created attachment 16588 [details] AML version of acpidump
Created attachment 16589 [details] RAW version of acpidump
As you've previously recommended I accomplished testing with these results: processor.max_cstate=2 - after awhile lockup occured idle=poll - without lockups without CONFIG_CPU_IDLE - without lockups
Looks like all callers to acpi_os_read_port() pass on a static 8, 16 or 32 width. And there should not be a call for > 32 in normal execution. Probably we have a corrupted stack or something like that?
Hi, venki, what's the status of this bug?
So, it seems there are certain unknown hardware problems which cause these problems. This issue is supported by fact there is only I who had reported this behaviour. I suggest we could close this if no one else in some close future wouldn't report similar bug.
Created attachment 17271 [details] debug patch Could you perhaps run with this debug patch and see if you get get any ACPI: invalid read port width messages in dmesg. If yes post the full dmesg. I also removed the BUG() so it won't kill your boot anymore, so you have to check in dmesg with grep after boot.
reject this bug as there is no response from the bug reporter. Michal, please reopen it if you can test the patch in comment #11 and update here.
Created attachment 18103 [details] Make acpi_os_{read/write}_port() return error rather than panic if BIOS reports invalid port width Also make drivers/acpi/processor_throttling.c test the return value after calling these routines. (All the other callers already do this!)
Please could you reopen this bug. We have seen this cause crashes on some of our machines too. The patch I have just posted is based on the one in comment #11, but I have made it return an error code rather than just generate a warning, and ensured that all callers check this code. We have verified that this fixes the bug on our machines.