Motherboard is K7S5A with AMD Duron processor. As discussed elsewhere, during resume the system faults in acpi_os_wait_semaphore() at a time when interrupts are disabled. When I patch the routine by setting timeout to 0 if irq_disabled(), the same thing happens in acpi_evaluate_integer(). One bright spot is that with 2.6.18-rc1 these faults no longer cause an interrupt storm, as happens in 2.6.17. I will attach dmesg logs of the bootup and the two suspend-to-disk attempts.
Created attachment 8524 [details] dmesg log of bootup
Created attachment 8525 [details] dmesg log of suspend-to-disk and resume Note: This test was made using 2.6.18-rc1 plus the changes in Greg KH's gregkh-all-2.6.18-rc1 patch plus the patch attached to bug #3469.
Created attachment 8526 [details] dmesg log of suspend-to-disk and resume with timeout set to 0 This test was run with the same kernel as before, plus this patch: --- usb-2.6.orig/drivers/acpi/osl.c +++ usb-2.6/drivers/acpi/osl.c @@ -758,6 +758,9 @@ acpi_status acpi_os_wait_semaphore(acpi_ ACPI_DEBUG_PRINT((ACPI_DB_MUTEX, "Waiting for semaphore[%p|%d|%d]\n", handle, units, timeout)); + if (irqs_disabled()) + timeout = 0; + switch (timeout) { /* * No Wait:
Created attachment 8623 [details] patch vs 2.6.18-rc2 seems devious to nuke the timeout if irqs disabled. Perhaps this patch to simply do a lock_try() upfront is more understandable?
I can't try your patch right now, but it looks like it should solve the problem involving acpi_os_wait_semaphore. But what about the oops shown in attachment 8526 [details] (comment #3)? Are acpi_pci_link_set() and its subordinates supposed to be able to run with interrupts disabled?
re: comment #3 and comment #5, yes, there is a different patch for acpi_evaluate_integer(). The direct answer to your question is NO, acpi_pci_link_set() is NOT designed to be called with interrupts off -- because it can call the AML in interpreter, which needs to allocate. It magically works at boot time due to system_state. Although suspend/resume is analogous to boot time, it was deemed preferable to hack all the invocations that might provoke might_sleep() rather than to deal with it again in once place. This route wasn't my 1st choice, can you tell?:-)
Created attachment 8805 [details] patch vs 2.6.18-rc4 this consolidated patch handles both the acpi_os_acquire_semaphore() case and the acpi_evaluate_integer() case.
The attachment in comment #7 works okay. In fact, when I tried it without the hunk for acpi_evaluate_integer() it still worked! Apparently something else has changed in the interim, and as a result that routine is no longer called at a bad time. Leaving out the original change to acpi_os_acquire_semaphore() did cause the error to re-appear, so that much _is_ necessary. By the way, I noticed that this patch could go ahead to simplify the timeout==0 case in the big "switch" statement.
shipped upstream post 2.6.18-rc4. closed.