According to kerneloops.org, the highest frequency ACPI-relaetd OOPS occurs only on Dell Laptops: http://www.kerneloops.org/searchweek.php?search=acpi_idle_enter_bm
It seems that this kernel oops issue is not related with the cpu idle driver. Instead it is related with that the schedule function is called explicitly or implicitly when executing hardware interrupt or softirq. (Maybe the schedule function will be called when calling msleep or obtaining the mutex lock). In the kernel every process will have its own preempt_count, which indicates whether it is in hardware irq or software irq. > BUG: scheduling while atomic: swapper/0/0x00000100 (0x00000100 indicates that this happens when executing the software irq. > BUG: scheduling while atomic: swapper/0/0x10010000 (0x10010000 indicates that this happens when executing the hardware interrupt). The backtrace reports that it is in idle driver. The following describes how the kerneloops is reported in idle driver. In fact it is related with the calling schedule explicitly/implicitly while executing the hardware/software irq. 1. Before entering the deep C-state, the local irq will be disabled. 2. The cpu will be waked up from the C-state after the hardware interrupt is triggered. 3. The hardware ISR will be serviced after the local irq is re-enabled again(This is re-enabled in the function of acpi_idle_enter_bm/enter_simple). 4. If the hardware ISR is too long, maybe it will raise the software IRQ. And after the hardware ISR is finished, it will check whether the softirq is raised and then execute it(This is called in the function of do_softirq). 5. Maybe the hardware ISR/softirq will try to obtain the mutex lock. If the mutex lock can't be obtained, it will call the function of schedule implicitly. Then the schedule_debug function will be called in the function of schedule to check whether the task schedule happens in hardware ISR/software IRQ. 6. If the preempt_count can't meet with the requirement, it will complain the backtrace in www.kerneloops.org. As this happens in interrupt context, the schedule_debug function only prints the backtrace of the stack before the interrupt happens. I will attach one debug patch to print the stack backtrace that calls the schedule function explicitly/implicitly. thanks.
Created attachment 25171 [details] attach one debug patch that prints the stack backtrace Can someone try the debug patch so that it can print the stack backtrace when the schedule function is called explicitly/implicitly in course of hardware/software irq context? And after the backtrace is complained again, please attach the output of dmesg. thanks.
Some of this sighting may be fixed by the patch to the dell-laptop driver. That patch went upstream in 2.6.34-rc1. https://bugzilla.redhat.com/show_bug.cgi?id=572827 It will also need to be back-ported to distro releases which shipped the dell-laptop input key code before upstream did. re-open this bug if we see the oops w/ upstream-based kernel newer than 2.6.34-rc1