Most recent kernel where this bug did not occur:2.6.17-rc5 Distribution:FC5 Hardware Environment:Thinkpad X60: 1.83Ghz Intel Core Duo Software Environment: Problem Description: When suspending to ram with ACPI with the watchdog enabled, the machine crashes with BUG in arch/i386/kernel/nmi.c:174. The stack backtrace is: release_evntsel_nmi stop_apci_nmi_watchdog on_each_cpu disable_lapic_nmi_watchdog lapic_nmi_suspend sysdev_suspend device_power_down suspend_enter enter_state state_store subsys_attr_store sysfs_write_file vfs_write sys_write sysenter_past_esp Steps to reproduce: 1. boot with nmi_watchdog set to default setting 2. suspend to ram, and resume 3. suspend to ram again - crash during suspend booting with nmi_watchdog=0 on the kernel command line works around the problem.
Created attachment 8251 [details] .config
Created attachment 8252 [details] dmesg output of boot
It appears that patch x86_64-mm-add-performance-counter-reservation-framework-for-up-kernels.patch (dzickus@redhat.com) introduces the BUG_ON in question. This patch is apparently intended to be UP only, though clearly its code gets into SMP kernels...
Er, obviously there are a couple of follow-on patches to make this work for SMP i386.
I see no BUG_ONs in that dmesg output?
No, that's just a normal bootup. The machine locks hard after the crash, so I can't get dmesg output (and netconsole doesn't work during suspend either). I'll attach the screenshots.
Created attachment 8253 [details] Picture of oops message
Can you attach a dmesg output of a normal bootup with the nmi watchdog enabled? The one you attached has it disabled (nmi_watchdog=0). Also could I see the output of 'cat /proc/interrupts |grep NMI' before you suspend and after you resume (the first time). Thanks.
Created attachment 8260 [details] dmesg output before suspend
Created attachment 8261 [details] grep NMI /proc/interrupts before suspend
Created attachment 8262 [details] dmesg output after resume
Created attachment 8263 [details] grep NMI /proc/interrupts after resume
The NMI count on CPU1 is increasing at about 10-20/sec (variable) after resume.
Is the problem still there with 2.6.18? We may need to touch_nmi_watchdog or something...
But why this patch is not included in 2.6.18 ?
I now compile 2.6.18 an I can suspend one time and after this suspend led blinking and cannot suspend again
Sorry I wrote to wrong bug :-(
Could anyone please say what the current status of this bug is?
Please reopen this bug if it's still present in kernel 2.6.19.