Bug 24942 - Many NMI, and freeze at one month work.
Summary: Many NMI, and freeze at one month work.
Status: RESOLVED OBSOLETE
Alias: None
Product: Platform Specific/Hardware
Classification: Unclassified
Component: x86-64 (show other bugs)
Hardware: All Linux
: P1 high
Assignee: platform_x86_64@kernel-bugs.osdl.org
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2010-12-15 20:50 UTC by Nevenchannyy Alexander
Modified: 2012-08-14 14:17 UTC (History)
3 users (show)

See Also:
Kernel Version: 2.6.36.2
Subsystem:
Regression: No
Bisected commit-id:


Attachments
dmesg (49.68 KB, text/x-log)
2010-12-15 21:35 UTC, Nevenchannyy Alexander
Details
ps -ef (15.65 KB, text/plain)
2010-12-15 21:38 UTC, Nevenchannyy Alexander
Details
new today stall (49.72 KB, text/x-log)
2010-12-16 10:03 UTC, Nevenchannyy Alexander
Details
today dmesg Linux node0 2.6.34-gentoo (53.42 KB, text/x-log)
2010-12-16 17:19 UTC, Nevenchannyy Alexander
Details
dmesg from Linux virtualbc 2.6.34-gentoo-r1 (55.14 KB, text/x-log)
2010-12-16 17:33 UTC, Nevenchannyy Alexander
Details
Diagnostic patch to dump out hrtimer functions in effect during an RCU CPU stall warning message. (2.65 KB, patch)
2010-12-16 19:05 UTC, Paul E. McKenney
Details | Diff
Updated diagnostic patch for hrtimers. (2.90 KB, patch)
2010-12-16 19:48 UTC, Paul E. McKenney
Details | Diff
new traces with Paul 's patch (438.36 KB, text/x-log)
2010-12-18 15:20 UTC, Nevenchannyy Alexander
Details

Description Nevenchannyy Alexander 2010-12-15 20:50:46 UTC
Dec 15 18:06:20 node2 kernel: [  910.304038] INFO: rcu_bh_state detected stall on CPU 27 (t=0 jiffies)
Dec 15 18:06:20 node2 kernel: [  910.304038] sending NMI to all CPUs:
Dec 15 18:06:20 node2 kernel: [  910.304064] NMI backtrace for cpu 1
Dec 15 18:06:20 node2 kernel: [  910.304068] CPU 1
Dec 15 18:06:20 node2 kernel: [  910.304070] Modules linked in: dm_round_robin dm_multipath tun kvm_amd kvm rtc_cmos thermal processor i2c_nforce2 rtc_core rtc_lib thermal_sys i2c_core button e1000 fuse nfs auth_rpcgss nfs_acl fscache lockd sunrpc dm_snapshot dm_mod scsi_wait_scan sl811_hcd usbhid ohci_hcd uhci_hcd usb_storage ehci_hcd usbcore qla2xxx scsi_transport_fc scsi_tgt firmware_class mptsas scsi_transport_sas mptscsih mptbase sg ahci libahci sata_nv libata
Dec 15 18:06:20 node2 kernel: [  910.304101]
Dec 15 18:06:20 node2 kernel: [  910.304104] Pid: 0, comm: kworker/0:0 Not tainted 2.6.36.2 #1 Sun Fire X4600 M2/Sun Fire X4600 M2
Dec 15 18:06:20 node2 kernel: [  910.304107] RIP: 0010:[<ffffffff8100b4f0>]  [<ffffffff8100b4f0>] default_idle+0x20/0x40
Dec 15 18:06:20 node2 kernel: [  910.304117] RSP: 0018:ffff8804264d9f10  EFLAGS: 00000246
Dec 15 18:06:20 node2 kernel: [  910.304119] RAX: 0000000000000000 RBX: ffff8804264d8010 RCX: 00000000c0010055
Dec 15 18:06:20 node2 kernel: [  910.304121] RDX: 0000000000000000 RSI: 0000000000000001 RDI: ffffffff81784108
Dec 15 18:06:20 node2 kernel: [  910.304122] RBP: ffffffff8169f560 R08: 0000000000000000 R09: 0000000000000000
Dec 15 18:06:20 node2 kernel: [  910.304124] R10: 0000000000000000 R11: 00000000ffffffff R12: 0000000000000000
Dec 15 18:06:20 node2 kernel: [  910.304125] R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000
Dec 15 18:06:20 node2 kernel: [  910.304127] FS:  00007fdb9be3c700(0000) GS:ffff880001a80000(0000) knlGS:0000000000000000
Dec 15 18:06:20 node2 kernel: [  910.304129] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
Dec 15 18:06:20 node2 kernel: [  910.304130] CR2: 00007f0b9674f1eb CR3: 0000000825a9b000 CR4: 00000000000006e0
Dec 15 18:06:20 node2 kernel: [  910.304132] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
Dec 15 18:06:20 node2 kernel: [  910.304134] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Dec 15 18:06:20 node2 kernel: [  910.304136] Process kworker/0:0 (pid: 0, threadinfo ffff8804264d8000, task ffff8804264d78d0)
Dec 15 18:06:20 node2 kernel: [  910.304137] Stack:
Dec 15 18:06:20 node2 kernel: [  910.304216]  ffffffff8100b62b 0000000000000000 0000000000000000 ffff8804264d8010
Dec 15 18:06:20 node2 kernel: [  910.304227] <0> ffffffff81001dc0 0000000000000286 000000000000054f 0000000000000000
Dec 15 18:06:20 node2 kernel: [  910.304232] <0> 0000000000000000 0000000000000000 0000000000000000 0000000000000000
Dec 15 18:06:20 node2 kernel: [  910.304238] Call Trace:
Dec 15 18:06:20 node2 kernel: [  910.304317]  [<ffffffff8100b62b>] ? c1e_idle+0x7b/0x100
Dec 15 18:06:20 node2 kernel: [  910.304323]  [<ffffffff81001dc0>] ? cpu_idle+0x50/0xa0
Dec 15 18:06:20 node2 kernel: [  910.304324] Code: 44 00 00 0f ae 7a 10 eb c4 66 90 65 48 8b 04 25 48 b5 00 00 83 a0 3c e0 ff ff fb 0f ae f0 48 8b 80 38 e0 ff ff a8 08 75 13 fb f4 <65> 48 8b 04 25 48 b5 00 00 83 88 3c e0 ff ff 04 c3 fb eb ec 66
Dec 15 18:06:20 node2 kernel: [  910.305025] Call Trace:
Dec 15 18:06:20 node2 kernel: [  910.305025]  [<ffffffff8100b62b>] ? c1e_idle+0x7b/0x100
Dec 15 18:06:20 node2 kernel: [  910.305025]  [<ffffffff81001dc0>] ? cpu_idle+0x50/0xa0
Dec 15 18:06:20 node2 kernel: [  910.305025] Pid: 0, comm: kworker/0:0 Not tainted 2.6.36.2 #1
Dec 15 18:06:20 node2 kernel: [  910.305025] Call Trace:
Dec 15 18:06:20 node2 kernel: [  910.305025]  <NMI>  [<ffffffff8101ac81>] ? arch_trigger_all_cpu_backtrace_handler+0x81/0xa0
Dec 15 18:06:20 node2 kernel: [  910.305025]  [<ffffffff8105d5a6>] ? notifier_call_chain+0x46/0x70
Dec 15 18:06:20 node2 kernel: [  910.305025]  [<ffffffff8105d6dd>] ? notify_die+0x2d/0x40
Dec 15 18:06:20 node2 kernel: [  910.305025]  [<ffffffff81004d13>] ? do_nmi+0x163/0x290
Dec 15 18:06:20 node2 kernel: [  910.305025]  [<ffffffff813f42ba>] ? nmi+0x1a/0x20
Dec 15 18:06:20 node2 kernel: [  910.305025]  [<ffffffff8100b4f0>] ? default_idle+0x20/0x40
Dec 15 18:06:20 node2 kernel: [  910.305025]  <<EOE>>  [<ffffffff8100b62b>] ? c1e_idle+0x7b/0x100
Dec 15 18:06:20 node2 kernel: [  910.305025]  [<ffffffff81001dc0>] ? cpu_idle+0x50/0xa0
Dec 15 18:06:20 node2 kernel: [  910.305032] NMI backtrace for cpu 0
Dec 15 18:06:20 node2 kernel: [  910.305032] CPU 0
Dec 15 18:06:20 node2 kernel: [  910.305032] Modules linked in: dm_round_robin dm_multipath tun kvm_amd kvm rtc_cmos thermal processor i2c_nforce2 rtc_core rtc_lib thermal_sys i2c_core button e1000 fuse nfs auth_rpcgss nfs_acl fscache lockd sunrpc dm_snapshot dm_mod scsi_wait_scan sl811_hcd usbhid ohci_hcd uhci_hcd usb_storage ehci_hcd usbcore qla2xxx scsi_transport_fc scsi_tgt firmware_class mptsas scsi_transport_sas mptscsih mptbase sg ahci libahci sata_nv libata
Dec 15 18:06:20 node2 kernel: [  910.305032]
Dec 15 18:06:20 node2 kernel: [  910.305032] Pid: 0, comm: swapper Not tainted 2.6.36.2 #1 Sun Fire X4600 M2/Sun Fire X4600 M2
Dec 15 18:06:20 node2 kernel: [  910.305032] RIP: 0010:[<ffffffff8100b4f0>]  [<ffffffff8100b4f0>] default_idle+0x20/0x40
Dec 15 18:06:20 node2 kernel: [  910.305032] RSP: 0018:ffffffff81601f50  EFLAGS: 00000246
Dec 15 18:06:20 node2 kernel: [  910.305032] RAX: 0000000000000000 RBX: ffffffff81600010 RCX: 00000000c0010055
Dec 15 18:06:20 node2 kernel: [  910.305032] RDX: 0000000000000000 RSI: 0000000000000001 RDI: ffffffff81784108
Dec 15 18:06:20 node2 kernel: [  910.305032] RBP: ffffffff8169f560 R08: 0000000000000000 R09: 0000000000000000
Dec 15 18:06:20 node2 kernel: [  910.305032] R10: 0000000000000000 R11: 00000000ffffffff R12: 0000000000000000
Dec 15 18:06:20 node2 kernel: [  910.305032] R13: ffffffffffffffff R14: 0000000000000000 R15: 0000000000000000
Dec 15 18:06:20 node2 kernel: [  910.305032] FS:  00007f3898fbd700(0000) GS:ffff880001a00000(0000) knlGS:0000000000000000
Dec 15 18:06:20 node2 kernel: [  910.305032] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
Dec 15 18:06:20 node2 kernel: [  910.305032] CR2: 00007f8b9ef72950 CR3: 000000082576e000 CR4: 00000000000006f0
Dec 15 18:06:20 node2 kernel: [  910.305032] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
Dec 15 18:06:20 node2 kernel: [  910.305032] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Dec 15 18:06:20 node2 kernel: [  910.305032] Process swapper (pid: 0, threadinfo ffffffff81600000, task ffffffff8165b020)
Dec 15 18:06:20 node2 kernel: [  910.305032] Stack:
Dec 15 18:06:20 node2 kernel: [  910.305032]  ffffffff8100b62b 0000000000000000 ffffffffffffffff ffffffff81600010
Dec 15 18:06:20 node2 kernel: [  910.305032] <0> ffffffff81001dc0 0000000000000000 0000000000000000 6db6db6db6db6db7
Dec 15 18:06:20 node2 kernel: [  910.305032] <0> ffffffff816b5d0a 0000000000000000 ffffffff816e8560 0000000000000000
Dec 15 18:06:20 node2 kernel: [  910.305032] Call Trace:
Dec 15 18:06:20 node2 000000ffff0ff0 DR7: 0000000000000400
Dec 15 18:06:20 node2 kernel: [  910.305036] Process kworker/0:1 (pid: 0, threadinfo ffff8804266f0000, task ffff8804266ef950)
Dec 15 18:06:20 node2 kernel: [  910.305036] Stack:
Dec 15 18:06:20 node2 kernel: [  910.305036]  ffffffff8100b62b 0000000000000000 0000000000000000 ffff8804266f0010
Dec 15 18:06:20 node2 kernel: [  910.305036] <0> ffffffff81001dc0 0000000000000000 000000000000054f 0000000000000000
Dec 15 18:06:20 node2 kernel: [  910.305036] <0> 0000000000000000 0000000000000000 0000000000000000 0000000000000000
Dec 15 18:06:20 node2 kernel: [  910.305036] Call Trace:
Dec 15 18:06:20 node2 kernel: [  910.305036]  [<ffffffff8100b62b>] ? c1e_idle+0x7b/0x100
Dec 15 18:06:20 node2 kernel: [  910.305036]  [<ffffffff81001dc0>] ? cpu_idle+0x50/0xa0
Dec 15 18:06:20 node2 kernel: [  910.305036] Code: 44 00 00 0f ae 7a 10 eb c4 66 90 65 48 8b 04 25 48 b5 00 00 83 a0 3c e0 ff ff fb 0f ae f0 48 8b 80 38 e0 ff ff a8 08 75 13 fb f4 <65> 48 8b 04 25 48 b5 00 00 83 88 3c e0 ff ff 04 c3 fb eb ec 66
Dec 15 18:06:20 node2 kernel: [  910.305036] Call Trace:
Dec 15 18:06:20 node2 kernel: [  910.305036]  [<ffffffff8100b62b>] ? c1e_idle+0x7b/0x100
Dec 15 18:06:20 node2 kernel: [  910.305036]  [<ffffffff81001dc0>] ? cpu_idle+0x50/0xa0
Dec 15 18:06:20 node2 kernel: [  910.305036] Pid: 0, comm: kworker/0:1 Not tainted 2.6.36.2 #1
Dec 15 18:06:20 node2 kernel: [  910.305036] Call Trace:
Dec 15 18:06:20 node2 kernel: [  910.305036]  <NMI>  [<ffffffff8101ac81>] ? arch_trigger_all_cpu_backtrace_handler+0x81/0xa0
Dec 15 18:06:20 node2 kernel: [  910.305036]  [<ffffffff8105d5a6>] ? notifier_call_chain+0x46/0x70
Dec 15 18:06:20 node2 kernel: [  910.305036]  [<ffffffff8105d6dd>] ? notify_die+0x2d/0x40
Dec 15 18:06:20 node2 kernel: [  910.305036]  [<ffffffff81004d13>] ? do_nmi+0x163/0x290
Dec 15 18:06:20 node2 kernel: [  910.305036]  [<ffffffff813f42ba>] ? nmi+0x1a/0x20
Dec 15 18:06:20 node2 kernel: [  910.305036]  [<ffffffff8100b4f0>] ? default_idle+0x20/0x40
Dec 15 18:06:20 node2 kernel: [  910.305036]  <<EOE>>  [<ffffffff8100b62b>] ? c1e_idle+0x7b/0x100
Dec 15 18:06:20 node2 kernel: [  910.305036]  [<ffffffff81001dc0>] ? cpu_idle+0x50/0xa0
Dec 15 18:06:20 node2 kernel: [  910.305039] NMI backtrace for cpu 24
Dec 15 18:06:20 node2 kernel: [  910.305039] CPU 24
Dec 15 18:06:20 node2 kernel: [  910.305039] Modules linked in: dm_round_robin dm_multipath tun kvm_amd kvm rtc_cmos thermal processor i2c_nforce2 rtc_core rtc_lib thermal_sys i2c_core button e1000 fuse nfs auth_rpcgss nfs_acl fscache lockd sunrpc dm_snapshot dm_mod scsi_wait_scan sl811_hcd usbhid ohci_hcd uhci_hcd usb_storage ehci_hcd usbcore qla2xxx scsi_transport_fc scsi_tgt firmware_class mptsas scsi_transport_sas mptscsih mptbase sg ahci libahci sata_nv libata
Dec 15 18:06:20 node2 kernel: [  910.305039]
Dec 15 18:06:20 node2 kernel: [  910.305039] Pid: 0, comm: kworker/0:1 Not tainted 2.6.36.2 #1 Sun Fire X4600 M2/Sun Fire X4600 M2
Dec 15 18:06:20 node2 kernel: [  910.305039] RIP: 0010:[<ffffffff8100b4f0>]  [<ffffffff8100b4f0>] default_idle+0x20/0x40
Dec 15 18:06:20 node2 kernel: [  910.305039] RSP: 0018:ffff8804266e1f10  EFLAGS: 00000246
Dec 15 18:06:20 node2 kernel: [  910.305039] RAX: 0000000000000000 RBX: ffff8804266e0010 RCX: 00000000c0010055
Dec 15 18:06:20 node2 kernel: [  910.305039] RDX: 0000000000000000 RSI: 0000000000000001 RDI: ffffffff81784108
Dec 15 18:06:20 node2 kernel: [  910.305039] RBP: ffffffff8169f560 R08: 0000000000000000 R09: 0000000000000000
Dec 15 18:06:20 node2 kernel: [  910.305039] R10: 0000000000000000 R11: 00000000ffffffff R12: 0000000000000000
Dec 15 18:06:20 node2 kernel: [  910.305039] R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000
Dec 15 18:06:20 node2 kernel: [  910.305039] FS:  00007f0b97526720(0000) GS:ffff881836200000(0000) knlGS:0000000000000000
Dec 15 18:06:20 node2 kernel: [  910.305039] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
Dec 15 18:06:20 node2 kernel: [  910.305039] CR2: 00007f0b9752d000 CR3: 0000000001653000 CR4: 00000000000006e0
Dec 15 18:06:20 node2 kernel: [  910.305039] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
Dec 15 18:06:20 node2 kernel: [  910.305039] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Dec 15 18:06:20 node2 kernel: [  910.305039] Process kworker/0:1 (pid: 0, threadinfo ffff8804266e0000, task ffff8804266df910)
Dec 15 18:06:20 node2 kernel: [  910.305039] Stack:
Dec 15 18:06:20 node2 kernel: [  910.305039]  ffffffff8100b62b 0000000000000000 0000000000000000 ffff8804266e0010
Dec 15 18:06:20 node2 kernel: [  910.305039] <0> ffffffff81001dc0 0000000000000000 000000000000054f 0000000000000000
Dec 15 18:06:20 node2 kernel: [  910.305039] <0> 0000000000000000 0000000000000000 0000000000000000 0000000000000000
Dec 15 18:06:20 node2 kernel: [  910.305039] Call Trace:
Dec 15 18:06:20 node2 kernel: [  910.305039]  [<ffffffff8100b62b>] ? c1e_idle+0x7b/0x100
Dec 15 18:06:20 node2 kernel: [  910.305039]  [<ffffffff81001dc0>] ? cpu_idle+0x50/0xa0
Dec 15 18:06:20 node2 kernel: [  910.305039] Code: 44 00 00 0f ae 7a 10 eb c4 66 90 65 48 8b 04 25 48 b5 00 00 83 a0 3c e0 ff ff fb 0f ae f0 48 8b 80 38 e0 ff ff a8 08 75 13 fb f4 <65> 48 8b 04 25 48 b5 00 00 83 88 3c e0 ff ff 04 c3 fb eb ec 66
Dec 15 18:06:20 node2 kernel: [  910.305039] Call Trace:
Dec 15 18:06:20 node2 kernel: [  910.305039]  [<ffffffff8100b62b>] ? c1e_idle+0x7b/0x100
Dec 15 18:06:20 node2 kernel: [  910.305039]  [<ffffffff81001dc0>] ? cpu_idle+0x50/0xa0
Dec 15 18:06:20 node2 kernel: [  910.305039] Pid: 0, comm: kworker/0:1 Not tainted 2.6.36.2 #1
Dec 15 18:06:20 node2 kernel: [  910.305039] Call Trace:
Dec 15 18:06:20 node2 kernel: [  910.305039]  <NMI>  [<ffffffff8101ac81>] ? arch_trigger_all_cpu_backtrace_handler+0x81/0xa0
Dec 15 18:06:20 node2 kernel: [  910.305039]  [<ffffffff8105d5a6>] ? notifier_call_chain+0x46/0x70
Dec 15 18:06:20 node2 kernel: [  910.305039]  [<ffffffff8105d6dd>] ? notify_die+0x2d/0x40
Dec 15 18:06:20 node2 kernel: [  910.305039]  [<ffffffff81004d13>] ? do_nmi+0x163/0x290
Dec 15 18:06:20 node2 kernel: [  910.305039]  [<ffffffff813f42ba>] ? nmi+0x1a/0x20
Dec 15 18:06:20 node2 kernel: [  910.305039]  [<ffffffff8100b4f0>] ? default_idle+0x20/0x40
Dec 15 18:06:20 node2 kernel: [  910.305039]  <<EOE>>  [<ffffffff8100b62b>] ? c1e_idle+0x7b/0x100
Dec 15 18:06:20 node2 kernel: [  910.305039]  [<ffffffff81001dc0>] ? cpu_idle+0x50/0xa0
Dec 15 18:06:20 node2 kernel: [  910.305036] NMI backtrace for cpu 21
Dec 15 18:06:20 node2 kernel: [  910.305036] CPU 21
Dec 15 18:06:20 node2 kernel: [  910.305036] Modules linked in: dm_round_robin dm_multipath tun kvm_amd kvm rtc_cmos thermal processor i2c_nforce2 rtc_core rtc_lib thermal_sys i2c_core button e1000 fuse nfs auth_rpcgss nfs_acl fscache lockd sunrpc dm_snapshot dm_mod scsi_wait_scan sl811_hcd usbhid ohci_hcd uhci_hcd usb_storage ehci_hcd usbcore qla2xxx scsi_transport_fc scsi_tgt firmware_class mptsas scsi_transport_sas mptscsih mptbase sg ahci libahci sata_nv libata
Dec 15 18:06:20 node2 kernel: [  910.305036]
Dec 15 18:06:20 node2 kernel: [  910.305036] Pid: 0, comm: kworker/0:1 Not tainted 2.6.36.2 #1 Sun Fire X4600 M2/Sun Fire X4600 M2
Dec 15 18:06:20 node2 kernel: [  910.305036] RIP: 0010:[<ffffffff8100b4f0>]  [<ffffffff8100b4f0>] default_idle+0x20/0x40
Dec 15 18:06:20 node2 kernel: [  910.305036] RSP: 0018:ffff880426655f10  EFLAGS: 00000246
Dec 15 18:06:20 node2 kernel: [  910.305036] RAX: 0000000000000000 RBX: ffff880426654010 RCX: 00000000c0010055
Dec 15 18:06:20 node2 kernel: [  910.305036] RDX: 0000000000000000 RSI: 0000000000000001 RDI: ffffffff81784108
Dec 15 18:06:20 node2 kernel: [  910.305036] RBP: ffffffff8169f560 R08: 0000000000000000 R09: 0000000000000000
Dec 15 18:06:20 node2 kernel: [  910.305036] R10: 0000000000000000 R11: 00000000ffffffff R12: 0000000000000000
Dec 15 18:06:20 node2 kernel: [  910.305036] R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000
Dec 15 18:06:20 node2 kernel: [  910.305036] FS:  00007f0b97526720(0000) GS:ffff881436280000(0000) knlGS:0000000000000000
Dec 15 18:06:20 node2 kernel: [  910.305036] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
Dec 15 18:06:20 node2 kernel: [  910.305036] CR2: 00000000006f22e8 CR3: 0000000001653000 CR4: 00000000000006e0
Dec 15 18:06:20 node2 kernel: [  910.305036] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
Dec 15 18:06:20 node2 kernel: [  910.305036] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Dec 15 18:06:20 node2 kernel: [  910.305036] Process kworker/0:1 (pid: 0, threadinfo ffff880426654000, task ffff880426653950)
Dec 15 18:06:20 node2 kernel: [  910.305036] Stack:
Dec 15 18:06:20 node2 kernel: [  910.305036]  ffffffff8100b62b 0000000000000000 0000000000000000 ffff880426654010
Dec 15 18:06:20 node2 kernel: [  910.305036] <0> ffffffff81001dc0 0000000000000000 000000000000054f 0000000000000000
Dec 15 18:06:20 node2 kernel: [  910.305036] <0> 0000000000000000 0000000000000000 0000000000000000 0000000000000000
Dec 15 18:06:20 node2 kernel: [  910.305036] Call Trace:
Dec 15 18:06:20 node2 kernel: [  910.305036]  [<ffffffff8100b62b>] ? c1e_idle+0x7b/0x100
Dec 15 18:06:20 node2 kernel: [  910.305036]  [<ffffffff81001dc0>] ? cpu_idle+0x50/0xa0
Dec 15 18:06:20 node2 kernel: [  910.305036] Code: 44 00 00 0f ae 7a 10 eb c4 66 90 65 48 8b 04 25 48 b5 00 00 83 a0 3c e0 ff ff fb 0f ae f0 48 8b 80 38 e0 ff ff a8 08 75 13 fb f4 <65> 48 8b 04 25 48 b5 00 00 83 88 3c e0 ff ff 04 c3 fb eb ec 66
Dec 15 18:06:20 node2 kernel: [  910.305036] Call Trace:
Dec 15 18:06:20 node2 kernel: [  910.305036]  [<ffffffff8100b62b>] ? c1e_idle+0x7b/0x100
Dec 15 18:06:20 node2 kernel: [  910.305036]  [<ffffffff81001dc0>] ? cpu_idle+0x50/0xa0
Dec 15 18:06:20 node2 kernel: [  910.305036] Pid: 0, comm: kworker/0:1 Not tainted 2.6.36.2 #1
Dec 15 18:06:20 node2 kernel: [  910.305036] Call Trace:
Dec 15 18:06:20 node2 kernel: [  910.305036]  <NMI>  [<ffffffff8101ac81>] ? arch_trigger_all_cpu_backtrace_handler+0x81/0xa0
Dec 15 18:06:20 node2 kernel: [  910.305036]  [<ffffffff8105d5a6>] ? notifier_call_chain+0x46/0x70
Dec 15 18:06:20 node2 kernel: [  910.305036]  [<ffffffff8105d6dd>] ? notify_die+0x2d/0x40
Dec 15 18:06:20 node2 kernel: [  910.305036]  [<ffffffff81004d13>] ? do_nmi+0x163/0x290
Dec 15 18:06:20 node2 kernel: [  910.305036]  [<ffffffff813f42ba>] ? nmi+0x1a/0x20
Dec 15 18:06:20 node2 kernel: [  910.305036]  [<ffffffff8100b4f0>] ? default_idle+0x20/0x40
Dec 15 18:06:20 node2 kernel: [  910.305036]  <<EOE>>  [<ffffffff8100b62b>] ? c1e_idle+0x7b/0x100
Dec 15 18:06:20 node2 kernel: [  910.305036]  [<ffffffff81001dc0>] ? cpu_idle+0x50/0xa0
Dec 15 18:06:20 node2 kernel: [  910.304038] NMI backtrace for cpu 27
Dec 15 18:06:20 node2 kernel: [  910.304038] CPU 27
Dec 15 18:06:20 node2 kernel: [  910.304038] Modules linked in: dm_round_robin dm_multipath tun kvm_amd kvm rtc_cmos thermal processor i2c_nforce2 rtc_core rtc_lib thermal_sys i2c_core button e1000 fuse nfs auth_rpcgss nfs_acl fscache lockd sunrpc dm_snapshot dm_mod scsi_wait_scan sl811_hcd usbhid ohci_hcd uhci_hcd usb_storage ehci_hcd usbcore qla2xxx scsi_transport_fc scsi_tgt firmware_class mptsas scsi_transport_sas mptscsih mptbase sg ahci libahci sata_nv libata
Dec 15 18:06:20 node2 kernel: [  910.304038]
Dec 15 18:06:20 node2 kernel: [  910.304038] Pid: 0, comm: kworker/0:1 Not tainted 2.6.36.2 #1 Sun Fire X4600 M2/Sun Fire X4600 M2
Dec 15 18:06:20 node2 kernel: [  910.304038] RIP: 0010:[<ffffffff8101aa0f>]  [<ffffffff8101aa0f>] default_send_IPI_mask_sequence_phys+0xaf/0xc0
Dec 15 18:06:20 node2 kernel: [  910.304038] RSP: 0018:ffff881836383e08  EFLAGS: 00000046
Dec 15 18:06:20 node2 kernel: [  910.304038] RAX: ffffffff8165f5a0 RBX: 0000000000000002 RCX: 0000000000000020
Dec 15 18:06:20 node2 kernel: [  910.304038] RDX: 0000000000000021 RSI: 0000000000000020 RDI: 0000000000000020
Dec 15 18:06:20 node2 kernel: [  910.304038] RBP: 000000000000d3c0 R08: ffffffff8169f560 R09: 0000000000000000
Dec 15 18:06:20 node2 kernel: [  910.304038] R10: 0000000000000000 R11: 0000000000000000 R12: ffffffff8169f560
Dec 15 18:06:20 node2 kernel: [  910.304038] R13: 0000000000000400 R14: 0000000000000092 R15: 000000000000001d
Dec 15 18:06:20 node2 kernel: [  910.304038] FS:  00007f7bdd1d9710(0000) GS:ffff881836380000(0000) knlGS:0000000000000000
Dec 15 18:06:20 node2 kernel: [  910.304038] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
Dec 15 18:06:20 node2 kernel: [  910.304038] CR2: 00000000010e6e28 CR3: 00000018254b4000 CR4: 00000000000006e0
Dec 15 18:06:20 node2 kernel: [  910.304038] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
Dec 15 18:06:20 node2 kernel: [  910.304038] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Dec 15 18:06:20 node2 kernel: [  910.304038] Process kworker/0:1 (pid: 0, threadinfo ffff88042670e000, task ffff88042670d850)
Dec 15 18:06:20 node2 kernel: [  910.304038] Stack:
Dec 15 18:06:20 node2 kernel: [  910.304038]  ffff881836383e18 ffff881800000021 ffff881836383e28 0000000000000000
Dec 15 18:06:20 node2 kernel: [  910.304038] <0> ffffffff81667180 ffffffff81667180 0000000000000000 7fffffffffffffff
Dec 15 18:06:20 node2 kernel: [  910.304038] <0> 000000d3f2594319 ffffffff8101acd4 ffff88183638f0c0 ffffffff8108d264
Dec 15 18:06:20 node2 kernel: [  910.304038] Call Trace:
Dec 15 18:06:20 node2 kernel: [  910.304038]  <IRQ>
Dec 15 18:06:20 node2 kernel: [  910.304038]  [<ffffffff8101acd4>] ? arch_trigger_all_cpu_backtrace+0x34/0x60
Dec 15 18:06:20 node2 kernel: [  910.304038]  [<ffffffff8108d264>] ? __rcu_pending+0x1c4/0x380
Dec 15 18:06:20 node2 kernel: [  910.304038]  [<ffffffff8108d513>] ? rcu_check_callbacks+0xf3/0x100
Dec 15 18:06:20 node2 kernel: [  910.304038]  [<ffffffff8104b2ef>] ? update_process_times+0x3f/0x70
Dec 15 18:06:20 node2 kernel: [  910.304038]  [<ffffffff81065f98>] ? tick_sched_timer+0x58/0x140
Dec 15 18:06:20 node2 kernel: [  910.304038]  [<ffffffff8105bf18>] ? __run_hrtimer+0x48/0xe0
Dec 15 18:06:20 node2 kernel: [  910.304038]  [<ffffffff8105c23f>] ? hrtimer_interrupt+0xdf/0x260
Dec 15 18:06:20 node2 kernel: [  910.304038]  [<ffffffff81003c0c>] ? call_softirq+0x1c/0x30
Dec 15 18:06:20 node2 kernel: [  910.304038]  [<ffffffff8101a4f5>] ? smp_apic_timer_interrupt+0x65/0xa0
Dec 15 18:06:20 node2 kernel: [  910.304038]  [<ffffffff810036d3>] ? apic_timer_interrupt+0x13/0x20
Dec 15 18:06:20 node2 kernel: [  910.304038]  <EOI>
Dec 15 18:06:20 node2 kernel: [  910.304038]  [<ffffffff8100b4f0>] ? default_idle+0x20/0x40
Dec 15 18:06:20 node2 kernel: [  910.304038]  [<ffffffff8100b62b>] ? c1e_idle+0x7b/0x100
Dec 15 18:06:20 node2 kernel: [  910.304038]  [<ffffffff81001dc0>] ? cpu_idle+0x50/0xa0
Dec 15 18:06:20 node2 kernel: [  910.304038] Code: 89 e8 0f 45 c3 89 04 25 00 c3 5f ff eb 9e 41 56 9d 48 83 c4 18 5b 5d 41 5c 41 5d 41 5e 41 5f c3 48 8b 05 15 40 68 00 89 54 24 08 <ff> 90 58 01 00 00 8b 54 24 08 eb bb 0f 1f 44 00 00 41 57 41 56
Dec 15 18:06:20 node2 kernel: [  910.304038] Call Trace:
Dec 15 18:06:20 node2 kernel: [  910.304038]  <IRQ>  [<ffffffff8101acd4>] ? arch_trigger_all_cpu_backtrace+0x34/0x60
Dec 15 18:06:20 node2 kernel: [  910.304038]  [<ffffffff8108d264>] ? __rcu_pending+0x1c4/0x380
Dec 15 18:06:20 node2 kernel: [  910.304038]  [<ffffffff8108d513>] ? rcu_check_callbacks+0xf3/0x100
Dec 15 18:06:20 node2 kernel: [  910.304038]  [<ffffffff8104b2ef>] ? update_process_times+0x3f/0x70
Dec 15 18:06:20 node2 kernel: [  910.304038]  [<ffffffff81065f98>] ? tick_sched_timer+0x58/0x140
Dec 15 18:06:20 node2 kernel: [  910.304038]  [<ffffffff8105bf18>] ? __run_hrtimer+0x48/0xe0
Dec 15 18:06:20 node2 kernel: [  910.304038]  [<ffffffff8105c23f>] ? hrtimer_interrupt+0xdf/0x260
Dec 15 18:06:20 node2 kernel: [  910.304038]  [<ffffffff81003c0c>] ? call_softirq+0x1c/0x30
Dec 15 18:06:20 node2 kernel: [  910.304038]  [<ffffffff8101a4f5>] ? smp_apic_timer_interrupt+0x65/0xa0
Dec 15 18:06:20 node2 kernel: [  910.304038]  [<ffffffff810036d3>] ? apic_timer_interrupt+0x13/0x20
Dec 15 18:06:20 node2 kernel: [  910.304038]  <EOI>  [<ffffffff8100b4f0>] ? default_idle+0x20/0x40
Dec 15 18:06:20 node2 kernel: [  910.304038]  [<ffffffff8100b62b>] ? c1e_idle+0x7b/0x100
Dec 15 18:06:20 node2 kernel: [  910.304038]  [<ffffffff81001dc0>] ? cpu_idle+0x50/0xa0
Dec 15 18:06:20 node2 kernel: [  910.304038] Pid: 0, comm: kworker/0:1 Not tainted 2.6.36.2 #1
Dec 15 18:06:20 node2 kernel: [  910.304038] Call Trace:
Dec 15 18:06:20 node2 kernel: [  910.304038]  <NMI>  [<ffffffff8101ac81>] ? arch_trigger_all_cpu_backtrace_handler+0x81/0xa0
Dec 15 18:06:20 node2 kernel: [  910.304038]  [<ffffffff8105d5a6>] ? notifier_call_chain+0x46/0x70
Dec 15 18:06:20 node2 kernel: [  910.304038]  [<ffffffff8105d6dd>] ? notify_die+0x2d/0x40
Dec 15 18:06:20 node2 kernel: [  910.304038]  [<ffffffff81004d13>] ? do_nmi+0x163/0x290
Dec 15 18:06:20 node2 kernel: [  910.304038]  [<ffffffff813f42ba>] ? nmi+0x1a/0x20
Dec 15 18:06:20 node2 kernel: [  910.304038]  [<ffffffff8101aa0f>] ? default_send_IPI_mask_sequence_phys+0xaf/0xc0
Dec 15 18:06:20 node2 kernel: [  910.304038]  <<EOE>>  <IRQ>  [<ffffffff8101acd4>] ? arch_trigger_all_cpu_backtrace+0x34/0x60
Dec 15 18:06:20 node2 kernel: [  910.304038]  [<ffffffff8108d264>] ? __rcu_pending+0x1c4/0x380
Dec 15 18:06:20 node2 kernel: [  910.304038]  [<ffffffff8108d513>] ? rcu_check_callbacks+0xf3/0x100
Dec 15 18:06:20 node2 kernel: [  910.304038]  [<ffffffff8104b2ef>] ? update_process_times+0x3f/0x70
Dec 15 18:06:20 node2 kernel: [  910.304038]  [<ffffffff81065f98>] ? tick_sched_timer+0x58/0x140
Dec 15 18:06:20 node2 kernel: [  910.304038]  [<ffffffff8105bf18>] ? __run_hrtimer+0x48/0xe0
Dec 15 18:06:20 node2 kernel: [  910.304038]  [<ffffffff8105c23f>] ? hrtimer_interrupt+0xdf/0x260
Dec 15 18:06:20 node2 kernel: [  910.304038]  [<ffffffff81003c0c>] ? call_softirq+0x1c/0x30
Dec 15 18:06:20 node2 kernel: [  910.304038]  [<ffffffff8101a4f5>] ? smp_apic_timer_interrupt+0x65/0xa0
Dec 15 18:06:20 node2 kernel: [  910.304038]  [<ffffffff810036d3>] ? apic_timer_interrupt+0x13/0x20
Dec 15 18:06:20 node2 kernel: [  910.304038]  <EOI>  [<ffffffff8100b4f0>] ? default_idle+0x20/0x40
Dec 15 18:06:20 node2 kernel: [  910.304038]  [<ffffffff8100b62b>] ? c1e_idle+0x7b/0x100
Dec 15 18:06:20 node2 kernel: [  910.304038]  [<ffffffff81001dc0>] ? cpu_idle+0x50/0xa0
Dec 15 18:06:20 node2 kernel: [  910.305012] NMI backtrace for cpu 26
Dec 15 18:06:20 node2 kernel: [  910.305012] CPU 26
Dec 15 18:06:20 node2 kernel: [  910.305012] Modules linked in: dm_round_robin dm_multipath tun kvm_amd kvm rtc_cmos thermal processor i2c_nforce2 rtc_core rtc_lib thermal_sys i2c_core button e1000 fuse nfs auth_rpcgss nfs_acl fscache lockd sunrpc dm_snapshot dm_mod scsi_wait_scan sl811_hcd usbhid ohci_hcd uhci_hcd usb_storage ehci_hcd usbcore qla2xxx scsi_transport_fc scsi_tgt firmware_class mptsas scsi_transport_sas mptscsih mptbase sg ahci libahci sata_nv libata
Dec 15 18:06:20 node2 kernel: [  910.305012]
Dec 15 18:06:20 node2 kernel: [  910.305012] Pid: 0, comm: kworker/0:1 Not tainted 2.6.36.2 #1 Sun Fire X4600 M2/Sun Fire X4600 M2
Dec 15 18:06:20 node2 kernel: [  910.305012] RIP: 0010:[<ffffffff8100b4f0>]  [<ffffffff8100b4f0>] default_idle+0x20/0x40
Dec 15 18:06:20 node2 kernel: [  910.305012] RSP: 0018:ffff880426701f10  EFLAGS: 00000246
Dec 15 18:06:20 node2 kernel: [  910.305012] RAX: 0000000000000000 RBX: ffff880426700010 RCX: 00000000c0010055
Dec 15 18:06:20 node2 kernel: [  910.305012] RDX: 0000000000000000 RSI: 0000000000000001 RDI: ffffffff81784108
Dec 15 18:06:20 node2 kernel: [  910.305012] RBP: ffffffff8169f560 R08: 0000000000000000 R09: 0000000000000000
Dec 15 18:06:20 node2 kernel: [  910.305012] R10: 0000000000000000 R11: 00000000ffffffff R12: 0000000000000000
Dec 15 18:06:20 node2 kernel: [  910.305012] R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000
Dec 15 18:06:20 node2 kernel: [  910.305012] FS:  00007f0b97526720(0000) GS:ffff881836300000(0000) knlGS:0000000000000000
Dec 15 18:06:20 node2 kernel: [  910.305012] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
Dec 15 18:06:20 node2 kernel: [  910.305012] CR2: 000000000088e358 CR3: 0000000001653000 CR4: 00000000000006e0
Dec 15 18:06:20 node2 kernel: [  910.305012] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
Dec 15 18:06:20 node2 kernel: [  910.305012] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Dec 15 18:06:20 node2 kernel: [  910.305012] Process kworker/0:1 (pid: 0, threadinfo ffff880426700000, task ffff8804266ff990)
Dec 15 18:06:20 node2 kernel: [  910.305012] Stack:
Dec 15 18:06:20 node2 kernel: [  910.305012]  ffffffff8100b62b 0000000000000000 0000000000000000 ffff880426700010
Dec 15 18:06:20 node2 kernel: [  910.305012] <0> ffffffff81001dc0 0000000000000000 000000000000054f 0000000000000000
Dec 15 18:06:20 node2 kernel: [  910.305012] <0> 0000000000000000 0000000000000000 0000000000000000 0000000000000000
Dec 15 18:06:20 node2 kernel: [  910.305012] Call Trace:
Dec 15 18:06:20 node2 kernel: [  910.305012]  [<ffffffff8100b62b>] ? c1e_idle+0x7b/0x100
Dec 15 18:06:20 node2 kernel: [  910.305012]  [<ffffffff81001dc0>] ? cpu_idle+0x50/0xa0
Dec 15 18:06:20 node2 kernel: [  910.305012] Code: 44 00 00 0f ae 7a 10 eb c4 66 90 65 48 8b 04 25 48 b5 00 00 83 a0 3c e0 ff ff fb 0f ae f0 48 8b 80 38 e0 ff ff a8 08 75 13 fb f4 <65> 48 8b 04 25 48 b5 00 00 83 88 3c e0 ff ff 04 c3 fb eb ec 66
Dec 15 18:06:20 node2 kernel: [  910.305012] Call Trace:
Dec 15 18:06:20 node2 kernel: [  910.305012]  [<ffffffff8100b62b>] ? c1e_idle+0x7b/0x100
Dec 15 18:06:20 node2 kernel: [  910.305012]  [<ffffffff81001dc0>] ? cpu_idle+0x50/0xa0
Dec 15 18:06:20 node2 kernel: [  910.305012] Pid: 0, comm: kworker/0:1 Not tainted 2.6.36.2 #1
Dec 15 18:06:20 node2 kernel: [  910.305012] Call Trace:
Dec 15 18:06:20 node2 kernel: [  910.305012]  <NMI>  [<ffffffff8101ac81>] ? arch_trigger_all_cpu_backtrace_handler+0x81/0xa0
Dec 15 18:06:20 node2 kernel: [  910.305012]  [<ffffffff8105d5a6>] ? notifier_call_chain+0x46/0x70
Dec 15 18:06:20 node2 kernel: [  910.305012]  [<ffffffff8105d6dd>] ? notify_die+0x2d/0x40
Dec 15 18:06:20 node2 kernel: [  910.305012]  [<ffffffff81004d13>] ? do_nmi+0x163/0x290
Dec 15 18:06:20 node2 kernel: [  910.305012]  [<ffffffff813f42ba>] ? nmi+0x1a/0x20
Dec 15 18:06:20 node2 kernel: [  910.305012]  [<ffffffff8100b4f0>] ? default_idle+0x20/0x40
Dec 15 18:06:20 node2 kernel: [  910.305012]  <<EOE>>  [<ffffffff8100b62b>] ? c1e_idle+0x7b/0x100
Dec 15 18:06:20 node2 kernel: [  910.305012]  [<ffffffff81001dc0>] ? cpu_idle+0x50/0xa0
Dec 15 18:06:20 node2 kernel: [  910.305039] NMI backtrace for cpu 23
Dec 15 18:06:20 node2 kernel: [  910.305039] CPU 23
Dec 15 18:06:20 node2 kernel: [  910.305039] Modules linked in: dm_round_robin dm_multipath tun kvm_amd kvm rtc_cmos thermal processor i2c_nforce2 rtc_core rtc_lib thermal_sys i2c_core button e1000 fuse nfs auth_rpcgss nfs_acl fscache lockd sunrpc dm_snapshot dm_mod scsi_wait_scan sl811_hcd usbhid ohci_hcd uhci_hcd usb_storage ehci_hcd usbcore qla2xxx scsi_transport_fc scsi_tgt firmware_class mptsas scsi_transport_sas mptscsih mptbase sg ahci libahci sata_nv libata
Dec 15 18:06:20 node2 kernel: [  910.305039]
Dec 15 18:06:20 node2 kernel: [  910.305039] Pid: 0, comm: kworker/0:1 Not tainted 2.6.36.2 #1 Sun Fire X4600 M2/Sun Fire X4600 M2
Dec 15 18:06:20 node2 kernel: [  910.305039] RIP: 0010:[<ffffffff8100b4f0>]  [<ffffffff8100b4f0>] default_idle+0x20/0x40
Dec 15 18:06:20 node2 kernel: [  910.305039] RSP: 0018:ffff8804266d1f10  EFLAGS: 00000246
Dec 15 18:06:20 node2 kernel: [  910.305039] RAX: 0000000000000000 RBX: ffff8804266d0010 RCX: 00000000c0010055
Dec 15 18:06:20 node2 kernel: [  910.305039] RDX: 0000000000000000 RSI: 0000000000000001 RDI: ffffffff81784108
Dec 15 18:06:20 node2 kernel: [  910.305039] RBP: ffffffff8169f560 R08: 0000000000000000 R09: 0000000000000000
Dec 15 18:06:20 node2 kernel: [  910.305039] R10: 0000000000000000 R11: 00000000ffffffff R12: 0000000000000000
Dec 15 18:06:20 node2 kernel: [  910.305039] R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000
Dec 15 18:06:20 node2 kernel: [  910.305039] FS:  00007ff923383700(0000) GS:ffff881436380000(0000) knlGS:0000000000000000
Dec 15 18:06:20 node2 kernel: [  910.305039] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
Dec 15 18:06:20 node2 kernel: [  910.305039] CR2: 0000000000ec6368 CR3: 0000000001653000 CR4: 00000000000006e0
Dec 15 18:06:20 node2 kernel: [  910.305039] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
Dec 15 18:06:20 node2 kernel: [  910.305039] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Dec 15 18:06:20 node2 kernel: [  910.305039] Process kworker/0:1 (pid: 0, threadinfo ffff8804266d0000, task ffff8804266cf8d0)
Dec 15 18:06:20 node2 kernel: [  910.305039] Stack:
Dec 15 18:06:20 node2 kernel: [  910.305039]  ffffffff8100b62b 0000000000000000 0000000000000000 ffff8804266d0010
Dec 15 18:06:20 node2 kernel: [  910.305039] <0> ffffffff81001dc0 0000000000000000 000000000000054f 0000000000000000
Dec 15 18:06:20 node2 kernel: [  910.305039] <0> 0000000000000000 0000000000000000 0000000000000000 0000000000000000
Dec 15 18:06:20 node2 kernel: [  910.305039] Call Trace:
Dec 15 18:06:20 node2 kernel: [  910.305039]  [<ffffffff8100b62b>] ? c1e_idle+0x7b/0x100
Dec 15 18:06:20 node2 kernel: [  910.305039]  [<ffffffff81001dc0>] ? cpu_idle+0x50/0xa0
Dec 15 18:06:20 node2 kernel: [  910.305039] Code: 44 00 00 0f ae 7a 10 eb c4 66 90 65 48 8b 04 25 48 b5 00 00 83 a0 3c e0 ff ff fb 0f ae f0 48 8b 80 38 e0 ff ff a8 08 75 13 fb f4 <65> 48 8b 04 25 48 b5 00 00 83 88 3c e0 ff ff 04 c3 fb eb ec 66
Dec 15 18:06:20 node2 kernel: [  910.305039] Call Trace:
Dec 15 18:06:20 node2 kernel: [  910.305039]  [<ffffffff8100b62b>] ? c1e_idle+0x7b/0x100
Dec 15 18:06:20 node2 kernel: [  910.305039]  [<ffffffff81001dc0>] ? cpu_idle+0x50/0xa0
Dec 15 18:06:20 node2 kernel: [  910.305039] Pid: 0, comm: kworker/0:1 Not tainted 2.6.36.2 #1
Dec 15 18:06:20 node2 kernel: [  910.305039] Call Trace:
Dec 15 18:06:20 node2 kernel: [  910.305039]  <NMI>  [<ffffffff8101ac81>] ? arch_trigger_all_cpu_backtrace_handler+0x81/0xa0
Dec 15 18:06:20 node2 kernel: [  910.305039]  [<ffffffff8105d5a6>] ? notifier_call_chain+0x46/0x70
Dec 15 18:06:20 node2 kernel: [  910.305039]  [<ffffffff8105d6dd>] ? notify_die+0x2d/0x40
Dec 15 18:06:20 node2 kernel: [  910.305039]  [<ffffffff81004d13>] ? do_nmi+0x163/0x290
Dec 15 18:06:20 node2 kernel: [  910.305039]  [<ffffffff813f42ba>] ? nmi+0x1a/0x20
Dec 15 18:06:20 node2 kernel: [  910.305039]  [<ffffffff8100b4f0>] ? default_idle+0x20/0x40
Dec 15 18:06:20 node2 kernel: [  910.305039]  <<EOE>>  [<ffffffff8100b62b>] ? c1e_idle+0x7b/0x100
Dec 15 18:06:20 node2 kernel: [  910.305039]  [<ffffffff81001dc0>] ? cpu_idle+0x50/0xa0
Dec 15 18:06:20 node2 kernel: [  910.305036] NMI backtrace for cpu 28
Dec 15 18:06:20 node2 kernel: [  910.305036] CPU 28
Dec 15 18:06:20 node2 kernel: [  910.305036] Modules linked in: dm_round_robin dm_multipath tun kvm_amd kvm rtc_cmos thermal processor i2c_nforce2 rtc_core rtc_lib thermal_sys i2c_core button e1000 fuse nfs auth_rpcgss nfs_acl fscache lockd sunrpc dm_snapshot dm_mod scsi_wait_scan sl811_hcd usbhid ohci_hcd uhci_hcd usb_storage ehci_hcd usbcore qla2xxx scsi_transport_fc scsi_tgt firmware_class mptsas scsi_transport_sas mptscsih mptbase sg ahci libahci sata_nv libata
Dec 15 18:06:20 node2 kernel: [  910.305036]
Dec 15 18:06:20 node2 kernel: [  910.305036] Pid: 0, comm: kworker/0:1 Not tainted 2.6.36.2 #1 Sun Fire X4600 M2/Sun Fire X4600 M2
Dec 15 18:06:20 node2 kernel: [  910.305036] RIP: 0010:[<ffffffff8100b4f0>]  [<ffffffff8100b4f0>] default_idle+0x20/0x40
Dec 15 18:06:20 node2 kernel: [  910.305036] RSP: 0018:ffff88042671ff10  EFLAGS: 00000246
Dec 15 18:06:20 node2 kernel: [  910.305036] RAX: 0000000000000000 RBX: ffff88042671e010 RCX: 00000000c0010055
Dec 15 18:06:20 node2 kernel: [  910.305036] RDX: 0000000000000000 RSI: 0000000000000001 RDI: ffffffff81784108
Dec 15 18:06:20 node2 kernel: [  910.305036] RBP: ffffffff8169f560 R08: 0000000000000000 R09: 0000000000000000
Dec 15 18:06:20 node2 kernel: [  910.305036] R10: 0000000000000000 R11: 00000000ffffffff R12: 0000000000000000
Dec 15 18:06:20 node2 kernel: [  910.305036] R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000
Dec 15 18:06:20 node2 kernel: [  910.305036] FS:  00007f6e738f1700(0000) GS:ffff881c36200000(0000) knlGS:0000000000000000
Dec 15 18:06:20 node2 kernel: [  910.305036] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
Dec 15 18:06:20 node2 kernel: [  910.305036] CR2: 000000000086da80 CR3: 0000001c25d04000 CR4: 00000000000006e0
Dec 15 18:06:20 node2 kernel: [  910.305036] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
Dec 15 18:06:20 node2 kernel: [  910.305036] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Dec 15 18:06:20 node2 kernel: [  910.305036] Process kworker/0:1 (pid: 0, threadinfo ffff88042671e000, task ffff88042671d890)
Dec 15 18:06:20 node2 kernel: [  910.305036] Stack:
Dec 15 18:06:20 node2 kernel: [  910.305036]  ffffffff8100b62b 0000000000000000 0000000000000000 ffff88042671e010
Dec 15 18:06:20 node2 kernel: [  910.305036] <0> ffffffff81001dc0 0000000000000000 000000000000054f 0000000000000000
Dec 15 18:06:20 node2 kernel: [  910.305036] <0> 0000000000000000 0000000000000000 0000000000000000 0000000000000000
Dec 15 18:06:20 node2 kernel: [  910.305036] Call Trace:
Dec 15 18:06:20 node2 kernel: [  910.305036]  [<ffffffff8100b62b>] ? c1e_idle+0x7b/0x100
Dec 15 18:06:20 node2 kernel: [  910.305036]  [<ffffffff81001dc0>] ? cpu_idle+0x50/0xa0
Dec 15 18:06:20 node2 kernel: [  910.305036] Code: 44 00 00 0f ae 7a 10 eb c4 66 90 65 48 8b 04 25 48 b5 00 00 83 a0 3c e0 ff ff fb 0f ae f0 48 8b 80 38 e0 ff ff a8 08 75 13 fb f4 <65> 48 8b 04 25 48 b5 00 00 83 88 3c e0 ff ff 04 c3 fb eb ec 66
Dec 15 18:06:20 node2 kernel: [  910.305036] Call Trace:
Dec 15 18:06:20 node2 kernel: [  910.305036]  [<ffffffff8100b62b>] ? c1e_idle+0x7b/0x100
Dec 15 18:06:20 node2 kernel: [  910.305036]  [<ffffffff81001dc0>] ? cpu_idle+0x50/0xa0
Dec 15 18:06:20 node2 kernel: [  910.305036] Pid: 0, comm: kworker/0:1 Not tainted 2.6.36.2 #1
Dec 15 18:06:20 node2 kernel: [  910.305036] Call Trace:
Dec 15 18:06:20 node2 kernel: [  910.305036]  <NMI>  [<ffffffff8101ac81>] ? arch_trigger_all_cpu_backtrace_handler+0x81/0xa0
Dec 15 18:06:20 node2 kernel: [  910.305036]  [<ffffffff8105d5a6>] ? notifier_call_chain+0x46/0x70
Dec 15 18:06:20 node2 kernel: [  910.305036]  [<ffffffff8105d6dd>] ? notify_die+0x2d/0x40
Dec 15 18:06:20 node2 kernel: [  910.305036]  [<ffffffff81004d13>] ? do_nmi+0x163/0x290
Dec 15 18:06:20 node2 kernel: [  910.305036]  [<ffffffff813f42ba>] ? nmi+0x1a/0x20
Dec 15 18:06:20 node2 kernel: [  910.305036]  [<ffffffff8100b4f0>] ? default_idle+0x20/0x40
Dec 15 18:06:20 node2 kernel: [  910.305036]  <<EOE>>  [<ffffffff8100b62b>] ? c1e_idle+0x7b/0x100
Dec 15 18:06:20 node2 kernel: [  910.305036]  [<ffffffff81001dc0>] ? cpu_idle+0x50/0xa0
Dec 15 18:06:20 node2 kernel: [  910.310777] NMI backtrace for cpu 29
Dec 15 18:06:20 node2 kernel: [  910.310780] CPU 29
Dec 15 18:06:20 node2 kernel: [  910.310782] Modules linked in: dm_round_robin dm_multipath tun kvm_amd kvm rtc_cmos thermal processor i2c_nforce2 rtc_core rtc_lib thermal_sys i2c_core button e1000 fuse nfs auth_rpcgss nfs_acl fscache lockd sunrpc dm_snapshot dm_mod scsi_wait_scan sl811_hcd usbhid ohci_hcd uhci_hcd usb_storage ehci_hcd usbcore qla2xxx scsi_transport_fc scsi_tgt firmware_class mptsas scsi_transport_sas mptscsih mptbase sg ahci libahci sata_nv libata
Dec 15 18:06:20 node2 kernel: [  910.310810]
Dec 15 18:06:20 node2 kernel: [  910.310814] Pid: 0, comm: kworker/0:1 Not tainted 2.6.36.2 #1 Sun Fire X4600 M2/Sun Fire X4600 M2
Dec 15 18:06:20 node2 kernel: [  910.310817] RIP: 0010:[<ffffffff8100b4f0>]  [<ffffffff8100b4f0>] default_idle+0x20/0x40
Dec 15 18:06:20 node2 kernel: [  910.310830] RSP: 0018:ffff88042672ff10  EFLAGS: 00000246
Dec 15 18:06:20 node2 kernel: [  910.310831] RAX: 0000000000000000 RBX: ffff88042672e010 RCX: 00000000c0010055
Dec 15 18:06:20 node2 kernel: [  910.310833] RDX: 0000000000000000 RSI: 0000000000000001 RDI: ffffffff81784108
Dec 15 18:06:20 node2 kernel: [  910.310834] RBP: ffffffff8169f560 R08: 0000000000000000 R09: 0000000000000000
Dec 15 18:06:20 node2 kernel: [  910.310836] R10: 0000000000000000 R11: 00000000ffffffff R12: 0000000000000000
Dec 15 18:06:20 node2 kernel: [  910.310837] R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000
Dec 15 18:06:20 node2 kernel: [  910.310839] FS:  00007f0b97526720(0000) GS:ffff881c36280000(0000) knlGS:0000000000000000
Dec 15 18:06:20 node2 kernel: [  910.310841] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
Dec 15 18:06:20 node2 kernel: [  910.310842] CR2: 00007ff0991f7860 CR3: 0000000001653000 CR4: 00000000000006f0
Dec 15 18:06:20 node2 kernel: [  910.310844] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
Dec 15 18:06:20 node2 kernel: [  910.310846] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Dec 15 18:06:20 node2 kernel: [  910.310848] Process kworker/0:1 (pid: 0, threadinfo ffff88042672e000, task ffff88042672d8d0)
Dec 15 18:06:20 node2 kernel: [  910.310849] Stack:
Dec 15 18:06:20 node2 kernel: [  910.310851]  ffffffff8100b62b 0000000000000000 0000000000000000 ffff88042672e010
Dec 15 18:06:20 node2 kernel: [  910.310858] <0> ffffffff81001dc0 0000000000000000 000000000000054f 0000000000000000
Dec 15 18:06:20 node2 kernel: [  910.310862] <0> 0000000000000000 0000000000000000 0000000000000000 0000000000000000
Dec 15 18:06:20 node2 kernel: [  910.310868] Call Trace:
Dec 15 18:06:20 node2 kernel: [  910.310873]  [<ffffffff8100b62b>] ? c1e_idle+0x7b/0x100
Dec 15 18:06:20 node2 kernel: [  910.310879]  [<ffffffff81001dc0>] ? cpu_idle+0x50/0xa0
Dec 15 18:06:20 node2 kernel: [  910.310881] Code: 44 00 00 0f ae 7a 10 eb c4 66 90 65 48 8b 04 25 48 b5 00 00 83 a0 3c e0 ff ff fb 0f ae f0 48 8b 80 38 e0 ff ff a8 08 75 13 fb f4 <65> 48 8b 04 25 48 b5 00 00 83 88 3c e0 ff ff 04 c3 fb eb ec 66
Dec 15 18:06:20 node2 kernel: [  910.310893] Call Trace:
Dec 15 18:06:20 node2 kernel: [  910.310895]  [<ffffffff8100b62b>] ? c1e_idle+0x7b/0x100
Dec 15 18:06:20 node2 kernel: [  910.310898]  [<ffffffff81001dc0>] ? cpu_idle+0x50/0xa0
Dec 15 18:06:20 node2 kernel: [  910.310901] Pid: 0, comm: kworker/0:1 Not tainted 2.6.36.2 #1
Dec 15 18:06:20 node2 kernel: [  910.310902] Call Trace:
Dec 15 18:06:20 node2 kernel: [  910.310903]  <NMI>  [<ffffffff8101ac81>] ? arch_trigger_all_cpu_backtrace_handler+0x81/0xa0
Dec 15 18:06:20 node2 kernel: [  910.310914]  [<ffffffff8105d5a6>] ? notifier_call_chain+0x46/0x70
Dec 15 18:06:20 node2 kernel: [  910.310917]  [<ffffffff8105d6dd>] ? notify_die+0x2d/0x40
Dec 15 18:06:20 node2 kernel: [  910.310920]  [<ffffffff81004d13>] ? do_nmi+0x163/0x290
Dec 15 18:06:20 node2 kernel: [  910.310925]  [<ffffffff813f42ba>] ? nmi+0x1a/0x20
Dec 15 18:06:20 node2 kernel: [  910.310928]  [<ffffffff8100b4f0>] ? default_idle+0x20/0x40
Dec 15 18:06:20 node2 kernel: [  910.310930]  <<EOE>>  [<ffffffff8100b62b>] ? c1e_idle+0x7b/0x100
Dec 15 18:06:20 node2 kernel: [  910.310935]  [<ffffffff81001dc0>] ? cpu_idle+0x50/0xa0
Dec 15 18:06:20 node2 kernel: [  910.310940] NMI backtrace for cpu 30
Dec 15 18:06:20 node2 kernel: [  910.310943] CPU 30
Dec 15 18:06:20 node2 kernel: [  910.310944] Modules linked in: dm_round_robin dm_multipath tun kvm_amd kvm rtc_cmos thermal processor i2c_nforce2 rtc_core rtc_lib thermal_sys i2c_core button e1000 fuse nfs auth_rpcgss nfs_acl fscache lockd sunrpc dm_snapshot dm_mod scsi_wait_scan sl811_hcd usbhid ohci_hcd uhci_hcd usb_storage ehci_hcd usbcore qla2xxx scsi_transport_fc scsi_tgt firmware_class mptsas scsi_transport_sas mptscsih mptbase sg ahci libahci sata_nv libata
Dec 15 18:06:20 node2 kernel: [  910.310975]
Dec 15 18:06:20 node2 kernel: [  910.310979] Pid: 0, comm: kworker/0:1 Not tainted 2.6.36.2 #1 Sun Fire X4600 M2/Sun Fire X4600 M2
Dec 15 18:06:20 node2 kernel: [  910.310981] RIP: 0010:[<ffffffff8100b4f0>]  [<ffffffff8100b4f0>] default_idle+0x20/0x40
Dec 15 18:06:20 node2 kernel: [  910.310994] RSP: 0018:ffff88042675df10  EFLAGS: 00000246
Dec 15 18:06:20 node2 kernel: [  910.310996] RAX: 0000000000000000 RBX: ffff88042675c010 RCX: 00000000c0010055
Dec 15 18:06:20 node2 kernel: [  910.310998] RDX: 0000000000000000 RSI: 0000000000000001 RDI: ffffffff81784108
Dec 15 18:06:20 node2 kernel: [  910.310999] RBP: ffffffff8169f560 R08: 0000000000000000 R09: 0000000000000000
Dec 15 18:06:20 node2 kernel: [  910.311001] R10: 0000000000000000 R11: 00000000ffffffff R12: 0000000000000000
Dec 15 18:06:20 node2 kernel: [  910.311005] R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000
Dec 15 18:06:20 node2 kernel: [  910.311007] FS:  00007f0b97526720(0000) GS:ffff881c36300000(0000) knlGS:0000000000000000
Dec 15 18:06:20 node2 kernel: [  910.311011] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
Dec 15 18:06:20 node2 kernel: [  910.311013] CR2: 0000000000877000 CR3: 0000000001653000 CR4: 00000000000006e0
Dec 15 18:06:20 node2 kernel: [  910.311024] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
Dec 15 18:06:20 node2 kernel: [  910.311025] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Dec 15 18:06:20 node2 kernel: [  910.311028] Process kworker/0:1 (pid: 0, threadinfo ffff88042675c000, task ffff88042675b910)
Dec 15 18:06:20 node2 kernel: [  910.311028] Stack:
Dec 15 18:06:20 node2 kernel: [  910.311028]  ffffffff8100b62b 0000000000000000 0000000000000000 ffff88042675c010
Dec 15 18:06:20 node2 kernel: [  910.311028] <0> ffffffff81001dc0 0000000000000000 000000000000054f 0000000000000000
Dec 15 18:06:20 node2 kernel: [  910.311028] <0> 0000000000000000 0000000000000000 0000000000000000 0000000000000000
Dec 15 18:06:20 node2 kernel: [  910.311028] Call Trace:
Dec 15 18:06:20 node2 kernel: [  910.311028]  [<ffffffff8100b62b>] ? c1e_idle+0x7b/0x100
Dec 15 18:06:20 node2 kernel: [  910.311028]  [<ffffffff81001dc0>] ? cpu_idle+0x50/0xa0
Dec 15 18:06:20 node2 kernel: [  910.311028] Code: 44 00 00 0f ae 7a 10 eb c4 66 90 65 48 8b 04 25 48 b5 00 00 83 a0 3c e0 ff ff fb 0f ae f0 48 8b 80 38 e0 ff ff a8 08 75 13 fb f4 <65> 48 8b 04 25 48 b5 00 00 83 88 3c e0 ff ff 04 c3 fb eb ec 66
Dec 15 18:06:20 node2 kernel: [  910.311028] Call Trace:
Dec 15 18:06:20 node2 kernel: [  910.311028]  [<ffffffff8100b62b>] ? c1e_idle+0x7b/0x100
Dec 15 18:06:20 node2 kernel: [  910.311028]  [<ffffffff81001dc0>] ? cpu_idle+0x50/0xa0
Dec 15 18:06:20 node2 kernel: [  910.311028] Pid: 0, comm: kworker/0:1 Not tainted 2.6.36.2 #1
Dec 15 18:06:20 node2 kernel: [  910.311028] Call Trace:
Dec 15 18:06:20 node2 kernel: [  910.311028]  <NMI>  [<ffffffff8101ac81>] ? arch_trigger_all_cpu_backtrace_handler+0x81/0xa0
Dec 15 18:06:20 node2 kernel: [  910.311028]  [<ffffffff8105d5a6>] ? notifier_call_chain+0x46/0x70
Dec 15 18:06:20 node2 kernel: [  910.311028]  [<ffffffff8105d6dd>] ? notify_die+0x2d/0x40
Dec 15 18:06:20 node2 kernel: [  910.311028]  [<ffffffff81004d13>] ? do_nmi+0x163/0x290
Dec 15 18:06:20 node2 kernel: [  910.311028]  [<ffffffff813f42ba>] ? nmi+0x1a/0x20
Dec 15 18:06:20 node2 kernel: [  910.311028]  [<ffffffff8100b4f0>] ? default_idle+0x20/0x40
Dec 15 18:06:20 node2 kernel: [  910.311028]  <<EOE>>  [<ffffffff8100b62b>] ? c1e_idle+0x7b/0x100
Dec 15 18:06:20 node2 kernel: [  910.311028]  [<ffffffff81001dc0>] ? cpu_idle+0x50/0xa0
Dec 15 18:06:20 node2 kernel: [  910.311024] NMI backtrace for cpu 31
Dec 15 18:06:20 node2 kernel: [  910.311024] CPU 31
Dec 15 18:06:20 node2 kernel: [  910.311024] Modules linked in: dm_round_robin dm_multipath tun kvm_amd kvm rtc_cmos thermal processor i2c_nforce2 rtc_core rtc_lib thermal_sys i2c_core button e1000 fuse nfs auth_rpcgss nfs_acl fscache lockd sunrpc dm_snapshot dm_mod scsi_wait_scan sl811_hcd usbhid ohci_hcd uhci_hcd usb_storage ehci_hcd usbcore qla2xxx scsi_transport_fc scsi_tgt firmware_class mptsas scsi_transport_sas mptscsih mptbase sg ahci libahci sata_nv libata
Dec 15 18:06:20 node2 kernel: [  910.311024]
Dec 15 18:06:20 node2 kernel: [  910.311024] Pid: 0, comm: kworker/0:1 Not tainted 2.6.36.2 #1 Sun Fire X4600 M2/Sun Fire X4600 M2
Dec 15 18:06:20 node2 kernel: [  910.311024] RIP: 0010:[<ffffffff8100b4f0>]  [<ffffffff8100b4f0>] default_idle+0x20/0x40
Dec 15 18:06:20 node2 kernel: [  910.311024] RSP: 0018:ffff88042676df10  EFLAGS: 00000246
Dec 15 18:06:20 node2 kernel: [  910.311024] RAX: 0000000000000000 RBX: ffff88042676c010 RCX: 00000000c0010055
Dec 15 18:06:20 node2 kernel: [  910.311024] RDX: 0000000000000000 RSI: 0000000000000001 RDI: ffffffff81784108
Dec 15 18:06:20 node2 kernel: [  910.311024] RBP: ffffffff8169f560 R08: 0000000000000000 R09: 0000000000000000
Dec 15 18:06:20 node2 kernel: [  910.311024] R10: 0000000000000000 R11: 00000000ffffffff R12: 0000000000000000
Dec 15 18:06:20 node2 kernel: [  910.311024] R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000
Dec 15 18:06:20 node2 kernel: [  910.311024] FS:  00002af9e4280ce0(0000) GS:ffff881c36380000(0000) knlGS:0000000000000000
Dec 15 18:06:20 node2 kernel: [  910.311024] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
Dec 15 18:06:20 node2 kernel: [  910.311024] CR2: 0000000000694000 CR3: 0000000001653000 CR4: 00000000000006e0
Dec 15 18:06:20 node2 kernel: [  910.311024] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
Dec 15 18:06:20 node2 kernel: [  910.311024] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Dec 15 18:06:20 node2 kernel: [  910.311024] Process kworker/0:1 (pid: 0, threadinfo ffff88042676c000, task ffff88042676b950)
Dec 15 18:06:20 node2 kernel: [  910.311024] Stack:
Dec 15 18:06:20 node2 kernel: [  910.311024]  ffffffff8100b62b 0000000000000000 0000000000000000 ffff88042676c010
Dec 15 18:06:20 node2 kernel: [  910.311024] <0> ffffffff81001dc0 0000000000000000 000000000000054f 0000000000000000
Dec 15 18:06:20 node2 kernel: [  910.311024] <0> 0000000000000000 0000000000000000 0000000000000000 0000000000000000
Dec 15 18:06:20 node2 kernel: [  910.311024] Call Trace:
Dec 15 18:06:20 node2 kernel: [  910.311024]  [<ffffffff8100b62b>] ? c1e_idle+0x7b/0x100
Dec 15 18:06:20 node2 kernel: [  910.311024]  [<ffffffff81001dc0>] ? cpu_idle+0x50/0xa0
Dec 15 18:06:20 node2 kernel: [  910.311024] Code: 44 00 00 0f ae 7a 10 eb c4 66 90 65 48 8b 04 25 48 b5 00 00 83 a0 3c e0 ff ff fb 0f ae f0 48 8b 80 38 e0 ff ff a8 08 75 13 fb f4 <65> 48 8b 04 25 48 b5 00 00 83 88 3c e0 ff ff 04 c3 fb eb ec 66
Dec 15 18:06:20 node2 kernel: [  910.311024] Call Trace:
Dec 15 18:06:20 node2 kernel: [  910.311024]  [<ffffffff8100b62b>] ? c1e_idle+0x7b/0x100
Dec 15 18:06:20 node2 kernel: [  910.311024]  [<ffffffff81001dc0>] ? cpu_idle+0x50/0xa0
Dec 15 18:06:20 node2 kernel: [  910.311024] Pid: 0, comm: kworker/0:1 Not tainted 2.6.36.2 #1
Dec 15 18:06:20 node2 kernel: [  910.311024] Call Trace:
Dec 15 18:06:20 node2 kernel: [  910.311024]  <NMI>  [<ffffffff8101ac81>] ? arch_trigger_all_cpu_backtrace_handler+0x81/0xa0
Dec 15 18:06:20 node2 kernel: [  910.311024]  [<ffffffff8105d5a6>] ? notifier_call_chain+0x46/0x70
Dec 15 18:06:20 node2 kernel: [  910.311024]  [<ffffffff8105d6dd>] ? notify_die+0x2d/0x40
Dec 15 18:06:20 node2 kernel: [  910.311024]  [<ffffffff81004d13>] ? do_nmi+0x163/0x290
Dec 15 18:06:20 node2 kernel: [  910.311024]  [<ffffffff813f42ba>] ? nmi+0x1a/0x20
Dec 15 18:06:20 node2 kernel: [  910.311024]  [<ffffffff8100b4f0>] ? default_idle+0x20/0x40
Dec 15 18:06:20 node2 kernel: [  910.311024]  <<EOE>>  [<ffffffff8100b62b>] ? c1e_idle+0x7b/0x100
Dec 15 18:06:20 node2 kernel: [  910.311024]  [<ffffffff81001dc0>] ? cpu_idle+0x50/0xa0

This problem have 2.6.32 2.6.36 kernels, may me more, but i'm don't have opportunities to test this. This is production servers with ~80 KVM VM's.
Comment 1 Nevenchannyy Alexander 2010-12-15 20:51:57 UTC
also 2.6.34
Comment 2 Nevenchannyy Alexander 2010-12-15 20:52:43 UTC
also 2.6.34 & 2.6.35
Comment 3 Andrew Morton 2010-12-15 20:59:08 UTC
Geeze that's a mess.  Can you please readd the trace as an attachment so it doesn't get all wrecked by wordwrapping?
Comment 4 Nevenchannyy Alexander 2010-12-15 21:35:26 UTC
Created attachment 40272 [details]
dmesg
Comment 5 Nevenchannyy Alexander 2010-12-15 21:38:03 UTC
Created attachment 40282 [details]
ps -ef
Comment 6 Andrew Morton 2010-12-15 21:58:30 UTC
Thanks.  I'm having trouble working out where CPU 27 got stuck.  Maybe in rcu_check_callbacks().
Comment 7 Nevenchannyy Alexander 2010-12-16 03:27:44 UTC
I'm also have some dmesg's from others two servers with 2.6.34-gentoo and 2.6.32-gentoo-r20. If interested i'm send it's.
Comment 8 Nevenchannyy Alexander 2010-12-16 10:03:30 UTC
Created attachment 40332 [details]
new today stall
Comment 9 Paul E. McKenney 2010-12-16 16:17:50 UTC
I took a quick look at attachment id=40332 from comment #8 above, and here is what I found:

CPU 0: ???
CPU 1: idle
CPU 2: idle
CPU 3-22: ???
CPU 23: idle
CPU 24: ???
CPU 25: idle
CPU 26: idle
CPU 27: idle
CPU 28: Was running hrtimers, took a scheduler-tick interrupt, detected CPU stall, initiated the stack traces.
CPU 29: rt_worker_func() in IPv4 routing
CPU 30: idle
CPU 31: idle

Are CPUs 0 and 3-22 offline or something?

CPU 28 is flagged as causing the stall.  Is there an extremely heavy timer load on this system?

Attachment id=40272 shows the same thing: CPU 27 was running hrtimers, took a scheduler-tick interrupt, and detected the CPU stall.  In both cases, we get three copied of the stack backtrace, not sure why.

If you are willing to try out a diagnostic patch, one thing to try would be to store the value of the "fn" local variable __run_hrtimer() in kernel/hrtimer.c into a global per-CPU variable just after the "fn = timer->function;" line -- NULL it out before __run_hrtimer() returns.  Then in print_other_cpu_stall() in kernel/rcutree.c, just after the first printk(), print out the global per-CPU variables.

The value for "fn" for the CPU flagged as causing the stall might provide some clues.  (I can provide the patch if you would prefer, but the edit-debug-test cycle will be quite a bit faster if you do it.)

Also, does this stall happen but once?  If the system is semi-alive, and if the CPU stall persists, you should see similar messages every 30 seconds.  Or does the system hang?
Comment 10 Nevenchannyy Alexander 2010-12-16 16:52:34 UTC
All CPU is online. Test was compiling kernel with MAKEOPTS="-j65" and controlling with htop. All CPU works fine with ~100% loading.

I'm have three production servers under Gentoo Linux with differed kernels under KVM VM's.
2 nodes with many VM have 1-2 messages per day about CPU stall. After messages it's semi-live about 1 month, after this freeze (with null information in /var/log/messages). Third server works's fine 49 days (with 8 WinXP guests), but at this moment, also have one message about CPU stall.

This is messages from fourth server, it's installed yesterday. At 910 seconds of works with out load, we have first CPU stall.

About patch, yes of course i'm compile kernel with it at third node, with out critical for business VM's. For me this problem very critical, because hangs create many problems.

I am also currently studying part of the kernel code on the system RCU. Trying to understand what is happening with servers. Any help is appreciated:)

P.S. Sorry for bad English.
Comment 11 Nevenchannyy Alexander 2010-12-16 17:19:11 UTC
Created attachment 40402 [details]
today dmesg  Linux node0 2.6.34-gentoo
Comment 12 Nevenchannyy Alexander 2010-12-16 17:29:48 UTC
At other server we are also see that NMI was received only 9 CPU, instead of 32.

betelgeuse ~ # cat ./trace2.log | grep 'NMI '
Dec 16 20:01:34 node0 kernel: [1215640.206461] sending NMI to all CPUs:
Dec 16 20:01:34 node0 kernel: [1215640.206502] NMI backtrace for cpu 1
Dec 16 20:01:34 node0 kernel: [1215640.206985] NMI backtrace for cpu 2
Dec 16 20:01:34 node0 kernel: [1215640.207055] NMI backtrace for cpu 3
Dec 16 20:01:34 node0 kernel: [1215640.207408] NMI backtrace for cpu 26
Dec 16 20:01:34 node0 kernel: [1215640.207469] NMI backtrace for cpu 30
Dec 16 20:01:34 node0 kernel: [1215640.206461] NMI backtrace for cpu 29
Dec 16 20:01:34 node0 kernel: [1215640.207425] NMI backtrace for cpu 27
Dec 16 20:01:34 node0 kernel: [1215640.208927] NMI backtrace for cpu 28
Dec 16 20:01:34 node0 kernel: [1215640.213487] NMI backtrace for cpu 31
Comment 13 Nevenchannyy Alexander 2010-12-16 17:33:30 UTC
Created attachment 40412 [details]
dmesg from Linux virtualbc 2.6.34-gentoo-r1
Comment 14 Paul E. McKenney 2010-12-16 17:33:54 UTC
OK, so patch should be against vanilla 2.6.34, correct?
Comment 15 Nevenchannyy Alexander 2010-12-16 17:35:03 UTC
Other server, NMI received only 13 CPU instead of 32.

betelgeuse ~ # cat ./trace3.log | grep 'NMI '
Dec 14 02:51:30 virtualbc kernel: [4214243.442828] sending NMI to all CPUs:
Dec 14 02:51:30 virtualbc kernel: [4214243.442916] NMI backtrace for cpu 0
Dec 14 02:51:30 virtualbc kernel: [4214243.443857] NMI backtrace for cpu 4
Dec 14 02:51:30 virtualbc kernel: [4214243.444884] NMI backtrace for cpu 1
Dec 14 02:51:30 virtualbc kernel: [4214243.452132] NMI backtrace for cpu 29
Dec 14 02:51:30 virtualbc kernel: [4214243.452344] NMI backtrace for cpu 24
Dec 14 02:51:30 virtualbc kernel: [4214243.452394] NMI backtrace for cpu 23
Dec 14 02:51:30 virtualbc kernel: [4214243.390012] NMI backtrace for cpu 22
Dec 14 02:51:30 virtualbc kernel: [4214243.452594] NMI backtrace for cpu 25
Dec 14 02:51:30 virtualbc kernel: [4214243.452703] NMI backtrace for cpu 28
Dec 14 02:51:30 virtualbc kernel: [4214243.452909] NMI backtrace for cpu 26
Dec 14 02:51:30 virtualbc kernel: [4214243.453003] NMI backtrace for cpu 27
Dec 14 02:51:30 virtualbc kernel: [4214243.453014] NMI backtrace for cpu 30
Dec 14 02:51:30 virtualbc kernel: [4214243.453105] NMI backtrace for cpu 31
Comment 16 Nevenchannyy Alexander 2010-12-16 17:36:52 UTC
(In reply to comment #14)
> OK, so patch should be against vanilla 2.6.34, correct?

No, this is production servers, for this moment i'm have two test servers with 2.6.36.2. But this is not critical, i'm have good knowledge of C, so can port for any kernel.
Comment 17 Paul E. McKenney 2010-12-16 17:37:38 UTC
And I very much hope that the testing will not be on the machine that produced the dmesg in your comment #13 -- no symbol names for functions, just hexadecimal -- not very helpful...  :-(
Comment 18 Paul E. McKenney 2010-12-16 17:38:09 UTC
OK, 2.6.36 is more convenient for me anyway.
Comment 19 Nevenchannyy Alexander 2010-12-16 17:41:45 UTC
virtualbc server at this moment dont't have debug in kernel -(( But I'm sure the symptoms are the same. This is identical servers from Sun/Oracle with Opteron CPUs. It's identical for all severs under Linux :(
Comment 20 Paul E. McKenney 2010-12-16 19:05:42 UTC
Created attachment 40442 [details]
Diagnostic patch to dump out hrtimer functions in effect during an RCU CPU stall warning message.

Diagnostic patch -- compiles against 2.6.36, but is otherwise untested.
Comment 21 Paul E. McKenney 2010-12-16 19:48:01 UTC
Created attachment 40452 [details]
Updated diagnostic patch for hrtimers.

This one should compile with stall-warning enabled.  :-/
Comment 22 Nevenchannyy Alexander 2010-12-18 15:20:34 UTC
Created attachment 40682 [details]
new traces with Paul 's patch
Comment 23 Paul E. McKenney 2010-12-18 23:56:54 UTC
So the offending hrtimer entry was the scheduler tick itself, which indicates that the CPU was idle.

CPU 15 misses some:
0, 1, 4, 5, 8, 9, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 23, 24, 25, 28, 29, 30.

Ten-second pause, then: CPUs 0 and 15 get them all:
0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31.

95-second pause, then: 
CPU 31: 
0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 20, 31.

In all cases, the system is mostly idle.  So I am wondering if the diagnostic is causing the long-term problem with all the NMIs?  I will look for race conditions that could cause spurious stall warnings, and in the meantime I suggest building a kernel with CONFIG_RCU_CPU_STALL_VERBOSE=n.  Though it will take some months to be sure of the hang.
Comment 24 Nevenchannyy Alexander 2010-12-19 09:23:20 UTC
I'm compiled kernel with CONFIG_RCU_CPU_STALL_VERBOSE=n. So we are waiting for system hung ? But, as i'm wrote before, system hangs with out any logs in /var/log/messages :(
Comment 25 Paul E. McKenney 2010-12-19 21:53:39 UTC
No, -you- are waiting for the system to hang.  -I- am looking for why this might be happening.

I might have another diagnostic patch or (even better) a fix, hopefully soon.  But either way, I am afraid that at some point we will need to let at least one of your systems run for at least a month to see if the problem is fixed.

Note You need to log in before you can comment on or make changes to this bug.