Hi, We are experiencing random system crashes on our ubuntu 11.10 server running an openstack / KVM cloud. Unfortunately we did not observe any specific behavior that trigers the problem, but I have collected a couple of stack traces. Running on a dell 710 dual Intel Xeon X5650 CPU's Please let me know if I can provide more information. Kind regards bram Jan 13 17:06:46 cmggcn01 kernel: [876590.142455] general protection fault: 0000 [#1] SMP Jan 13 17:06:46 cmggcn01 kernel: [876590.165966] CPU 4 Jan 13 17:06:46 cmggcn01 kernel: [876590.166135] Modules linked in: ebt_arp ebt_ip 8021q garp ip6table_filter ip6_tables ebtable_nat ebtables ipt_MASQUERADE iptable_nat nf_nat nf_conntrack_ipv4 nf_defrag_ipv4 xt_state nf_conntrack ipt_REJECT xt_CHECKSUM iptable_mangle xt_tcpudp iptable_filter ip_tables x_tables bridge stp kvm_intel kvm nbd vesafb ib_iser rdma_cm ib_cm iw_cm ib_sa ib_mad ib_core ib_addr iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi dcdbas dm_multipath psmouse serio_raw ghes hed acpi_power_meter bonding i7core_edac lp joydev parport edac_core ses enclosure usbhid hid megaraid_sas bnx2 Jan 13 17:06:46 cmggcn01 kernel: [876590.313703] Jan 13 17:06:46 cmggcn01 kernel: [876590.337109] Pid: 93, comm: ksmd Not tainted 3.0.0-14-server #23-Ubuntu Dell Inc. PowerEdge R710/0MD99X Jan 13 17:06:46 cmggcn01 kernel: [876590.362347] RIP: 0010:[<ffffffffa01b3411>] [<ffffffffa01b3411>] kvm_set_pte_rmapp+0x51/0x130 [kvm] Jan 13 17:06:46 cmggcn01 kernel: [876590.386424] RSP: 0018:ffff8817f4129bc0 EFLAGS: 00010202 Jan 13 17:06:46 cmggcn01 kernel: [876590.411497] RAX: 000088050d943ff8 RBX: 000088050d943ff8 RCX: ffffffffa01b33c0 Jan 13 17:06:46 cmggcn01 kernel: [876590.435831] RDX: ffff8817f4129c88 RSI: 0000000000000000 RDI: 000088050d943ff8 Jan 13 17:06:46 cmggcn01 kernel: [876590.460571] RBP: ffff8817f4129c00 R08: ffff880bf6012960 R09: 0000000000000100 Jan 13 17:06:46 cmggcn01 kernel: [876590.484258] R10: 00000000000000ab R11: 0000000000000002 R12: ffffc900288c3ff8 Jan 13 17:06:46 cmggcn01 kernel: [876590.507470] R13: ffff8817f4129c88 R14: ffff880b917d8000 R15: 00000000002f7279 Jan 13 17:06:46 cmggcn01 kernel: [876590.531508] FS: 0000000000000000(0000) GS:ffff88183fc40000(0000) knlGS:0000000000000000 Jan 13 17:06:46 cmggcn01 kernel: [876590.554747] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b Jan 13 17:06:46 cmggcn01 kernel: [876590.577706] CR2: 00007f3ac168a000 CR3: 0000000001c03000 CR4: 00000000000026e0 Jan 13 17:06:46 cmggcn01 kernel: [876590.602479] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 Jan 13 17:06:46 cmggcn01 kernel: [876590.624973] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 Jan 13 17:06:46 cmggcn01 kernel: [876590.647214] Process ksmd (pid: 93, threadinfo ffff8817f4128000, task ffff8817f63fdc80) Jan 13 17:06:46 cmggcn01 kernel: [876590.669444] Stack: Jan 13 17:06:46 cmggcn01 kernel: [876590.690742] ffff8817f4129c10 ffffffffa01b3409 000000000000008f ffff880b6bea70b0 Jan 13 17:06:46 cmggcn01 kernel: [876590.713167] 0000000000000002 00007f54c4050000 ffff880b6bea7000 000000000010d9ff Jan 13 17:06:46 cmggcn01 kernel: [876590.735595] ffff8817f4129c70 ffffffffa01b0dd9 ffff8817f4129c80 ffffffffa01b33c0 Jan 13 17:06:46 cmggcn01 kernel: [876590.758191] Call Trace: Jan 13 17:06:46 cmggcn01 kernel: [876590.780877] [<ffffffffa01b3409>] ? kvm_set_pte_rmapp+0x49/0x130 [kvm] Jan 13 17:06:46 cmggcn01 kernel: [876590.804088] [<ffffffffa01b0dd9>] kvm_handle_hva+0x99/0x180 [kvm] Jan 13 17:06:46 cmggcn01 kernel: [876590.829628] [<ffffffffa01b33c0>] ? rmap_write_protect+0x150/0x150 [kvm] Jan 13 17:06:46 cmggcn01 kernel: [876590.853510] [<ffffffffa01b7ad1>] kvm_set_spte_hva+0x21/0x30 [kvm] Jan 13 17:06:46 cmggcn01 kernel: [876590.876966] [<ffffffffa019591d>] kvm_mmu_notifier_change_pte+0x5d/0x90 [kvm] Jan 13 17:06:46 cmggcn01 kernel: [876590.901395] [<ffffffff8114d87e>] __mmu_notifier_change_pte+0x3e/0x80 Jan 13 17:06:46 cmggcn01 kernel: [876590.925191] [<ffffffff8114e07f>] write_protect_page+0x10f/0x170 Jan 13 17:06:46 cmggcn01 kernel: [876590.950156] [<ffffffff8114e2df>] ? replace_page+0x1ff/0x280 Jan 13 17:06:46 cmggcn01 kernel: [876590.974177] [<ffffffff8114e3e0>] try_to_merge_one_page+0x80/0x220 Jan 13 17:06:46 cmggcn01 kernel: [876590.999032] [<ffffffff8114e5f7>] try_to_merge_with_ksm_page+0x77/0xc0 Jan 13 17:06:46 cmggcn01 kernel: [876591.022944] [<ffffffff8114f616>] cmp_and_merge_page+0xe6/0x260 Jan 13 17:06:46 cmggcn01 kernel: [876591.046993] [<ffffffff8114f83f>] ksm_scan_thread+0xaf/0x2a0 Jan 13 17:06:46 cmggcn01 kernel: [876591.070807] [<ffffffff81081660>] ? add_wait_queue+0x60/0x60 Jan 13 17:06:46 cmggcn01 kernel: [876591.094345] [<ffffffff8114f790>] ? cmp_and_merge_page+0x260/0x260 Jan 13 17:06:46 cmggcn01 kernel: [876591.118363] [<ffffffff81080bbc>] kthread+0x8c/0xa0 Jan 13 17:06:46 cmggcn01 kernel: [876591.147851] [<ffffffff81609164>] kernel_thread_helper+0x4/0x10 Jan 13 17:06:46 cmggcn01 kernel: [876591.178092] [<ffffffff81080b30>] ? flush_kthread_worker+0xa0/0xa0 Jan 13 17:06:46 cmggcn01 kernel: [876591.203396] [<ffffffff81609160>] ? gs_change+0x13/0x13 Jan 13 17:06:46 cmggcn01 kernel: [876591.226913] Code: 0f 85 e8 00 00 00 48 89 f8 66 66 66 90 49 8b 3c 24 49 89 c7 31 f6 49 c1 e7 12 49 c1 ef 1e e8 f7 fd ff ff 48 85 c0 48 89 c3 74 77 Jan 13 17:06:46 cmggcn01 kernel: [876591.299573] RIP [<ffffffffa01b3411>] kvm_set_pte_rmapp+0x51/0x130 [kvm] Jan 13 17:06:46 cmggcn01 kernel: [876591.323386] RSP <ffff8817f4129bc0> Jan 13 17:06:46 cmggcn01 kernel: [876591.394334] ---[ end trace a6f88f15bc3d2aa0 ]--- Jan 13 17:07:46 cmggcn01 kernel: [876651.245759] INFO: rcu_sched_state detected stall on CPU 14 (t=15000 jiffies) Jan 13 17:07:46 cmggcn01 kernel: [876651.249812] INFO: rcu_sched_state detected stalls on CPUs/tasks: { 14} (detected by 22, t=15002 jiffies) Jan 16 12:03:42 cmggcn01 kernel: [232444.624348] general protection fault: 0000 [#1] SMP Jan 16 12:03:42 cmggcn01 kernel: [232444.624791] CPU 14 Jan 16 12:03:42 cmggcn01 kernel: [232444.624971] Modules linked in: ebt_arp ebt_ip 8021q garp ip6table_filter ip6_tables ebtable_nat ebtables ipt_MASQUERADE iptable_nat nf_nat nf_conntrack_ipv4 nf_defrag_ipv4 xt_state nf_conntrack ipt_REJECT xt_CHECKSUM iptable_mangle xt_tcpudp iptable_filter ip_tables x_tables bridge stp kvm_intel kvm nbd ib_iser rdma_cm ib_cm iw_cm ib_sa ib_mad ib_core ib_addr iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi vesafb bonding psmouse ghes acpi_power_meter dcdbas dm_multipath joydev hed i7core_edac serio_raw edac_core lp parport ses enclosure usbhid hid megaraid_sas bnx2 Jan 16 12:03:42 cmggcn01 kernel: [232444.630415] Jan 16 12:03:42 cmggcn01 kernel: [232444.630541] Pid: 92, comm: ksmd Not tainted 3.0.0-14-server #23-Ubuntu Dell Inc. PowerEdge R710/0MD99X Jan 16 12:03:42 cmggcn01 kernel: [232444.631356] RIP: 0010:[<ffffffff8114ecee>] [<ffffffff8114ecee>] remove_rmap_item_from_tree+0x9e/0x150 Jan 16 12:03:42 cmggcn01 kernel: [232444.632148] RSP: 0018:ffff8817f3e1fe10 EFLAGS: 00010286 Jan 16 12:03:42 cmggcn01 kernel: [232444.632589] RAX: ffff8817bbee8c30 RBX: ffff880bf5847fc0 RCX: ffff880bf56f8463 Jan 16 12:03:42 cmggcn01 kernel: [232444.633188] RDX: 0000880ba0f4d030 RSI: 0000000000020072 RDI: ffffea0028d70730 Jan 16 12:03:42 cmggcn01 kernel: [232444.633785] RBP: ffff8817f3e1fe30 R08: ffffea0028d70738 R09: ffff880c3fff6928 Jan 16 12:03:42 cmggcn01 kernel: [232444.634383] R10: 00000000000000b7 R11: ffffea0028e44180 R12: ffff880bf56f8460 Jan 16 12:03:42 cmggcn01 kernel: [232444.634980] R13: ffffea0028d70730 R14: ffff8817f3e1fe98 R15: ffff8817f3e20000 Jan 16 12:03:42 cmggcn01 kernel: [232444.635406] FS: 0000000000000000(0000) GS:ffff88183fce0000(0000) knlGS:0000000000000000 Jan 16 12:03:42 cmggcn01 kernel: [232444.635858] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b Jan 16 12:03:42 cmggcn01 kernel: [232444.636340] CR2: 0000000001c1be08 CR3: 0000000001c03000 CR4: 00000000000026e0 Jan 16 12:03:42 cmggcn01 kernel: [232444.636939] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 Jan 16 12:03:42 cmggcn01 kernel: [232444.637535] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 Jan 16 12:03:42 cmggcn01 kernel: [232444.638134] Process ksmd (pid: 92, threadinfo ffff8817f3e1e000, task ffff8817f3e20000) Jan 16 12:03:42 cmggcn01 kernel: [232444.638949] Stack: Jan 16 12:03:42 cmggcn01 kernel: [232444.639208] ffff880bf60e4980 ffff8817f3e20000 ffffea0022feb118 ffff880bf5847fc0 Jan 16 12:03:42 cmggcn01 kernel: [232444.640235] ffff8817f3e1fe70 ffffffff8114f55a ffff8817f3e1fe60 0000000000000000 Jan 16 12:03:42 cmggcn01 kernel: [232444.640953] ffff8817f3e20000 000000000000005b ffff8817f3e20000 ffff8817f3e1fe98 Jan 16 12:03:42 cmggcn01 kernel: [232444.641622] Call Trace: Jan 16 12:03:42 cmggcn01 kernel: [232444.641828] [<ffffffff8114f55a>] cmp_and_merge_page+0x2a/0x260 Jan 16 12:03:42 cmggcn01 kernel: [232444.642325] [<ffffffff8114f83f>] ksm_scan_thread+0xaf/0x2a0 Jan 16 12:03:42 cmggcn01 kernel: [232444.642796] [<ffffffff81081660>] ? add_wait_queue+0x60/0x60 Jan 16 12:03:42 cmggcn01 kernel: [232444.643269] [<ffffffff8114f790>] ? cmp_and_merge_page+0x260/0x260 Jan 16 12:03:42 cmggcn01 kernel: [232444.643785] [<ffffffff81080bbc>] kthread+0x8c/0xa0 Jan 16 12:03:42 cmggcn01 kernel: [232444.644198] [<ffffffff81609164>] kernel_thread_helper+0x4/0x10 Jan 16 12:03:42 cmggcn01 kernel: [232444.644690] [<ffffffff81080b30>] ? flush_kthread_worker+0xa0/0xa0 Jan 16 12:03:42 cmggcn01 kernel: [232444.645207] [<ffffffff81609160>] ? gs_change+0x13/0x13 Jan 16 12:03:42 cmggcn01 kernel: [232444.645642] Code: 28 4c 89 e7 e8 84 fc ff ff 48 85 c0 49 89 c5 74 d2 f0 0f ba 28 00 19 c0 85 c0 0f 85 a1 00 00 00 48 8b 43 30 48 8b 53 38 48 85 c0 Jan 16 12:03:42 cmggcn01 kernel: [232444.648047] RIP [<ffffffff8114ecee>] remove_rmap_item_from_tree+0x9e/0x150 Jan 16 12:03:42 cmggcn01 kernel: [232444.648721] RSP <ffff8817f3e1fe10> Jan 16 12:03:42 cmggcn01 kernel: [232444.695308] ---[ end trace 1466b29b5c8949e3 ]--- Jan 18 11:52:42 cmggcn01 kernel: [89218.740228] general protection fault: 0000 [#1] SMP Jan 18 11:52:42 cmggcn01 kernel: [89218.740662] CPU 3 Jan 18 11:52:42 cmggcn01 kernel: [89218.740823] Modules linked in: ebt_arp ebt_ip 8021q garp ip6table_filter ip6_tables ebtable_nat ebtables ipt_MASQUERADE iptable_nat nf_nat nf_conntrack_ipv4 nf_defrag_ipv4 xt_state nf_conntrack ipt_REJECT xt_CHECKSUM iptable_mangle xt_tcpudp iptable_filter ip_tables x_tables bridge stp kvm_intel kvm nbd vesafb ib_iser rdma_cm ib_cm iw_cm ib_sa ib_mad ib_core ib_addr iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi psmouse dcdbas dm_multipath serio_raw joydev ghes hed acpi_power_meter bonding lp parport i7core_edac edac_core ses enclosure usbhid hid megaraid_sas bnx2 Jan 18 11:52:42 cmggcn01 kernel: [89218.745746] Jan 18 11:52:42 cmggcn01 kernel: [89218.745869] Pid: 92, comm: ksmd Not tainted 3.0.0-14-server #23-Ubuntu Dell Inc. PowerEdge R710/0MD99X Jan 18 11:52:42 cmggcn01 kernel: [89218.746672] RIP: 0010:[<ffffffff812edf23>] [<ffffffff812edf23>] rb_insert_color+0x43/0x150 Jan 18 11:52:42 cmggcn01 kernel: [89218.747377] RSP: 0018:ffff8817f3e4ddb8 EFLAGS: 00010206 Jan 18 11:52:42 cmggcn01 kernel: [89218.748058] RAX: ffff88135bb7ffe8 RBX: ffff88135bb7ffe8 RCX: 0000000000000008 Jan 18 11:52:42 cmggcn01 kernel: [89218.748982] RDX: ffff8809af4934e8 RSI: ffffffff81ee9840 RDI: ffff88098e40d9e8 Jan 18 11:52:42 cmggcn01 kernel: [89218.749617] RBP: ffff8817f3e4dde0 R08: 0000000000000079 R09: 0000000000000029 Jan 18 11:52:42 cmggcn01 kernel: [89218.750208] R10: ffff880aac5be000 R11: 0000000000000001 R12: ffff8809af4934e8 Jan 18 11:52:42 cmggcn01 kernel: [89218.750794] R13: 00008809ae4a5ae8 R14: ffff88098e40d9e8 R15: ffffffff81ee9840 Jan 18 11:52:42 cmggcn01 kernel: [89218.751386] FS: 0000000000000000(0000) GS:ffff880c3fc20000(0000) knlGS:0000000000000000 Jan 18 11:52:42 cmggcn01 kernel: [89218.752054] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b Jan 18 11:52:42 cmggcn01 kernel: [89218.752526] CR2: 00007f819ff52000 CR3: 0000000001c03000 CR4: 00000000000026e0 Jan 18 11:52:42 cmggcn01 kernel: [89218.753116] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 Jan 18 11:52:42 cmggcn01 kernel: [89218.753708] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 Jan 18 11:52:42 cmggcn01 kernel: [89218.754296] Process ksmd (pid: 92, threadinfo ffff8817f3e4c000, task ffff8817f63a9720) Jan 18 11:52:42 cmggcn01 kernel: [89218.754956] Stack: Jan 18 11:52:42 cmggcn01 kernel: [89218.755120] ffffea00255b4190 ffff88098e40d9c0 ffffea00264a9c80 ffff8809af4934f0 Jan 18 11:52:42 cmggcn01 kernel: [89218.755785] 0000000000000000 ffff8817f3e4de30 ffffffff8114e855 ffff8809af4934e8 Jan 18 11:52:42 cmggcn01 kernel: [89218.756276] ffff8817f3e4de48 ffff8817f3e4de30 ffffea0052a42d28 ffffea00255b4190 Jan 18 11:52:42 cmggcn01 kernel: [89218.756711] Call Trace: Jan 18 11:52:42 cmggcn01 kernel: [89218.756892] [<ffffffff8114e855>] unstable_tree_search_insert+0xe5/0x150 Jan 18 11:52:42 cmggcn01 kernel: [89218.757499] [<ffffffff8114f690>] cmp_and_merge_page+0x160/0x260 Jan 18 11:52:42 cmggcn01 kernel: [89218.758284] [<ffffffff8114f83f>] ksm_scan_thread+0xaf/0x2a0 Jan 18 11:52:42 cmggcn01 kernel: [89218.759010] [<ffffffff81081660>] ? add_wait_queue+0x60/0x60 Jan 18 11:52:42 cmggcn01 kernel: [89218.759524] [<ffffffff8114f790>] ? cmp_and_merge_page+0x260/0x260 Jan 18 11:52:42 cmggcn01 kernel: [89218.760037] [<ffffffff81080bbc>] kthread+0x8c/0xa0 Jan 18 11:52:42 cmggcn01 kernel: [89218.760438] [<ffffffff81609164>] kernel_thread_helper+0x4/0x10 Jan 18 11:52:42 cmggcn01 kernel: [89218.760927] [<ffffffff81080b30>] ? flush_kthread_worker+0xa0/0xa0 Jan 18 11:52:42 cmggcn01 kernel: [89218.761439] [<ffffffff81609160>] ? gs_change+0x13/0x13 Jan 18 11:52:42 cmggcn01 kernel: [89218.761862] Code: 0f 1f 84 00 00 00 00 00 49 83 e4 fc 74 4a 49 8b 04 24 a8 01 75 42 48 89 c3 48 83 e3 fc 4c 8b 6b 10 4d 39 e5 74 7a 4d 85 ed 74 45 Jan 18 11:52:42 cmggcn01 kernel: [89218.764260] RIP [<ffffffff812edf23>] rb_insert_color+0x43/0x150 Jan 18 11:52:42 cmggcn01 kernel: [89218.764772] RSP <ffff8817f3e4ddb8> Jan 18 11:52:42 cmggcn01 kernel: [89218.811669] ---[ end trace e5a6d0f7fdfec15f ]--- Jan 27 13:41:28 cmggcn01 kernel: [871350.761867] general protection fault: 0000 [#2] SMP Jan 27 13:41:28 cmggcn01 kernel: [871350.790117] CPU 14 Jan 27 13:41:28 cmggcn01 kernel: [871350.790387] Modules linked in: btrfs zlib_deflate libcrc32c ufs qnx4 hfsplus hfs minix ntfs vfat msdos fat jfs xfs reiserfs ebt_arp ebt_ip 8021q garp ip6table_filter ip6_tables ebtable_nat ebtables ipt_MASQUERADE iptable_nat nf_nat nf_conntrack_ipv4 nf_defrag_ipv4 xt_state nf_conntrack ipt_REJECT xt_CHECKSUM iptable_mangle xt_tcpudp iptable_filter ip_tables x_tables bridge stp kvm_intel kvm nbd vesafb ib_iser rdma_cm ib_cm iw_cm ib_sa ib_mad ib_core ib_addr iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi psmouse dcdbas dm_multipath serio_raw joydev ghes hed acpi_power_meter bonding lp parport i7core_edac edac_core ses enclosure usbhid hid megaraid_sas bnx2 Jan 27 13:41:28 cmggcn01 kernel: [871351.005151] Jan 27 13:41:28 cmggcn01 kernel: [871351.036187] Pid: 90, comm: kswapd0 Tainted: G D 3.0.0-14-server #23-Ubuntu Dell Inc. PowerEdge R710/0MD99X Jan 27 13:41:28 cmggcn01 kernel: [871351.072809] RIP: 0010:[<ffffffffa01a5890>] [<ffffffffa01a5890>] kvm_unmap_rmapp+0x20/0x60 [kvm] Jan 27 13:41:28 cmggcn01 kernel: [871351.105190] RSP: 0018:ffff8817f3e27a60 EFLAGS: 00010202 Jan 27 13:41:28 cmggcn01 kernel: [871351.141329] RAX: 00008817f5d067f8 RBX: ffffc9001fd41ff8 RCX: ffffffffa01a58d0 Jan 27 13:41:28 cmggcn01 kernel: [871351.179076] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 00008817f5d067f8 Jan 27 13:41:28 cmggcn01 kernel: [871351.212086] RBP: ffff8817f3e27a80 R08: ffff8817f315b3e0 R09: 0000000000000100 Jan 27 13:41:28 cmggcn01 kernel: [871351.245788] R10: 000000000000000e R11: 0000000000000002 R12: ffff8817f2f0c000 Jan 27 13:41:28 cmggcn01 kernel: [871351.277514] R13: 0000000000000000 R14: ffff880be235e000 R15: 00000000000d3cff Jan 27 13:41:28 cmggcn01 kernel: [871351.308421] FS: 0000000000000000(0000) GS:ffff88183fce0000(0000) knlGS:0000000000000000 Jan 27 13:41:28 cmggcn01 kernel: [871351.339685] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b Jan 27 13:41:28 cmggcn01 kernel: [871351.370089] CR2: 00007f8836442000 CR3: 0000000001c03000 CR4: 00000000000026e0 Jan 27 13:41:28 cmggcn01 kernel: [871351.399771] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 Jan 27 13:41:28 cmggcn01 kernel: [871351.428208] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 Jan 27 13:41:28 cmggcn01 kernel: [871351.456153] Process kswapd0 (pid: 90, threadinfo ffff8817f3e26000, task ffff8817f63ac560) Jan 27 13:41:28 cmggcn01 kernel: [871351.484943] Stack: Jan 27 13:41:28 cmggcn01 kernel: [871351.512984] 0000000000000000 ffffc9001fd41ff8 0000000000000001 00007f834a87e000 Jan 27 13:41:28 cmggcn01 kernel: [871351.542025] ffff8817f3e27aa0 ffffffffa01a5945 ffff880be235e060 0000000000000001 Jan 27 13:41:28 cmggcn01 kernel: [871351.571050] ffff8817f3e27b10 ffffffffa01a1dd9 ffff8817f3e27ae0 ffffffffa01a58d0 Jan 27 13:41:28 cmggcn01 kernel: [871351.600455] Call Trace: Jan 27 13:41:28 cmggcn01 kernel: [871351.628903] [<ffffffffa01a5945>] kvm_age_rmapp+0x75/0x90 [kvm] Jan 27 13:41:28 cmggcn01 kernel: [871351.659242] [<ffffffffa01a1dd9>] kvm_handle_hva+0x99/0x180 [kvm] Jan 27 13:41:28 cmggcn01 kernel: [871351.687386] [<ffffffffa01a58d0>] ? kvm_unmap_rmapp+0x60/0x60 [kvm] Jan 27 13:41:28 cmggcn01 kernel: [871351.716053] [<ffffffffa01a8af7>] kvm_age_hva+0x17/0x20 [kvm] Jan 27 13:41:28 cmggcn01 kernel: [871351.746652] [<ffffffffa018a4dd>] kvm_mmu_notifier_clear_flush_young+0x4d/0x90 [kvm] Jan 27 13:41:28 cmggcn01 kernel: [871351.774490] [<ffffffff8114d7b8>] __mmu_notifier_clear_flush_young+0x48/0x60 Jan 27 13:41:28 cmggcn01 kernel: [871351.801948] [<ffffffff81138f1b>] page_referenced_one+0x18b/0x1f0 Jan 27 13:41:28 cmggcn01 kernel: [871351.827654] [<ffffffff8113a8a5>] page_referenced_anon+0xd5/0x130 Jan 27 13:41:28 cmggcn01 kernel: [871351.852371] [<ffffffff8113a9c8>] page_referenced+0xc8/0xf0 Jan 27 13:41:28 cmggcn01 kernel: [871351.875896] [<ffffffff8111cbe9>] shrink_active_list.isra.50+0x1d9/0x370 Jan 27 13:41:28 cmggcn01 kernel: [871351.899681] [<ffffffff811146cd>] ? throttle_vm_writeout+0x3d/0xa0 Jan 27 13:41:28 cmggcn01 kernel: [871351.922135] [<ffffffff8111dd8b>] balance_pgdat+0x16b/0x6f0 Jan 27 13:41:28 cmggcn01 kernel: [871351.944598] [<ffffffff8111e3fa>] kswapd+0xea/0x1f0 Jan 27 13:41:28 cmggcn01 kernel: [871351.967441] [<ffffffff8111e310>] ? balance_pgdat+0x6f0/0x6f0 Jan 27 13:41:28 cmggcn01 kernel: [871351.989825] [<ffffffff81080bbc>] kthread+0x8c/0xa0 Jan 27 13:41:28 cmggcn01 kernel: [871352.012498] [<ffffffff81609164>] kernel_thread_helper+0x4/0x10 Jan 27 13:41:28 cmggcn01 kernel: [871352.035112] [<ffffffff81080b30>] ? flush_kthread_worker+0xa0/0xa0 Jan 27 13:41:28 cmggcn01 kernel: [871352.057481] [<ffffffff81609160>] ? gs_change+0x13/0x13 Jan 27 13:41:28 cmggcn01 kernel: [871352.080582] Code: e7 d0 e8 e0 66 90 e9 a2 fe ff ff 55 48 89 e5 41 55 41 54 53 48 83 ec 08 66 66 66 66 90 45 31 ed 49 89 fc 48 89 f3 eb 20 0f 1f 00 <f6> 00 01 74 35 48 8b 15 74 7a 02 00 48 89 c6 4c 89 e7 41 bd 01 Jan 27 13:41:28 cmggcn01 kernel: [871352.128618] RIP [<ffffffffa01a5890>] kvm_unmap_rmapp+0x20/0x60 [kvm] Jan 27 13:41:28 cmggcn01 kernel: [871352.152218] RSP <ffff8817f3e27a60> Jan 27 13:41:28 cmggcn01 kernel: [871352.221414] ---[ end trace e5a6d0f7fdfec160 ]--- Jan 27 13:42:28 cmggcn01 kernel: [871412.095485] INFO: rcu_sched_state detected stall on CPU 0 (t=15000 jiffies) Jan 27 13:42:28 cmggcn01 kernel: [871412.095493] INFO: rcu_sched_state detected stall on CPU 17 (t=15000 jiffies) Jan 27 13:45:29 cmggcn01 kernel: [871591.757675] INFO: rcu_sched_state detected stall on CPU 17 (t=60030 jiffies) Jan 27 13:45:29 cmggcn01 kernel: [871591.757686] INFO: rcu_sched_state detected stall on CPU 0 (t=60030 jiffies) Jan 27 13:48:29 cmggcn01 kernel: [871771.419867] INFO: rcu_sched_state detected stall on CPU 0 (t=105060 jiffies) Jan 27 13:48:29 cmggcn01 kernel: [871771.419878] INFO: rcu_sched_state detected stall on CPU 17 (t=105060 jiffies)
> Jan 27 13:41:28 cmggcn01 kernel: [871350.761867] general protection fault: > 0000 [#2] SMP > Jan 27 13:41:28 cmggcn01 kernel: [871350.790117] CPU 14 > Jan 27 13:41:28 cmggcn01 kernel: [871350.790387] Modules linked in: btrfs > zlib_deflate libcrc32c ufs qnx4 hfsplus hfs minix ntfs vfat msdos fat jfs xfs > reiserfs ebt_arp ebt_ip 8021q garp ip6table_filter ip6_tables ebtable_nat > ebtables ipt_MASQUERADE iptable_nat nf_nat nf_conntrack_ipv4 nf_defrag_ipv4 > xt_state nf_conntrack ipt_REJECT xt_CHECKSUM iptable_mangle xt_tcpudp > iptable_filter ip_tables x_tables bridge stp kvm_intel kvm nbd vesafb ib_iser > rdma_cm ib_cm iw_cm ib_sa ib_mad ib_core ib_addr iscsi_tcp libiscsi_tcp > libiscsi scsi_transport_iscsi psmouse dcdbas dm_multipath serio_raw joydev > ghes hed acpi_power_meter bonding lp parport i7core_edac edac_core ses > enclosure usbhid hid megaraid_sas bnx2 > Jan 27 13:41:28 cmggcn01 kernel: [871351.005151] > Jan 27 13:41:28 cmggcn01 kernel: [871351.036187] Pid: 90, comm: kswapd0 > Tainted: G D 3.0.0-14-server #23-Ubuntu Dell Inc. PowerEdge > R710/0MD99X > Jan 27 13:41:28 cmggcn01 kernel: [871351.072809] RIP: > 0010:[<ffffffffa01a5890>] [<ffffffffa01a5890>] kvm_unmap_rmapp+0x20/0x60 > [kvm] > Jan 27 13:41:28 cmggcn01 kernel: [871351.105190] RSP: 0018:ffff8817f3e27a60 > EFLAGS: 00010202 > Jan 27 13:41:28 cmggcn01 kernel: [871351.141329] RAX: 00008817f5d067f8 RBX: > ffffc9001fd41ff8 RCX: ffffffffa01a58d0 > Jan 27 13:41:28 cmggcn01 kernel: [871351.179076] RDX: 0000000000000000 RSI: > 0000000000000000 RDI: 00008817f5d067f8 > Jan 27 13:41:28 cmggcn01 kernel: [871351.212086] RBP: ffff8817f3e27a80 R08: > ffff8817f315b3e0 R09: 0000000000000100 > Jan 27 13:41:28 cmggcn01 kernel: [871351.245788] R10: 000000000000000e R11: > 0000000000000002 R12: ffff8817f2f0c000 > Jan 27 13:41:28 cmggcn01 kernel: [871351.277514] R13: 0000000000000000 R14: > ffff880be235e000 R15: 00000000000d3cff > Jan 27 13:41:28 cmggcn01 kernel: [871351.308421] FS: 0000000000000000(0000) > GS:ffff88183fce0000(0000) knlGS:0000000000000000 > Jan 27 13:41:28 cmggcn01 kernel: [871351.339685] CS: 0010 DS: 0000 ES: 0000 > CR0: 000000008005003b > Jan 27 13:41:28 cmggcn01 kernel: [871351.370089] CR2: 00007f8836442000 CR3: > 0000000001c03000 CR4: 00000000000026e0 > Jan 27 13:41:28 cmggcn01 kernel: [871351.399771] DR0: 0000000000000000 DR1: > 0000000000000000 DR2: 0000000000000000 > Jan 27 13:41:28 cmggcn01 kernel: [871351.428208] DR3: 0000000000000000 DR6: > 00000000ffff0ff0 DR7: 0000000000000400 > Jan 27 13:41:28 cmggcn01 kernel: [871351.456153] Process kswapd0 (pid: 90, > threadinfo ffff8817f3e26000, task ffff8817f63ac560) > Jan 27 13:41:28 cmggcn01 kernel: [871351.484943] Stack: > Jan 27 13:41:28 cmggcn01 kernel: [871351.512984] 0000000000000000 > ffffc9001fd41ff8 0000000000000001 00007f834a87e000 > Jan 27 13:41:28 cmggcn01 kernel: [871351.542025] ffff8817f3e27aa0 > ffffffffa01a5945 ffff880be235e060 0000000000000001 > Jan 27 13:41:28 cmggcn01 kernel: [871351.571050] ffff8817f3e27b10 > ffffffffa01a1dd9 ffff8817f3e27ae0 ffffffffa01a58d0 <snip> > Jan 27 13:41:28 cmggcn01 kernel: [871352.080582] Code: e7 d0 e8 e0 66 90 e9 > a2 fe ff ff 55 48 89 e5 41 55 41 54 53 48 83 ec 08 66 66 66 66 90 45 31 ed 49 > 89 fc 48 89 f3 eb 20 0f 1f 00 <f6> 00 01 74 35 48 8b 15 74 7a 02 00 48 89 c6 > 4c 89 e7 41 bd 01 0: e8 e0 66 90 e9 callq 0xffffffffe99066e5 5: a2 fe ff ff 55 48 89 mov %al,0x41e5894855fffffe c: e5 41 e: 55 push %rbp f: 41 54 push %r12 11: 53 push %rbx 12: 48 83 ec 08 sub $0x8,%rsp 16: 66 66 66 66 90 data32 data32 data32 xchg %ax,%ax 1b: 45 31 ed xor %r13d,%r13d 1e: 49 89 fc mov %rdi,%r12 21: 48 89 f3 mov %rsi,%rbx 24: eb 20 jmp 0x46 26: 0f 1f 00 nopl (%rax) 29: f6 00 01 testb $0x1,(%rax) ^ dies here, %rax is non-canonical. 2c: 74 35 je 0x63 2e: 48 8b 15 74 7a 02 00 mov 0x27a74(%rip),%rdx # 0x27aa9 35: 48 89 c6 mov %rax,%rsi 38: 4c 89 e7 mov %r12,%rdi static int kvm_unmap_rmapp(struct kvm *kvm, unsigned long *rmapp, unsigned long data) { u64 *spte; int need_tlb_flush = 0; while ((spte = rmap_next(kvm, rmapp, NULL))) { BUG_ON(!(*spte & PT_PRESENT_MASK)); ^ here, when fetching *spte. rmap_printk("kvm_rmap_unmap_hva: spte %p %llx\n", spte, *spte); drop_spte(kvm, spte); need_tlb_flush = 1; } return need_tlb_flush; Looks like a use-after-free with the two bytes at offset 6 zeroed. If this is reproducible, please rerun with the host kernel parameter slub_debug=FZPU.
Are you using bridge and netfilter? Can you disable both? This looks similar to https://bugzilla.kernel.org/show_bug.cgi?id=27052
(In reply to comment #2) > Are you using bridge and netfilter? Can you disable both? This looks similar > to > https://bugzilla.kernel.org/show_bug.cgi?id=27052 Indeed I'm running both. Running both is required to run VM's in the openstack configuration so just disabling would brake my cloud config...I guess? Any alternatives? Have meanwhile upgraded to the 3.2.2 kernel to see if the problem persists will reboot with "slub_debug=FZPU" on next crash. lsmod: Module Size Used by des_generic 21415 0 md4 12595 0 nls_utf8 12557 1 cifs 281484 2 ebt_arp 12585 108 ebt_ip 12538 36 8021q 24151 0 garp 14313 1 8021q ip6table_filter 12815 0 ip6_tables 27617 1 ip6table_filter ebtable_nat 12807 1 ebtables 30966 1 ebtable_nat ipt_MASQUERADE 12759 3 xt_state 12578 25 ipt_REJECT 12576 2 xt_CHECKSUM 12549 1 iptable_mangle 12695 1 xt_tcpudp 12603 61 iptable_nat 13182 1 nf_nat 25545 2 ipt_MASQUERADE,iptable_nat nf_conntrack_ipv4 19588 28 iptable_nat,nf_nat nf_conntrack 81527 5 ipt_MASQUERADE,xt_state,iptable_nat,nf_nat,nf_conntrack_ipv4 nf_defrag_ipv4 12729 1 nf_conntrack_ipv4 iptable_filter 12810 1 kvm_intel 136560 61 ip_tables 27227 3 iptable_mangle,iptable_nat,iptable_filter x_tables 29727 14 ebt_arp,ebt_ip,ip6table_filter,ip6_tables,ebtables,ipt_MASQUERADE,xt_state,ipt_REJECT,xt_CHECKSUM,iptable_mangle,xt_tcpudp,iptable_nat,iptable_filter,ip_tables bridge 90674 0 kvm 404475 1 kvm_intel stp 12931 2 garp,bridge nbd 17712 0 ib_iser 38366 0 rdma_cm 43625 1 ib_iser ib_cm 47663 1 rdma_cm iw_cm 18705 1 rdma_cm ib_sa 28854 2 rdma_cm,ib_cm ib_mad 47570 2 ib_cm,ib_sa ib_core 82371 6 ib_iser,rdma_cm,ib_cm,iw_cm,ib_sa,ib_mad ib_addr 14109 1 rdma_cm iscsi_tcp 18447 0 libiscsi_tcp 20862 1 iscsi_tcp libiscsi 57321 3 ib_iser,iscsi_tcp,libiscsi_tcp scsi_transport_iscsi 53383 4 ib_iser,iscsi_tcp,libiscsi ext2 73217 1 bonding 108597 0 psmouse 73859 0 dcdbas 14438 0 serio_raw 13211 0 joydev 17597 0 i7core_edac 27864 0 edac_core 53411 4 i7core_edac lp 17789 0 dm_multipath 23141 0 parport 46360 1 lp mac_hid 13205 0 acpi_power_meter 18139 0 ses 17385 0 enclosure 15209 1 ses usbhid 46754 0 hid 99171 1 usbhid megaraid_sas 87049 2 bnx2 85274 0
Given the silence from January I assume this is fixed