Created attachment 186851 [details] dmesg output after starting the VM With kernel 4.2, starting one of my VMs instantly freezes the host system and creates Machine Check Exceptions on CPUs dedicated to that particula VM: [12316.171917] mce: [Hardware Error]: CPU 3: Machine Check Exception: 5 Bank 17: be2000000003110a [12316.171917] mce: [Hardware Error]: RIP !INEXACT! 10:<ffffffff813217fd> {intel_idle+0xbd/0x120} [12316.171917] mce: [Hardware Error]: TSC 76fd7352bf6 ADDR fa137140 MISC 30f0083884509086 [12316.171917] mce: [Hardware Error]: PROCESSOR 0:306f2 TIME 1441130705 SOCKET 0 APIC 6 microcode 2d [12316.171917] mce: [Hardware Error]: Run the above through 'mcelog --ascii' ... A bisection revealed that commit fd717f11015f673487ffc826e59b2bad69d20fe5 introduced the problem: KVM: x86: apply guest MTRR virtualization on host reserved pages Currently guest MTRR is avoided if kvm_is_reserved_pfn returns true. However, the guest could prefer a different page type than UC for such pages. A good example is that pass-throughed VGA frame buffer is not always UC as host expected. This patch enables full use of virtual guest MTRRs. One could argue that the following warning is an obvious hint [12311.584431] pmd_set_huge: Cannot satisfy [mem 0x383fe0000000-0x383fe0200000] with a huge-page mapping due to MTRR override. but I'm able to run another VM without problems despite that warning. Please let me know I you need additional information.
Created attachment 186861 [details] VM configuration of the VM causing the freeze
Created attachment 186871 [details] VM configuration of the VM that is still working
mcelog: Family 6 Model 3f CPU: only decoding architectural errors Hardware event. This is not a software error. MCE 0 CPU 0 BANK 17 MISC 4f0083884501086 ADDR fa000200 TIME 1441663568 Tue Sep 8 00:06:08 2015 MCG status: MCi status: Uncorrected error Error enabled MCi_MISC register valid MCi_ADDR register valid Processor context corrupt MCA: corrected filtering (some unreported errors in same region) Generic CACHE Level-2 Generic Error STATUS be2000000003110a MCGSTATUS 0 MCGCAP 7000c16 APICID 0 SOCKETID 0 CPUID Vendor Intel Family 6 Model 63
Seems to be related to my problem. Please try to set the cores of the freezing VM to 2 or less.