Booting a 2.6.39-rc7 kernel with KVM like this results in an Oops: kvm -kernel arch/x86/boot/bzImage -smp 2 -cpu phenom Doesn't oops if either: * "-smp 2" is removed * "-cpu phenom" is removed (or -cpu qemu64 is used) * older guest kernel is used (2.6.38 worked) * I boot the kernel on bare metal on a Phenom CPU Bisection tells me 2.6.39-rc1 worked too, and this regression got introduced by: 5bbc097d890409d8eff4e3f1d26f11a9d6b7c07e is the first bad commit commit 5bbc097d890409d8eff4e3f1d26f11a9d6b7c07e Author: Joerg Roedel <joerg.roedel@amd.com> Date: Fri Apr 15 14:47:40 2011 +0200 x86, amd: Disable GartTlbWlkErr when BIOS forgets it This patch disables GartTlbWlk errors on AMD Fam10h CPUs if the BIOS forgets to do is (or is just too old). Letting these errors enabled can cause a sync-flood on the CPU causing a reboot. The AMD BKDG recommends disabling GART TLB Wlk Error completely. This patch is the fix for https://bugzilla.kernel.org/show_bug.cgi?id=33012 on my machine. Signed-off-by: Joerg Roedel <joerg.roedel@amd.com> Link: http://lkml.kernel.org/r/20110415131152.GJ18463@8bytes.org Tested-by: Alexandre Demers <alexandre.f.demers@gmail.com> Cc: <stable@kernel.org> Signed-off-by: H. Peter Anvin <hpa@linux.intel.com> :040000 040000 1d4588e5725523f455d5972189d4c934cc4c68dc 834f7cd8c7435c82240619df705f83030df844fa M arch Here is the oops: [ 0.015863] Initializing cgroup subsys blkio [ 0.016743] general protection fault: 0000 [#1] PREEMPT SMP [ 0.017451] last sysfs file: [ 0.017783] CPU 0 [ 0.017982] Modules linked in: [ 0.018390] [ 0.018562] Pid: 0, comm: swapper Not tainted 2.6.39-rc7-phenom #108 Bochs Bochs [ 0.019377] RIP: 0010:[<ffffffff814f82b0>] [<ffffffff814f82b0>] init_amd+0x397/0x3b7 [ 0.019998] RSP: 0000:ffffffff81a01ee8 EFLAGS: 00010246 [ 0.019998] RAX: 0000000000000001 RBX: 0000000042004200 RCX: 00000000c0010048 [ 0.019998] RDX: 0000000000000021 RSI: ffffffff81aafe00 RDI: ffffffff8160229c [ 0.019998] RBP: ffffffff81a01f18 R08: 0000000000000023 R09: 0000000000000000 [ 0.019998] R10: 0000000000000001 R11: 0000000000000070 R12: ffffffff81aafe00 [ 0.019998] R13: ffff880007ffb8c0 R14: ffffffffffffffff R15: 0000000000000000 [ 0.019998] FS: 0000000000000000(0000) GS:ffff880007c00000(0000) knlGS:0000000000000000 [ 0.019998] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b [ 0.019998] CR2: 0000000000000000 CR3: 0000000001a23000 CR4: 00000000000006b0 [ 0.019998] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 0.019998] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 [ 0.019998] Process swapper (pid: 0, threadinfo ffffffff81a00000, task ffffffff81a2b020) [ 0.019998] Stack: [ 0.019998] 0000001000000001 0000000000000000 ffffffff81a01f18 ffffffff814f7411 [ 0.019998] 0000000000726f73 ffffffff81aafe00 ffffffff81a01f38 ffffffff814f7617 [ 0.019998] ffffffff81afd5a0 ffffffff81b01980 ffffffff81a01f48 ffffffff81ad0615 [ 0.019998] Call Trace: [ 0.019998] [<ffffffff814f7411>] ? get_cpu_cap+0xb2/0xb6 [ 0.019998] [<ffffffff814f7617>] identify_cpu+0x202/0x2f4 [ 0.019998] [<ffffffff81ad0615>] identify_boot_cpu+0x10/0x30 [ 0.019998] [<ffffffff81ad081b>] check_bugs+0x9/0x2d [ 0.019998] [<ffffffff81ac9b50>] start_kernel+0x353/0x36a [ 0.019998] [<ffffffff81ac9322>] x86_64_start_reservations+0x132/0x136 [ 0.019998] [<ffffffff81ac9416>] x86_64_start_kernel+0xf0/0xf7 [ 0.019998] Code: ff 41 80 3c 24 0e 76 17 48 c7 c7 90 22 60 81 e8 27 85 b1 ff 84 c0 75 07 f0 41 80 4c 24 30 02 41 80 3c 24 10 75 1c b9 48 00 01 c0 <0f> 32 89 c0 48 c1 e2 20 80 cc 04 48 09 d0 48 89 c2 48 c1 ea 20 [ 0.019998] RIP [<ffffffff814f82b0>] init_amd+0x397/0x3b7 [ 0.019998] RSP <ffffffff81a01ee8> [ 0.020004] ---[ end trace 4eaa2a86a8e2da22 ]--- [ 0.020491] Kernel panic - not syncing: Attempted to kill the idle task! [ 0.021211] Pid: 0, comm: swapper Tainted: G D 2.6.39-rc7-phenom #108 [ 0.023334] Call Trace: [ 0.023648] [<ffffffff81500988>] panic+0x9b/0x1a2 [ 0.024220] [<ffffffff8104bd8e>] do_exit+0x75e/0x890 [ 0.024773] [<ffffffff810498d4>] ? kmsg_dump+0xc4/0x100 [ 0.025348] [<ffffffff81005fdc>] oops_end+0x9c/0xe0 [ 0.025856] [<ffffffff81006173>] die+0x53/0x80 [ 0.026363] [<ffffffff81003384>] do_general_protection+0x154/0x160 [ 0.026670] [<ffffffff8150a7df>] general_protection+0x1f/0x30 [ 0.027342] [<ffffffff814f82b0>] ? init_amd+0x397/0x3b7 [ 0.027863] [<ffffffff814f8299>] ? init_amd+0x380/0x3b7 [ 0.028324] [<ffffffff814f7411>] ? get_cpu_cap+0xb2/0xb6 [ 0.028792] [<ffffffff814f7617>] identify_cpu+0x202/0x2f4 [ 0.029268] [<ffffffff81ad0615>] identify_boot_cpu+0x10/0x30 [ 0.030002] [<ffffffff81ad081b>] check_bugs+0x9/0x2d [ 0.030513] [<ffffffff81ac9b50>] start_kernel+0x353/0x36a [ 0.031055] [<ffffffff81ac9322>] x86_64_start_reservations+0x132/0x136 [ 0.031626] [<ffffffff81ac9416>] x86_64_start_kernel+0xf0/0xf7 KVM version is: QEMU emulator version 0.14.0 (qemu-kvm-0.14.0 Debian 0.14.0+dfsg-1~tls), Copyright (c) 2003-2008 Fabrice Bellard Host kernel is: Linux debian 2.6.39-rc7-phenom #69 SMP PREEMPT Thu May 12 20:07:27 EEST 2011 x86_64 GNU/Linux Host CPU is: processor : 5 vendor_id : AuthenticAMD cpu family : 16 model : 10 model name : AMD Phenom(tm) II X6 1090T Processor stepping : 0 cpu MHz : 3200.000 cache size : 512 KB physical id : 0 siblings : 6 core id : 5 cpu cores : 6 apicid : 5 initial apicid : 5 fpu : yes fpu_exception : yes cpuid level : 6 wp : yes flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt pdpe1gb rdtscp lm 3dnowext 3dnow constant_tsc rep_good nopl nonstop_tsc extd_apicid aperfmperf pni monitor cx16 popcnt lahf_lm cmp_legacy svm extapic cr8_legacy abm sse4a misalignsse 3dnowprefetch osvw ibs skinit wdt cpb npt lbrv svm_lock nrip_save pausefilter bogomips : 6431.10 TLB size : 1024 4K pages clflush size : 64 cache_alignment : 64 address sizes : 48 bits physical, 48 bits virtual power management: ts ttp tm stc 100mhzsteps hwpstate [9]
Host kernel says: [23723.632616] kvm: 23803: cpu3 unhandled rdmsr: 0xc0010048 [23723.632620] kvm: 23803: cpu3 unhandled wrmsr: 0xc0010048 data fff00000401 Looks like KVM doesn't support that MSR, and that KVM will cause a GP fault in the guest if it tries to write to unknown MSR: if (svm_set_msr(&svm->vcpu, ecx, data)) { trace_kvm_msr_write_ex(ecx, data); kvm_inject_gp(&svm->vcpu, 0); I think that either KVM should be taught about that MSR, or the code writing to it should check whether running under KVM (what about Xen/VMware/etc.?), and don't write if so.
First-Bad-Commit : 5bbc097d890409d8eff4e3f1d26f11a9d6b7c07e
A patch referencing this bug report has been merged in v3.0-rc1: commit d47cc0db8fd6011de2248df505fc34990b7451bf Author: Roedel, Joerg <Joerg.Roedel@amd.com> Date: Thu May 19 11:13:39 2011 +0200 x86, amd: Use _safe() msr access for GartTlbWlk disable code
Confirmed the fix.