Bug 16961

Summary: kernel BUG at arch/x86/kvm/../../../virt/kvm/kvm_main.c:1978
Product: Virtualization Reporter: Maciej Rutecki (maciej.rutecki)
Component: kvmAssignee: Avi Kivity (avi)
Status: CLOSED CODE_FIX    
Severity: normal CC: avi, florian, maciej.rutecki, rjw, sergey.senozhatsky
Priority: P1    
Hardware: All   
OS: Linux   
Kernel Version: 2.6.36-rc1-git2 Subsystem:
Regression: Yes Bisected commit-id:
Bug Depends on:    
Bug Blocks: 16444    

Description Maciej Rutecki 2010-08-24 19:04:36 UTC
Subject    : kernel BUG at arch/x86/kvm/../../../virt/kvm/kvm_main.c:1978
Submitter  : Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
Date       : 2010-08-19 9:54
Message-ID : 20100819095429.GA5201@swordfish.minsk.epam.com
References : http://marc.info/?l=linux-kernel&m=128221169606214&w=2

This entry is being used for tracking a regression from 2.6.35. Please don't
close it until the problem is fixed in the mainline.
Comment 1 Rafael J. Wysocki 2010-08-29 22:20:23 UTC
Handled-By : Avi Kivity <avi@redhat.com>
Comment 2 Florian Mickler 2010-08-30 13:43:35 UTC
Reported to be still present in 2.6.36-rc3 with backtrace:

http://lkml.org/lkml/2010/8/30/80
Comment 3 Rafael J. Wysocki 2010-08-30 17:42:27 UTC
On Monday, August 30, 2010, Sergey Senozhatsky wrote:
> On (08/30/10 00:36), Rafael J. Wysocki wrote:
> > Bug-Entry   : http://bugzilla.kernel.org/show_bug.cgi?id=16961
> > Subject             : kernel BUG at
> arch/x86/kvm/../../../virt/kvm/kvm_main.c:1978
> > Submitter   : Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
> > Date                : 2010-08-19 9:54 (11 days old)
> > Message-ID  : <20100819095429.GA5201@swordfish.minsk.epam.com>
> > References  : http://marc.info/?l=linux-kernel&m=128221169606214&w=2
> > Handled-By  : Avi Kivity <avi@redhat.com>
> > 
> 
> Hello,
> .36-rc3
> 
> [ 2913.218767] kvm: disabling virtualization on CPU1
> [ 2913.219078] CPU 1 is now offline
> [ 2913.221758] lockdep: fixing up alternatives.
> [ 2913.221814] Booting Node 0 Processor 1 APIC 0x1
> [ 2913.363980] ------------[ cut here ]------------
> [ 2913.364042] kernel BUG at arch/x86/kvm/../../../virt/kvm/kvm_main.c:1978!
Comment 4 Avi Kivity 2010-08-31 08:19:37 UTC
[  313.487258] kvm: enabling virtualization on CPU3
[  313.487326] NMI watchdog enabled, takes one hw-pmu counter.
[  313.489627] coretemp coretemp.3: TjMax is 105 C.
[  315.344223] lockdep: fixing up alternatives.
[  315.344236] Booting Node 0 Processor 2 APIC 0x4
[  315.487292] ------------[ cut here ]------------
[  315.487322] kernel BUG at arch/x86/kvm/../../../virt/kvm/kvm_main.c:1978!
[  315.487352] invalid opcode: 0000 [#1] PREEMPT SMP 
[  315.487388] last sysfs file: /sys/devices/system/cpu/cpu2/online
[  315.487415] CPU 2 
[  315.487425] Modules linked in: kvm_intel kvm ipv6 snd_seq_dummy ac battery \
snd_seq_oss snd_seq_midi_event snd_hwdep snd_seq snd_seq_device wmi usbhid hid \
snd_hda_codec_atihdmi radeon button snd_hda_codec_realtek snd_pcm_oss snd_mixer_oss \
snd_hda_intel snd_hda_codec snd_pcm snd_timer snd soundcore snd_page_alloc broadcom \
tg3 libphy psmouse serio_raw evdev ttm drm_kms_helper ehci_hcd sr_mod usbcore cdrom \
sd_mod ahci libahci [  315.487728] 
[  315.487739] Pid: 27687, comm: qemu-kvm Not tainted \
2.6.36-rc1-dbg-git2-00264-gd5a1964-dirty #134 Aspire 5741G    /Aspire 5741G     [  \
315.487787] RIP: 0010:[<ffffffffa02f2446>]  [<ffffffffa02f2446>] \
kvm_handle_fault_on_reboot+0xf/0x11 [kvm] [  315.487839] RSP: 0000:ffff88013c333b18  \
EFLAGS: 00010246 [  315.487863] RAX: ffff88013c333b40 RBX: ffff88012dcb0000 RCX: \
ffff88010c7e9000 [  315.487893] RDX: ffff880002280000 RSI: ffff8801563e8728 RDI: \
ffff88010c7e9000 [  315.487922] RBP: ffff88013c333b18 R08: ffff880002213cd0 R09: \
00000000000003c7 [  315.487952] R10: 0000000000000000 R11: 0000000000000001 R12: \
0000000000000002 [  315.487982] R13: ffff88010c7e9000 R14: ffff8801563e8000 R15: \
0000000000000000 [  315.488012] FS:  00007f15014be710(0000) GS:ffff880002280000(0000) \
knlGS:0000000000000000 [  315.488046] CS:  0010 DS: 002b ES: 002b CR0: \
000000008005003b [  315.488070] CR2: 0000000000000000 CR3: 0000000154b42000 CR4: \
00000000000006e0 [  315.488100] DR0: 0000000000000000 DR1: 0000000000000000 DR2: \
0000000000000000 [  315.488130] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: \
0000000000000400 [  315.488160] Process qemu-kvm (pid: 27687, threadinfo \
ffff88013c332000, task ffff8801563e8000) [  315.488194] Stack:
[  315.488205]  ffff88013c333b68 ffffffffa026d0a2 ffff88013c333b58 ffffffff81062e91
[  315.488244] <0> ffff8801563e8000 000000010c7e9000 ffff880157d78000 \
ffff88012dcb0000 [  315.488290] <0> 0000000000000002 0000000000014240 \
ffff88013c333b98 ffffffffa02fb4cc [  315.488337] Call Trace:
[  315.488353]  [<ffffffffa026d0a2>] vmx_vcpu_load+0x90/0x1a0 [kvm_intel]
[  315.488384]  [<ffffffff81062e91>] ? mark_held_locks+0x50/0x72
[  315.488415]  [<ffffffffa02fb4cc>] kvm_arch_vcpu_load+0x73/0xbb [kvm]
[  315.488446]  [<ffffffffa02f2cd8>] kvm_sched_in+0xd/0xf [kvm]
[  315.488474]  [<ffffffff8102de1f>] finish_task_switch+0x90/0xd7
[  315.488500]  [<ffffffff8102dd8f>] ? finish_task_switch+0x0/0xd7
[  315.488529]  [<ffffffff81373381>] schedule+0x81d/0x8f2
[  315.488553]  [<ffffffff81062e91>] ? mark_held_locks+0x50/0x72
[  315.488584]  [<ffffffffa030d82c>] ? kvm_cpu_has_interrupt+0x3a/0x56 [kvm]
[  315.488617]  [<ffffffffa02f5057>] kvm_vcpu_block+0x8e/0xa9 [kvm]
[  315.488645]  [<ffffffff81052dbd>] ? autoremove_wake_function+0x0/0x34
[  315.488678]  [<ffffffffa030024d>] kvm_arch_vcpu_ioctl_run+0x97d/0xca0 [kvm]
[  315.488712]  [<ffffffffa030016a>] ? kvm_arch_vcpu_ioctl_run+0x89a/0xca0 [kvm]
[  315.488743]  [<ffffffff813747d1>] ? mutex_lock_nested+0x2f3/0x31b
[  315.488771]  [<ffffffff8103441b>] ? sub_preempt_count+0x92/0xa5
[  315.488800]  [<ffffffffa02f4164>] kvm_vcpu_ioctl+0x113/0x4e9 [kvm]
[  315.488829]  [<ffffffff81376247>] ? _raw_spin_unlock_irq+0x3c/0x59
[  315.488859]  [<ffffffff810e8d5b>] do_vfs_ioctl+0x4c1/0x502
[  315.488885]  [<ffffffff810dc496>] ? fget_light+0xe0/0xf8
[  315.488909]  [<ffffffff810dc408>] ? fget_light+0x52/0xf8
[  315.490162]  [<ffffffff810e8ded>] sys_ioctl+0x51/0x74
[  315.491403]  [<ffffffff81002002>] system_call_fastpath+0x16/0x1b
[  315.492649] Code: 2f 02 00 85 c0 75 13 ba 01 00 00 00 31 f6 48 c7 c7 bb 27 2f a0 \
e8 6a db d4 e0 c9 c3 55 80 3d 59 2f 02 00 00 48 89 e5 74 02 eb fe <0f> 0b 55 48 89 e5 \
53 48 89 f3 48 83 ec 08 48 8b 87 90 00 00 00  [  315.495975] RIP  \
[<ffffffffa02f2446>] kvm_handle_fault_on_reboot+0xf/0x11 [kvm] [  315.498317]  RSP \
<ffff88013c333b18> [  315.510526] ---[ end trace ac38cfaaa84a0bdf ]---
[  315.510763] kvm: enabling virtualization on CPU2

Looks like a race between the scheduler and the kvm cpu online notifier.
Comment 5 Rafael J. Wysocki 2010-09-13 18:00:10 UTC
On Monday, September 13, 2010, Sergey Senozhatsky wrote:
> On (09/12/10 20:14), Rafael J. Wysocki wrote:
> > Bug-Entry   : http://bugzilla.kernel.org/show_bug.cgi?id=16961
> > Subject             : kernel BUG at
> arch/x86/kvm/../../../virt/kvm/kvm_main.c:1978
> > Submitter   : Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
> > Date                : 2010-08-19 9:54 (25 days old)
> > Message-ID  : <<20100819095429.GA5201@swordfish.minsk.epam.com>>
> > References  : http://marc.info/?l=linux-kernel&m=128221169606214&w=2
> > Handled-By  : Avi Kivity <avi@redhat.com>
> > 
> > 
> 
> Hello,
> Fixed by
> commit da908f2fb4e783c2a4de751fb90f11a0dd041161
> Author: Zachary Amsden <zamsden@redhat.com>