Bug 55201
Summary: | host panic when "creating guest, doing scp and killing QEMU process" continuously | ||
---|---|---|---|
Product: | Virtualization | Reporter: | Jay Ren (yongjie.ren) |
Component: | kvm | Assignee: | virtualization_kvm |
Status: | CLOSED CODE_FIX | ||
Severity: | normal | CC: | gleb, mtosatti |
Priority: | P1 | ||
Hardware: | All | ||
OS: | Linux | ||
Kernel Version: | 3.7.0 | Subsystem: | |
Regression: | No | Bisected commit-id: | |
Attachments: |
host serial port log when bug happens
kvm-fix-inprogress-clock-update-deadlock.patch |
Description
Jay Ren
2013-03-14 02:54:10 UTC
also, I got some info in the console when this bug happens. ------------- Message from syslogd@vt-snb9 at Mar 4 15:37:26 ... kernel:BUG: soft lockup - CPU#15 stuck for 23s! [python:57408] Message from syslogd@vt-snb9 at Mar 4 15:37:30 ... kernel:BUG: soft lockup - CPU#4 stuck for 22s! [qemu-system-x86:57338] Message from syslogd@vt-snb9 at Mar 4 15:38:02 ... kernel:BUG: soft lockup - CPU#4 stuck for 21s! [qemu-system-x86:57338] Message from syslogd@vt-snb9 at Mar 4 15:38:02 ... kernel:BUG: soft lockup - CPU#15 stuck for 23s! [python:57408] Message from syslogd@vt-snb9 at Mar 4 15:38:06 ... kernel:BUG: soft lockup - CPU#6 stuck for 23s! [qemu-system-x86:57320] Created attachment 95371 [details]
host serial port log when bug happens
There is a deadlock in pvclock handling: cpu0: cpu1: kvm_gen_update_masterclock() kvm_guest_time_update() spin_lock(pvclock_gtod_sync_lock) local_irq_save(flags) spin_lock(pvclock_gtod_sync_lock) kvm_make_mclock_inprogress_request(kvm) make_all_cpus_request() smp_call_function_many() Now if smp_call_function_many() called by cpu0 tries to call function on cpu1 there will be a deadlock. It shouldn't do it though since make_all_cpus_request() is careful to not IPI to cpus that are not in a guest mode and cpu1 is not in a guest mode. The only way I see the deadlock may happen is if make_all_cpus_request() fails to allocate "cpus" cpu mask in which case it IPIs all online cpus. Created attachment 95451 [details]
kvm-fix-inprogress-clock-update-deadlock.patch
Jay Ren, would you please confirm whether the attached patch fixes the problem. (In reply to comment #5) > Jay Ren, would you please confirm whether the attached patch fixes the > problem. Sure, I'll give the feedback when I get the testing result. (In reply to comment #5) > Jay Ren, would you please confirm whether the attached patch fixes the > problem. Hi Marcelo, you patch can fix this bug. With your patch, I tested the loop for about 3700 times, and didn't found any panic for Host. Without your patch, I can reproduce this bug (host panic) in less than 300 times. Reported-and-Tested-by: Yongjie Ren <yongjie.ren@intel.com> I verified this against the latest kvm.git tree (next branch). The following commit fixed this bug. commit c09664bb44184b3846e8c5254db4eae4b932682a Author: Marcelo Tosatti <mtosatti@redhat.com> Date: Mon Mar 18 13:54:32 2013 -0300 KVM: x86: fix deadlock in clock-in-progress request handling There is a deadlock in pvclock handling: cpu0: cpu1: kvm_gen_update_masterclock() kvm_guest_time_update() spin_lock(pvclock_gtod_sync_lock) local_irq_save(flags) spin_lock(pvclock_gtod_sync_lock) kvm_make_mclock_inprogress_request(kvm) make_all_cpus_request() smp_call_function_many() Now if smp_call_function_many() called by cpu0 tries to call function on cpu1 there will be a deadlock. Fix by moving pvclock_gtod_sync_lock protected section outside irq disabled section. Analyzed by Gleb Natapov <gleb@redhat.com> Acked-by: Gleb Natapov <gleb@redhat.com> Reported-and-Tested-by: Yongjie Ren <yongjie.ren@intel.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com> |