Bug 196149 - QEMU causes a host hang / reset on PPC64EL when used in KVM + HV mode (kvm_hv module)
Summary: QEMU causes a host hang / reset on PPC64EL when used in KVM + HV mode (kvm_hv...
Status: NEW
Alias: None
Product: Virtualization
Classification: Unclassified
Component: kvm (show other bugs)
Hardware: PPC-64 Linux
: P1 normal
Assignee: virtualization_kvm
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2017-06-21 17:04 UTC by Timothy Pearson
Modified: 2017-08-26 22:33 UTC (History)
0 users

See Also:
Kernel Version: 4.9
Subsystem:
Regression: No
Bisected commit-id:


Attachments

Description Timothy Pearson 2017-06-21 17:04:18 UTC
QEMU causes a host hang / reset on PPC64EL when used in KVM + HV mode (kvm_hv module) on host kernels with a page size of 4k.

After a random amount of uptime, starting new QEMU virtual machines will cause the host to experience a soft CPU lockup. Depending on configuration and other random factors the host will either checkstop and reboot, or hang indefinitely. The following stacktrace was pulled from an instance where the host simply hung after starting a fourth virtual machine.

Command line:

qemu-system-ppc64 --enable-kvm -M pseries -cpu host -m 8G -realtime mlock=on -kernel vmlinux-4.7.0-1-powerpc64le -initrd initrd.img-4.7.0-1-powerpc64le

Host kernel stack trace:

[ 1067.451053] INFO: rcu_sched self-detected stall on CPU
[ 1067.452646]  32-...: (5249 ticks this GP) idle=b4d/140000000000001/0 softirq=2214/2214 fqs=2576
[ 1067.454256]   (t=5251 jiffies g=13030 c=13029 q=127812)
[ 1067.455057] INFO: rcu_sched detected stalls on CPUs/tasks:
[ 1067.455062]  32-...: (5249 ticks this GP) idle=b4d/140000000000001/0 softirq=2214/2214 fqs=2576
[ 1067.455076]  (detected by 16, t=5252 jiffies, g=13030, c=13029, q=127812)
[ 1067.455078] Task dump for CPU 32:
[ 1067.455081] qemu-system-ppc R  running task        0  4426   4324 0x00040004
[ 1067.455082] Call Trace:
[ 1067.455087] [c000001f6cd13550] [0000000000000003] 0x3 (unreliable)
[ 1067.466157] Task dump for CPU 32:
[ 1067.466160] qemu-system-ppc R  running task        0  4426   4324 0x00040004
[ 1067.466161] Call Trace:
[ 1067.466168] [c000001f6cd133d0] [c00000000010d484] sched_show_task+0xe4/0x150 (unreliable)
[ 1067.466172] [c000001f6cd13440] [c000000000836918] rcu_dump_cpu_stacks+0xf4/0x140
[ 1067.466175] [c000001f6cd13490] [c00000000015a064] rcu_check_callbacks+0x9f4/0xb40
[ 1067.466178] [c000001f6cd135c0] [c000000000162394] update_process_times+0x44/0x90
[ 1067.466180] [c000001f6cd135f0] [c000000000179bd8] tick_sched_handle.isra.4+0x48/0xe0
[ 1067.466183] [c000001f6cd13630] [c000000000179cd4] tick_sched_timer+0x64/0xd0
[ 1067.466185] [c000001f6cd13670] [c0000000001633d4] __hrtimer_run_queues+0x124/0x420
[ 1067.466187] [c000001f6cd13700] [c00000000016407c] hrtimer_interrupt+0xec/0x2b0
[ 1067.466191] [c000001f6cd137c0] [c000000000026bec] __timer_interrupt+0x8c/0x270
[ 1067.466197] [c000001f6cd13810] [c00000000002721c] timer_interrupt+0x9c/0xe0
[ 1067.466200] [c000001f6cd13840] [c000000000009550] decrementer_common+0x150/0x180
[ 1067.466209] --- interrupt: 901 at kvmppc_hv_get_dirty_log+0x1c8/0x510 [kvm_hv]
                   LR = kvmppc_hv_get_dirty_log+0x1f4/0x510 [kvm_hv]
[ 1067.466212] [c000001f6cd13be0] [d00000001a889620] kvm_vm_ioctl_get_dirty_log_hv+0xd8/0x180 [kvm_hv]
[ 1067.466218] [c000001f6cd13c30] [d00000001a832268] kvm_vm_ioctl_get_dirty_log+0x40/0x60 [kvm]
[ 1067.466223] [c000001f6cd13c60] [d00000001a826bcc] kvm_vm_ioctl+0x524/0x8f0 [kvm]
[ 1067.466227] [c000001f6cd13d40] [c0000000003234f8] do_vfs_ioctl+0xd8/0x8c0
[ 1067.466230] [c000001f6cd13de0] [c000000000323db4] SyS_ioctl+0xd4/0xf0
[ 1067.466233] [c000001f6cd13e30] [c00000000000bd60] system_call+0x38/0xfc
Comment 1 Timothy Pearson 2017-08-26 22:33:56 UTC
Upgraded kernel to 4.11 and instead of a host crash, the VM just goes down with:

KVM: unknown exit, hardware reason 10b0000
NIP 0000000000000700   LR 00003fffa634812c CTR 00003fffa6367240 XER 0000000000000000 CPU#1
MSR 8000000000001000 HID0 0000000000000000  HF 8000000000000000 iidx 3 didx 3
TB 00000000 00000000 DECR 00000000
GPR00 fffffeffebe4f4a0 00003fffc7854530 00003fffa64a4700 00000100141b0b60
GPR04 00003fffa6730000 00000000000000b0 0000000000000010 0000000000000020
GPR08 0000000000000030 00003fffa6730000 0000000000000000 00000100141b0b60
GPR12 0000000000000001 00003fffa67b22c0 0000000029584808 00000000295847f8
GPR16 00003fffc7854a80 00003fffc7854a7c 00000000295858a8 0000000000000001
GPR20 000557afa0668167 00003fffa6491040 00000000295858b8 00000000295858c0
GPR24 0000000000000000 00003fffc7854940 00003fffa67aab90 00000000000000b0
GPR28 00000100141b0b60 00000000000000b0 00000000000000b0 00000100141b0920
CR 440008b2  [ G  G  -  -  -  L  LO E  ]             RES ffffffffffffffff
FPR00 074673502fd86f4c 0000000000000000 0000000000000000 0000000000000000
FPR04 0000000000000000 0000000000000000 00000100141b07d0 00000100141b03c0
FPR08 0000000000000000 000102020101ff00 0000000000000000 0000000000000000
FPR12 00003fffa64e00a0 0000000000000000 0000000000000000 0000000000000000
FPR16 0000000000000000 0000000000000000 0000000000000000 0000000000000000
FPR20 0000000000000000 0000000000000000 0000000000000000 0000000000000000
FPR24 0000000000000000 0000000000000000 0000000000000000 0000000000000000
FPR28 0000000000000000 0000000000000000 0000000000000000 0000000000000000
FPSCR 0000000000000000
 SRR0 00003fffa63672c8  SRR1 800000010280f032    PVR 00000000004d0200 VRSAVE 00000000ffffffff
SPRG0 0000000000000000 SPRG1 c00000000fb80900  SPRG2 00003fffa67b22c0  SPRG3 0000000000000001
SPRG4 0000000000000000 SPRG5 0000000000000000  SPRG6 0000000000000000  SPRG7 0000000000000000
HSRR0 0000000000000000 HSRR1 0000000000000000
 CFAR 0000000000000000
 LPCR 000000000184f001
 SDR1 0000000000000009   DAR 00003fffa6730000  DSISR 0000000042000000

Message from syslogd@host at Aug 26 17:31:46 ...
 kernel:[  909.718710] kvmppc_emulate_mmio: emulation failed (7cc02698)

The fastest way to trigger this fault (nearly 100% reproduction rate) is to try to boot the CentOS ppc64 full ISO image [1] with the following command line in QEMU 2.9:

qemu-system-ppc64 --enable-kvm -M pseries -cpu POWER8 -smp 2,cores=2,threads=1,sockets=1 -m 16G -realtime mlock=on -monitor pty disk.qcow2 -cdrom CentOS-7-AltArch-ppc64-Everything-1611.iso

[1] http://mirror.centos.org/altarch/7/isos/ppc64/CentOS-7-AltArch-ppc64-Everything-1611.iso

Note You need to log in before you can comment on or make changes to this bug.