Bug 211109 - Random Hard kernel freeze
Summary: Random Hard kernel freeze
Status: NEW
Alias: None
Product: Virtualization
Classification: Unclassified
Component: kvm (show other bugs)
Hardware: All Linux
: P1 normal
Assignee: virtualization_kvm
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2021-01-10 09:45 UTC by Vytautas Mickus
Modified: 2021-01-10 17:18 UTC (History)
0 users

See Also:
Kernel Version: 5.9.9-95-tkg-MuQSS-llvm
Subsystem:
Regression: No
Bisected commit-id:


Attachments

Description Vytautas Mickus 2021-01-10 09:45:51 UTC
Allocating a large amount of vcpu (30,31) on a system with 32 threads (16 core ryzen 3950x) causes hard freeze of the whole system. While 16-17 vcpu allocation is fine.

This was seen while using minikube
```
minikube start --cni=calico --container-runtime=cri-o --cpus=$(($(nproc) - 1)) --driver=kvm2 -n 3
```

Last journal entry is:
```
Jan 10 10:14:51 rig libvirtd[2269]: operation failed: domain 'minikube' already exists with uuid 4aacd29f-bd5f-418d-b267-42b79f75fbab
```
At which point minikube sees that the domain already exists and launches it. Hard freeze is immediate.

OS:
Linux rig 5.9.9-95-tkg-MuQSS-llvm #1 TKG SMP PREEMPT Sun, 22 Nov 2020 07:41:26 +0000 x86_64 GNU/Linux

os-release:
```
NAME="Arch Linux"
PRETTY_NAME="Arch Linux"
ID=arch
BUILD_ID=rolling
```
Comment 1 Vytautas Mickus 2021-01-10 09:48:43 UTC
VM configuration xml:
```
<domain type='kvm' id='1'>
  <name>minikube</name>
  <uuid>b2dc4e44-53b6-433f-850c-a638294a1cf5</uuid>
  <memory unit='KiB'>6144000</memory>
  <currentMemory unit='KiB'>6144000</currentMemory>
  <vcpu placement='static'>31</vcpu>
  <resource>
    <partition>/machine</partition>
  </resource>
  <os>
    <type arch='x86_64' machine='pc-i440fx-5.1'>hvm</type>
    <boot dev='cdrom'/>
    <boot dev='hd'/>
    <bootmenu enable='no'/>
  </os>
  <features>
    <acpi/>
    <apic/>
    <pae/>
  </features>
  <cpu mode='host-passthrough' check='none' migratable='on'/>
  <clock offset='utc'/>
  <on_poweroff>destroy</on_poweroff>
  <on_reboot>restart</on_reboot>
  <on_crash>destroy</on_crash>
  <devices>
    <emulator>/usr/bin/qemu-system-x86_64</emulator>
    <disk type='file' device='cdrom'>
      <driver name='qemu' type='raw'/>
      <source file='/home/vytautas/.minikube/machines/minikube/boot2docker.iso' index='2'/>
      <backingStore/>
      <target dev='hdc' bus='scsi'/>
      <readonly/>
      <alias name='scsi0-0-2'/>
      <address type='drive' controller='0' bus='0' target='0' unit='2'/>
    </disk>
    <disk type='file' device='disk'>
      <driver name='qemu' type='raw' io='threads'/>
      <source file='/home/vytautas/.minikube/machines/minikube/minikube.rawdisk' index='1'/>
      <backingStore/>
      <target dev='hda' bus='virtio'/>
      <alias name='virtio-disk0'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x05' function='0x0'/>
    </disk>
    <controller type='usb' index='0' model='piix3-uhci'>
      <alias name='usb'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x01' function='0x2'/>
    </controller>
    <controller type='pci' index='0' model='pci-root'>
      <alias name='pci.0'/>
    </controller>
    <controller type='scsi' index='0' model='lsilogic'>
      <alias name='scsi0'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x04' function='0x0'/>
    </controller>
    <interface type='network'>
      <mac address='e0:58:91:b9:3c:cd'/>
      <source network='default' portid='cd0c40d3-2faf-479e-990d-1a5956405a2e' bridge='virbr0'/>
      <target dev='vnet0'/>
      <model type='virtio'/>
      <alias name='net0'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x02' function='0x0'/>
    </interface>
    <interface type='network'>
      <mac address='8c:a0:a6:01:80:c5'/>
      <source network='minikube-net' portid='23d86a93-40be-42c0-85e6-be14a3991f14' bridge='virbr1'/>
      <target dev='vnet1'/>
      <model type='virtio'/>
      <alias name='net1'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x03' function='0x0'/>
    </interface>
    <serial type='pty'>
      <source path='/dev/pts/3'/>
      <target type='isa-serial' port='0'>
        <model name='isa-serial'/>
      </target>
      <alias name='serial0'/>
    </serial>
    <console type='pty' tty='/dev/pts/3'>
      <source path='/dev/pts/3'/>
      <target type='serial' port='0'/>
      <alias name='serial0'/>
    </console>
    <input type='mouse' bus='ps2'>
      <alias name='input0'/>
    </input>
    <input type='keyboard' bus='ps2'>
      <alias name='input1'/>
    </input>
    <memballoon model='virtio'>
      <alias name='balloon0'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x06' function='0x0'/>
    </memballoon>
    <rng model='virtio'>
      <backend model='random'>/dev/random</backend>
      <alias name='rng0'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x07' function='0x0'/>
    </rng>
  </devices>
  <seclabel type='dynamic' model='dac' relabel='yes'>
    <label>+65534:+992</label>
    <imagelabel>+65534:+992</imagelabel>
  </seclabel>
</domain>
```
Comment 2 Vytautas Mickus 2021-01-10 09:55:51 UTC
Might this be an issue with core 0 being used of the cpu?
Comment 3 Vytautas Mickus 2021-01-10 10:07:07 UTC
I guess it is not connected to the vcpu amount, it's just random.
Comment 4 Vytautas Mickus 2021-01-10 17:18:54 UTC
Sorry, It was, in fact, the tkg-MuQSS-llvm patchset. Without it I have no troubles running vms with kvm.

Note You need to log in before you can comment on or make changes to this bug.