Bug 206579 - KVM with passthrough generates "BUG: kernel NULL pointer dereference" and crashes
Summary: KVM with passthrough generates "BUG: kernel NULL pointer dereference" and cra...
Status: NEW
Alias: None
Product: Virtualization
Classification: Unclassified
Component: kvm (show other bugs)
Hardware: All Linux
: P1 normal
Assignee: virtualization_kvm
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2020-02-17 18:17 UTC by Robert M. Muncrief
Modified: 2020-08-24 17:03 UTC (History)
7 users (show)

See Also:
Kernel Version: 5.6 rc2
Subsystem:
Regression: No
Bisected commit-id:


Attachments
KVM crash at boot with two test patched (91.78 KB, text/plain)
2020-02-22 00:21 UTC, Robert M. Muncrief
Details
Patch to fix the NULL pointer de-reference for when AVIC is enabled + VFIO pass-through (1.83 KB, patch)
2020-02-24 13:52 UTC, Suravee Suthikulpanit
Details | Diff
Patched rc3 dmesg crash output (92.99 KB, text/plain)
2020-02-24 16:50 UTC, Robert M. Muncrief
Details
dmesg crash output with Paolo's latest patch (92.03 KB, text/plain)
2020-02-24 20:36 UTC, Robert M. Muncrief
Details
Exact qemu command being executed. (4.61 KB, text/plain)
2020-02-24 20:38 UTC, Robert M. Muncrief
Details
Dmesg crash output with properly cast pointer (91.94 KB, text/plain)
2020-02-24 21:43 UTC, Robert M. Muncrief
Details
Success! Working dmesg output. (90.52 KB, text/plain)
2020-02-24 22:09 UTC, Robert M. Muncrief
Details
svm.c patch (763 bytes, patch)
2020-02-24 22:53 UTC, Robert M. Muncrief
Details | Diff
svm.c patch option 2 (656 bytes, patch)
2020-02-25 20:34 UTC, Robert M. Muncrief
Details | Diff
qemu-vkm setup info resulting in nonfunctional avic (3.52 KB, application/gzip)
2020-02-26 20:34 UTC, Robert M. Muncrief
Details
avic_inhibit_reasons debug information (5.91 KB, application/gzip)
2020-02-27 20:50 UTC, Robert M. Muncrief
Details
avic_inhibit_reasons-anthony (10.38 KB, application/x-xz)
2020-02-27 23:00 UTC, Kayant
Details
Working avic setup information (3.33 KB, application/gzip)
2020-02-28 07:26 UTC, Robert M. Muncrief
Details
Perf_stat_anthony_apicv (6.21 KB, text/plain)
2020-02-28 20:14 UTC, Kayant
Details
Many "WARN_ON(entry & AVIC_PHYSICAL_ID_ENTRY_IS_RUNNING_MASK);" warnings (202.44 KB, text/plain)
2020-03-01 06:27 UTC, Robert M. Muncrief
Details
dmesg output with latest patches from Comment 46 (142.81 KB, text/plain)
2020-03-22 18:58 UTC, Robert M. Muncrief
Details
Latest KVM warnings information (32.62 KB, application/gzip)
2020-04-04 19:24 UTC, Robert M. Muncrief
Details
Kvm-anthony-warnings (16.29 KB, application/x-xz)
2020-04-06 10:27 UTC, Kayant
Details
Windows SVM IOMMU testing (82 bytes, text/plain)
2020-04-10 19:28 UTC, Kayant
Details

Description Robert M. Muncrief 2020-02-17 18:17:12 UTC
###Summary
I'm running the latest Arch with all updates and have a Windows 10 VM with GPU, USB, and SATA passthrough that runs fantastically with all kernels previous to 5.6 rc1, back to 4.19.x. However the VM dies with a NULL pointer dereference bug with kernels 5.6 rc1 and rc2, and the physical machine has to be rebooted because KVM will no longer run.

###System Specifications
Motherboard: ASUS TUF Gaming X570-Plus motherboard with the latest BIOS v1405
CPU: R7 3700x
DRAM: 16GB (8GB x 2) of Corsair CMK16GX4M2Z3200C16 DDR4
GPU1: Sapphire Nitro R9 390 in the primary PCIE x16 slot
GPU2: GT 710 in the secondary PCIE x4 slot

###VM Configuration
The VM runs Windows 10 and passes through the R9 390 GPU, a USB card, and a SATA port. I have run it successfully on kernels from 4.19.x to 5.5.4. However before this bug report I did a simple test with GPU passthrough only, but the dmesg output was the same. Here is the full VM XML:

--------------- Start VM XML ---------------
<!--
WARNING: THIS IS AN AUTO-GENERATED FILE. CHANGES TO IT ARE LIKELY TO BE
OVERWRITTEN AND LOST. Changes to this xml configuration should be made using:
  virsh edit Win10-1_UEFI
or other application using the libvirt API.
-->

<domain type='kvm'>
  <name>Win10-1_UEFI</name>
  <uuid>b83c2a85-b248-4093-bded-dae2bc2ccf05</uuid>
  <metadata>
    <libosinfo:libosinfo xmlns:libosinfo="http://libosinfo.org/xmlns/libvirt/domain/1.0">
      <libosinfo:os id="http://microsoft.com/win/10"/>
    </libosinfo:libosinfo>
  </metadata>
  <memory unit='KiB'>8388608</memory>
  <currentMemory unit='KiB'>8388608</currentMemory>
  <vcpu placement='static' current='8'>16</vcpu>
  <os>
    <type arch='x86_64' machine='pc-q35-4.1'>hvm</type>
    <loader readonly='yes' type='pflash'>/usr/share/ovmf/x64/OVMF_CODE.fd</loader>
    <nvram>/var/lib/libvirt/qemu/nvram/Win10-1_UEFI_VARS.fd</nvram>
  </os>
  <features>
    <acpi/>
    <apic/>
    <hyperv>
      <relaxed state='on'/>
      <vapic state='on'/>
      <spinlocks state='on' retries='8191'/>
    </hyperv>
    <vmport state='off'/>
  </features>
  <cpu mode='host-passthrough' check='partial'>
    <topology sockets='1' cores='8' threads='2'/>
  </cpu>
  <clock offset='localtime'>
    <timer name='rtc' tickpolicy='catchup'/>
    <timer name='pit' tickpolicy='delay'/>
    <timer name='hpet' present='no'/>
    <timer name='hypervclock' present='yes'/>
  </clock>
  <on_poweroff>destroy</on_poweroff>
  <on_reboot>restart</on_reboot>
  <on_crash>destroy</on_crash>
  <pm>
    <suspend-to-mem enabled='no'/>
    <suspend-to-disk enabled='no'/>
  </pm>
  <devices>
    <emulator>/usr/bin/qemu-system-x86_64</emulator>
    <disk type='file' device='disk'>
      <driver name='qemu' type='raw'/>
      <source file='/mnt/data1/VM/KVM/Win10-1_UEFI.img'/>
      <target dev='vda' bus='virtio'/>
      <boot order='1'/>
      <address type='pci' domain='0x0000' bus='0x0a' slot='0x00' function='0x0'/>
    </disk>
    <disk type='file' device='disk'>
      <driver name='qemu' type='raw'/>
      <source file='/mnt/data4/VM/KVM/Win10-1_Data.img'/>
      <target dev='vdb' bus='virtio'/>
      <address type='pci' domain='0x0000' bus='0x0c' slot='0x00' function='0x0'/>
    </disk>
    <disk type='file' device='cdrom'>
      <driver name='qemu' type='raw'/>
      <source file='/usr/share/virtio/virtio-win.iso'/>
      <target dev='sdc' bus='sata'/>
      <readonly/>
      <boot order='2'/>
      <address type='drive' controller='0' bus='0' target='0' unit='2'/>
    </disk>
    <controller type='usb' index='0' model='qemu-xhci' ports='15'>
      <address type='pci' domain='0x0000' bus='0x02' slot='0x00' function='0x0'/>
    </controller>
    <controller type='sata' index='0'>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x1f' function='0x2'/>
    </controller>
    <controller type='pci' index='0' model='pcie-root'/>
    <controller type='pci' index='1' model='pcie-root-port'>
      <model name='pcie-root-port'/>
      <target chassis='1' port='0x10'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x02' function='0x0' multifunction='on'/>
    </controller>
    <controller type='pci' index='2' model='pcie-root-port'>
      <model name='pcie-root-port'/>
      <target chassis='2' port='0x11'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x02' function='0x1'/>
    </controller>
    <controller type='pci' index='3' model='pcie-root-port'>
      <model name='pcie-root-port'/>
      <target chassis='3' port='0x12'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x02' function='0x2'/>
    </controller>
    <controller type='pci' index='4' model='pcie-root-port'>
      <model name='pcie-root-port'/>
      <target chassis='4' port='0x13'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x02' function='0x3'/>
    </controller>
    <controller type='pci' index='5' model='pcie-root-port'>
      <model name='pcie-root-port'/>
      <target chassis='5' port='0x14'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x02' function='0x4'/>
    </controller>
    <controller type='pci' index='6' model='pcie-root-port'>
      <model name='pcie-root-port'/>
      <target chassis='6' port='0x15'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x02' function='0x5'/>
    </controller>
    <controller type='pci' index='7' model='pcie-root-port'>
      <model name='pcie-root-port'/>
      <target chassis='7' port='0x16'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x02' function='0x6'/>
    </controller>
    <controller type='pci' index='8' model='pcie-root-port'>
      <model name='pcie-root-port'/>
      <target chassis='8' port='0x17'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x02' function='0x7'/>
    </controller>
    <controller type='pci' index='9' model='pcie-root-port'>
      <model name='pcie-root-port'/>
      <target chassis='9' port='0x18'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x03' function='0x0' multifunction='on'/>
    </controller>
    <controller type='pci' index='10' model='pcie-root-port'>
      <model name='pcie-root-port'/>
      <target chassis='10' port='0x19'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x03' function='0x1'/>
    </controller>
    <controller type='pci' index='11' model='pcie-to-pci-bridge'>
      <model name='pcie-pci-bridge'/>
      <address type='pci' domain='0x0000' bus='0x08' slot='0x00' function='0x0'/>
    </controller>
    <controller type='pci' index='12' model='pcie-root-port'>
      <model name='pcie-root-port'/>
      <target chassis='12' port='0x1a'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x03' function='0x2'/>
    </controller>
    <controller type='pci' index='13' model='pcie-root-port'>
      <model name='pcie-root-port'/>
      <target chassis='13' port='0x1b'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x03' function='0x3'/>
    </controller>
    <controller type='virtio-serial' index='0'>
      <address type='pci' domain='0x0000' bus='0x03' slot='0x00' function='0x0'/>
    </controller>
    <interface type='bridge'>
      <mac address='52:54:00:6d:6f:68'/>
      <source bridge='vm_bridge0'/>
      <model type='virtio'/>
      <address type='pci' domain='0x0000' bus='0x01' slot='0x00' function='0x0'/>
    </interface>
    <serial type='pty'>
      <target type='isa-serial' port='0'>
        <model name='isa-serial'/>
      </target>
    </serial>
    <console type='pty'>
      <target type='serial' port='0'/>
    </console>
    <channel type='spicevmc'>
      <target type='virtio' name='com.redhat.spice.0'/>
      <address type='virtio-serial' controller='0' bus='0' port='1'/>
    </channel>
    <input type='mouse' bus='ps2'/>
    <input type='keyboard' bus='ps2'/>
    <input type='tablet' bus='virtio'>
      <address type='pci' domain='0x0000' bus='0x0d' slot='0x00' function='0x0'/>
    </input>
    <graphics type='spice' autoport='yes'>
      <listen type='address'/>
    </graphics>
    <sound model='ich9'>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x1b' function='0x0'/>
    </sound>
    <video>
      <model type='qxl' ram='65536' vram='65536' vgamem='16384' heads='1' primary='yes'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x01' function='0x0'/>
    </video>
    <hostdev mode='subsystem' type='pci' managed='yes'>
      <source>
        <address domain='0x0000' bus='0x05' slot='0x00' function='0x0'/>
      </source>
      <address type='pci' domain='0x0000' bus='0x04' slot='0x00' function='0x0'/>
    </hostdev>
    <hostdev mode='subsystem' type='pci' managed='yes'>
      <source>
        <address domain='0x0000' bus='0x08' slot='0x00' function='0x0'/>
      </source>
      <address type='pci' domain='0x0000' bus='0x06' slot='0x00' function='0x0'/>
    </hostdev>
    <hostdev mode='subsystem' type='pci' managed='yes'>
      <source>
        <address domain='0x0000' bus='0x0a' slot='0x00' function='0x0'/>
      </source>
      <address type='pci' domain='0x0000' bus='0x07' slot='0x00' function='0x0'/>
    </hostdev>
    <hostdev mode='subsystem' type='pci' managed='yes'>
      <source>
        <address domain='0x0000' bus='0x0a' slot='0x00' function='0x1'/>
      </source>
      <address type='pci' domain='0x0000' bus='0x09' slot='0x00' function='0x0'/>
    </hostdev>
    <redirdev bus='usb' type='spicevmc'>
      <address type='usb' bus='0' port='2'/>
    </redirdev>
    <redirdev bus='usb' type='spicevmc'>
      <address type='usb' bus='0' port='3'/>
    </redirdev>
    <memballoon model='virtio'>
      <address type='pci' domain='0x0000' bus='0x05' slot='0x00' function='0x0'/>
    </memballoon>
  </devices>
</domain>

--------------- End VM XML ---------------

###Error Output
-------------- Start dmesg Output -------------

[  104.053227] vm_bridge0: port 2(vnet0) entered blocking state
[  104.053229] vm_bridge0: port 2(vnet0) entered disabled state
[  104.053284] device vnet0 entered promiscuous mode
[  104.053407] vm_bridge0: port 2(vnet0) entered blocking state
[  104.053409] vm_bridge0: port 2(vnet0) entered listening state
[  105.209759] vfio-pci 0000:0a:00.0: enabling device (0002 -> 0003)
[  105.210049] vfio-pci 0000:0a:00.0: vfio_ecap_init: hiding ecap 0x19@0x270
[  105.210056] vfio-pci 0000:0a:00.0: vfio_ecap_init: hiding ecap 0x1b@0x2d0
[  105.229765] vfio-pci 0000:0a:00.1: enabling device (0000 -> 0002)
[  106.549861] BUG: kernel NULL pointer dereference, address: 0000000000000010
[  106.549865] #PF: supervisor read access in kernel mode
[  106.549867] #PF: error_code(0x0000) - not-present page
[  106.549869] PGD 0 P4D 0 
[  106.549872] Oops: 0000 [#1] PREEMPT SMP NOPTI
[  106.549876] CPU: 12 PID: 5762 Comm: CPU 0/KVM Tainted: P           OE     5.6.0-rc2-1-mainline #1
[  106.549878] Hardware name: System manufacturer System Product Name/TUF GAMING X570-PLUS, BIOS 1405 11/19/2019
[  106.549885] RIP: 0010:svm_refresh_apicv_exec_ctrl+0xe4/0x110 [kvm_amd]
[  106.549888] Code: 8b 83 f8 39 00 00 48 39 c5 74 31 48 8b 9b f8 39 00 00 48 39 dd 75 13 eb 23 e8 18 70 cc d3 85 c0 75 1a 48 8b 1b 48 39 dd 74 12 <48> 8b 7b 10 45 85 e4 75 e6 e8 6e 6f cc d3 85 c0 74 e6 5b 4c 89 ee
[  106.549890] RSP: 0018:ffffaef000d5bd50 EFLAGS: 00010082
[  106.549892] RAX: 0000000000000000 RBX: 0000000000000000 RCX: ffff93353937d000
[  106.549894] RDX: 0000000000000001 RSI: ffff9334aa6afc00 RDI: 0000000000000000
[  106.549896] RBP: ffff9334aa6c7408 R08: 0000000000000000 R09: ffff9334aa6afc00
[  106.549897] R10: 00000018d0a785db R11: 0000000000000000 R12: 0000000000000000
[  106.549898] R13: 0000000000000202 R14: ffff9334aa6c7418 R15: ffffaef00149a7a0
[  106.549900] FS:  00007f80d89ff700(0000) GS:ffff93359eb00000(0000) knlGS:0000000000000000
[  106.549901] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  106.549903] CR2: 0000000000000010 CR3: 000000032a5b2000 CR4: 0000000000340ee0
[  106.549904] Call Trace:
[  106.549929]  kvm_arch_vcpu_ioctl_run+0x33d/0x1b20 [kvm]
[  106.549949]  kvm_vcpu_ioctl+0x266/0x630 [kvm]
[  106.549954]  ksys_ioctl+0x87/0xc0
[  106.549957]  __x64_sys_ioctl+0x16/0x20
[  106.549961]  do_syscall_64+0x4e/0x150
[  106.549964]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
[  106.549967] RIP: 0033:0x7f80dbed42eb
[  106.549969] Code: 0f 1e fa 48 8b 05 a5 8b 0c 00 64 c7 00 26 00 00 00 48 c7 c0 ff ff ff ff c3 66 0f 1f 44 00 00 f3 0f 1e fa b8 10 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 75 8b 0c 00 f7 d8 64 89 01 48
[  106.549970] RSP: 002b:00007f80d89fcea8 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
[  106.549972] RAX: ffffffffffffffda RBX: 000000000000ae80 RCX: 00007f80dbed42eb
[  106.549973] RDX: 0000000000000000 RSI: 000000000000ae80 RDI: 000000000000001a
[  106.549974] RBP: 00007f80d97d8880 R08: 000055c551c01110 R09: 0000000000000000
[  106.549975] R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000
[  106.549976] R13: 00007ffeb948a03f R14: 00007f80d89fd140 R15: 00007f80d89ff700
[  106.549980] Modules linked in: vhost_net vhost tap cpufreq_ondemand ebtable_filter ebtables uvcvideo videobuf2_vmalloc videobuf2_memops videobuf2_v4l2 videobuf2_common tun videodev snd_usb_audio snd_usbmidi_lib snd_rawmidi snd_seq_device mc bridge stp llc nvidia_drm(POE) nvidia_modeset(POE) nct6775 nvidia(POE) hwmon_vid amdgpu gpu_sched nls_iso8859_1 nls_cp437 vfat radeon fat fuse i2c_algo_bit ttm snd_hda_codec_realtek snd_hda_codec_generic drm_kms_helper ledtrig_audio eeepc_wmi snd_hda_codec_hdmi asus_wmi battery sparse_keymap rfkill cec snd_hda_intel wmi_bmof edac_mce_amd snd_intel_dspcfg drm snd_hda_codec crct10dif_pclmul crc32_pclmul ghash_clmulni_intel snd_hda_core snd_hwdep snd_pcm aesni_intel ipmi_devintf r8169 ipmi_msghandler agpgart snd_timer crypto_simd pcspkr sp5100_tco k10temp syscopyarea cryptd realtek sysfillrect glue_helper mousedev i2c_piix4 snd joydev input_leds sysimgblt libphy fb_sys_fops soundcore wmi pinctrl_amd evdev mac_hid acpi_cpufreq nf_log_ipv6 ip6t_REJECT
[  106.550017]  nf_reject_ipv6 xt_hl ip6t_rt nf_log_ipv4 nf_log_common ipt_REJECT nf_reject_ipv4 xt_LOG xt_multiport xt_limit xt_addrtype xt_tcpudp xt_conntrack ip6table_filter ip6_tables nf_conntrack_netbios_ns nf_conntrack_broadcast nf_nat_ftp nf_nat nf_conntrack_ftp nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 libcrc32c iptable_filter sg ip_tables x_tables ext4 crc32c_generic crc16 mbcache jbd2 sd_mod sr_mod cdrom hid_generic usbhid hid ahci libahci libata crc32c_intel xhci_pci scsi_mod xhci_hcd vfio_pci vfio_virqfd vfio_iommu_type1 vfio kvm_amd ccp rng_core kvm irqbypass
[  106.550039] CR2: 0000000000000010
[  106.550041] ---[ end trace 393523eed3771272 ]---
[  106.550045] RIP: 0010:svm_refresh_apicv_exec_ctrl+0xe4/0x110 [kvm_amd]
[  106.550047] Code: 8b 83 f8 39 00 00 48 39 c5 74 31 48 8b 9b f8 39 00 00 48 39 dd 75 13 eb 23 e8 18 70 cc d3 85 c0 75 1a 48 8b 1b 48 39 dd 74 12 <48> 8b 7b 10 45 85 e4 75 e6 e8 6e 6f cc d3 85 c0 74 e6 5b 4c 89 ee
[  106.550048] RSP: 0018:ffffaef000d5bd50 EFLAGS: 00010082
[  106.550050] RAX: 0000000000000000 RBX: 0000000000000000 RCX: ffff93353937d000
[  106.550051] RDX: 0000000000000001 RSI: ffff9334aa6afc00 RDI: 0000000000000000
[  106.550052] RBP: ffff9334aa6c7408 R08: 0000000000000000 R09: ffff9334aa6afc00
[  106.550053] R10: 00000018d0a785db R11: 0000000000000000 R12: 0000000000000000
[  106.550054] R13: 0000000000000202 R14: ffff9334aa6c7418 R15: ffffaef00149a7a0
[  106.550055] FS:  00007f80d89ff700(0000) GS:ffff93359eb00000(0000) knlGS:0000000000000000
[  106.550057] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  106.550058] CR2: 0000000000000010 CR3: 000000032a5b2000 CR4: 0000000000340ee0
[  106.550060] note: CPU 0/KVM[5762] exited with preempt_count 1
[  120.675828] vm_bridge0: port 2(vnet0) entered learning state

-------------- End dmesg Output -------------
Comment 1 Alex Williamson 2020-02-18 06:45:06 UTC
Partially bisected, will continue tomorrow.  This seems to have been introduced by Paolo's kvm-5.6-2 merge, which seems ripe for potential breakage in the APICv on SVM arena.
Comment 2 Alex Williamson 2020-02-18 18:54:47 UTC
Bisected and replied to patch introducing regression:

https://lore.kernel.org/kvm/20200218115135.4e09ffca@w520.home/
Comment 3 Robert M. Muncrief 2020-02-18 19:55:43 UTC
(In reply to Alex Williamson from comment #2)
> Bisected and replied to patch introducing regression:
> 
> https://lore.kernel.org/kvm/20200218115135.4e09ffca@w520.home/

Wow, that was quick Alex. Thank you, and all the KVM devs, for such a fantastic piece of software, and all your hard work that makes it happen. I'm in my 60s and have been running Linux since a few years after Linus sprung it upon the world, and KVM is without a doubt one of the most incredible and useful kernel developments I've ever witnessed in all that time.
Comment 4 Suravee Suthikulpanit 2020-02-21 14:56:35 UTC
Alex / muncrief,

I have posted a patch here (https://lkml.org/lkml/2020/2/21/1523)

Would you please give it a try and see if the issue persists?

Thanks,
Suravee
Comment 5 Robert M. Muncrief 2020-02-21 19:15:07 UTC
(In reply to Suravee Suthikulpanit from comment #4)
> Alex / muncrief,
> 
> I have posted a patch here (https://lkml.org/lkml/2020/2/21/1523)
> 
> Would you please give it a try and see if the issue persists?
> 
> Thanks,
> Suravee

Thank you for diligently addressing this problem Suravee. I have good news and bad news.

The good news is that the patch works with avic disabled.

But unfortunately there's a different hard crash when I enable avic using host-passthrough. However if I use EPYC-IBPB as the cpu model there's no crash, but from what I can see that's because avic is disabled in the qemu capabilities with a "<blocker name='x2apic'/>" message. I'll tell you what I tried as succinctly as possible.

First I added "options kvm_amd avic=1" to kvm.conf, executed mkinitcpio, and rebooted. Then I did my best to see if avic was enabled:

\# dmesg | grep -i AMD-Vi
[    0.910741] pci 0000:00:00.2: AMD-Vi: IOMMU performance counters supported
[    0.912189] pci 0000:00:00.2: AMD-Vi: Found IOMMU cap 0x40
[    0.912190] pci 0000:00:00.2: AMD-Vi: Extended features (0x58f77ef22294ade):
[    0.912192] AMD-Vi: Interrupt remapping enabled
[    0.912192] AMD-Vi: Virtual APIC enabled
[    0.912192] AMD-Vi: X2APIC enabled
[    0.912620] AMD-Vi: Lazy IO/TLB flushing enabled
[    0.923298] AMD-Vi: AMD IOMMUv2 driver by Joerg Roedel <jroedel@suse.de>

\# dmesg | grep -i svm
[    1.240049] SVM: AVIC enabled
[    1.240050] SVM: Virtual VMLOAD VMSAVE supported
[    1.240050] SVM: Virtual GIF supported

\# cat /sys/module/kvm_amd/parameters/avic
1

This all looked good. But when I looked in /var/cache/libvirt/qemu/capabilities/*.xml all AMD CPUs had a message like this:

  <cpu type='kvm' name='EPYC-IBPB' typename='EPYC-IBPB-x86_64-cpu' usable='no'>
    <blocker name='x2apic'/>
  </cpu>

And like I said when I ran my VM with host-passthrough there was a hard crash, but if I chose EPYC-IBPB it worked. However I suspect that's because when I use EPYC avic is actually disabled, but when I use host-passthrough it's enabled.

In any case, following is the relevant dmesg output from the crash. Please let me know if I'm doing something wrong, or if you need more info or testing.

_---------- Begin dmesg output ----------_

[  564.119840] device vnet0 entered promiscuous mode
[  564.119952] vm_bridge0: port 2(vnet0) entered blocking state
[  564.119953] vm_bridge0: port 2(vnet0) entered listening state
[  566.354740] vfio-pci 0000:08:00.0: vfio_ecap_init: hiding ecap 0x19@0x270
[  566.354744] vfio-pci 0000:08:00.0: vfio_ecap_init: hiding ecap 0x25@0x400
[  566.354746] vfio-pci 0000:08:00.0: vfio_ecap_init: hiding ecap 0x26@0x410
[  566.354749] vfio-pci 0000:08:00.0: vfio_ecap_init: hiding ecap 0x27@0x440
[  566.374852] vfio-pci 0000:0a:00.0: vfio_ecap_init: hiding ecap 0x19@0x270
[  566.374861] vfio-pci 0000:0a:00.0: vfio_ecap_init: hiding ecap 0x1b@0x2d0
[  569.921527] BUG: kernel NULL pointer dereference, address: 0000000000000010
[  569.921533] #PF: supervisor read access in kernel mode
[  569.921535] #PF: error_code(0x0000) - not-present page
[  569.921537] PGD 0 P4D 0 
[  569.921542] Oops: 0000 [#1] PREEMPT SMP NOPTI
[  569.921546] CPU: 7 PID: 7335 Comm: CPU 2/KVM Tainted: P           OE     5.6.0-rc2-1-mainline #1
[  569.921548] Hardware name: System manufacturer System Product Name/TUF GAMING X570-PLUS, BIOS 1405 11/19/2019
[  569.921558] RIP: 0010:svm_refresh_apicv_exec_ctrl+0xf4/0x130 [kvm_amd]
[  569.921562] Code: 8b 83 f8 39 00 00 48 39 c5 74 31 48 8b 9b f8 39 00 00 48 39 dd 75 13 eb 23 e8 08 a0 90 f0 85 c0 75 1a 48 8b 1b 48 39 dd 74 12 <48> 8b 7b 10 45 85 e4 75 e6 e8 5e 9f 90 f0 85 c0 74 e6 5b 4c 89 ee
[  569.921565] RSP: 0018:ffffa7a74084fce8 EFLAGS: 00010082
[  569.921568] RAX: 0000000000000000 RBX: 0000000000000000 RCX: ffffa38f715ce000
[  569.921570] RDX: 0000000000000001 RSI: ffffa38f3c9e1800 RDI: 0000000000000000
[  569.921572] RBP: ffffa38f3ca57408 R08: 0000000000000000 R09: 0000000000000bb8
[  569.921574] R10: 000000849ef65aa1 R11: 00000000000630c0 R12: 0000000000000000
[  569.921576] R13: 0000000000000202 R14: ffffa38f3ca57418 R15: ffffa38f3ca53a10
[  569.921578] FS:  00007f6b5b5ff700(0000) GS:ffffa3905e9c0000(0000) knlGS:0000000000000000
[  569.921580] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  569.921582] CR2: 0000000000000010 CR3: 000000032f9ec000 CR4: 0000000000340ee0
[  569.921583] Call Trace:
[  569.921592]  svm_vcpu_unblocking+0x31/0x50 [kvm_amd]
[  569.921613]  kvm_vcpu_block+0xd1/0x340 [kvm]
[  569.921639]  kvm_arch_vcpu_ioctl_run+0x1234/0x1b20 [kvm]
[  569.921645]  ? __seccomp_filter+0xd2/0x6c0
[  569.921665]  kvm_vcpu_ioctl+0x266/0x630 [kvm]
[  569.921671]  ksys_ioctl+0x87/0xc0
[  569.921675]  __x64_sys_ioctl+0x16/0x20
[  569.921678]  do_syscall_64+0x4e/0x150
[  569.921683]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
[  569.921686] RIP: 0033:0x7f6b607452eb
[  569.921688] Code: 0f 1e fa 48 8b 05 a5 8b 0c 00 64 c7 00 26 00 00 00 48 c7 c0 ff ff ff ff c3 66 0f 1f 44 00 00 f3 0f 1e fa b8 10 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 75 8b 0c 00 f7 d8 64 89 01 48
[  569.921690] RSP: 002b:00007f6b5b5fcea8 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
[  569.921693] RAX: ffffffffffffffda RBX: 000000000000ae80 RCX: 00007f6b607452eb
[  569.921695] RDX: 0000000000000000 RSI: 000000000000ae80 RDI: 000000000000001b
[  569.921696] RBP: 00007f6b5e04ae00 R08: 00005558b88eb110 R09: 0000000000000000
[  569.921698] R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000
[  569.921699] R13: 00007ffd77fa257f R14: 00007f6b5b5fd140 R15: 00007f6b5b5ff700
[  569.921704] Modules linked in: vhost_net vhost tap cpufreq_ondemand ebtable_filter ebtables tun bridge stp llc uvcvideo videobuf2_vmalloc videobuf2_memops videobuf2_v4l2 videobuf2_common snd_usb_audio videodev snd_usbmidi_lib snd_rawmidi nvidia_drm(POE) snd_seq_device mc nvidia_modeset(POE) nct6775 hwmon_vid nvidia(POE) amdgpu nls_iso8859_1 gpu_sched nls_cp437 vfat radeon fat fuse snd_hda_codec_realtek i2c_algo_bit snd_hda_codec_generic ttm ledtrig_audio eeepc_wmi snd_hda_codec_hdmi asus_wmi drm_kms_helper battery sparse_keymap rfkill wmi_bmof snd_hda_intel cec edac_mce_amd snd_intel_dspcfg snd_hda_codec crct10dif_pclmul drm crc32_pclmul snd_hda_core ghash_clmulni_intel snd_hwdep ipmi_devintf snd_pcm ipmi_msghandler aesni_intel r8169 agpgart snd_timer crypto_simd syscopyarea sp5100_tco cryptd sysfillrect realtek glue_helper joydev sysimgblt input_leds mousedev pcspkr k10temp snd i2c_piix4 fb_sys_fops libphy soundcore wmi pinctrl_amd evdev mac_hid acpi_cpufreq nf_log_ipv6 ip6t_REJECT
[  569.921749]  nf_reject_ipv6 xt_hl ip6t_rt nf_log_ipv4 nf_log_common ipt_REJECT nf_reject_ipv4 xt_LOG xt_multiport xt_limit xt_addrtype xt_tcpudp xt_conntrack ip6table_filter ip6_tables nf_conntrack_netbios_ns nf_conntrack_broadcast nf_nat_ftp nf_nat nf_conntrack_ftp nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 libcrc32c iptable_filter sg ip_tables x_tables ext4 crc32c_generic crc16 mbcache jbd2 sr_mod cdrom sd_mod hid_generic usbhid hid ahci libahci crc32c_intel libata xhci_pci scsi_mod xhci_hcd vfio_pci vfio_virqfd vfio_iommu_type1 vfio kvm_amd ccp rng_core kvm irqbypass
[  569.921778] CR2: 0000000000000010
[  569.921780] ---[ end trace c5c7ecbc97cc5c9a ]---
[  569.921782] BUG: kernel NULL pointer dereference, address: 0000000000000010
[  569.921784] #PF: supervisor read access in kernel mode
[  569.921786] #PF: error_code(0x0000) - not-present page
[  569.921787] PGD 0 P4D 0 
[  569.921793] RIP: 0010:svm_refresh_apicv_exec_ctrl+0xf4/0x130 [kvm_amd]
[  569.921796] Code: 8b 83 f8 39 00 00 48 39 c5 74 31 48 8b 9b f8 39 00 00 48 39 dd 75 13 eb 23 e8 08 a0 90 f0 85 c0 75 1a 48 8b 1b 48 39 dd 74 12 <48> 8b 7b 10 45 85 e4 75 e6 e8 5e 9f 90 f0 85 c0 74 e6 5b 4c 89 ee
[  569.921797] RSP: 0018:ffffa7a74084fce8 EFLAGS: 00010082
[  569.921801] RAX: 0000000000000000 RBX: 0000000000000000 RCX: ffffa38f715ce000
[  569.921802] RDX: 0000000000000001 RSI: ffffa38f3c9e1800 RDI: 0000000000000000
[  569.921804] RBP: ffffa38f3ca57408 R08: 0000000000000000 R09: 0000000000000bb8
[  569.921805] R10: 000000849ef65aa1 R11: 00000000000630c0 R12: 0000000000000000
[  569.921806] R13: 0000000000000202 R14: ffffa38f3ca57418 R15: ffffa38f3ca53a10
[  569.921809] FS:  00007f6b5b5ff700(0000) GS:ffffa3905e9c0000(0000) knlGS:0000000000000000
[  569.921810] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  569.921812] CR2: 0000000000000010 CR3: 000000032f9ec000 CR4: 0000000000340ee0
[  569.921815] note: CPU 2/KVM[7335] exited with preempt_count 1
[  569.921816] Oops: 0000 [#2] PREEMPT SMP NOPTI
[  569.921819] CPU: 6 PID: 7334 Comm: CPU 1/KVM Tainted: P      D    OE     5.6.0-rc2-1-mainline #1
[  569.921821] Hardware name: System manufacturer System Product Name/TUF GAMING X570-PLUS, BIOS 1405 11/19/2019
[  569.921828] RIP: 0010:svm_refresh_apicv_exec_ctrl+0xf4/0x130 [kvm_amd]
[  569.921830] Code: 8b 83 f8 39 00 00 48 39 c5 74 31 48 8b 9b f8 39 00 00 48 39 dd 75 13 eb 23 e8 08 a0 90 f0 85 c0 75 1a 48 8b 1b 48 39 dd 74 12 <48> 8b 7b 10 45 85 e4 75 e6 e8 5e 9f 90 f0 85 c0 74 e6 5b 4c 89 ee
[  569.921832] RSP: 0018:ffffa7a748ecfce8 EFLAGS: 00010082
[  569.921836] RAX: 0000000000000000 RBX: 0000000000000000 RCX: ffffa39037a67000
[  569.921837] RDX: 0000000000000001 RSI: ffffa38f3c9e1400 RDI: 0000000000000000
[  569.921839] RBP: ffffa38f3ca539f8 R08: 0000000000000000 R09: 0000000000000bb8
[  569.921841] R10: 00000000000000e1 R11: 0000000000000000 R12: 0000000000000000
[  569.921842] R13: 0000000000000202 R14: ffffa38f3ca53a08 R15: ffffa38f3ca50000
[  569.921845] FS:  00007f6b5c3ff700(0000) GS:ffffa3905e980000(0000) knlGS:0000000000000000
[  569.921847] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  569.921848] CR2: 0000000000000010 CR3: 000000032f9ec000 CR4: 0000000000340ee0
[  569.921850] Call Trace:
[  569.921858]  svm_vcpu_unblocking+0x31/0x50 [kvm_amd]
[  569.921884]  kvm_vcpu_block+0xd1/0x340 [kvm]
[  569.921911]  kvm_arch_vcpu_ioctl_run+0x1234/0x1b20 [kvm]
[  569.921916]  ? __seccomp_filter+0xd2/0x6c0
[  569.921936]  kvm_vcpu_ioctl+0x266/0x630 [kvm]
[  569.921942]  ksys_ioctl+0x87/0xc0
[  569.921945]  __x64_sys_ioctl+0x16/0x20
[  569.921948]  do_syscall_64+0x4e/0x150
[  569.921951]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
[  569.921954] RIP: 0033:0x7f6b607452eb
[  569.921956] Code: 0f 1e fa 48 8b 05 a5 8b 0c 00 64 c7 00 26 00 00 00 48 c7 c0 ff ff ff ff c3 66 0f 1f 44 00 00 f3 0f 1e fa b8 10 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 75 8b 0c 00 f7 d8 64 89 01 48
[  569.921958] RSP: 002b:00007f6b5c3fcea8 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
[  569.921960] RAX: ffffffffffffffda RBX: 000000000000ae80 RCX: 00007f6b607452eb
[  569.921962] RDX: 0000000000000000 RSI: 000000000000ae80 RDI: 000000000000001a
[  569.921964] RBP: 00007f6b5e028800 R08: 00005558b88eb110 R09: 0000000000000000
[  569.921965] R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000
[  569.921967] R13: 00007ffd77fa257f R14: 00007f6b5c3fd140 R15: 00007f6b5c3ff700
[  569.921971] Modules linked in: vhost_net vhost tap cpufreq_ondemand ebtable_filter ebtables tun bridge stp llc uvcvideo videobuf2_vmalloc videobuf2_memops videobuf2_v4l2 videobuf2_common snd_usb_audio videodev snd_usbmidi_lib snd_rawmidi nvidia_drm(POE) snd_seq_device mc nvidia_modeset(POE) nct6775 hwmon_vid nvidia(POE) amdgpu nls_iso8859_1 gpu_sched nls_cp437 vfat radeon fat fuse snd_hda_codec_realtek i2c_algo_bit snd_hda_codec_generic ttm ledtrig_audio eeepc_wmi snd_hda_codec_hdmi asus_wmi drm_kms_helper battery sparse_keymap rfkill wmi_bmof snd_hda_intel cec edac_mce_amd snd_intel_dspcfg snd_hda_codec crct10dif_pclmul drm crc32_pclmul snd_hda_core ghash_clmulni_intel snd_hwdep ipmi_devintf snd_pcm ipmi_msghandler aesni_intel r8169 agpgart snd_timer crypto_simd syscopyarea sp5100_tco cryptd sysfillrect realtek glue_helper joydev sysimgblt input_leds mousedev pcspkr k10temp snd i2c_piix4 fb_sys_fops libphy soundcore wmi pinctrl_amd evdev mac_hid acpi_cpufreq nf_log_ipv6 ip6t_REJECT
[  569.922006]  nf_reject_ipv6 xt_hl ip6t_rt nf_log_ipv4 nf_log_common ipt_REJECT nf_reject_ipv4 xt_LOG xt_multiport xt_limit xt_addrtype xt_tcpudp xt_conntrack ip6table_filter ip6_tables nf_conntrack_netbios_ns nf_conntrack_broadcast nf_nat_ftp nf_nat nf_conntrack_ftp nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 libcrc32c iptable_filter sg ip_tables x_tables ext4 crc32c_generic crc16 mbcache jbd2 sr_mod cdrom sd_mod hid_generic usbhid hid ahci libahci crc32c_intel libata xhci_pci scsi_mod xhci_hcd vfio_pci vfio_virqfd vfio_iommu_type1 vfio kvm_amd ccp rng_core kvm irqbypass
[  569.922031] CR2: 0000000000000010
[  569.922033] ---[ end trace c5c7ecbc97cc5c9b ]---
[  569.922035] BUG: kernel NULL pointer dereference, address: 0000000000000010
[  569.922037] #PF: supervisor read access in kernel mode
[  569.922039] #PF: error_code(0x0000) - not-present page
[  569.922040] PGD 0 P4D 0 
[  569.922045] RIP: 0010:svm_refresh_apicv_exec_ctrl+0xf4/0x130 [kvm_amd]
[  569.922047] Code: 8b 83 f8 39 00 00 48 39 c5 74 31 48 8b 9b f8 39 00 00 48 39 dd 75 13 eb 23 e8 08 a0 90 f0 85 c0 75 1a 48 8b 1b 48 39 dd 74 12 <48> 8b 7b 10 45 85 e4 75 e6 e8 5e 9f 90 f0 85 c0 74 e6 5b 4c 89 ee
[  569.922048] RSP: 0018:ffffa7a74084fce8 EFLAGS: 00010082
[  569.922051] RAX: 0000000000000000 RBX: 0000000000000000 RCX: ffffa38f715ce000
[  569.922052] RDX: 0000000000000001 RSI: ffffa38f3c9e1800 RDI: 0000000000000000
[  569.922054] RBP: ffffa38f3ca57408 R08: 0000000000000000 R09: 0000000000000bb8
[  569.922055] R10: 000000849ef65aa1 R11: 00000000000630c0 R12: 0000000000000000
[  569.922057] R13: 0000000000000202 R14: ffffa38f3ca57418 R15: ffffa38f3ca53a10
[  569.922058] FS:  00007f6b5c3ff700(0000) GS:ffffa3905e980000(0000) knlGS:0000000000000000
[  569.922060] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  569.922062] CR2: 0000000000000010 CR3: 000000032f9ec000 CR4: 0000000000340ee0
[  569.922064] note: CPU 1/KVM[7334] exited with preempt_count 1
[  569.922066] Oops: 0000 [#3] PREEMPT SMP NOPTI
[  569.922068] CPU: 14 PID: 7340 Comm: CPU 7/KVM Tainted: P      D    OE     5.6.0-rc2-1-mainline #1
[  569.922071] Hardware name: System manufacturer System Product Name/TUF GAMING X570-PLUS, BIOS 1405 11/19/2019
[  569.922079] RIP: 0010:svm_refresh_apicv_exec_ctrl+0xf4/0x130 [kvm_amd]
[  569.922082] Code: 8b 83 f8 39 00 00 48 39 c5 74 31 48 8b 9b f8 39 00 00 48 39 dd 75 13 eb 23 e8 08 a0 90 f0 85 c0 75 1a 48 8b 1b 48 39 dd 74 12 <48> 8b 7b 10 45 85 e4 75 e6 e8 5e 9f 90 f0 85 c0 74 e6 5b 4c 89 ee
[  569.922084] RSP: 0018:ffffa7a74882fce8 EFLAGS: 00010082
[  569.922087] RAX: 0000000000000000 RBX: 0000000000000000 RCX: ffffa38f6de2f000
[  569.922089] RDX: 0000000000000001 RSI: ffffa38f6f9fc100 RDI: 0000000000000000
[  569.922091] RBP: ffffa38f6f3239f8 R08: 0000000000000000 R09: 0000000000000bb8
[  569.922094] R10: 0000000000000078 R11: 0000000000000000 R12: 0000000000000000
[  569.922096] R13: 0000000000000202 R14: ffffa38f6f323a08 R15: ffffa38f6f320000
[  569.922098] FS:  00007f6b56fff700(0000) GS:ffffa3905eb80000(0000) knlGS:0000000000000000
[  569.922100] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  569.922102] CR2: 0000000000000010 CR3: 000000032f9ec000 CR4: 0000000000340ee0
[  569.922105] Call Trace:
[  569.922111]  svm_vcpu_unblocking+0x31/0x50 [kvm_amd]
[  569.922130]  kvm_vcpu_block+0xd1/0x340 [kvm]
[  569.922151]  kvm_arch_vcpu_ioctl_run+0x1234/0x1b20 [kvm]
[  569.922154]  ? __seccomp_filter+0xd2/0x6c0
[  569.922171]  kvm_vcpu_ioctl+0x266/0x630 [kvm]
[  569.922176]  ksys_ioctl+0x87/0xc0
[  569.922179]  __x64_sys_ioctl+0x16/0x20
[  569.922181]  do_syscall_64+0x4e/0x150
[  569.922184]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
[  569.922186] RIP: 0033:0x7f6b607452eb
[  569.922188] Code: 0f 1e fa 48 8b 05 a5 8b 0c 00 64 c7 00 26 00 00 00 48 c7 c0 ff ff ff ff c3 66 0f 1f 44 00 00 f3 0f 1e fa b8 10 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 75 8b 0c 00 f7 d8 64 89 01 48
[  569.922189] RSP: 002b:00007f6b56ffcea8 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
[  569.922191] RAX: ffffffffffffffda RBX: 000000000000ae80 RCX: 00007f6b607452eb
[  569.922193] RDX: 0000000000000000 RSI: 000000000000ae80 RDI: 0000000000000020
[  569.922194] RBP: 00007f6b5e12ba00 R08: 00005558b88eb110 R09: 0000000000000000
[  569.922195] R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000
[  569.922196] R13: 00007ffd77fa257f R14: 00007f6b56ffd140 R15: 00007f6b56fff700
[  569.922200] Modules linked in: vhost_net vhost tap cpufreq_ondemand ebtable_filter ebtables tun bridge stp llc uvcvideo videobuf2_vmalloc videobuf2_memops videobuf2_v4l2 videobuf2_common snd_usb_audio videodev snd_usbmidi_lib snd_rawmidi nvidia_drm(POE) snd_seq_device mc nvidia_modeset(POE) nct6775 hwmon_vid nvidia(POE) amdgpu nls_iso8859_1 gpu_sched nls_cp437 vfat radeon fat fuse snd_hda_codec_realtek i2c_algo_bit snd_hda_codec_generic ttm ledtrig_audio eeepc_wmi snd_hda_codec_hdmi asus_wmi drm_kms_helper battery sparse_keymap rfkill wmi_bmof snd_hda_intel cec edac_mce_amd snd_intel_dspcfg snd_hda_codec crct10dif_pclmul drm crc32_pclmul snd_hda_core ghash_clmulni_intel snd_hwdep ipmi_devintf snd_pcm ipmi_msghandler aesni_intel r8169 agpgart snd_timer crypto_simd syscopyarea sp5100_tco cryptd sysfillrect realtek glue_helper joydev sysimgblt input_leds mousedev pcspkr k10temp snd i2c_piix4 fb_sys_fops libphy soundcore wmi pinctrl_amd evdev mac_hid acpi_cpufreq nf_log_ipv6 ip6t_REJECT
[  569.922229]  nf_reject_ipv6 xt_hl ip6t_rt nf_log_ipv4 nf_log_common ipt_REJECT nf_reject_ipv4 xt_LOG xt_multiport xt_limit xt_addrtype xt_tcpudp xt_conntrack ip6table_filter ip6_tables nf_conntrack_netbios_ns nf_conntrack_broadcast nf_nat_ftp nf_nat nf_conntrack_ftp nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 libcrc32c iptable_filter sg ip_tables x_tables ext4 crc32c_generic crc16 mbcache jbd2 sr_mod cdrom sd_mod hid_generic usbhid hid ahci libahci crc32c_intel libata xhci_pci scsi_mod xhci_hcd vfio_pci vfio_virqfd vfio_iommu_type1 vfio kvm_amd ccp rng_core kvm irqbypass
[  569.922250] CR2: 0000000000000010
[  569.922252] ---[ end trace c5c7ecbc97cc5c9c ]---
[  569.922254] BUG: kernel NULL pointer dereference, address: 0000000000000010
[  569.922255] #PF: supervisor read access in kernel mode
[  569.922257] #PF: error_code(0x0000) - not-present page
[  569.922258] PGD 0 P4D 0 
[  569.922262] RIP: 0010:svm_refresh_apicv_exec_ctrl+0xf4/0x130 [kvm_amd]
[  569.922265] Code: 8b 83 f8 39 00 00 48 39 c5 74 31 48 8b 9b f8 39 00 00 48 39 dd 75 13 eb 23 e8 08 a0 90 f0 85 c0 75 1a 48 8b 1b 48 39 dd 74 12 <48> 8b 7b 10 45 85 e4 75 e6 e8 5e 9f 90 f0 85 c0 74 e6 5b 4c 89 ee
[  569.922266] RSP: 0018:ffffa7a74084fce8 EFLAGS: 00010082
[  569.922269] RAX: 0000000000000000 RBX: 0000000000000000 RCX: ffffa38f715ce000
[  569.922270] RDX: 0000000000000001 RSI: ffffa38f3c9e1800 RDI: 0000000000000000
[  569.922272] RBP: ffffa38f3ca57408 R08: 0000000000000000 R09: 0000000000000bb8
[  569.922273] R10: 000000849ef65aa1 R11: 00000000000630c0 R12: 0000000000000000
[  569.922275] R13: 0000000000000202 R14: ffffa38f3ca57418 R15: ffffa38f3ca53a10
[  569.922276] FS:  00007f6b56fff700(0000) GS:ffffa3905eb80000(0000) knlGS:0000000000000000
[  569.922278] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  569.922279] CR2: 0000000000000010 CR3: 000000032f9ec000 CR4: 0000000000340ee0
[  569.922281] note: CPU 7/KVM[7340] exited with preempt_count 1
[  569.922283] Oops: 0000 [#4] PREEMPT SMP NOPTI
[  569.922285] CPU: 13 PID: 7337 Comm: CPU 4/KVM Tainted: P      D    OE     5.6.0-rc2-1-mainline #1
[  569.922287] Hardware name: System manufacturer System Product Name/TUF GAMING X570-PLUS, BIOS 1405 11/19/2019
[  569.922292] RIP: 0010:svm_refresh_apicv_exec_ctrl+0xf4/0x130 [kvm_amd]
[  569.922295] Code: 8b 83 f8 39 00 00 48 39 c5 74 31 48 8b 9b f8 39 00 00 48 39 dd 75 13 eb 23 e8 08 a0 90 f0 85 c0 75 1a 48 8b 1b 48 39 dd 74 12 <48> 8b 7b 10 45 85 e4 75 e6 e8 5e 9f 90 f0 85 c0 74 e6 5b 4c 89 ee
[  569.922296] RSP: 0018:ffffa7a741e9fce8 EFLAGS: 00010082
[  569.922300] RAX: 0000000000000000 RBX: 0000000000000000 RCX: ffffa38f3e68f000
[  569.922302] RDX: 0000000000000001 RSI: ffffa38d90ecb800 RDI: 0000000000000000
[  569.922304] RBP: ffffa38d90eb39f8 R08: 0000000000000000 R09: 0000000000000bb8
[  569.922305] R10: 0000000000000264 R11: 0000000000000000 R12: 0000000000000000
[  569.922307] R13: 0000000000000202 R14: ffffa38d90eb3a08 R15: ffffa38d90eb0000
[  569.922309] FS:  00007f6b599ff700(0000) GS:ffffa3905eb40000(0000) knlGS:0000000000000000
[  569.922311] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  569.922312] CR2: 0000000000000010 CR3: 000000032f9ec000 CR4: 0000000000340ee0
[  569.922314] Call Trace:
[  569.922321]  svm_vcpu_unblocking+0x31/0x50 [kvm_amd]
[  569.922338]  kvm_vcpu_block+0xd1/0x340 [kvm]
[  569.922360]  kvm_arch_vcpu_ioctl_run+0x1234/0x1b20 [kvm]
[  569.922364]  ? __seccomp_filter+0xd2/0x6c0
[  569.922382]  kvm_vcpu_ioctl+0x266/0x630 [kvm]
[  569.922387]  ksys_ioctl+0x87/0xc0
[  569.922390]  __x64_sys_ioctl+0x16/0x20
[  569.922393]  do_syscall_64+0x4e/0x150
[  569.922396]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
[  569.922398] RIP: 0033:0x7f6b607452eb
[  569.922400] Code: 0f 1e fa 48 8b 05 a5 8b 0c 00 64 c7 00 26 00 00 00 48 c7 c0 ff ff ff ff c3 66 0f 1f 44 00 00 f3 0f 1e fa b8 10 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 75 8b 0c 00 f7 d8 64 89 01 48
[  569.922402] RSP: 002b:00007f6b599fcea8 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
[  569.922404] RAX: ffffffffffffffda RBX: 000000000000ae80 RCX: 00007f6b607452eb
[  569.922405] RDX: 0000000000000000 RSI: 000000000000ae80 RDI: 000000000000001d
[  569.922407] RBP: 00007f6b5e0ac480 R08: 00005558b88eb110 R09: 0000000000000000
[  569.922408] R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000
[  569.922409] R13: 00007ffd77fa257f R14: 00007f6b599fd140 R15: 00007f6b599ff700
[  569.922412] Modules linked in: vhost_net vhost tap cpufreq_ondemand ebtable_filter ebtables tun bridge stp llc uvcvideo videobuf2_vmalloc videobuf2_memops videobuf2_v4l2 videobuf2_common snd_usb_audio videodev snd_usbmidi_lib snd_rawmidi nvidia_drm(POE) snd_seq_device mc nvidia_modeset(POE) nct6775 hwmon_vid nvidia(POE) amdgpu nls_iso8859_1 gpu_sched nls_cp437 vfat radeon fat fuse snd_hda_codec_realtek i2c_algo_bit snd_hda_codec_generic ttm ledtrig_audio eeepc_wmi snd_hda_codec_hdmi asus_wmi drm_kms_helper battery sparse_keymap rfkill wmi_bmof snd_hda_intel cec edac_mce_amd snd_intel_dspcfg snd_hda_codec crct10dif_pclmul drm crc32_pclmul snd_hda_core ghash_clmulni_intel snd_hwdep ipmi_devintf snd_pcm ipmi_msghandler aesni_intel r8169 agpgart snd_timer crypto_simd syscopyarea sp5100_tco cryptd sysfillrect realtek glue_helper joydev sysimgblt input_leds mousedev pcspkr k10temp snd i2c_piix4 fb_sys_fops libphy soundcore wmi pinctrl_amd evdev mac_hid acpi_cpufreq nf_log_ipv6 ip6t_REJECT
[  569.922442]  nf_reject_ipv6 xt_hl ip6t_rt nf_log_ipv4 nf_log_common ipt_REJECT nf_reject_ipv4 xt_LOG xt_multiport xt_limit xt_addrtype xt_tcpudp xt_conntrack ip6table_filter ip6_tables nf_conntrack_netbios_ns nf_conntrack_broadcast nf_nat_ftp nf_nat nf_conntrack_ftp nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 libcrc32c iptable_filter sg ip_tables x_tables ext4 crc32c_generic crc16 mbcache jbd2 sr_mod cdrom sd_mod hid_generic usbhid hid ahci libahci crc32c_intel libata xhci_pci scsi_mod xhci_hcd vfio_pci vfio_virqfd vfio_iommu_type1 vfio kvm_amd ccp rng_core kvm irqbypass
[  569.922463] CR2: 0000000000000010
[  569.922465] ---[ end trace c5c7ecbc97cc5c9d ]---
[  569.922467] BUG: kernel NULL pointer dereference, address: 0000000000000010
[  569.922469] #PF: supervisor read access in kernel mode
[  569.922470] #PF: error_code(0x0000) - not-present page
[  569.922472] PGD 0 P4D 0 
[  569.922479] RIP: 0010:svm_refresh_apicv_exec_ctrl+0xf4/0x130 [kvm_amd]
[  569.922481] Code: 8b 83 f8 39 00 00 48 39 c5 74 31 48 8b 9b f8 39 00 00 48 39 dd 75 13 eb 23 e8 08 a0 90 f0 85 c0 75 1a 48 8b 1b 48 39 dd 74 12 <48> 8b 7b 10 45 85 e4 75 e6 e8 5e 9f 90 f0 85 c0 74 e6 5b 4c 89 ee
[  569.922483] RSP: 0018:ffffa7a74084fce8 EFLAGS: 00010082
[  569.922486] Oops: 0000 [#5] PREEMPT SMP NOPTI
[  569.922489] CPU: 8 PID: 7336 Comm: CPU 3/KVM Tainted: P      D    OE     5.6.0-rc2-1-mainline #1
[  569.922491] Hardware name: System manufacturer System Product Name/TUF GAMING X570-PLUS, BIOS 1405 11/19/2019
[  569.922497] RIP: 0010:svm_refresh_apicv_exec_ctrl+0xf4/0x130 [kvm_amd]
[  569.922500] Code: 8b 83 f8 39 00 00 48 39 c5 74 31 48 8b 9b f8 39 00 00 48 39 dd 75 13 eb 23 e8 08 a0 90 f0 85 c0 75 1a 48 8b 1b 48 39 dd 74 12 <48> 8b 7b 10 45 85 e4 75 e6 e8 5e 9f 90 f0 85 c0 74 e6 5b 4c 89 ee
[  569.922502] RSP: 0018:ffffa7a741d87ce8 EFLAGS: 00010082
[  569.922505] RAX: 0000000000000000 RBX: 0000000000000000 RCX: ffffa38f715ce000
[  569.922507] RDX: 0000000000000001 RSI: ffffa38f3c9e1800 RDI: 0000000000000000
[  569.922508] RBP: ffffa38f3ca57408 R08: 0000000000000000 R09: 0000000000000bb8
[  569.922510] R10: 000000849ef65aa1 R11: 00000000000630c0 R12: 0000000000000000
[  569.922511] R13: 0000000000000202 R14: ffffa38f3ca57418 R15: ffffa38f3ca53a10
[  569.922513] FS:  00007f6b599ff700(0000) GS:ffffa3905eb40000(0000) knlGS:0000000000000000
[  569.922515] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  569.922517] CR2: 0000000000000010 CR3: 000000032f9ec000 CR4: 0000000000340ee0
[  569.922519] note: CPU 4/KVM[7337] exited with preempt_count 1
[  569.922520] RAX: 0000000000000000 RBX: 0000000000000000 RCX: ffffa38f717f5000
[  569.922522] RDX: 0000000000000001 RSI: ffffa38f3c9e0000 RDI: 0000000000000000
[  569.922523] RBP: ffffa38d90ec7408 R08: 0000000000000000 R09: 0000000000000bb8
[  569.922525] R10: 000000000000022d R11: 000000000006a63c R12: 0000000000000000
[  569.922527] R13: 0000000000000202 R14: ffffa38d90ec7418 R15: ffffa38d90ec3a10
[  569.922529] FS:  00007f6b5a7ff700(0000) GS:ffffa3905ea00000(0000) knlGS:0000000000000000
[  569.922531] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  569.922532] CR2: 0000000000000010 CR3: 000000032f9ec000 CR4: 0000000000340ee0
[  569.922534] Call Trace:
[  569.922541]  svm_vcpu_unblocking+0x31/0x50 [kvm_amd]
[  569.922559]  kvm_vcpu_block+0xd1/0x340 [kvm]
[  569.922581]  kvm_arch_vcpu_ioctl_run+0x1234/0x1b20 [kvm]
[  569.922586]  ? __seccomp_filter+0xd2/0x6c0
[  569.922603]  kvm_vcpu_ioctl+0x266/0x630 [kvm]
[  569.922608]  ksys_ioctl+0x87/0xc0
[  569.922611]  __x64_sys_ioctl+0x16/0x20
[  569.922614]  do_syscall_64+0x4e/0x150
[  569.922618]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
[  569.922620] RIP: 0033:0x7f6b607452eb
[  569.922622] Code: 0f 1e fa 48 8b 05 a5 8b 0c 00 64 c7 00 26 00 00 00 48 c7 c0 ff ff ff ff c3 66 0f 1f 44 00 00 f3 0f 1e fa b8 10 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 75 8b 0c 00 f7 d8 64 89 01 48
[  569.922624] RSP: 002b:00007f6b5a7fcea8 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
[  569.922626] RAX: ffffffffffffffda RBX: 000000000000ae80 RCX: 00007f6b607452eb
[  569.922627] RDX: 0000000000000000 RSI: 000000000000ae80 RDI: 000000000000001c
[  569.922628] RBP: 00007f6b5e07e040 R08: 00005558b88eb110 R09: 0000000000000000
[  569.922629] R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000
[  569.922630] R13: 00007ffd77fa257f R14: 00007f6b5a7fd140 R15: 00007f6b5a7ff700
[  569.922635] Modules linked in: vhost_net vhost tap cpufreq_ondemand ebtable_filter ebtables tun bridge stp llc uvcvideo videobuf2_vmalloc videobuf2_memops videobuf2_v4l2 videobuf2_common snd_usb_audio videodev snd_usbmidi_lib snd_rawmidi nvidia_drm(POE) snd_seq_device mc nvidia_modeset(POE) nct6775 hwmon_vid nvidia(POE) amdgpu nls_iso8859_1 gpu_sched nls_cp437 vfat radeon fat fuse snd_hda_codec_realtek i2c_algo_bit snd_hda_codec_generic ttm ledtrig_audio eeepc_wmi snd_hda_codec_hdmi asus_wmi drm_kms_helper battery sparse_keymap rfkill wmi_bmof snd_hda_intel cec edac_mce_amd snd_intel_dspcfg snd_hda_codec crct10dif_pclmul drm crc32_pclmul snd_hda_core ghash_clmulni_intel snd_hwdep ipmi_devintf snd_pcm ipmi_msghandler aesni_intel r8169 agpgart snd_timer crypto_simd syscopyarea sp5100_tco cryptd sysfillrect realtek glue_helper joydev sysimgblt input_leds mousedev pcspkr k10temp snd i2c_piix4 fb_sys_fops libphy soundcore wmi pinctrl_amd evdev mac_hid acpi_cpufreq nf_log_ipv6 ip6t_REJECT
[  569.922671]  nf_reject_ipv6 xt_hl ip6t_rt nf_log_ipv4 nf_log_common ipt_REJECT nf_reject_ipv4 xt_LOG xt_multiport xt_limit xt_addrtype xt_tcpudp xt_conntrack ip6table_filter ip6_tables nf_conntrack_netbios_ns nf_conntrack_broadcast nf_nat_ftp nf_nat nf_conntrack_ftp nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 libcrc32c iptable_filter sg ip_tables x_tables ext4 crc32c_generic crc16 mbcache jbd2 sr_mod cdrom sd_mod hid_generic usbhid hid ahci libahci crc32c_intel libata xhci_pci scsi_mod xhci_hcd vfio_pci vfio_virqfd vfio_iommu_type1 vfio kvm_amd ccp rng_core kvm irqbypass
[  569.922694] CR2: 0000000000000010
[  569.922696] ---[ end trace c5c7ecbc97cc5c9e ]---
[  569.922697] BUG: kernel NULL pointer dereference, address: 0000000000000010
[  569.922699] #PF: supervisor read access in kernel mode
[  569.922700] #PF: error_code(0x0000) - not-present page
[  569.922701] PGD 0 P4D 0 
[  569.922707] RIP: 0010:svm_refresh_apicv_exec_ctrl+0xf4/0x130 [kvm_amd]
[  569.922709] Code: 8b 83 f8 39 00 00 48 39 c5 74 31 48 8b 9b f8 39 00 00 48 39 dd 75 13 eb 23 e8 08 a0 90 f0 85 c0 75 1a 48 8b 1b 48 39 dd 74 12 <48> 8b 7b 10 45 85 e4 75 e6 e8 5e 9f 90 f0 85 c0 74 e6 5b 4c 89 ee
[  569.922710] RSP: 0018:ffffa7a74084fce8 EFLAGS: 00010082
[  569.922713] Oops: 0000 [#6] PREEMPT SMP NOPTI
[  569.922715] CPU: 9 PID: 7338 Comm: CPU 5/KVM Tainted: P      D    OE     5.6.0-rc2-1-mainline #1
[  569.922717] RAX: 0000000000000000 RBX: 0000000000000000 RCX: ffffa38f715ce000
[  569.922718] RDX: 0000000000000001 RSI: ffffa38f3c9e1800 RDI: 0000000000000000
[  569.922719] RBP: ffffa38f3ca57408 R08: 0000000000000000 R09: 0000000000000bb8
[  569.922720] R10: 000000849ef65aa1 R11: 00000000000630c0 R12: 0000000000000000
[  569.922721] R13: 0000000000000202 R14: ffffa38f3ca57418 R15: ffffa38f3ca53a10
[  569.922723] FS:  00007f6b5a7ff700(0000) GS:ffffa3905ea00000(0000) knlGS:0000000000000000
[  569.922724] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  569.922726] CR2: 0000000000000010 CR3: 000000032f9ec000 CR4: 0000000000340ee0
[  569.922727] Hardware name: System manufacturer System Product Name/TUF GAMING X570-PLUS, BIOS 1405 11/19/2019
[  569.922733] RIP: 0010:svm_refresh_apicv_exec_ctrl+0xf4/0x130 [kvm_amd]
[  569.922735] Code: 8b 83 f8 39 00 00 48 39 c5 74 31 48 8b 9b f8 39 00 00 48 39 dd 75 13 eb 23 e8 08 a0 90 f0 85 c0 75 1a 48 8b 1b 48 39 dd 74 12 <48> 8b 7b 10 45 85 e4 75 e6 e8 5e 9f 90 f0 85 c0 74 e6 5b 4c 89 ee
[  569.922737] RSP: 0018:ffffa7a74310fce8 EFLAGS: 00010082
[  569.922738] RAX: 0000000000000000 RBX: 0000000000000000 RCX: ffffa38f42a78000
[  569.922740] RDX: 0000000000000001 RSI: ffffa38f3dc17700 RDI: 0000000000000000
[  569.922741] note: CPU 3/KVM[7336] exited with preempt_count 1
[  569.922742] RBP: ffffa38f3df739f8 R08: 0000000000000000 R09: 0000000000000bb8
[  569.922744] R10: 00000000000003fc R11: 0000000000001637 R12: 0000000000000000
[  569.922745] R13: 0000000000000202 R14: ffffa38f3df73a08 R15: ffffa38f3df70000
[  569.922747] FS:  00007f6b58bff700(0000) GS:ffffa3905ea40000(0000) knlGS:0000000000000000
[  569.922748] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  569.922749] CR2: 0000000000000010 CR3: 000000032f9ec000 CR4: 0000000000340ee0
[  569.922751] Call Trace:
[  569.922756]  svm_vcpu_unblocking+0x31/0x50 [kvm_amd]
[  569.922773]  kvm_vcpu_block+0xd1/0x340 [kvm]
[  569.922795]  kvm_arch_vcpu_ioctl_run+0x1234/0x1b20 [kvm]
[  569.922799]  ? __seccomp_filter+0xd2/0x6c0
[  569.922816]  kvm_vcpu_ioctl+0x266/0x630 [kvm]
[  569.922820]  ksys_ioctl+0x87/0xc0
[  569.922823]  __x64_sys_ioctl+0x16/0x20
[  569.922825]  do_syscall_64+0x4e/0x150
[  569.922828]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
[  569.922831] RIP: 0033:0x7f6b607452eb
[  569.922833] Code: 0f 1e fa 48 8b 05 a5 8b 0c 00 64 c7 00 26 00 00 00 48 c7 c0 ff ff ff ff c3 66 0f 1f 44 00 00 f3 0f 1e fa b8 10 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 75 8b 0c 00 f7 d8 64 89 01 48
[  569.922834] RSP: 002b:00007f6b58bfcea8 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
[  569.922836] RAX: ffffffffffffffda RBX: 000000000000ae80 RCX: 00007f6b607452eb
[  569.922837] RDX: 0000000000000000 RSI: 000000000000ae80 RDI: 000000000000001e
[  569.922838] RBP: 00007f6b5e0da2c0 R08: 00005558b88eb110 R09: 0000000000000000
[  569.922839] R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000
[  569.922840] R13: 00007ffd77fa257f R14: 00007f6b58bfd140 R15: 00007f6b58bff700
[  569.922843] Modules linked in: vhost_net vhost tap cpufreq_ondemand ebtable_filter ebtables tun bridge stp llc uvcvideo videobuf2_vmalloc videobuf2_memops videobuf2_v4l2 videobuf2_common snd_usb_audio videodev snd_usbmidi_lib snd_rawmidi nvidia_drm(POE) snd_seq_device mc nvidia_modeset(POE) nct6775 hwmon_vid nvidia(POE) amdgpu nls_iso8859_1 gpu_sched nls_cp437 vfat radeon fat fuse snd_hda_codec_realtek i2c_algo_bit snd_hda_codec_generic ttm ledtrig_audio eeepc_wmi snd_hda_codec_hdmi asus_wmi drm_kms_helper battery sparse_keymap rfkill wmi_bmof snd_hda_intel cec edac_mce_amd snd_intel_dspcfg snd_hda_codec crct10dif_pclmul drm crc32_pclmul snd_hda_core ghash_clmulni_intel snd_hwdep ipmi_devintf snd_pcm ipmi_msghandler aesni_intel r8169 agpgart snd_timer crypto_simd syscopyarea sp5100_tco cryptd sysfillrect realtek glue_helper joydev sysimgblt input_leds mousedev pcspkr k10temp snd i2c_piix4 fb_sys_fops libphy soundcore wmi pinctrl_amd evdev mac_hid acpi_cpufreq nf_log_ipv6 ip6t_REJECT
[  569.922867]  nf_reject_ipv6 xt_hl ip6t_rt nf_log_ipv4 nf_log_common ipt_REJECT nf_reject_ipv4 xt_LOG xt_multiport xt_limit xt_addrtype xt_tcpudp xt_conntrack ip6table_filter ip6_tables nf_conntrack_netbios_ns nf_conntrack_broadcast nf_nat_ftp nf_nat nf_conntrack_ftp nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 libcrc32c iptable_filter sg ip_tables x_tables ext4 crc32c_generic crc16 mbcache jbd2 sr_mod cdrom sd_mod hid_generic usbhid hid ahci libahci crc32c_intel libata xhci_pci scsi_mod xhci_hcd vfio_pci vfio_virqfd vfio_iommu_type1 vfio kvm_amd ccp rng_core kvm irqbypass
[  569.922884] CR2: 0000000000000010
[  569.922886] ---[ end trace c5c7ecbc97cc5c9f ]---
[  569.922887] BUG: kernel NULL pointer dereference, address: 0000000000000010
[  569.922889] #PF: supervisor read access in kernel mode
[  569.922890] #PF: error_code(0x0000) - not-present page
[  569.922891] PGD 0 P4D 0 
[  569.922897] RIP: 0010:svm_refresh_apicv_exec_ctrl+0xf4/0x130 [kvm_amd]
[  569.922900] Code: 8b 83 f8 39 00 00 48 39 c5 74 31 48 8b 9b f8 39 00 00 48 39 dd 75 13 eb 23 e8 08 a0 90 f0 85 c0 75 1a 48 8b 1b 48 39 dd 74 12 <48> 8b 7b 10 45 85 e4 75 e6 e8 5e 9f 90 f0 85 c0 74 e6 5b 4c 89 ee
[  569.922901] RSP: 0018:ffffa7a74084fce8 EFLAGS: 00010082
[  569.922904] Oops: 0000 [#7] PREEMPT SMP NOPTI
[  569.922906] CPU: 10 PID: 7339 Comm: CPU 6/KVM Tainted: P      D    OE     5.6.0-rc2-1-mainline #1
[  569.922908] RAX: 0000000000000000 RBX: 0000000000000000 RCX: ffffa38f715ce000
[  569.922909] RDX: 0000000000000001 RSI: ffffa38f3c9e1800 RDI: 0000000000000000
[  569.922910] RBP: ffffa38f3ca57408 R08: 0000000000000000 R09: 0000000000000bb8
[  569.922911] R10: 000000849ef65aa1 R11: 00000000000630c0 R12: 0000000000000000
[  569.922912] R13: 0000000000000202 R14: ffffa38f3ca57418 R15: ffffa38f3ca53a10
[  569.922914] FS:  00007f6b58bff700(0000) GS:ffffa3905ea40000(0000) knlGS:0000000000000000
[  569.922915] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  569.922916] CR2: 0000000000000010 CR3: 000000032f9ec000 CR4: 0000000000340ee0
[  569.922918] Hardware name: System manufacturer System Product Name/TUF GAMING X570-PLUS, BIOS 1405 11/19/2019
[  569.922919] note: CPU 5/KVM[7338] exited with preempt_count 1
[  569.922924] RIP: 0010:svm_refresh_apicv_exec_ctrl+0xf4/0x130 [kvm_amd]
[  569.922926] Code: 8b 83 f8 39 00 00 48 39 c5 74 31 48 8b 9b f8 39 00 00 48 39 dd 75 13 eb 23 e8 08 a0 90 f0 85 c0 75 1a 48 8b 1b 48 39 dd 74 12 <48> 8b 7b 10 45 85 e4 75 e6 e8 5e 9f 90 f0 85 c0 74 e6 5b 4c 89 ee
[  569.922928] RSP: 0018:ffffa7a74319fce8 EFLAGS: 00010082
[  569.922930] RAX: 0000000000000000 RBX: 0000000000000000 RCX: ffffa39054d68000
[  569.922931] RDX: 0000000000000001 RSI: ffffa38f3dc17300 RDI: 0000000000000000
[  569.922932] RBP: ffffa38f3df77408 R08: 0000000000000000 R09: 0000000000000bb8
[  569.922934] R10: 000000000000012c R11: 0000000000d24bee R12: 0000000000000000
[  569.922935] R13: 0000000000000202 R14: ffffa38f3df77418 R15: ffffa38f3df73a10
[  569.922936] FS:  00007f6b57dff700(0000) GS:ffffa3905ea80000(0000) knlGS:0000000000000000
[  569.922938] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  569.922939] CR2: 0000000000000010 CR3: 000000032f9ec000 CR4: 0000000000340ee0
[  569.922940] Call Trace:
[  569.922946]  svm_vcpu_unblocking+0x31/0x50 [kvm_amd]
[  569.922963]  kvm_vcpu_block+0xd1/0x340 [kvm]
[  569.922985]  kvm_arch_vcpu_ioctl_run+0x1234/0x1b20 [kvm]
[  569.922988]  ? __seccomp_filter+0xd2/0x6c0
[  569.923005]  kvm_vcpu_ioctl+0x266/0x630 [kvm]
[  569.923010]  ksys_ioctl+0x87/0xc0
[  569.923012]  __x64_sys_ioctl+0x16/0x20
[  569.923015]  do_syscall_64+0x4e/0x150
[  569.923017]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
[  569.923019] RIP: 0033:0x7f6b607452eb
[  569.923021] Code: 0f 1e fa 48 8b 05 a5 8b 0c 00 64 c7 00 26 00 00 00 48 c7 c0 ff ff ff ff c3 66 0f 1f 44 00 00 f3 0f 1e fa b8 10 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 75 8b 0c 00 f7 d8 64 89 01 48
[  569.923023] RSP: 002b:00007f6b57dfcea8 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
[  569.923024] RAX: ffffffffffffffda RBX: 000000000000ae80 RCX: 00007f6b607452eb
[  569.923026] RDX: 0000000000000000 RSI: 000000000000ae80 RDI: 000000000000001f
[  569.923027] RBP: 00007f6b5e104600 R08: 00005558b88eb110 R09: 0000000000000000
[  569.923028] R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000
[  569.923029] R13: 00007ffd77fa257f R14: 00007f6b57dfd140 R15: 00007f6b57dff700
[  569.923032] Modules linked in: vhost_net vhost tap cpufreq_ondemand ebtable_filter ebtables tun bridge stp llc uvcvideo videobuf2_vmalloc videobuf2_memops videobuf2_v4l2 videobuf2_common snd_usb_audio videodev snd_usbmidi_lib snd_rawmidi nvidia_drm(POE) snd_seq_device mc nvidia_modeset(POE) nct6775 hwmon_vid nvidia(POE) amdgpu nls_iso8859_1 gpu_sched nls_cp437 vfat radeon fat fuse snd_hda_codec_realtek i2c_algo_bit snd_hda_codec_generic ttm ledtrig_audio eeepc_wmi snd_hda_codec_hdmi asus_wmi drm_kms_helper battery sparse_keymap rfkill wmi_bmof snd_hda_intel cec edac_mce_amd snd_intel_dspcfg snd_hda_codec crct10dif_pclmul drm crc32_pclmul snd_hda_core ghash_clmulni_intel snd_hwdep ipmi_devintf snd_pcm ipmi_msghandler aesni_intel r8169 agpgart snd_timer crypto_simd syscopyarea sp5100_tco cryptd sysfillrect realtek glue_helper joydev sysimgblt input_leds mousedev pcspkr k10temp snd i2c_piix4 fb_sys_fops libphy soundcore wmi pinctrl_amd evdev mac_hid acpi_cpufreq nf_log_ipv6 ip6t_REJECT
[  569.923057]  nf_reject_ipv6 xt_hl ip6t_rt nf_log_ipv4 nf_log_common ipt_REJECT nf_reject_ipv4 xt_LOG xt_multiport xt_limit xt_addrtype xt_tcpudp xt_conntrack ip6table_filter ip6_tables nf_conntrack_netbios_ns nf_conntrack_broadcast nf_nat_ftp nf_nat nf_conntrack_ftp nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 libcrc32c iptable_filter sg ip_tables x_tables ext4 crc32c_generic crc16 mbcache jbd2 sr_mod cdrom sd_mod hid_generic usbhid hid ahci libahci crc32c_intel libata xhci_pci scsi_mod xhci_hcd vfio_pci vfio_virqfd vfio_iommu_type1 vfio kvm_amd ccp rng_core kvm irqbypass
[  569.923073] CR2: 0000000000000010
[  569.923075] ---[ end trace c5c7ecbc97cc5ca0 ]---
[  569.923079] RIP: 0010:svm_refresh_apicv_exec_ctrl+0xf4/0x130 [kvm_amd]
[  569.923081] Code: 8b 83 f8 39 00 00 48 39 c5 74 31 48 8b 9b f8 39 00 00 48 39 dd 75 13 eb 23 e8 08 a0 90 f0 85 c0 75 1a 48 8b 1b 48 39 dd 74 12 <48> 8b 7b 10 45 85 e4 75 e6 e8 5e 9f 90 f0 85 c0 74 e6 5b 4c 89 ee
[  569.923082] RSP: 0018:ffffa7a74084fce8 EFLAGS: 00010082
[  569.923084] RAX: 0000000000000000 RBX: 0000000000000000 RCX: ffffa38f715ce000
[  569.923085] RDX: 0000000000000001 RSI: ffffa38f3c9e1800 RDI: 0000000000000000
[  569.923086] RBP: ffffa38f3ca57408 R08: 0000000000000000 R09: 0000000000000bb8
[  569.923087] R10: 000000849ef65aa1 R11: 00000000000630c0 R12: 0000000000000000
[  569.923089] R13: 0000000000000202 R14: ffffa38f3ca57418 R15: ffffa38f3ca53a10
[  569.923090] FS:  00007f6b57dff700(0000) GS:ffffa3905ea80000(0000) knlGS:0000000000000000
[  569.923092] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  569.923093] CR2: 0000000000000010 CR3: 000000032f9ec000 CR4: 0000000000340ee0
[  569.923095] note: CPU 6/KVM[7339] exited with preempt_count 1

_---------- End dmesg output ----------_
Comment 6 Paolo Bonzini 2020-02-21 21:27:00 UTC
This is untested, but based on the crash dump it seems like the ir_list is uninitialized.  Can you try this:

diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c
index 4b19188faaae..92afca7c252a 100644
--- a/arch/x86/kvm/svm.c
+++ b/arch/x86/kvm/svm.c
@@ -2206,7 +2206,7 @@ static int avic_init_vcpu(struct vcpu_svm *svm)
 {
 	int ret;
 
-	if (!kvm_vcpu_apicv_active(&svm->vcpu))
+	if (!avic)
 		return 0;
 
 	ret = avic_init_backing_page(&svm->vcpu);
Comment 7 Robert M. Muncrief 2020-02-22 00:21:32 UTC
Created attachment 287549 [details]
KVM crash at boot with two test patched

Unfortunately with the two test patches KVM crashes at boot now. And though the libvirtd status was good, virt-manager couldn't connect and nothing would run.

I noticed an odd APIC warning in dmesg about conflicting address space surrounding some nvidia stuff though, and wondered if that had anything to do with it. So I uninstalled rc6 and the Tk-Glitch nvidia-dkms drivers I had to use for rc6, and reinstalled the normal nvidia-dkms drivers. But the dmesg warning is still present with my working 5.5.4 kernel, and my VM runs great, so I don't think that has anything to do with it.

Nevertheless I'm attaching the full dmesg output I got at boot.

I'll be up for another few hours so if you need anymore help today I'll get right on it. My son is getting married tomorrow though so I won't be available, but from Sunday on I'll be back online.

Thanks for all your help gentlemen!
Comment 8 Suravee Suthikulpanit 2020-02-24 13:52:19 UTC
Created attachment 287577 [details]
Patch to fix the NULL pointer de-reference for when AVIC is enabled + VFIO pass-through

Muncrief,

Would you please give this patch a try (on top of the previous patch)?

Thanks,
Suravee
Comment 9 Paolo Bonzini 2020-02-24 14:24:13 UTC
Suravee, isn't that the same patch I asked about in comment 6?
Comment 10 Suravee Suthikulpanit 2020-02-24 14:44:15 UTC
(In reply to Paolo Bonzini from comment #9)
> Suravee, isn't that the same patch I asked about in comment 6?

Hm.. You are right. It is the same. Sorry for duplication. 

However, I actually tested this patch and it fixed the issue reported by Muncrief with the same call trace signature.

Let me look more into this then.

Suravee
Comment 11 Robert M. Muncrief 2020-02-24 16:50:29 UTC
Created attachment 287579 [details]
Patched rc3 dmesg crash output

Since rc3 just came out and Suravee successfully tested the patches I went ahead and did a clean build of rc3 with the two patches again and unfortunately my results were the same, kvm crashes at boot. I've attached the dmesg output.

By the way, I'm a retired hardware/firmware/software designer from the olden days (1980s through early 2000s), specializing in embedded systems. So I've done everything from designing simple CPUs and microcontrollers, to firmware and simple RTOSs, to assemblers and high level software, including Linux. Heck I was working before things like Linux and GDB even existed! :) So if you need me to do something more complex I'd be happy to try. I'm pretty rusty though, so I don't know if trying to instruct me would help or hurt :)
Comment 12 Paolo Bonzini 2020-02-24 17:57:15 UTC
Comment on attachment 287579 [details]
Patched rc3 dmesg crash output

Based on the crashdump the failure seems to be at:

        if (!svm->vcpu.arch.apic->regs)
                return -EINVAL;

in avic_init_backing_page.  This suggests refining the patch to look like this:

-----
        if (!avic && !irqchip_in_kernel(vcpu->kvm))
                return 0;

        ret = avic_init_backing_page(&svm->vcpu);
-----

muncrief, please in addition to testing the patch can you include the qemu command line (from "ps aux")?  I see nothing in your libvirt XML that suggests disabling KVM's in-kernel emulation of the local APIC (and the fact that EPYC-IBPB disables x2APIC also suggests that it's disabled), so we might have two bugs here.

What is your QEMU version?
Comment 13 Robert M. Muncrief 2020-02-24 20:36:43 UTC
Created attachment 287585 [details]
dmesg crash output with Paolo's latest patch

Hi Paolo. I changed the patch, however it doesn't compile with exactly what you suggested. You said "!irqchip_in_kernel(vcpu->kvm)" but svm is what's passed to the function so I had to change it to "!irqchip_in_kernel(&svm->vcpu.kvm)" or compilation failed. So to be clear the new patch I applied was:

\-	if (!kvm_vcpu_apicv_active(&svm->vcpu))
\+	if (!avic && !irqchip_in_kernel(svm->vcpu.kvm))
 		return 0;
 
 	ret = avic_init_backing_page(&svm->vcpu);

However kvm still crashes at boot with the patch from Comment 4 and the new patch applied. I've attached the dmesg output to this comment. The qemu command is coming up because I don't see how to attach two files.
Comment 14 Robert M. Muncrief 2020-02-24 20:38:54 UTC
Created attachment 287587 [details]
Exact qemu command being executed.

Hi again Paolo. I've attached the qemu command to this comment.
Comment 15 Robert M. Muncrief 2020-02-24 21:43:33 UTC
Created attachment 287595 [details]
Dmesg crash output with properly cast pointer

I looked at the modified patch I made and realized the pointer indirection was probably wrong so I looked at the other code and cast it the same way. There's still a kvm crash at boot and I attached the dmesg to this comment. I just didn't want to add any confusion by patching things incorrectly. Here's what the final code looks like after (what I hope is) the correct patch:

	int ret;
	struct kvm_vcpu *vcpu = &svm->vcpu;

	if (!avic && !irqchip_in_kernel(vcpu->kvm))
		return 0;

	ret = avic_init_backing_page(&svm->vcpu);
	if (ret)
		return ret;
Comment 16 Robert M. Muncrief 2020-02-24 21:47:38 UTC
(In reply to Paolo Bonzini from comment #12)
> What is your QEMU version?

Sorry I missed that Paolo. My QEMU version is 4.2.0.
Comment 17 Robert M. Muncrief 2020-02-24 22:09:24 UTC
Created attachment 287597 [details]
Success! Working dmesg output.

Okay gentlemen, I looked at the other code and it seemed to me that the condition should be an "or" instead of an "and" so I changed the patch again and my host system and VM both booted and are running perfectly as far as I can tell. However I have no idea if avic is actually working or whether what I've done is correct or not.

In any case, I attached the working dmesg output to this comment, and here's the final code I ended up with for avic_init_vcpu:

```
static int avic_init_vcpu(struct vcpu_svm *svm)
{
	int ret;
	struct kvm_vcpu *vcpu = &svm->vcpu;

	if (!avic || !irqchip_in_kernel(vcpu->kvm))
		return 0;

	ret = avic_init_backing_page(&svm->vcpu);
	if (ret)
		return ret;

	INIT_LIST_HEAD(&svm->ir_list);
	spin_lock_init(&svm->ir_list_lock);
	svm->dfr_reg = APIC_DFR_FLAT;

	return ret;
}
```
Comment 18 Robert M. Muncrief 2020-02-24 22:53:32 UTC
Created attachment 287599 [details]
svm.c patch

Ha! Of course I forgot to add the actual patch I ended up using, so here it is.

But hey, I told you I'm old :)

However I'm still assuming what I did was wrong. Though my VM is working fine, the CPU-Z benchmark is significantly lower. For example the single threaded performance is now around 480, and with kernel 5.5.5 it's a little over 500, around 503 to 507.

Of course this is all anecdotal, and I haven't had time to do any real testing. But I'm going to wait to hear from you gentlemen before I proceed any further. I'd especially like to know how to tell if avic is even actually working.
Comment 19 Robert M. Muncrief 2020-02-24 23:08:07 UTC
Oh sheesh, never mind about the benchmarks, they're exactly the same as before. I forgot to shutdown CrashPlan and it was doing it's thing in the background. Once I shut it down and rebooted there was no difference in VM performance.
Comment 20 Paolo Bonzini 2020-02-25 07:50:24 UTC
> I looked at the other code and it seemed to me that the condition should be
> an "or" instead of an "and"

Uff, my bad, I used an OR here for testing and wrote an AND in the comment.  Age seems not to matter at all. :)  Regarding the performance, let's compare your current score with these two cases:

1) adding to the libvirt XML

  <ioapic driver='qemu'/>

right below

  <apic/>

2) loading kvm_amd with avic=0 which will surely disable the AVIC.
Comment 21 Suravee Suthikulpanit 2020-02-25 08:53:25 UTC
Paolo/Muncrief,

I have also finally reproduce the issue (w/ -machine kernel_irqchip=off). The the recommended changes (w/ if (!avic || !irqchip_in_kernel(svm->vcpu.kvm)) fixes the issue. Thanks for catching this.

Paolo, If the NULL pointer is due to:

    if (!svm->vcpu.arch.apic->regs)
        return -EINVAL;

Shouldn't we be checking the following instead:

    if (!avic || !lapic_in_kernel(&svm->vcpu))
        return 0;

This also works in my test.

Muncrief,

Besides enabling AVIC (modprobe kvm_amd avic=1), you can check to see if AVIC is activated for the VM by running "perf kvm stat live" while running the VM and see if there are any AVIC-related #vmexits (instead of vintr).

Thanks,
Suravee
Comment 22 Robert M. Muncrief 2020-02-25 20:34:29 UTC
Created attachment 287607 [details]
svm.c patch option 2

Okay gentlemen, per Suravee's inquiry I created a patch with the "if (!avic || !lapic_in_kernel(&svm->vcpu))" option, did a clean build, and tested the VM. Everything worked great, and there was no perceptible difference in operation or performance between the two patch variations. I've attached the alternate patch to this comment. I know nothing about the details of what's going on here of course, so it's up to you to choose which one you prefer.

The problem now is that it appears that avic is not actually working. I executed "perf kvm stat live" as Suravee suggested and all I saw was vintr, there were no vmexit events. I also disabled avic per Paolo's instructions and there was no perceptible difference in VM performance. I've done everything I could discover to assure avic is enabled as follows:

\# dmesg | grep -i AMD-Vi
[    0.918716] pci 0000:00:00.2: AMD-Vi: IOMMU performance counters supported
[    0.920160] pci 0000:00:00.2: AMD-Vi: Found IOMMU cap 0x40
[    0.920161] pci 0000:00:00.2: AMD-Vi: Extended features (0x58f77ef22294ade):
[    0.920163] AMD-Vi: Interrupt remapping enabled
[    0.920163] AMD-Vi: Virtual APIC enabled
[    0.920163] AMD-Vi: X2APIC enabled
[    0.920272] AMD-Vi: Lazy IO/TLB flushing enabled
[    0.927736] AMD-Vi: AMD IOMMUv2 driver by Joerg Roedel <jroedel@suse.de>

\# cat /sys/module/kvm_amd/parameters/avic
1

\# systool -m kvm_amd -v
...
```
  Parameters:
    avic                = "1"
    dump_invalid_vmcb   = "N"
    nested              = "1"
    npt                 = "1"
    nrips               = "1"
    pause_filter_count_grow= "2"
    pause_filter_count_max= "65535"
    pause_filter_count_shrink= "0"
    pause_filter_count  = "3000"
    pause_filter_thresh = "128"
    sev                 = "0"
    vgif                = "1"
    vls                 = "1"
...
```

So at this point I'm perplexed as to why avic isn't working. Any suggestions or further instructions would be greatly appreciated.
Comment 23 Robert M. Muncrief 2020-02-25 20:42:59 UTC
(In reply to Paolo Bonzini from comment #20)
> 
> Uff, my bad, I used an OR here for testing and wrote an AND in the comment. 
> Age seems not to matter at all. :) ...

Ha! No problem Paolo. I greatly appreciate your help, and actually wondered if you were just testing to see if I was really an engineer. In any case, I've always been the classic "absent minded professor" type myself, and even had a manager once tell me I did a fantastic job, but that it was "terrifying" to watch me work :)
Comment 24 Robert M. Muncrief 2020-02-26 02:25:10 UTC
(In reply to muncrief from comment #22)

> ... I executed "perf kvm stat live" as Suravee suggested and all I saw was vintr,
> there were no vmexit events. ...

That was a typo so I want to make clear that I expect to see "avic_incomplete_ipi" or "avic_unaccelerated_access" vmexits in the VM-EXIT column, as those are the only avic related exits I could find in "tools/arch/x86/include/uapi/asm/svm.h" and "arch/x86/include/uapi/asm/svm.h." If there are others please let me know. Right now the  I only interrupt related things I see are "interrupt" and "vintr."
Comment 25 Robert M. Muncrief 2020-02-26 20:34:03 UTC
Created attachment 287639 [details]
qemu-vkm setup info resulting in nonfunctional avic

It occurred to me that the original bug may be solved, and that my problems getting avic to function may need to be addressed by creating a separate bug report. I'm not sure though, so I created an attachment with my current grub command line, qemu-kvm command line, and VM XML to this comment. And you can look at Comment 22 to see my resulting system configuration.

And by the way, I've tried a myriad of other options in the command lines and XML as well, but they all resulted in avic not functioning so there's no point in detailing them all.

So if someone could review this information to verify that everything is setup correctly, and let me know if it's time to file a different bug report, I'd appreciate it. I know you're all very busy and don't want to bother you with inquiries outside the scope of the initial bug.
Comment 26 Suravee Suthikulpanit 2020-02-27 14:49:47 UTC
There are several reason that could inhibit the AVIC from being activated even though it is enabled during module load (i.e. modprobe kvm_amd avic=1).

Could you please try the following patch:

---- BEGIN PATCH ----
diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index 40a0c0f..fb7e5a6 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -995,6 +995,7 @@ struct kvm_vm_stat {
        ulong lpages;
        ulong nx_lpage_splits;
        ulong max_mmu_page_hash_collisions;
+       ulong apicv_inhibit_reasons;
 };

 struct kvm_vcpu_stat {
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index fb5d64e..2c968a7 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -222,6 +222,7 @@ struct kvm_stats_debugfs_item debugfs_entries[] = {
        { "nx_largepages_splitted", VM_STAT(nx_lpage_splits, .mode = 0444) },
        { "max_mmu_page_hash_collisions",
                VM_STAT(max_mmu_page_hash_collisions) },
+       { "apicv_inhibit_reasons", VM_STAT(apicv_inhibit_reasons, .mode = 0444) },
        { NULL }
 };

@@ -8051,6 +8052,7 @@ void kvm_request_apicv_update(struct kvm *kvm, bool activate, ulong bit)
                        return;
        }

+       kvm->stat.apicv_inhibit_reasons = kvm->arch.apicv_inhibit_reasons;
        trace_kvm_apicv_update_request(activate, bit);
        if (kvm_x86_ops->pre_update_apicv_exec_ctrl)
                kvm_x86_ops->pre_update_apicv_exec_ctrl(kvm, activate);
---- END PATCH ----

Then, while running the VM, please run "cat /sys/kernel/debug/kvm/*/apicv_inhibit_reasons". This should allow us to see why KVM deactivate AVIC.

Trying your XML file in the description, I also noticed that AVIC is deactivated for the VM. However, when I tries specifying EPYC-IBPB model in the XML, then it creates the VM w/ AVIC activated. Could you please give it a try?

Thanks,
Suravee
Comment 27 Robert M. Muncrief 2020-02-27 20:50:32 UTC
Created attachment 287667 [details]
avic_inhibit_reasons debug information

Oh, that was so cool Suravee!

I applied the initial svm.c and debug patch and tested both host-passthrough and EPYC-IBPB configurations, and avic failed with both.

With host-passthrough the avic_inhibit_reasons value immediately changed from 0 to 20, and then a few seconds later changed to 28.

With the EPYC-IBPB the avic_inhibit_reasons value immediately changed from 0 to 16, and then a few seconds later changed to 24.

I wanted to give you as much accurate information as possible so I attached the avic_inhibit_reason values, my physical machine and domain capabilities, and the command lines and XML files used in the tests for each configuration to this comment.

By the way, I ran multiple tests with both configurations. I used the defaults, then explicit 1-8-2 topology, and with host-passthrough I removed the cache passthrough and VM hiding. I also ran all tests again using the second variation of the svm.c patch. Unfortunately nothing changed the avic failure though.

I hope this information helps, and am looking forward to assisting you gentlemen further in any way possible.
Comment 28 Kayant 2020-02-27 23:00:46 UTC
Created attachment 287685 [details]
avic_inhibit_reasons-anthony

Hi I also just wanted to give my observations I have found when testing the patches.

I confirm I also don't have don't have crashes relating to the original report. 

I have been trying out the SVM AVIC patches since around the first patch that was submitted but never got round to documentation my testing until recently.

I can't remember the specific patch set/kernel version I tried but I remember having avic apparently working with when synic + stimer where enabled but not without. If my understanding is correctly this shouldn't be the case as synic is meant to be a case when avic is permanently disabled.

This is still the case with current patchset. 

In summary I can get avic reporting it's working according to perf stat and trace logs when synic is on but not working when synic is off. Using EPYC-IBPB or passthrough doesn't change the avic_inhibit_reasons.

With Synic I get avic_inhibit_reasons - 10
With Synic+Stimer off I get - 0


To note I am using arch linux + qemu 4.2 + linux-mainline-5.6.0-rc2.

Please see a small trace log of synic on vs off, domain capabilities, perf stat and patches used.

These were recording once the VM was launched and sitting at the login screen.

Please let me know if there is any other info I get provide to help.
Comment 29 Robert M. Muncrief 2020-02-28 00:12:47 UTC
I have to knock off for today gentlemen, but just wanted to let you know that if I disable nested virtualization in the host-passthrough configuration I get the same avic_inhibit_reasons value as the EPYC-IBPB configuration. So instead of "20 28" for host-passthrough I get the same "16 24" as EPYC.

I also saw Anthony's comment about synic so I turned it off in both configurations with "<synic state='off'/>", but it didn't affect anything on my system. I think it's actually off by default though because when I tried to turn it on with "<synic state='on'/>" I got an error when trying to save the XML.
Comment 30 Kayant 2020-02-28 00:20:59 UTC
(In reply to muncrief from comment #29)
> I have to knock off for today gentlemen, but just wanted to let you know
> that if I disable nested virtualization in the host-passthrough
> configuration I get the same avic_inhibit_reasons value as the EPYC-IBPB
> configuration. So instead of "20 28" for host-passthrough I get the same "16
> 24" as EPYC.
> 
> I also saw Anthony's comment about synic so I turned it off in both
> configurations with "<synic state='off'/>", but it didn't affect anything on
> my system. I think it's actually off by default though because when I tried
> to turn it on with "<synic state='on'/>" I got an error when trying to save
> the XML.

Am guessing it's likely complaining about stimer not being enabled. To use synic you need 
<stimer state="on"/>
Comment 31 Suravee Suthikulpanit 2020-02-28 03:38:14 UTC
(In reply to muncrief from comment #27)
> Created attachment 287667 [details]
> avic_inhibit_reasons debug information

Thanks for the info
> With host-passthrough the avic_inhibit_reasons value immediately changed
> from 0 to 20, and then a few seconds later changed to 28.
> 
> With the EPYC-IBPB the avic_inhibit_reasons value immediately changed from 0
> to 16, and then a few seconds later changed to 24.

24 means: 
APICV_INHIBIT_REASON_IRQWIN
APICV_INHIBIT_REASON_PIT_REINJ

Looking at the QEMU command line, you might need to either:
  1. Specify "-machine kernel_irqchip=split" (default to "on")
Or 
  2. Specify "-global kvm-pit.lost_tick_policy=discard" (default to "delay")

Please give this a try.

Suravee
Comment 32 Suravee Suthikulpanit 2020-02-28 03:44:32 UTC
(In reply to Anthony from comment #28)
> Created attachment 287685 [details]
> avic_inhibit_reasons-anthony
> 
> Hi I also just wanted to give my observations I have found when testing the
> patches.
> 
> I confirm I also don't have don't have crashes relating to the original
> report. 
> 
> I have been trying out the SVM AVIC patches since around the first patch
> that was submitted but never got round to documentation my testing until
> recently.
> 
> I can't remember the specific patch set/kernel version I tried but I
> remember having avic apparently working with when synic + stimer where
> enabled but not without. If my understanding is correctly this shouldn't be
> the case as synic is meant to be a case when avic is permanently disabled.
> 
> This is still the case with current patchset. 
> 
> In summary I can get avic reporting it's working according to perf stat and
> trace logs when synic is on but not working when synic is off. Using
> EPYC-IBPB or passthrough doesn't change the avic_inhibit_reasons.
> 
> With Synic I get avic_inhibit_reasons - 10
> With Synic+Stimer off I get - 0
> 
> 
> To note I am using arch linux + qemu 4.2 + linux-mainline-5.6.0-rc2.
> 
> Please see a small trace log of synic on vs off, domain capabilities, perf
> stat and patches used.
> 
> These were recording once the VM was launched and sitting at the login
> screen.
> 
> Please let me know if there is any other info I get provide to help.

Thanks for the observation info, and your observation makes sense. AVIC is also deactivated w/ synic enabled. (https://elixir.bootlin.com/linux/v5.6-rc3/source/arch/x86/kvm/hyperv.c#L773)

Thanks,
Suravee
Comment 33 Robert M. Muncrief 2020-02-28 07:25:21 UTC
(In reply to Anthony from comment #30)
> (In reply to muncrief from comment #29)
> ...
> Am guessing it's likely complaining about stimer not being enabled. To use
> synic you need 
> <stimer state="on"/>

Thank you Anthony, you were correct.
Comment 34 Robert M. Muncrief 2020-02-28 07:26:51 UTC
Created attachment 287697 [details]
Working avic setup information

Final success Suravee!

The last steps were to disable nested virtualization and set the pit tickpolicy to discard. Setting kernel_irgchip to split was not necessary, and actually caused the GPU HDMI audio to crackle and eventually disappear. There was also no need to pass kernel options via grub as they can be specified in an option file. I've attached the final working grub and qemu command lines, and VM XML, to this comment.

So to wrap things up, here are the steps I took to get avic fully functioning:

1. Download the Arch linux-mainline AUR package, which currently downloads kernel 5.6 rc3 from https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git.

2. Patch the kernel source with "svm.c patch option 2" from the Attachments and compile the kernel.

3. Create file "/etc/modprobe.d/kvm.conf" and add "options kvm_amd avic=1 nested=0" to it.

4. Edit the VM XML file and change the pit tickpolicy with "<timer name='pit' tickpolicy='discard'/>"

And that was it. I didn't see any performance change with the quick CPUID benchmark I ran, but that may be because the VM runs at 98% of bare metal already. It's really amazing.

So if you could give me the final word on which svm patch should be used that's it for this bug (I used patch 2 because it seemed more direct). And if anyone knows how to set the avic and nested options in the XML file, instead of an options file, that would be great. I couldn't find any way to do it, and at this point am assuming it can only be done in a file.

Thank you and Paolo and Alex for all your help, it's been both fun and educational hammering out this bug with all of you.
Comment 35 Paolo Bonzini 2020-02-28 07:55:50 UTC
> how to set the avic and nested options in the XML file, instead of an options
> file, that would be great

For nested you can use

    <feature policy='disable' name='svm'/>

inside the <cpu> element.  For avic=1 it will probably be the default in 5.7.
Comment 36 Robert M. Muncrief 2020-02-28 16:06:47 UTC
(In reply to Paolo Bonzini from comment #35)
> > how to set the avic and nested options in the XML file, instead of an options
> > file, that would be great
> 
> For nested you can use
> 
>     <feature policy='disable' name='svm'/>
> 
> inside the <cpu> element.  For avic=1 it will probably be the default in 5.7.

I took the nested option out of the file and added the disable svm statement to the XML and that worked perfectly Paolo. Thank you.
Comment 37 Kayant 2020-02-28 20:14:56 UTC
Created attachment 287701 [details]
Perf_stat_anthony_apicv

(In reply to Suravee Suthikulpanit from comment #32)
> (In reply to Anthony from comment #28)
> > Created attachment 287685 [details]
> > avic_inhibit_reasons-anthony
> > 
> > Hi I also just wanted to give my observations I have found when testing the
> > patches.
> > 
> > I confirm I also don't have don't have crashes relating to the original
> > report. 
> > 
> > I have been trying out the SVM AVIC patches since around the first patch
> > that was submitted but never got round to documentation my testing until
> > recently.
> > 
> > I can't remember the specific patch set/kernel version I tried but I
> > remember having avic apparently working with when synic + stimer where
> > enabled but not without. If my understanding is correctly this shouldn't be
> > the case as synic is meant to be a case when avic is permanently disabled.
> > 
> > This is still the case with current patchset. 
> > 
> > In summary I can get avic reporting it's working according to perf stat and
> > trace logs when synic is on but not working when synic is off. Using
> > EPYC-IBPB or passthrough doesn't change the avic_inhibit_reasons.
> > 
> > With Synic I get avic_inhibit_reasons - 10
> > With Synic+Stimer off I get - 0
> > 
> > 
> > To note I am using arch linux + qemu 4.2 + linux-mainline-5.6.0-rc2.
> > 
> > Please see a small trace log of synic on vs off, domain capabilities, perf
> > stat and patches used.
> > 
> > These were recording once the VM was launched and sitting at the login
> > screen.
> > 
> > Please let me know if there is any other info I get provide to help.
> 
> Thanks for the observation info, and your observation makes sense. AVIC is
> also deactivated w/ synic enabled.
> (https://elixir.bootlin.com/linux/v5.6-rc3/source/arch/x86/kvm/hyperv.c#L773)
> 
> Thanks,
> Suravee

I see. I remember reading through your patchset and trying work things out good to see my basic understanding wasn't too far off :)

If it's not too much of a bother I was wondering if you could list the requirements needed to enable avic from a kvm/qemu setup. I just wanted to check if I am missing anything on my side as I am not 100% sure on how to tell avic is working looking a trace, perf stat/live. 

I did a perf stat from when the vm was launch via libvirt to about when it gets to the login screen I about 36 "kvm_apicv_update_request".The attachement has a full output if that is of use. I also got a trace of this but as I redirected all the output to a file I ended up with quite a large file(462MB). So I wanted to see if there is anything I should be looking out for in the trace to see if avic is working as intended. Also, if you don't mind could you show a sample of trace where avic is working.
Comment 38 Robert M. Muncrief 2020-02-28 21:49:34 UTC
(In reply to Anthony from comment #37)
> ... I just wanted to check if I am missing anything on my side as I am not 100% sure on how to tell avic is working looking a trace, perf stat/live. ... 

You can tell if avic is working by executing "perf kvm stat live" and then looking for "avic_incomplete_ipi" and "avic_unaccelerated_" in the VM-EXIT column. If it's not working you'll see "vintr" instead.

By the way the way "avic_unaccelerated_" is actually "avic_unaccelerated_access" in the code but evidently it gets truncated in the perf command output.
Comment 39 Kayant 2020-02-29 07:02:52 UTC
(In reply to muncrief from comment #38)
> (In reply to Anthony from comment #37)
> > ... I just wanted to check if I am missing anything on my side as I am not
> 100% sure on how to tell avic is working looking a trace, perf stat/live. ... 
> 
> You can tell if avic is working by executing "perf kvm stat live" and then
> looking for "avic_incomplete_ipi" and "avic_unaccelerated_" in the VM-EXIT
> column. If it's not working you'll see "vintr" instead.
> 
> By the way the way "avic_unaccelerated_" is actually
> "avic_unaccelerated_access" in the code but evidently it gets truncated in
> the perf command output.

Oh if that's the case then my understanding has just been poor as I assumed the kvm_apicv_update_request counter should be higher to show the times where apicv has been activated and deactivated which should also be reflected in a trace. At least that is what it reads like to me reading this patch - https://lore.kernel.org/patchwork/patch/1153605/
Comment 40 Robert M. Muncrief 2020-02-29 17:40:13 UTC
(In reply to Anthony from comment #39)
> (In reply to muncrief from comment #38)
> > (In reply to Anthony from comment #37)
> ... Oh if that's the case then my understanding has just been poor as I assumed
> the kvm_apicv_update_request counter should be higher to show the times
> where apicv has been activated and deactivated which should also be
> reflected in a trace. At least that is what it reads like to me reading this
> patch - https://lore.kernel.org/patchwork/patch/1153605/

I took a look at that just out of curiosity Anthony but unfortunately I don't know anything about the Linux kernel code, I've just been following along with the devs as best I can. I'm simply a retired hardware/firmware/software designer from the olden days. And by "olden" I mean before Linux, and even things like CGA graphics, existed :)

I was just passing along what I learned from Suravee, and some cursory observation of tiny related code segments. All I was ever able to accomplish was a partial understanding of the first 5 bits of the avic_inhibit_reasons output :)

Your question is an interesting one though, I wasn't even aware that a request counter existed!
Comment 41 Kayant 2020-02-29 19:43:47 UTC
(In reply to muncrief from comment #40)
> (In reply to Anthony from comment #39)
> > (In reply to muncrief from comment #38)
> > > (In reply to Anthony from comment #37)
> > ... Oh if that's the case then my understanding has just been poor as I
> assumed
> > the kvm_apicv_update_request counter should be higher to show the times
> > where apicv has been activated and deactivated which should also be
> > reflected in a trace. At least that is what it reads like to me reading
> this
> > patch - https://lore.kernel.org/patchwork/patch/1153605/
> 
> I took a look at that just out of curiosity Anthony but unfortunately I
> don't know anything about the Linux kernel code, I've just been following
> along with the devs as best I can. I'm simply a retired
> hardware/firmware/software designer from the olden days. And by "olden" I
> mean before Linux, and even things like CGA graphics, existed :)
> 
> I was just passing along what I learned from Suravee, and some cursory
> observation of tiny related code segments. All I was ever able to accomplish
> was a partial understanding of the first 5 bits of the avic_inhibit_reasons
> output :)
> 
> Your question is an interesting one though, I wasn't even aware that a
> request counter existed!

Woah that's quite the background you got there :)
For me I have just done some basic programming in my life and just try to piece things together as I go along.
The only reason I asked was because if I recall correctly in earlier patchsets the debugging interface for apicv was different and the counter(probably not the correct term) that showed apicv activity had more updates when it was working. Although given it's changed since then am not sure how it's meant to work now. 

Below is a sample of the output when you enable the trace -
 
"echo 1 >/sys/kernel/debug/tracing/events/kvm/kvm_apicv_update_request/enable"

Then to see the output

"cat /sys/kernel/debug/tracing/trace_pipe"

           <...>-211863 [000] .... 22493.097745: kvm_apicv_update_request: deactivate bit=4
           <...>-211863 [000] .... 22493.097818: kvm_apicv_update_request: activate bit=4
Comment 42 Robert M. Muncrief 2020-03-01 06:27:00 UTC
Created attachment 287723 [details]
Many "WARN_ON(entry & AVIC_PHYSICAL_ID_ENTRY_IS_RUNNING_MASK);" warnings

Oh sheesh gentlemen. I'd grepped dmesg for "error" after getting avic working, but when I went to check out the latest git to see if automount was fixed and manually browsed through dmesg I discovered a bunch of KVM warnings. So I went back and checked rc3 and they're there as well. I've attached the dmesg output from the latest git since I figure that's what you're working with.

They're coming from the svm.c "avic_vcpu_load" function. Here's the statement:
```
	WARN_ON(entry & AVIC_PHYSICAL_ID_ENTRY_IS_RUNNING_MASK);
```
The function continues after the warning, and I've tested the VM pretty thoroughly now and haven't discovered any problems, so of course as usual I have no idea what the heck is going on :)

Anyway, I just thought I'd let you know and if you want me to help debug it I'd be happy to. It would be nice to get rid of them though because it makes the dmesg output look really scary! :) Seriously though, many people would assume they're errors because of the way they display.

By the way, avic works without any patches with today's git. I saw part of the patch created here was there, and part wasn't, so I just compiled git without patches and it seems to work fine. And automount is fixed! Yay!
Comment 43 Paolo Bonzini 2020-03-01 18:21:48 UTC
Hey, this should fix the warning (not sure because it's untested and I'd wait for Suravee to confirm it's the intended behavior):

diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c
index b51b362a9736..81c2cfa96b69 100644
--- a/arch/x86/kvm/svm.c
+++ b/arch/x86/kvm/svm.c
@@ -2162,6 +2162,9 @@ static void avic_set_running(struct kvm_vcpu *vcpu, bool is_run)
 {
 	struct vcpu_svm *svm = to_svm(vcpu);
 
+	if (svm->avic_is_running == is_run)
+		return;
+
 	svm->avic_is_running = is_run;
 	if (is_run)
 		avic_vcpu_load(vcpu, vcpu->cpu);
Comment 44 Robert M. Muncrief 2020-03-02 07:01:03 UTC
(In reply to Paolo Bonzini from comment #43)
> Hey, this should fix the warning (not sure because it's untested and I'd
> wait for Suravee to confirm it's the intended behavior): ...
> 

Thanks Paolo. I assumed you meant "svm->avic_is_running == 1" because "is_run" isn't defined, but along the way I could see that functions like "avic_set_running" actually called "avic_vcpu_load" with "is_true" set to true.

So, being confused about the intended logic, I spent an interesting day trying to figure out why the stack trace seemed to show "avic_vcpu_load" being called by "kvm_vcpu_block", which didn't have any obvious calls to "avic_vcpu_load".

I don't know how to setup gdb to debug the kernel, and after doing a quick search it looked pretty difficult, so I just used an old fashioned technique of defining a global unsigned integer and setting/clearing tracking bits throughout "kvm_vcpu_block" to trace the real time flow of the code. I then output the bits from "avic_vcpu_load" when the error condition occurred so I could see where "kvm_vcpu_block" was when the warning condition was triggered.

And what I found was that "avic_vcpu_load" is branched to after the "schedule()" call in "kvm_vcpu_block". There's a for loop that executes "prepare_to_swait_exclusive" and then "schedule()", and that's when "avic_vcpu_load" is executed.

When I saw that I realized that tracking bits wouldn't do, as it appears to be some kind of preemption issue. So I'm seriously thinking about setting up my system for gdb kernel debugging because it really pissed me off that I couldn't figure it out! :)

Anyway, yes, I'm crazy like that :) I spent the whole day sprinkling tracking bits throughout the code and then recompiling the kernel over and over so I could decipher real time code flow. Hey! Don't laugh! That's the way we used to do it in the olden days ... :)
Comment 45 Robert M. Muncrief 2020-03-03 05:04:22 UTC
Well, I spent some time today trying to figure out why the AVIC_PHYSICAL_ID_ENTRY_IS_RUNNING_MASK is sometimes set when executing "avic_vcpu_load". I thought it might be a concurrency issue with "avic_physical_id_cache" so I put spinlocks around all access code but that didn't change anything, and tried some other things but I regret to say I failed.

First of all, I don't even know how "avic_vcpu_load" ends up being executed without being called by any of the functions in "svm_x86_ops" (svm_vcpu_blocking, svm_vcpu_unblocking, svm_vcpu_load). I did my best to figure out how swait worked, but I must really be missing something, and not understanding the call stack trace correctly.

In any case, the basic issue is that occasionally "avic_vcpu_load" is executed when the AVIC_PHYSICAL_ID_ENTRY_IS_RUNNING_MASK bit is 1, and it always expects it to be 0. And this always seems to happen after a cpu is placed on the swait wait queue in the "kvm_vcpu_block" function. It seems like somehow when it resumes it's executing"avic_vcpu_load", and the AVIC_PHYSICAL_ID_ENTRY_IS_RUNNING_MASK bit is 1 instead of 0.

I'm sure this is just mass misunderstanding on my part. And by the way I'm going through all of this because it's a good way for me to learn how real time concurrent kernel code operates. I know it seems like a backwards way to learn, but that's the only way I can.

However, despite my ignorance, the header in "swait.h" is concerning, and makes me wonder if there's something wrong with swait itself. It even titles itself "BROKEN wait-queues." Here's an excerpt from the header:

```
/*
 * BROKEN wait-queues.
 *
 * These "simple" wait-queues are broken garbage, and should never be
 * used. The comments below claim that they are "similar" to regular
 * wait-queues, but the semantics are actually completely different, and
 * every single user we have ever had has been buggy (or pointless).
 * ...
```
Comment 46 Kayant 2020-03-22 13:43:02 UTC
Suravee Suthikulpanit 

Just wanted to say thanks a lot for the latest patches - 

iommu/amd: Fix IOMMU AVIC not properly update the is_run bit in IRTE - https://lore.kernel.org/patchwork/patch/1208762/

This patch fixes the performance differences I saw when I was testing AVIC with your earlier patchsets. I believe this was what caused my confusion as because I found when AVIC was enabled I had worse performance at times so I thought it was due to me misunderstanding how things should be setup.  

kvm: svm: Introduce GA Log tracepoint for AVIC - https://lore.kernel.org/patchwork/patch/1208775/

This is perfect for people like me with limited understanding on things as it more easily tells you whether or not avic is working.

On that point with the IOMMU AVIC fix i have had great results with my setup - which now as a total of 7 PCIe devices that are pass through to my Windows guest.

Thanks again to everyone for all the hard work.
Comment 47 Robert M. Muncrief 2020-03-22 18:58:40 UTC
Created attachment 288009 [details]
dmesg output with latest patches from Comment 46

I'm glad to see people are still working on this bug, but unfortunately the patches from Comment 46 don't change anything on my system. I still get the same warning messages numerous times about the "is running" bit. And it still comes from this same instruction in svm.c : 

```
WARN_ON(entry & AVIC_PHYSICAL_ID_ENTRY_IS_RUNNING_MASK);
```

I'm running 5.6-rc6, and the first patch was already in the code, so I just added the trace patch for completeness. Then I rebooted and immediately ran my VM and have attached the full dmesg output just in case there's something unusual going on in my system that you might recognize.

Let me know if there's anything else I can do to help. I'd given up on this bug because the several fixes I tried didn't work, and then I didn't see any activity here so I began to wonder if the warnings were unimportant. But as I said before they sure make the dmesg output look alarming, and it would be nice to get rid of them even if they aren't significant. I guess I could just comment out the line if worse comes to worse though.
Comment 48 Suravee Suthikulpanit 2020-04-03 13:55:10 UTC
muncrief,

Thanks for providing the info and attempting to investigate the issue.

I have been trying to reproduce the issue but could not trigger the warning on my system. I might need your current configuration on how to reproduce the warning.

Also, the patch from Paolo in comment 43 does not fix the warning?
Comment 49 Kayant 2020-04-04 12:57:34 UTC
Just wanted to share my experience also with trying to replicate muncrief's issue. So far I haven't had any luck reproducing the issue with my config either using a Windows 10/Linux Guest. 

Suravee Suthikulpanit just a quick question. Since my previous comment I have gotten some better understanding and found ways to tell for sure AVIC is working. On that note in Windows I can't seem to get IOMMU AVIC working testing different configurations but SVM AVIC works great.

Using perf kvm --host top -p `pidof qemu-system-x86_64`

On Linux -    
0.12%  [kvm_amd]  [k] avic_vcpu_put.part.0
   0.10%  [kvm_amd]  [k] avic_vcpu_load
   0.02%  [kvm_amd]  [k] avic_incomplete_ipi_interception
   0.01%  [kvm_amd]  [k] svm_deliver_avic_intr
   
   2.83%  [kernel]  [k] iommu_completion_wait
   0.87%  [kernel]  [k] __iommu_queue_command_sync
   0.16%  [kernel]  [k] amd_iommu_update_ga
   0.03%  [kernel]  [k] iommu_flush_irt
Comment 50 Kayant 2020-04-04 13:02:53 UTC
On Windows -

   0.61%  [kvm_amd]  [k] svm_deliver_avic_intr
   0.05%  [kvm_amd]  [k] avic_vcpu_put.part.0
   0.02%  [kvm_amd]  [k] avic_vcpu_load
   0.14%  [kvm]      [k] kvm_emulate_wrmsr     

Looking around that should that IOMMU was working in Linux as intended when combined with looking at proc/interrupts and seeing interrupts when being handled by AMD-Vi and not IRQ counters for the PCI devices I passed through.

I was working is there a difference in how Windows/Hyper-V handles IOMMU AVIC or am I missing something that could be disabling it being activate?  

Sorry for the double comment i mistakenly submitted my first comment midway typing my full reply.
Comment 51 Robert M. Muncrief 2020-04-04 19:24:04 UTC
Created attachment 288207 [details]
Latest KVM warnings information

Okay gentlemen, here is an archive with all the latest information I have for the continued KVM warnings. To make things as clean as possible I did a clean compile of kernel 5.6.0 from the torvalds git. As before the first KVM patch was already in the kernel, and I manually checked the source file just to be sure. And the second KVM patch applied cleanly. However I included both patches so you can check to make sure they're correct. Here's a quick description of the four files in the archive:

dmesg_kvm_warnings.txt - Full dmesg output after a reboot, and then starting and logging into the VM.

Win10-1_UEFI.xml - The VM XML.

kvm_1.patch - The first KVM patch, which is now already in the kernel


kvm_2.patch - The second KVM patch that applied cleanly to the kernel.
Comment 52 Kayant 2020-04-05 16:52:47 UTC
(In reply to muncrief from comment #51)
> Created attachment 288207 [details]
> Latest KVM warnings information
> 
> Okay gentlemen, here is an archive with all the latest information I have
> for the continued KVM warnings. To make things as clean as possible I did a
> clean compile of kernel 5.6.0 from the torvalds git. As before the first KVM
> patch was already in the kernel, and I manually checked the source file just
> to be sure. And the second KVM patch applied cleanly. However I included
> both patches so you can check to make sure they're correct. Here's a quick
> description of the four files in the archive:
> 
> dmesg_kvm_warnings.txt - Full dmesg output after a reboot, and then starting
> and logging into the VM.
> 
> Win10-1_UEFI.xml - The VM XML.
> 
> kvm_1.patch - The first KVM patch, which is now already in the kernel
> 
> 
> kvm_2.patch - The second KVM patch that applied cleanly to the kernel.

Yh the second patch is the GA Log one which has been queued for Linux 5.7.

Good news I finally managed to reproduce the same errors both in a test VM and my current config.

To start my investigate I first made a test VM closely matching your config minus the PCI devices that can passthrough. After some trial and error, I managed to find out why I don't see the errors before in my config.

I found when installing Windows the errors would show up 100% of the time just after it finishes installing and during the reboots to finish the setup. It also produces the warning around the time a user account has been logged in. Outside of that, I wasn't able to find any case to produce the warning 100% of the time. In general, it just happens from time to time when the guest is running.

The reason I wasn't getting the warnings was that I had both CPU pinning and
-overcommit cpu-pm so when I last tested I never tested with my cpu pinning disabled.

In general, either CPU pinning or using -overcommit cpu-pm I have found resulted in no warnings in both my test VM and current config.

Hopefully that helps with debugging the cause.
Comment 53 Robert M. Muncrief 2020-04-06 02:50:57 UTC
(In reply to Anthony from comment #52)
> ...
> Good news I finally managed to reproduce the same errors both in a test VM
> and my current config.
> ...

Whew! I'm glad someone else is able to see it as well. I was beginning to wonder if I was just doing something wrong.

Just to be clear though, this error occurs every time I start my VM. It doesn't even matter if I login to the VM, the warnings have already appeared by the time the login screen appears.

Anyway, as you said, hopefully your discovery that CPU pinning and -overcommit cpu-pm get rid of the warnings will help guide the devs to a solution.

Also, I'm feeling a bit under the weather so if I don't respond to requests for more info for a few days it's because I need to get some rest and get well. But if anyone needs anything else just let me know and I'll get on it as soon as possible.
Comment 54 Kayant 2020-04-06 10:27:37 UTC
Created attachment 288227 [details]
Kvm-anthony-warnings

(In reply to muncrief from comment #53)
> (In reply to Anthony from comment #52)
> > ...
> > Good news I finally managed to reproduce the same errors both in a test VM
> > and my current config.
> > ...
> 
> Whew! I'm glad someone else is able to see it as well. I was beginning to
> wonder if I was just doing something wrong.
> 
> Just to be clear though, this error occurs every time I start my VM. It
> doesn't even matter if I login to the VM, the warnings have already appeared
> by the time the login screen appears.
> 
> Anyway, as you said, hopefully your discovery that CPU pinning and
> -overcommit cpu-pm get rid of the warnings will help guide the devs to a
> solution.
> 

In terms of getting the warnings early in the boot phase. I did some more testing today and found some more things. In terms of the warnings happening around early in the booting phase as you described I got warnings in 8 out of 10 cases when booting and shutting down my test VM.

I also found the reason changing things like the core count/topology and kernel_irqchip=on/off can delay were the first warning appears.

I attached my test config qemu launch args/libvirt config. As well as dmesg although the errors are basically the same as muncrief's.

> Also, I'm feeling a bit under the weather so if I don't respond to requests
> for more info for a few days it's because I need to get some rest and get
> well. But if anyone needs anything else just let me know and I'll get on it
> as soon as possible.

Take care and hope for a speedy recovery :)
Comment 55 Kayant 2020-04-10 19:28:16 UTC
Created attachment 288333 [details]
Windows SVM IOMMU testing

Hi Suravee,

Just wanted to report your latest patch:
KVM: x86: Fixes posted interrupt check for IRQs delivery modes - https://lore.kernel.org/patchwork/patch/1221065/

Fixes SVM IOMMU with windows - Confirmed looking at proc/interrupts and perf top etc.

Some oddities I found -

With no cpu-pm there can be a large amount of kvm:kvm_avic_incomplete_ipi exits. With cpu-pm it reduces significantly to sometimes being zero depending length of looking at perf stat.

Also with no cpu-pm there is a big performance degradation this can be felt in general in Windows desktop when UI feels more sluggish. Another example of this is just trying to start a latencymon report freezes the VM(Not sure if is a hard unrecoverable freeze but UI is completely frozen). With cpu-pm it is greatly improved but there is still some inconsistencies and compared to just SVM AVIC working(using kernel without the patch) the performance is worse.
By performance, I am mainly relating to windows system call performance monitoring with latencymon but this can also be since in applications(Gaming/UI in the testing I did). 

For whatever reason my test windows config, however SVM IOMMU doesn't seem to want to activate even with the same settings as my regular config. This is with the same kernel and testing back to back. 

I have attached a short perf record session + perf stat and configs which I hope will aid better. Please let me know if there is anything else I can provide/test to help with debugging.

Those are the parameters I used to record.

sudo perf kvm --host record -agvsT -z=22 --sample-cpu --max-size 2M --call-graph fp -e 'kvm:*' -o perf-events.data

sudo perf kvm --host record -agvsT -z=22 --sample-cpu --max-size 2M --call-graph fp -p `pidof qemu-system-x86_64` -o perf-stack.data
Comment 56 Robert M. Muncrief 2020-04-11 00:20:03 UTC
None of this changes anything on my end gentlemen. I applied the patch referenced in Comment 55 and nothing changed at all.

The warnings did not go away, but I didn't experience any performance degradation either.

If you want me to do further tests please let me know. I'm not completely well but I'm feeling better.
Comment 57 Robert M. Muncrief 2020-04-13 16:51:38 UTC
I just compiled and installed 5.7-rc1 and according to "perf kvm stat live" AVIC is not working at all again. I wasn't able to apply the normal patches because the code directory structure has changed, so this is just a raw compile from git.

My VM hasn't changed from Comment 51. So are there new patches I'm supposed to apply, or changes that need to be made to my XML? Or is AVIC in an interim dysfunctional state in 5.7-rc1?
Comment 58 Robert M. Muncrief 2020-04-13 17:20:38 UTC
I apologize gentlemen but I made a mistake in applying the patches during the first compile, and the one from Comment 55 did apply successfully.

However AVIC still doesn't work, though multi-threaded performance did improve around 5%. There was no change in single threaded performance. Remember I'm just using the simple benchmark in CPU-Z, so it's a very rudimentary test. However a 5% jump is significant enough for me to believe it's real.
Comment 59 Kayant 2020-04-18 22:28:00 UTC
(In reply to muncrief from comment #57)
> I just compiled and installed 5.7-rc1 and according to "perf kvm stat live"
> AVIC is not working at all again. I wasn't able to apply the normal patches
> because the code directory structure has changed, so this is just a raw
> compile from git.
> 
> My VM hasn't changed from Comment 51. So are there new patches I'm supposed
> to apply, or changes that need to be made to my XML? Or is AVIC in an
> interim dysfunctional state in 5.7-rc1?

Have you tried using perf top? I have found this to be the most reliable way to know AVIC is working as it shows the kernel functions being used. perf stat live only shows vmexits by default which isn't always easy to know for sure AVIC is activate. sudo perf stat -e 'kvm:*' -a -- sleep 1 helps to check if it's working optimally as it gives you counter of all kvm related vmexits that happen after 1 second.

You can do so with the below command -

sudo perf kvm --host top --kallsyms=/proc/kallsyms -gp `pidof qemu-system-x86_64`

It might not resolve the symbols the first time. Easy way to check is by searching for "svm" using "\". If you get no results exit with "Esc" or "Ctrl + C" and try again.The other reason might be your kernel doesn't have CONFIG_KALLSYMS enabled.

You can use "h" to bring up the help menu for other commands.

To see if SVM AVIC is working you want to search for it should return something like what I posted in comment 49/50.

   0.61%  [kvm_amd]  [k] svm_deliver_avic_intr
   0.05%  [kvm_amd]  [k] avic_vcpu_put.part.0
   0.02%  [kvm_amd]  [k] avic_vcpu_load

And for IOMMU AVIC -
   2.83%  [kernel]  [k] iommu_completion_wait
   0.87%  [kernel]  [k] __iommu_queue_command_sync
   0.16%  [kernel]  [k] amd_iommu_update_ga
   0.03%  [kernel]  [k] iommu_flush_irt

As far as Linux 5.7-rc1 AVIC is working as described in comment 55 including when tested with the patch that fixes IOMMU AVIC on windows.
Comment 60 Kayant 2020-04-18 23:19:59 UTC
To add hiding cpu id hypervisor seems to also cause it to not work well. Windows also generally don't perform as well when hiding the hypervisor CPU id on Ryzen.

Using top we see high activity in avic_vcpu_put.part.0 & avic_vcpu_load functions and near-zero to zero activity with svm_deliver_avic_intr function. Which suggests AVIC isn't working correctly if my sight understanding of svm_deliver_avic_intr function is correct.

Haven't looked to in-depth but I would guess SVM/AVIC uses hypervisor cpuid as check? to apply its optimizations or something along those lines.
Comment 61 Robert M. Muncrief 2020-04-22 21:49:28 UTC
(In reply to Anthony from comment #60)
> To add hiding cpu id hypervisor seems to also cause it to not work well.
> Windows also generally don't perform as well when hiding the hypervisor CPU
> id on Ryzen.
> 
> Using top we see high activity in avic_vcpu_put.part.0 & avic_vcpu_load
> functions and near-zero to zero activity with svm_deliver_avic_intr
> function. Which suggests AVIC isn't working correctly if my sight
> understanding of svm_deliver_avic_intr function is correct.
> 
> Haven't looked to in-depth but I would guess SVM/AVIC uses hypervisor cpuid
> as check? to apply its optimizations or something along those lines.

I can't unhide the hypervisor because I have a cheap Nvidia GT 710 for the VM and it won't work unless the hypervisor is hidden.

However I did use the other techniques to verify whether AVIC was working or not, and it definitely is not. There were 0 AVIC events no matter what I tried.

By the way, I tested with both 5.7 rc1 and rc2 and neither works. Thank you for the suggestions though. So unless there's something else someone wants me to try I'll test things again Sunday when rc3 is released.
Comment 62 Robert M. Muncrief 2020-04-27 19:11:05 UTC
I just compiled 5.7 rc3 and ran my VM, and unfortunately AVIC is still not working at all.

I used the patch from Comment 55, and tried all the techniques to check for AVIC in Comment 59.
Comment 63 Maxim Levitsky 2020-05-03 19:58:54 UTC
I tried the patch from comment 55 and it works for me (3970X).

Maybe you have something disabled in the config? You need synic,'vapic' hyper-v enlightenments disabled, and on top of that you need x2apic and svm (nested virtualization) to be disabled in the guest cpu config. 

And you need to have PIC reinject policy to be set to 'discard'.
Comment 64 Robert M. Muncrief 2020-08-24 17:03:00 UTC
Well, I've tried everything under the sun but never could never get AVIC to work reliably.

However the original problem of "BUG: kernel NULL pointer dereference" is solved so this bug can be closed.

I hope this feature will continue to be worked on though, and that one day it will work without extensive manual changes to the VM XML. For now it appears that it's just too complex, and sensitive to a myriad of conditions, to be useful to me.

Note You need to log in before you can comment on or make changes to this bug.