Host: 5.8.0-1.fc33.x86_64 qemu-kvm-5.0.0-5.fc33.x86_64 Backing file on host # truncate -s 100g f33default.raw # lsattr ---------------C---- ./f33default.raw Guest: 5.8.0-1.fc33.x86_64 Normally this installs just fine, but if I use io='io_uring' for the virtioblk device used as the installation target. Inside the guest VM: late in the installation the performance falls off a cliff, because super slow, and then btrfs spews checksum errors. Because virtioblk reports rotational, by default mkfs.btrfs does dup metadata. Curiously all of the checksum errors are only the mirror 1 copy. That's curious. And also there are no fixups, assuming it is OK with the mirror 2 copies.
Created attachment 290833 [details] dmesg
# ps aux | grep qemu qemu 16619 2.4 40.8 4276924 3293748 ? Sl 15:05 4:39 /usr/bin/qemu-system-x86_64 -name guest=uefivm,debug-threads=on -S -object secret,id=masterKey0,format=raw,file=/var/lib/libvirt/qemu/domain-2-uefivm/master-key.aes -blockdev {"driver":"file","filename":"/usr/share/edk2/ovmf/OVMF_CODE.fd","node-name":"libvirt-pflash0-storage","auto-read-only":true,"discard":"unmap"} -blockdev {"node-name":"libvirt-pflash0-format","read-only":true,"driver":"raw","file":"libvirt-pflash0-storage"} -blockdev {"driver":"file","filename":"/var/lib/libvirt/qemu/nvram/uefivm_VARS.fd","node-name":"libvirt-pflash1-storage","auto-read-only":true,"discard":"unmap"} -blockdev {"node-name":"libvirt-pflash1-format","read-only":false,"driver":"raw","file":"libvirt-pflash1-storage"} -machine pc-q35-5.0,accel=kvm,usb=off,vmport=off,dump-guest-core=off,pflash0=libvirt-pflash0-format,pflash1=libvirt-pflash1-format -cpu SandyBridge-IBRS,vme=on,vmx=on,pdcm=on,pcid=on,hypervisor=on,arat=on,tsc-adjust=on,umip=on,md-clear=on,stibp=on,arch-capabilities=on,ssbd=on,xsaveopt=on,ibpb=on,amd-stibp=on,amd-ssbd=on,pschange-mc-no=on -m 3072 -overcommit mem-lock=off -smp 4,sockets=4,cores=1,threads=1 -uuid 76270619-0043-4c0b-bac6-29805be87ddf -no-user-config -nodefaults -chardev socket,id=charmonitor,fd=35,server,nowait -mon chardev=charmonitor,id=monitor,mode=control -rtc base=utc,driftfix=slew -global kvm-pit.lost_tick_policy=delay -no-hpet -no-shutdown -global ICH9-LPC.disable_s3=1 -global ICH9-LPC.disable_s4=1 -boot strict=on -device pcie-root-port,port=0x10,chassis=1,id=pci.1,bus=pcie.0,multifunction=on,addr=0x2 -device pcie-root-port,port=0x11,chassis=2,id=pci.2,bus=pcie.0,addr=0x2.0x1 -device pcie-root-port,port=0x12,chassis=3,id=pci.3,bus=pcie.0,addr=0x2.0x2 -device pcie-root-port,port=0x13,chassis=4,id=pci.4,bus=pcie.0,addr=0x2.0x3 -device pcie-root-port,port=0x14,chassis=5,id=pci.5,bus=pcie.0,addr=0x2.0x4 -device pcie-root-port,port=0x15,chassis=6,id=pci.6,bus=pcie.0,addr=0x2.0x5 -device pcie-root-port,port=0x16,chassis=7,id=pci.7,bus=pcie.0,addr=0x2.0x6 -device pcie-root-port,port=0x17,chassis=8,id=pci.8,bus=pcie.0,addr=0x2.0x7 -device qemu-xhci,p2=15,p3=15,id=usb,bus=pci.2,addr=0x0 -device virtio-serial-pci,id=virtio-serial0,bus=pci.3,addr=0x0 -blockdev {"driver":"file","filename":"/var/lib/libvirt/images/f33default.raw","aio":"io_uring","node-name":"libvirt-2-storage","cache":{"direct":false,"no-flush":true},"auto-read-only":true,"discard":"unmap"} -blockdev {"node-name":"libvirt-2-format","read-only":false,"discard":"unmap","detect-zeroes":"unmap","cache":{"direct":false,"no-flush":true},"driver":"raw","file":"libvirt-2-storage"} -device virtio-blk-pci,bus=pci.4,addr=0x0,drive=libvirt-2-format,id=virtio-disk0,bootindex=2,write-cache=on -blockdev {"driver":"file","filename":"/var/lib/libvirt/images/Fedora-Workstation-Live-x86_64-Rawhide-20200809.n.0.iso","node-name":"libvirt-1-storage","auto-read-only":true,"discard":"unmap"} -blockdev {"node-name":"libvirt-1-format","read-only":true,"driver":"raw","file":"libvirt-1-storage"} -device ide-cd,bus=ide.0,drive=libvirt-1-format,id=sata0-0-0,bootindex=1 -netdev tap,fd=37,id=hostnet0,vhost=on,vhostfd=38 -device virtio-net-pci,netdev=hostnet0,id=net0,mac=52:54:00:9d:96:34,bus=pci.1,addr=0x0 -chardev pty,id=charserial0 -device isa-serial,chardev=charserial0,id=serial0 -chardev socket,id=charchannel0,fd=39,server,nowait -device virtserialport,bus=virtio-serial0.0,nr=1,chardev=charchannel0,id=channel0,name=org.qemu.guest_agent.0 -chardev spicevmc,id=charchannel1,name=vdagent -device virtserialport,bus=virtio-serial0.0,nr=2,chardev=charchannel1,id=channel1,name=com.redhat.spice.0 -device usb-tablet,id=input0,bus=usb.0,port=1 -spice port=5900,addr=127.0.0.1,disable-ticketing,seamless-migration=on -device qxl-vga,id=video0,ram_size=67108864,vram_size=67108864,vram64_size_mb=0,vgamem_mb=16,max_outputs=1,bus=pcie.0,addr=0x1 -device ich9-intel-hda,id=sound0,bus=pcie.0,addr=0x1b -device hda-duplex,id=sound0-codec0,bus=sound0.0,cad=0 -chardev spicevmc,id=charredir0,name=usbredir -device usb-redir,chardev=charredir0,id=redir0,bus=usb.0,port=2 -chardev spicevmc,id=charredir1,name=usbredir -device usb-redir,chardev=charredir1,id=redir1,bus=usb.0,port=3 -device virtio-balloon-pci,id=balloon0,bus=pci.5,addr=0x0 -object rng-random,id=objrng0,filename=/dev/urandom -device virtio-rng-pci,rng=objrng0,id=rng0,bus=pci.6,addr=0x0 -sandbox on,obsolete=deny,elevateprivileges=deny,spawn=deny,resourcecontrol=deny -msg timestamp=on
Created attachment 290835 [details] virsh dumpxml
https://bugzilla.kernel.org/show_bug.cgi?id=208827
I ran bcc-tool biolatency during an installation. The /dev/vda device that has io=io_uring enabled seems to have lower latency out of the gate, but also i'm not seeing the bimodal latencies that I do with io=threads. But immediately following the rsync phase, the installed system is put in a chroot and post install stuff is happening, including dracut running, and the latency become absolutely stratospheric...33 second latencies. Example: disk = b'vda' usecs : count distribution 0 -> 1 : 0 | | 2 -> 3 : 0 | | 4 -> 7 : 0 | | 8 -> 15 : 0 | | 16 -> 31 : 0 | | 32 -> 63 : 28 |**** | 64 -> 127 : 239 |****************************************| 128 -> 255 : 45 |******* | 256 -> 511 : 43 |******* | 512 -> 1023 : 98 |**************** | 1024 -> 2047 : 57 |********* | 2048 -> 4095 : 12 |** | 4096 -> 8191 : 9 |* | 8192 -> 16383 : 0 | | 16384 -> 32767 : 0 | | 32768 -> 65535 : 0 | | 65536 -> 131071 : 0 | | 131072 -> 262143 : 0 | | 262144 -> 524287 : 0 | | 524288 -> 1048575 : 2 | | 1048576 -> 2097151 : 0 | | 2097152 -> 4194303 : 0 | | 4194304 -> 8388607 : 0 | | 8388608 -> 16777215 : 0 | | 16777216 -> 33554431 : 1 | | Another odd duck? Aug 11 17:05:32 localhost-live chronyd[1285]: System clock was stepped by -14405.762210 seconds ... Aug 11 22:08:35 localhost-live chronyd[1285]: System clock wrong by -6.721561 seconds, adjustment started Ignore the five hour gap, I was just lazy having started the VM but only started the install five hours later. The odd duck is that somehow the system lock is wrong by 6 seconds as if the actual VM in its entirety, stalled, and its own system clock got stuck in the delay. Attaching journal and matching biolatency output, the times match. So you can see that the skyrocketing latency coincides with chroot and dracut starting. The rsync (and bulk of the installation) was ok. Weird. The installation never recovers.
Created attachment 290837 [details] bcc biolatency -DT 15
Created attachment 290839 [details] journal, full, wall clock time
$ cat /sys/block/vda/queue/scheduler [mq-deadline] kyber bfq none Same for the device the raw backing file is on. rawfile+nodatacow>btrfs>SSD. No other layers.
Using bcc-tools btrfsdist 15 on the host, I'm not seeing any latency changes during the installation.
Reproduced the problems, my scripts are here https://github.com/josefbacik/virt-scripts.git Just need to make an image file dd if=/dev/zero of=file.img bs=1M count=1 seek=50000 then run ./fedora-install file.img You need to have virt-install installed, with all the random libvirt stuff, virt-viewer as well. This will do the fedora install over the internet and do all the things you need. Once it reboots into a prompt you'll need to shut down the guest, and then do virsh edit fedora-rawhide Way down you'll see something like <disk type='file' device='disk'> <driver name='qemu' type='raw'/> <source file='/path/to/file.img'/> You'll need to add io='io_uring' to the driver section, so it looks like <driver name='qemu' type='raw' io='io_uring'/> then you can re-launch the vm, log in as root, password is redhat. From there you can just do sh jens-broke-it.sh and wait for it to fail, it doesn't take very long, it'll break out and you'll see a series of checksum errors.
I'm curious why the problem doesn't happen until the chroot. 100% find during rsync and then blows up at the chroot+dracut stage. But I just read this on Fedora devel@ (not io_uring or btrfs related as far as I know) in mock with nspawn chroots. "It does! Running mock with --isolation=simple works around the issue. Looks like the glibc 2.32.9000 snapshot broke systemd-nspawn based chroots with this change: - Linux: Use faccessat2 to implement faccessat (bug 18683)" https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org/message/YS6MPDLXP5BENHHHA43EW4YLI5UG435D/ https://sourceware.org/bugzilla/show_bug.cgi?id=18683#c3
According to https://bugzilla.kernel.org/show_bug.cgi?id=208827 it got fixed, closing.