When I device is assigned to or "deassigned" from a Virtual Machine, the KVM infrastructure issues a pci_reset_function() on the PCI device. This successfully emulates the standard power on/power down device RESET concept and ensure that the device is in a canonical RESET state when the VM comes up. (See virt/kvm/assigned-dev.c:kvm_free_assigned_device() and kvm_vm_ioctl_assign_device().) However, KVM does _not_ issue a pci_reset_function() when the Virtual Machine is rebooted. This means that the PCI device is in a random state when the new kernel instance comes up. Moreover, any persistent device state — for instance NIC Link Up state for an attached peer is maintained. Unfortunately I can't supply a patch to fix this. I've looked through a big chunk of the KVM infrastructure but I can't find where the Virtual Machine "reboot" operation is handled.
Is this for a VT-d case? Thanks, CJ
This is in reference to the ability to map a PCI Device into a Virtual Machine — sometimes referred to as "PCI Pass Through." For Intel architectures, VT-d provides some of the critical underlaying architectural support for remapping device DMA and interrupts. Other architectures supply similar technology. At issue here is that the KVM Support in the hypervisor kernel does not issue a pci_reset_function() call against all such devices when a Virtual Machine reboots. This means that the Virtual Machine finds the device in a non-clean condition on reboot. It also means that any peer devices/services that the device hardware may be attached to will continue to see the non-RESET state on the other side of the device until a driver is loaded in the Virtual Machine to manage the device. Thus, for instance, if a network adapter had brought up a link and then a reboot was done, the peer would continue to see a Link Up condition until the Virtual Machine rebooted and a driver for the device was loaded. The above phenomenology doesn't match the way that a reboot would affect a real machine. For a real machine, part of the reboot cycle would involve all devices and buses getting RESET. We need the KVM Hipervisor support to do the same thing.