Bug 88671 - Radeon driver fails to reset hardware properly after kvm guest reboot
Summary: Radeon driver fails to reset hardware properly after kvm guest reboot
Status: NEW
Alias: None
Product: Drivers
Classification: Unclassified
Component: Video(DRI - non Intel) (show other bugs)
Hardware: All Linux
: P1 normal
Assignee: virtualization_kvm
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2014-11-21 18:33 UTC by Tom Stellard
Modified: 2016-03-23 18:53 UTC (History)
2 users (show)

See Also:
Kernel Version: 3.17.3
Subsystem:
Regression: No
Bisected commit-id:


Attachments
lspci (2.72 KB, text/plain)
2014-11-21 18:33 UTC, Tom Stellard
Details
Backtrace from BUG_ON (3.07 KB, text/plain)
2014-11-21 18:33 UTC, Tom Stellard
Details
Virtual machine definition (4.28 KB, text/xml)
2014-11-21 18:34 UTC, Tom Stellard
Details

Description Tom Stellard 2014-11-21 18:33:14 UTC
Created attachment 158381 [details]
lspci

I'm running into this bug while trying to use pci passthrough of an AMD BONAIRE XT (Radeon HD 7790)

Steps to reproduce:

1. virsh start vm
2. virsh destroy vm
3. virsh start vm

This bug only appears after starting the vm for the second time.  The first time the vm boots normally and passthrough works as expected.
Comment 1 Tom Stellard 2014-11-21 18:33:42 UTC
Created attachment 158391 [details]
Backtrace from BUG_ON
Comment 2 Tom Stellard 2014-11-21 18:34:11 UTC
Created attachment 158401 [details]
Virtual machine definition
Comment 3 Tom Stellard 2014-11-21 18:36:22 UTC
I should also mention that I have this hook executing when the machine starts up:

if [ "$2" = "prepare" ]; then
        virsh nodedev-detach pci_0000_01_00_1
fi
Comment 4 Alex Williamson 2014-11-21 19:14:40 UTC
I'm not sure how you're getting to this BUG_ON, but (a) legacy KVM device assignment is deprecated and (b) the card you've chosen has known reset issues.  You might want to try vfio-pci, which I know can make this card work at least once per host boot, but you're likely to get a BSOD and IOMMU faults on subsequent guest [re]boots.  The reset problem with this card has been reported to AMD, but there is no solution at this time.
Comment 5 Tom Stellard 2014-11-29 02:36:51 UTC
Thanks for the tip about vfio-pci.  I can now sometimes get two working guest boots per host boot.
Comment 6 Tom Stellard 2015-03-02 16:42:11 UTC
I've been playing with this a little more and it seems to be working correctly,
but radeon dynamic power management (dpm) always fails to initialize on the second guest boot.  My questions are:

1. What methods are being used by kvm/qemu/libvirt to reset the GPU on guest shutdown?

2. Is the problem only cuased by the fact that GPU reset is not implemented correctly in the radeon driver of are there improvements that are needed in
kvm/qemu/libvirt in order to get this working?
Comment 7 Alex Williamson 2015-03-02 17:32:49 UTC
(In reply to Tom Stellard from comment #6)
> I've been playing with this a little more and it seems to be working
> correctly,
> but radeon dynamic power management (dpm) always fails to initialize on the
> second guest boot.  My questions are:
> 
> 1. What methods are being used by kvm/qemu/libvirt to reset the GPU on guest
> shutdown?

Secondary PCI bus reset from the parent bridge.

> 2. Is the problem only cuased by the fact that GPU reset is not implemented
> correctly in the radeon driver of are there improvements that are needed in
> kvm/qemu/libvirt in order to get this working?

Without a device specific reset, we're doing to most thorough standard reset available to us.  I've also tried to use some of the reset mechanisms implemented in the radeon FOSS driver but they offered no improvement over the bus reset.  There seems to be some state retained in the device that is not cleared via bus reset.

Note You need to log in before you can comment on or make changes to this bug.