Bug 42636 - PCI passthrough does not work with AMD iommu for PCI device
Summary: PCI passthrough does not work with AMD iommu for PCI device
Status: NEW
Alias: None
Product: Virtualization
Classification: Unclassified
Component: kvm (show other bugs)
Hardware: All Linux
: P1 normal
Assignee: virtualization_kvm
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2012-01-23 09:55 UTC by Klaus Mueller
Modified: 2012-04-05 08:08 UTC (History)
4 users (show)

See Also:
Kernel Version: 3.1
Subsystem:
Regression: No
Bisected commit-id:


Attachments
log of qemu-kvm crash (8.51 KB, text/plain)
2012-03-30 11:07 UTC, Klaus Mueller
Details

Description Klaus Mueller 2012-01-23 09:55:18 UTC
I want to passthrough this PCI deivce to a kvm guest:

05:07.0 Network controller: RaLink RT2800 802.11n PCI
        Subsystem: Linksys Device 0067
        Control: I/O- Mem+ BusMaster- SpecCycle- MemWINV+ VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx-
        Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=slow >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
        Interrupt: pin A routed to IRQ 21
        Region 0: Memory at fdae0000 (32-bit, non-prefetchable) [size=64K]
        Capabilities: [40] Power Management version 3
                Flags: PMEClk- DSI- D1- D2- AuxCurrent=0mA PME(D0-,D1-,D2-,D3hot-,D3cold-)
                Status: D0 NoSoftRst- PME-Enable- DSel=0 DScale=0 PME-

Unfortunately, I'm always getting an error during virsh start guest:

Failed to assign device "hostdev0" : Device or resource busy.
qemu-kvm: -device pci-assign,host=05:06.0,id=hostdev0,configfd=20,bus=pci.0,addr=0x4: Device 'pci-assign' could not be initialized.

If I'm adding this device (05:06.0) to the guest, too, I'm getting the exactly same error again. Of course, I unloaded the module of this additional device before trying to passthrough it to the guest.

lspci -vvs 05:06.0
05:06.0 Ethernet controller: Intel Corporation 82557/8/9/0/1 Ethernet Pro 100 (rev 0c)
        Subsystem: Intel Corporation EtherExpress PRO/100 S Desktop Adapter
        Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx-
        Status: Cap+ 66MHz- UDF- FastB2B+ ParErr- DEVSEL=medium >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
        Latency: 64 (2000ns min, 14000ns max), Cache Line Size: 64 bytes
        Interrupt: pin A routed to IRQ 20
        Region 0: Memory at fdaff000 (32-bit, non-prefetchable) [size=4K]
        Region 1: I/O ports at af00 [size=64]
        Region 2: Memory at fdaa0000 (32-bit, non-prefetchable) [size=128K]
        [virtual] Expansion ROM at fd900000 [disabled] [size=64K]
        Capabilities: [dc] Power Management version 2
                Flags: PMEClk- DSI+ D1+ D2+ AuxCurrent=0mA PME(D0+,D1+,D2+,D3hot+,D3cold+)
                Status: D0 NoSoftRst- PME-Enable- DSel=0 DScale=2 PME-


lspci -vt
-[0000:00]-+-00.0  ATI Technologies Inc RD890 PCI to PCI bridge (external gfx0 port B)
           +-00.2  ATI Technologies Inc Device 5a23
           +-02.0-[01]--+-00.0  ATI Technologies Inc NI Turks [AMD Radeon HD 6500]
           |            \-00.1  ATI Technologies Inc Device aa90
           +-04.0-[02]----00.0  Device 1b6f:7023
           +-09.0-[03]----00.0  Realtek Semiconductor Co., Ltd. RTL8111/8168B PCI Express Gigabit Ethernet controller
           +-0a.0-[04]----00.0  Device 1b6f:7023
           +-11.0  ATI Technologies Inc SB700/SB800 SATA Controller [AHCI mode]
           +-12.0  ATI Technologies Inc SB700/SB800 USB OHCI0 Controller
           +-12.2  ATI Technologies Inc SB700/SB800 USB EHCI Controller
           +-13.0  ATI Technologies Inc SB700/SB800 USB OHCI0 Controller
           +-13.2  ATI Technologies Inc SB700/SB800 USB EHCI Controller
           +-14.0  ATI Technologies Inc SBx00 SMBus Controller
           +-14.1  ATI Technologies Inc SB700/SB800 IDE Controller
           +-14.2  ATI Technologies Inc SBx00 Azalia (Intel HDA)
           +-14.3  ATI Technologies Inc SB700/SB800 LPC host controller
           +-14.4-[05]--+-06.0  Intel Corporation 82557/8/9/0/1 Ethernet Pro 100
           |            \-07.0  RaLink RT2800 802.11n PCI
           +-14.5  ATI Technologies Inc SB700/SB800 USB OHCI2 Controller
           +-15.0-[06]--
           +-16.0  ATI Technologies Inc SB700/SB800 USB OHCI0 Controller
           +-16.2  ATI Technologies Inc SB700/SB800 USB EHCI Controller
           +-18.0  Advanced Micro Devices [AMD] Device 1600
           +-18.1  Advanced Micro Devices [AMD] Device 1601
           +-18.2  Advanced Micro Devices [AMD] Device 1602
           +-18.3  Advanced Micro Devices [AMD] Device 1603
           +-18.4  Advanced Micro Devices [AMD] Device 1604
           \-18.5  Advanced Micro Devices [AMD] Device 1605

dmesg | grep AMD-Vi
[    0.610182] AMD-Vi: device: 00:00.2 cap: 0040 seg: 0 flags: 3e info 1300
[    0.610184] AMD-Vi:        mmio-addr: 00000000fec30000
[    0.610359] AMD-Vi:   DEV_SELECT_RANGE_START  devid: 00:00.0 flags: 00
[    0.610360] AMD-Vi:   DEV_RANGE_END           devid: 00:00.2
[    0.610362] AMD-Vi:   DEV_SELECT                      devid: 00:02.0 flags: 00
[    0.610363] AMD-Vi:   DEV_SELECT_RANGE_START  devid: 01:00.0 flags: 00
[    0.610364] AMD-Vi:   DEV_RANGE_END           devid: 01:00.1
[    0.610366] AMD-Vi:   DEV_SELECT                      devid: 00:04.0 flags: 00
[    0.610367] AMD-Vi:   DEV_SELECT                      devid: 02:00.0 flags: 00
[    0.610368] AMD-Vi:   DEV_SELECT                      devid: 00:09.0 flags: 00
[    0.610369] AMD-Vi:   DEV_SELECT                      devid: 03:00.0 flags: 00
[    0.610370] AMD-Vi:   DEV_SELECT                      devid: 00:0a.0 flags: 00
[    0.610371] AMD-Vi:   DEV_SELECT                      devid: 04:00.0 flags: 00
[    0.610372] AMD-Vi:   DEV_SELECT                      devid: 00:11.0 flags: 00
[    0.610373] AMD-Vi:   DEV_SELECT_RANGE_START  devid: 00:12.0 flags: 00
[    0.610374] AMD-Vi:   DEV_RANGE_END           devid: 00:12.2
[    0.610376] AMD-Vi:   DEV_SELECT_RANGE_START  devid: 00:13.0 flags: 00
[    0.610377] AMD-Vi:   DEV_RANGE_END           devid: 00:13.2
[    0.610378] AMD-Vi:   DEV_SELECT                      devid: 00:14.0 flags: d7
[    0.610379] AMD-Vi:   DEV_SELECT                      devid: 00:14.1 flags: 00
[    0.610380] AMD-Vi:   DEV_SELECT                      devid: 00:14.2 flags: 00
[    0.610381] AMD-Vi:   DEV_SELECT                      devid: 00:14.3 flags: 00
[    0.610382] AMD-Vi:   DEV_SELECT                      devid: 00:14.4 flags: 00
[    0.610384] AMD-Vi:   DEV_ALIAS_RANGE                 devid: 05:00.0 flags: 00 devid_to: 00:14.4
[    0.610385] AMD-Vi:   DEV_RANGE_END           devid: 05:1f.7
[    0.610391] AMD-Vi:   DEV_SELECT                      devid: 00:14.5 flags: 00
[    0.610392] AMD-Vi:   DEV_SELECT                      devid: 00:15.0 flags: 00
[    0.610393] AMD-Vi:   DEV_SELECT_RANGE_START  devid: 06:00.0 flags: 00
[    0.610394] AMD-Vi:   DEV_RANGE_END           devid: 06:1f.7
[    0.610398] AMD-Vi:   DEV_SELECT_RANGE_START  devid: 00:16.0 flags: 00
[    0.610399] AMD-Vi:   DEV_RANGE_END           devid: 00:16.2
[    0.610531] AMD-Vi: Enabling IOMMU at 0000:00:00.2 cap 0x40
[    0.669479] AMD-Vi: Lazy IO/TLB flushing enabled

kernel is 3.1
kvm is 1.0
board is GA-990XA-UD3
Comment 1 Alex Williamson 2012-01-23 14:56:55 UTC
           +-14.4-[05]--+-06.0  Intel Corporation 82557/8/9/0/1 Ethernet Pro100
           |            \-07.0  RaLink RT2800 802.11n PCI

[    0.610382] AMD-Vi:   DEV_SELECT                      devid: 00:14.4 flags:00
[    0.610384] AMD-Vi:   DEV_ALIAS_RANGE                 devid: 05:00.0 flags:00 devid_to: 00:14.4
[    0.610385] AMD-Vi:   DEV_RANGE_END           devid: 05:1f.7

The devices are behind a PCIe-to-PCI bridge (00:14.4), so both devices get aliased to the same devices.  You'll need to either add both devices to the guest or sequester the other device by binding it to pci-stub.
Comment 2 Klaus Mueller 2012-01-23 16:26:49 UTC
Well, I did exactly what you proposed, but I got the same error again, as I tried to apply both devices. That's the relevant part of the xml file:

    <hostdev mode='subsystem' type='pci' managed='yes'>
      <source>
        <address domain='0x0000' bus='0x05' slot='0x06' function='0x0'/>
      </source>
    </hostdev>

    <hostdev mode='subsystem' type='pci' managed='yes'>
      <source>
        <address domain='0x0000' bus='0x05' slot='0x07' function='0x0'/>
      </source>
    </hostdev>

Did I make a mistake?
Comment 3 Daniel Mayer 2012-03-30 07:48:26 UTC
Hi,
I feel with you - I had many, many headaches with PCI-PTrough...

You said you unloaded the driver module for both devices. 
a) do you have the pcistub module available (kernel config)? For some PCI devices I had to enable both DMA and IRQ remapping in the kernel. The latter is disabled by default. Otherwise: sometimes, especially with older PCI cards, IRQ remapping had to be disabled... at least as module or compiled-in:

Bus options (PCI etc.) --> Support for DMA Remapping Devices
Bus options (PCI etc.) --> Enable DMA Remapping Devices
Bus options (PCI etc.) --> PCI Stub driver

[quote]
Of course, I unloaded the module of this additional device
before trying to passthrough it to the guest.[/qoute]

b) Did you try to assign the pcistub driver as printed here:
http://www.linux-kvm.org/page/How_to_assign_devices_with_VT-d_in_KVM
Comment 4 Klaus Mueller 2012-03-30 11:07:21 UTC
Created attachment 72753 [details]
log of qemu-kvm crash

I tried again with only one pci card inserted (Network controller: RaLink RT2800 802.11n PCI). The tree is now like this:

           +-14.4-[06]----07.0  RaLink RT2800 802.11n PCI

I booted with intel_iommu=on (does this apply to AMD, too?).

If I start the VM now, I'm getting the same error again:
Failed to assign device "hostdev0" : Device or resource busy
qemu-kvm: -device pci-assign,host=06:07.0,id=hostdev0,configfd=19,bus=pci.0,addr=0x5: Device 'pci-assign' could not be initialized

If I'm additionally putting the bridge to the VM, I'm getting an error telling me about a missing reset function and another missing thing, too.

If I remove the bridge from the VM xml-file again and start it now, the VM suddenly starts up fine :-))).

But if the VM is stopped again (with halt in the VM or with virsh shutdown VM, qemu-kvm crashes and remains as zombie. The output from messages is attached as iommu-trace.log.

kvm is 0.15.0 and kernel is 3.1.10 (kernel-desktop from openSUSE)
Comment 5 Klaus Mueller 2012-03-30 11:29:50 UTC
More specific error message while VM is started with attached PCI bridge:

Unable to reset PCI device 0000:00:14.4: no FLR, PM reset or bus reset available

I tried with kvm module for linux 3.3 and got the same crash while stopping the VM.
Comment 6 Alexandre DERUMIER 2012-03-30 11:35:49 UTC
Hi, 
for amd, pass

"iommu=pt iommu=1" to kernel options

you can also try to load kvm module with allow_unsafe_assigned_interrupts=1

echo "options kvm allow_unsafe_assigned_interrupts=1" > /etc/modprobe.d/kvm_iommu_map_guest.conf
Comment 7 Klaus Mueller 2012-03-30 12:16:39 UTC
Thanks,
with the "iommu=pt iommu=1" options applied it doesn't work at all - even the "workaround" with first try to start with the bridge applied and afterwards without the bridge added isn't working then.

BTW: I can pass through PCIe devices, as long as they have different PCI ids and different modules, without any problem.

I'm using the option allow_unsafe_assigned_interrupts=1 already!

I think there is a problem with the handling of the device behind a PCI bridge, which should be fixed (-> the Ralink device which should be passed through is a PCI device and not a PCIe device)! Take a look at the qemu-kvm crash log https://bugzilla.kernel.org/attachment.cgi?id=72753
Comment 8 Alexandrov Stanislav 2012-04-04 08:22:51 UTC
I have same problems with pci passthrough, while my pci-e video card hd7750 with hda(01:00.0/1) and integrated network card(05:00.0) succsessfilly assigned in guest, when i try to add usb controller, i get same error about FLR:

Unable to reset PCI device 0000:00:12.0: no FLR, PM reset or bus reset
available

with xen it works fine.

lspci -tv
-[0000:00]-+-00.0  Advanced Micro Devices [AMD] nee ATI RD890 PCI to PCI bridge (external gfx0 port B)
           +-00.2  Advanced Micro Devices [AMD] nee ATI RD990 I/O Memory Management Unit (IOMMU)
           +-02.0-[01]--+-00.0  Advanced Micro Devices [AMD] nee ATI Device 683f
           |            \-00.1  Advanced Micro Devices [AMD] nee ATI Device aab0
           +-09.0-[02]----00.0  Etron Technology, Inc. EJ168 USB 3.0 Host Controller
           +-0b.0-[03]--+-00.0  nVidia Corporation GF110 [GeForce GTX 560 Ti]
           |            \-00.1  nVidia Corporation GF110 High Definition Audio Controller

           +-11.0  Advanced Micro Devices [AMD] nee ATI SB7x0/SB8x0/SB9x0 SATA Controller [AHCI mode]
           +-12.0  Advanced Micro Devices [AMD] nee ATI SB7x0/SB8x0/SB9x0 USB OHCI0 Controller
           +-12.2  Advanced Micro Devices [AMD] nee ATI SB7x0/SB8x0/SB9x0 USB EHCI Controller
           +-13.0  Advanced Micro Devices [AMD] nee ATI SB7x0/SB8x0/SB9x0 USB OHCI0 Controller
           +-13.2  Advanced Micro Devices [AMD] nee ATI SB7x0/SB8x0/SB9x0 USB EHCI Controller
           +-14.0  Advanced Micro Devices [AMD] nee ATI SBx00 SMBus Controller
           +-14.2  Advanced Micro Devices [AMD] nee ATI SBx00 Azalia (Intel HDA)
           +-14.3  Advanced Micro Devices [AMD] nee ATI SB7x0/SB8x0/SB9x0 LPC host controller
           +-14.4-[04]--
           +-14.5  Advanced Micro Devices [AMD] nee ATI SB7x0/SB8x0/SB9x0 USB OHCI2 Controller
           +-15.0-[05]----00.0  Realtek Semiconductor Co., Ltd. RTL8111/8168B PCI Express Gigabit Ethernet controller
           +-15.1-[06]--
           +-15.2-[07]--
           +-15.3-[08]--
           +-16.0  Advanced Micro Devices [AMD] nee ATI SB7x0/SB8x0/SB9x0 USB OHCI0 Controller
           +-16.2  Advanced Micro Devices [AMD] nee ATI SB7x0/SB8x0/SB9x0 USB EHCI Controller
           +-18.0  Advanced Micro Devices [AMD] Family 10h Processor HyperTransport Configuration
           +-18.1  Advanced Micro Devices [AMD] Family 10h Processor Address Map
           +-18.2  Advanced Micro Devices [AMD] Family 10h Processor DRAM Controller
           +-18.3  Advanced Micro Devices [AMD] Family 10h Processor Miscellaneous Control
           \-18.4  Advanced Micro Devices [AMD] Family 10h Processor Link Control


qemu-kvm-1.0, linux-3.3
motherboard is GA-990FXA-D3
Comment 9 Klaus Mueller 2012-04-05 08:08:58 UTC
I tested the same device here and got the same error as you reported.

Good to know, that Xen doesn't have any problem. This really means, that it is most probably a kvm or qemu or AMD iommu bug.

kvm 0.15, kernel: linux 3.1.10 (openSUSE)

Note You need to log in before you can comment on or make changes to this bug.