Bug 218876 - PCIE device crash when trying to pass through USB PCIe Card to virtual machine
Summary: PCIE device crash when trying to pass through USB PCIe Card to virtual machine
Status: NEW
Alias: None
Product: Virtualization
Classification: Unclassified
Component: kvm (show other bugs)
Hardware: Intel Linux
: P3 normal
Assignee: virtualization_kvm
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2024-05-22 19:31 UTC by Dan Alderman
Modified: 2024-05-29 11:35 UTC (History)
2 users (show)

See Also:
Kernel Version: 6.9.1-x64v2-xanmod1 #0~20240517.gc240cba SMP PREEMPT_DYNAMIC
Subsystem:
Regression: No
Bisected commit-id:


Attachments
Boot log and what happens when I try to start vm with VFIO PCIe passthrough (155.39 KB, text/plain)
2024-05-22 19:32 UTC, Dan Alderman
Details
sudo lspci -vvnnk (96.21 KB, text/plain)
2024-05-23 15:28 UTC, Dan Alderman
Details
power reset methods of PCI devices (1.03 KB, text/plain)
2024-05-23 15:28 UTC, Dan Alderman
Details

Description Dan Alderman 2024-05-22 19:31:45 UTC
Hi.

I'm running a Debian Bookworm host with Xanmod 6.9.1 kernel

This motherboard:

https://www.supermicro.com/en/products/motherboard/a2sdi-16c-tp8f

With this USB controller in the 4x PCIe slot.

https://www.startech.com/en-gb/cards-adapters/pexusb3s2ei

The USB card is based on the Renesas uPD720201 USB 3.0 Host Controller and reports the latest firmware.

I have a Debian Bookworm VM running on this host, which I intend to pass the entire PCIe card through to (Gnome+Plexamp->USB-SPDIF).  If I configure the VM to do this the VM fails to start and I get the following errors from the kernel.  The card then becomes seemingly unrecoverable without a warm reboot at least.

I have tried many kernel and BIOS options regarding PCIe but nothing has helped so far.  I'll attach a boot log. This is the error when I start the VM:

May 19 09:24:46 kryten kernel: VFIO - User Level meta-driver version: 0.3
May 19 09:24:46 kryten kernel: xhci_hcd 0000:02:00.0: remove, state 1
May 19 09:24:46 kryten kernel: usb usb4: USB disconnect, device number 1
May 19 09:24:46 kryten kernel: xhci_hcd 0000:02:00.0: USB bus 4 deregistered
May 19 09:24:46 kryten kernel: xhci_hcd 0000:02:00.0: remove, state 1
May 19 09:24:46 kryten kernel: usb usb3: USB disconnect, device number 1
May 19 09:24:46 kryten kernel: usb 3-4: USB disconnect, device number 2
May 19 09:24:46 kryten kernel: xhci_hcd 0000:02:00.0: USB bus 3 deregistered
May 19 09:24:47 kryten kernel: usb 1-1.2: USB disconnect, device number 4
May 19 09:24:53 kryten kernel: pcieport 0000:00:09.0: broken device, retraining non-functional downstream link at 2.5GT/s
May 19 09:24:54 kryten kernel: pcieport 0000:00:09.0: retraining failed
May 19 09:24:55 kryten kernel: pcieport 0000:00:09.0: broken device, retraining non-functional downstream link at 2.5GT/s
May 19 09:24:56 kryten kernel: pcieport 0000:00:09.0: retraining failed
May 19 09:24:56 kryten kernel: vfio-pci 0000:02:00.0: not ready 1023ms after bus reset; waiting
May 19 09:24:57 kryten kernel: vfio-pci 0000:02:00.0: not ready 2047ms after bus reset; waiting
May 19 09:24:59 kryten kernel: vfio-pci 0000:02:00.0: not ready 4095ms after bus reset; waiting
May 19 09:25:04 kryten kernel: vfio-pci 0000:02:00.0: not ready 8191ms after bus reset; waiting
May 19 09:25:12 kryten kernel: vfio-pci 0000:02:00.0: not ready 16383ms after bus reset; waiting
May 19 09:25:29 kryten kernel: vfio-pci 0000:02:00.0: not ready 32767ms after bus reset; waiting
May 19 09:26:05 kryten kernel: pcieport 0000:00:09.0: broken device, retraining non-functional downstream link at 2.5GT/s
May 19 09:26:06 kryten kernel: pcieport 0000:00:09.0: retraining failed
May 19 09:26:08 kryten kernel: pcieport 0000:00:09.0: broken device, retraining non-functional downstream link at 2.5GT/s
May 19 09:26:09 kryten kernel: pcieport 0000:00:09.0: retraining failed
May 19 09:26:09 kryten kernel: vfio-pci 0000:02:00.0: not ready 1023ms after bus reset; waiting
May 19 09:26:10 kryten kernel: vfio-pci 0000:02:00.0: not ready 2047ms after bus reset; waiting
May 19 09:26:12 kryten kernel: vfio-pci 0000:02:00.0: not ready 4095ms after bus reset; waiting
May 19 09:26:16 kryten kernel: vfio-pci 0000:02:00.0: not ready 8191ms after bus reset; waiting
May 19 09:26:25 kryten kernel: vfio-pci 0000:02:00.0: not ready 16383ms after bus reset; waiting
May 19 09:26:43 kryten kernel: vfio-pci 0000:02:00.0: not ready 32767ms after bus reset; waiting
May 19 09:27:18 kryten kernel: vfio-pci 0000:02:00.0: Unable to change power state from D0 to D3hot, device inaccessible
May 19 09:27:19 kryten kernel: vfio-pci 0000:02:00.0: Unable to change power state from D3cold to D0, device inaccessible
May 19 09:27:19 kryten kernel: vfio-pci 0000:02:00.0: Unable to change power state from D3cold to D0, device inaccessible
May 19 09:27:19 kryten kernel: vfio-pci 0000:02:00.0: Unable to change power state from D3cold to D0, device inaccessible
May 19 09:27:19 kryten kernel: vfio-pci 0000:02:00.0: Unable to change power state from D3cold to D0, device inaccessible
May 19 09:27:19 kryten kernel: vfio-pci 0000:02:00.0: Unable to change power state from D3cold to D0, device inaccessible
May 19 09:27:19 kryten kernel: xhci_hcd 0000:02:00.0: Invalid ROM..
May 19 09:27:19 kryten kernel: xhci_hcd 0000:02:00.0: Unable to change power state from D3cold to D0, device inaccessible
May 19 09:27:19 kryten kernel: xhci_hcd 0000:02:00.0: xHCI Host Controller
May 19 09:27:19 kryten kernel: xhci_hcd 0000:02:00.0: new USB bus registered, assigned bus number 3
May 19 09:27:19 kryten kernel: xhci_hcd 0000:02:00.0: Host halt failed, -19
May 19 09:27:19 kryten kernel: xhci_hcd 0000:02:00.0: can't setup: -19
May 19 09:27:19 kryten kernel: xhci_hcd 0000:02:00.0: USB bus 3 deregistered
May 19 09:27:19 kryten kernel: xhci_hcd 0000:02:00.0: init 0000:02:00.0 fail, -19

Thanks for your time and help.
Comment 1 Dan Alderman 2024-05-22 19:32:44 UTC
Created attachment 306324 [details]
Boot log and what happens when I try to start vm with VFIO PCIe passthrough
Comment 2 TJ 2024-05-23 14:35:57 UTC
This could be a power management issue:

  vfio-pci 0000:02:00.0: Unable to change power state from D3cold to D0, device inaccessible

Could you add the outputs of these commands?

# detailed PCI device info
sudo lspci -vvnnk

# power reset methods of PCI devices
find /sys/devices/pci0000\:00/ -name reset_method | while read -r f; do echo -e "$f = $(cat $f)"; done
Comment 3 Dan Alderman 2024-05-23 15:28:00 UTC
Created attachment 306326 [details]
sudo lspci -vvnnk

sudo lspci -vvnnk
Comment 4 Dan Alderman 2024-05-23 15:28:51 UTC
Created attachment 306327 [details]
power reset methods of PCI devices

power reset methods of PCI devices

find /sys/devices/pci0000\:00/ -name reset_method | while read -r f; do echo -e
"$f = $(cat $f)"; done
Comment 5 Dan Alderman 2024-05-24 11:19:08 UTC
In case it's useful, I have tried with pcie_aspm=off but it didn't seem to fix it, same behaviour.
Comment 6 TJ 2024-05-25 13:49:01 UTC
Thanks for the logs. An overview of how devices are connected. The PCIE Root port at address 00:09.0 (Bus:Device:Function) is the 'parent' of the USB host controller on Bus 02:00.0.

The issue here appears to be that when the USB host controller is removed it may actually go into D3Cold state. This actually removes power and, currently, Linux kernel has no mechanism to control power on PCI bus [0].

There are three possible work-arounds I can think of worth testing:

  1. remove 00:09.0 and then rescan its parent root complex since that *may* trigger power to be restored to the port (use the script I shared with you on IRC via termbin)

  2. unbind [1] the xhci_hcd driver from the device *before* trying to start the VM or loading vfio-pci (this could be scripted) so the device remains powered:

  # echo 0000:02:0.0 > /sys/bus/pci/drivers/xhci_hcd/unbind

  3. avoid this altogether if the USB host controller is only ever wanted for use in the guest virtual machine by reserving the device so the host's XHCI controller driver never claims it via kernel command-line; something like: vfio-pci.ids=1912:0014 - but this would require ensuring that vfio-pci was loaded *very* early in the initrd processing to take control of the USB host controller before xhci_hcd gets to it! I don't think there is an easy way to ensure ordering the module load order for that aside from a custom udev rule that loads vfio-pci for this device ID.



[0] https://www.kernel.org/doc/html/latest/power/pci.html#native-pci-power-management

[1] https://www.kernel.org/doc/Documentation/ABI/testing/sysfs-bus-pci
Comment 7 TJ 2024-05-25 14:16:40 UTC
I've done some basic analysis and testing here to develop a udev rule. This looks like it ought to do the job.

# this is /etc/udev/rules.d/00-vfiio-pci.rules
SUBSYSTEM=="pci", ATTR{endor}=="1912", ATTR{device}=="0014", RUN+="modprobe vfio-pci ids=1912:0014"

It needs testing so what I'd suggest is:

1. use the unbind method in (2) in my previous comment to detach the xhci_hcd driver and check there is no "Kernel driver in use" with:

  $ lspci -nnk -d 1912:0014"

2. Tell the kernel to replay events to test if the rule reacts as expected:

  # udevadm trigger --type=subsystems --subsystem-match=pci

3. Check if VFIO bound to the device ("Kernel driver in use") with:

  $ lspci -nnk -d 1912:0014"

If that works then the module and rule need adding to the initrd.img with:

  # echo "vfio-pci" >> /etc/initramfs-tools/modules
  # update-initramfs -u

and then do a reboot test when convenient.
Comment 8 TJ 2024-05-25 14:20:40 UTC
Argh! noticed typos in the rule name and the rule!

# this is /etc/udev/rules.d/00-vfio-pci.rules
SUBSYSTEM=="pci", ATTR{vendor}=="1912", ATTR{device}=="0014", RUN+="modprobe vfio-pci ids=1912:0014"
Comment 9 Alex Williamson 2024-05-28 21:16:12 UTC
(In reply to TJ from comment #2)
> This could be a power management issue:
> 
>   vfio-pci 0000:02:00.0: Unable to change power state from D3cold to D0,
> device inaccessible

I wouldn't rule out a power management issue, but I think the device has already disappeared by the time we're seeing these error logs.  We're hitting the timeouts waiting for the device after bus reset and we trigger quirks that are trying to retrain the link at a reduced rate.  Unfortunately this device only supports bus reset.

Is it possible to test assigning another single function device installed in this slot?  We'd want to make sure that the reset_method is still "bus", which can be selected via the same sysfs file if the other device supports more reset mechanism.

If it is a power management issue, we can also restrict vfio-pci use of power management by passing the disable_idle_d3=1 module option when loading the vfio-pci driver.  If that works, we might want to quirk the ID for this old NEC controller to avoid D3 states.
Comment 10 Dan Alderman 2024-05-29 10:42:56 UTC
I tried with adding the module options but I still get the same behaviour.

cat /etc/modprobe.d/vfio.conf
options vfio-pci disable_idle_d3=1

(reboot)

cat /proc/modules | cut -f 1 -d " " | while read module; do echo "Module: $module"; if [ -d "/sys/module/$module/parameters" ]; then ls /sys/module/$module/parameters/ | while read parameter; do echo -n "Parameter: $parameter --> "; cat /sys/module/$module/parameters/$parameter; done; fi; echo; done | grep -A 5 vfio

[snip]

Parameter: disable_idle_d3 --> Y

[snip]

Try launching the VM with the PCI device passed through and I get the same retrain failure.

I have this card to try next.

https://www.amazon.co.uk/dp/B087G7T234

Sorry for being a little slow - I have disabilities that can put a stop to my activities regardless of what my brain thinks about it.

Thanks for all the help.
Comment 11 Dan Alderman 2024-05-29 11:35:36 UTC
Just tried with the new card:

02:00.0 USB controller: ASMedia Technology Inc. ASM2142/ASM3142 USB 3.1 Host Controller

I added it to the VM in virt-manager and I get the same error when I launch it.

[snip]

kernel: pcieport 0000:00:09.0: broken device, retraining non-functional downstream link at 2.5GT/s

[snip]

vfio-pci 0000:02:00.0: not ready 4095ms after bus reset; waiting

Thanks.

Note You need to log in before you can comment on or make changes to this bug.