Bug 219010

Summary: [REGRESSION][VFIO] kernel 6.9.7 causing qemu crash because of "Collect hot-reset devices to local buffer"
Product: Virtualization Reporter: Žilvinas Žaltiena (zaltys)
Component: kvmAssignee: virtualization_kvm
Status: RESOLVED CODE_FIX    
Severity: normal CC: beldzhang, holland, regressions, zaltys
Priority: P3    
Hardware: All   
OS: Linux   
Kernel Version: Subsystem:
Regression: No Bisected commit-id:

Description Žilvinas Žaltiena 2024-07-06 16:30:57 UTC
One of my virtual machines using PCI device passthrough (vfio) stopped working on OpenSuse Tumbleweed since kernel 6.9.7. Qemu 9.0.1 complains:

qemu-system-x86_64: vfio: hot reset info failed: No space left on device
qemu-system-x86_64: GLib: ../glib/gmem.c:177: failed to allocate 18446744068411217972 bytes

and then coredumps. Qemu backtrace shows vfio_pci_get_pci_hot_reset_info() being the last qemu function being called.

Reverting kernel 6.9.7 commit 9313244c26f3792daa86f3a18cc3bd5ad60310e0 (upstream f6944d4a0b87c16bc34ae589169e1ded3d4db08e) - "vfio/pci: Collect hot-reset devices to local buffer" fixes the problem. As I understand, that was backported to 6.9.7 from 6.10 tree.

Upon more throughout analysis I pinpointed that crash is happening because of one specific passed device: sound card of Asus B650 Creator motherboard. VM starts on 6.9.7 if I remove this sound card from it. I think the important bit is this card being VF of device which does not report support for FLR:

15:00.0 | iommu group 28 | Phoenix PCIe Dummy Function <-- not passed to VM, no driver, reset method: pm bus 
15:00.2 | iommu group 29 | Encryption controller (PSP/CCP) <-- ccp driver
15:00.3 | iommu group 30 | USB controller <-- xhci_hcd driver
15:00.4 | iommu group 31 | USB controller <-- xhci_hcd driver
15:00.6 | iommu group 32 | HD Audio Controller <-- sound card passed to VM

After reverting the above mentioned commit, qemu complains:

vfio: Cannot reset device 0000:15:00.6, depends on group 28 which is not owned

exactly the same as before 6.9.7 and VM starts with that sound card passed.

This might be an unsupported configuration, but qemu crashing with 6.9.7 also feels like kernel might be breaking userspace by handling/mishandling this differently, especially with minor version change.
Comment 1 Žilvinas Žaltiena 2024-07-06 17:19:19 UTC
Additional information: passing NVIDIA GPU, Samsung NVMEs works, passing Fresco  FL1100 based USB card does not work. Fresco card is single VF device, but like that sound card it does not report FLR. Reverting "vfio/pci: Collect hot-reset devices to local buffer" allows to pass every mentioned device.
Comment 2 The Linux kernel's regression tracker (Thorsten Leemhuis) 2024-07-09 08:44:17 UTC
Does the problem happen with 6.10-rc6 or newer as well?
Comment 3 Liu, Yi L 2024-07-09 13:44:42 UTC
On 2024/7/7 01:19, bugzilla-daemon@kernel.org wrote:
> https://bugzilla.kernel.org/show_bug.cgi?id=219010
> 
> --- Comment #1 from Žilvinas Žaltiena (zaltys@natrix.lt) ---
> Additional information: passing NVIDIA GPU, Samsung NVMEs works, passing
> Fresco
>   FL1100 based USB card does not work. Fresco card is single VF device, but
>   like
> that sound card it does not report FLR. Reverting "vfio/pci: Collect
> hot-reset
> devices to local buffer" allows to pass every mentioned device.
> 

It appears that the count is used without init.. And it does not happen
with other devices as they have FLR, hence does not trigger the hotreset
info path. Please try below patch to see if it works.


 From 93618efe933c4fa5ec453bddacdf1ca2ccbf3751 Mon Sep 17 00:00:00 2001
From: Yi Liu <yi.l.liu@intel.com>
Date: Tue, 9 Jul 2024 06:41:02 -0700
Subject: [PATCH] vfio/pci: Fix a regresssion

Signed-off-by: Yi Liu <yi.l.liu@intel.com>
---
  drivers/vfio/pci/vfio_pci_core.c | 2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/vfio/pci/vfio_pci_core.c 
b/drivers/vfio/pci/vfio_pci_core.c
index 59af22f6f826..0a7bfdd08bc7 100644
--- a/drivers/vfio/pci/vfio_pci_core.c
+++ b/drivers/vfio/pci/vfio_pci_core.c
@@ -1260,7 +1260,7 @@ static int vfio_pci_ioctl_get_pci_hot_reset_info(
  	struct vfio_pci_hot_reset_info hdr;
  	struct vfio_pci_fill_info fill = {};
  	bool slot = false;
-	int ret, count;
+	int ret, count = 0;

  	if (copy_from_user(&hdr, arg, minsz))
  		return -EFAULT;
Comment 4 Beld Zhang 2024-07-09 14:24:21 UTC
after manual modify source code:
testing pass, that crash is not occurs again.

nv 3060ti on dell precision T7920
kernel 6.6.38
qemu 8.2.4
Comment 5 Žilvinas Žaltiena 2024-07-09 20:49:14 UTC
(In reply to Liu, Yi L from comment #3)
> It appears that the count is used without init.. And it does not happen
> with other devices as they have FLR, hence does not trigger the hotreset
> info path. Please try below patch to see if it works.
> 

Patch fixes the problem on my system.
Comment 6 Liu, Yi L 2024-07-10 00:44:54 UTC
On 2024/7/10 04:49, bugzilla-daemon@kernel.org wrote:
> https://bugzilla.kernel.org/show_bug.cgi?id=219010
> 
> --- Comment #5 from Žilvinas Žaltiena (zaltys@natrix.lt) ---
> (In reply to Liu, Yi L from comment #3)
>> It appears that the count is used without init.. And it does not happen
>> with other devices as they have FLR, hence does not trigger the hotreset
>> info path. Please try below patch to see if it works.
>>
> 
> Patch fixes the problem on my system.
> 

patch submitted to mailing list. Thanks, and feel free to let me know if
it is proper to add your reported-by, and add your tested-by.
Comment 7 Liu, Yi L 2024-07-10 00:46:03 UTC
On 2024/7/10 08:48, Yi Liu wrote:
> On 2024/7/10 04:49, bugzilla-daemon@kernel.org wrote:
>> https://bugzilla.kernel.org/show_bug.cgi?id=219010
>>
>> --- Comment #5 from Žilvinas Žaltiena (zaltys@natrix.lt) ---
>> (In reply to Liu, Yi L from comment #3)
>>> It appears that the count is used without init.. And it does not happen
>>> with other devices as they have FLR, hence does not trigger the hotreset
>>> info path. Please try below patch to see if it works.
>>>
>>
>> Patch fixes the problem on my system.
>>
> 
> patch submitted to mailing list. Thanks, and feel free to let me know if
> it is proper to add your reported-by, and add your tested-by.
> 

forgot the link. :)

https://lore.kernel.org/kvm/20240710004150.319105-1-yi.l.liu@intel.com/T/#u
Comment 8 Žilvinas Žaltiena 2024-07-10 15:47:27 UTC
(In reply to Liu, Yi L from comment #6)

> patch submitted to mailing list. Thanks, and feel free to let me know if
> it is proper to add your reported-by, and add your tested-by.

It is ok to add me.
Comment 9 Žilvinas Žaltiena 2024-07-20 18:30:10 UTC
Fixed in 6.9.10. Closing this.