Bug 220010 - Regression: 6.14.2 breaks PCI passthrough, qemu-system-x86_64: no available reset mechanism
Summary: Regression: 6.14.2 breaks PCI passthrough, qemu-system-x86_64: no available r...
Status: RESOLVED PATCH_ALREADY_AVAILABLE
Alias: None
Product: Drivers
Classification: Unclassified
Component: PCI (show other bugs)
Hardware: AMD Linux
: P3 normal
Assignee: drivers_pci@kernel-bugs.osdl.org
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2025-04-14 06:29 UTC by Athul Krishna K R
Modified: 2025-04-16 16:04 UTC (History)
2 users (show)

See Also:
Kernel Version: 6.14.2
Subsystem:
Regression: Yes
Bisected commit-id: 792bd08a41bdc309e08ff4d429f670cbb30f2b63


Attachments
lspci (97.16 KB, text/plain)
2025-04-14 06:29 UTC, Athul Krishna K R
Details
dmesg_1.log (116.63 KB, text/plain)
2025-04-14 06:43 UTC, Athul Krishna K R
Details
dmesg_2.log (126.97 KB, text/plain)
2025-04-14 06:44 UTC, Athul Krishna K R
Details
dmesg_3.log (213.26 KB, text/plain)
2025-04-14 06:49 UTC, Athul Krishna K R
Details

Description Athul Krishna K R 2025-04-14 06:29:54 UTC
Created attachment 307958 [details]
lspci

Device: Asus Zephyrus GA402RJ
dGPU: RX 6700S
Kernel: 6.14.2

03:00.0: dGPU (IOMMU Group 16)
03:00.1: GPU Audio Controller (IOMMU Group 17)

qemu-system-x86_64: vfio: Cannot reset device 0000:03:00.1, no available reset mechanism.
qemu-system-x86_64: vfio: Cannot reset device 0000:03:00.0, no available reset mechanism.

BISECTED: 792bd08a41bdc309e08ff4d429f670cbb30f2b63

03:00.0 does have `bus` in `03:00.0/reset_method`, and `03:00.1` does not have `03:00.1/reset` file. 

Removing `03:00.1` from PCI passthrough still shows error:
qemu-system-x86_64: vfio: Cannot reset device 0000:03:00.0, no available reset mechanism.

Attached lspci -vv output
Comment 1 Athul Krishna K R 2025-04-14 06:32:37 UTC
792bd08a41bdc309e08ff4d429f670cbb30f2b63 is stable tree commit-id

commit 792bd08a41bdc309e08ff4d429f670cbb30f2b63
Author: Nishanth Aravamudan <naravamudan@nvidia.com>
Date:   Fri Feb 7 14:56:00 2025 -0600

    PCI: Avoid reset when disabled via sysfs
    
    [ Upstream commit 479380efe1625e251008d24b2810283db60d6fcd ]
    
    After d88f521da3ef ("PCI: Allow userspace to query and set device reset
    mechanism"), userspace can disable reset of specific PCI devices by writing
    an empty string to the sysfs reset_method file.
    
    However, pci_slot_resettable() does not check pci_reset_supported(), which
    means that pci_reset_function() will still reset the device even if
    userspace has disabled all the reset methods.
    
    I was able to reproduce this issue with a vfio device passed to a qemu
    guest, where I had disabled PCI reset via sysfs.
    
    Add an explicit check of pci_reset_supported() in both
    pci_slot_resettable() and pci_bus_resettable() to ensure both the reset
    status and reset execution are bypassed if an administrator disables it for
    a device.
    
    Link: https://lore.kernel.org/r/20250207205600.1846178-1-naravamudan@nvidia.com
    Fixes: d88f521da3ef ("PCI: Allow userspace to query and set device reset mechanism")
    Signed-off-by: Nishanth Aravamudan <naravamudan@nvidia.com>
    [bhelgaas: commit log]
    Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
    Cc: Alex Williamson <alex.williamson@redhat.com>
    Cc: Raphael Norwitz <raphael.norwitz@nutanix.com>
    Cc: Amey Narkhede <ameynarkhede03@gmail.com>
    Cc: Jason Gunthorpe <jgg@nvidia.com>
    Cc: Yishai Hadas <yishaih@nvidia.com>
    Cc: Shameer Kolothum <shameerali.kolothum.thodi@huawei.com>
    Cc: Kevin Tian <kevin.tian@intel.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
index 23609dd123f9..3e78cf86ef03 100644
--- a/drivers/pci/pci.c
+++ b/drivers/pci/pci.c
@@ -5413,6 +5413,8 @@ static bool pci_bus_resettable(struct pci_bus *bus)
                return false;
 
        list_for_each_entry(dev, &bus->devices, bus_list) {
+               if (!pci_reset_supported(dev))
+                       return false;
                if (dev->dev_flags & PCI_DEV_FLAGS_NO_BUS_RESET ||
                    (dev->subordinate && !pci_bus_resettable(dev->subordinate)))
                        return false;
@@ -5489,6 +5491,8 @@ static bool pci_slot_resettable(struct pci_slot *slot)
        list_for_each_entry(dev, &slot->bus->devices, bus_list) {
                if (!dev->slot || dev->slot != slot)
                        continue;
+               if (!pci_reset_supported(dev))
+                       return false;
                if (dev->dev_flags & PCI_DEV_FLAGS_NO_BUS_RESET ||
                    (dev->subordinate && !pci_bus_resettable(dev->subordinate)))
                        return false;
Comment 2 Athul Krishna K R 2025-04-14 06:43:56 UTC
Created attachment 307959 [details]
dmesg_1.log

dmesg_1.log: 03:00.0 and 03:00.1
Comment 3 Athul Krishna K R 2025-04-14 06:44:36 UTC
Created attachment 307960 [details]
dmesg_2.log

dmesg_2.log: 03:00.0 only
Comment 4 Athul Krishna K R 2025-04-14 06:49:55 UTC
Created attachment 307961 [details]
dmesg_3.log

dmesg_3.log: 792bd08a41bdc309e08ff4d429f670cbb30f2b63 reverted
Comment 5 Bjorn Helgaas 2025-04-15 22:30:25 UTC
Thanks for the report and sorry for the trouble.  Fix queued for v6.16: https://git.kernel.org/cgit/linux/kernel/git/pci/pci.git/commit/?id=bc0b828ef6e5

I added you as another reporter and included a link to this bugzilla.
Comment 6 Athul Krishna K R 2025-04-16 16:04:04 UTC
Thanks for the support. I shall mark this as resolved.

Note You need to log in before you can comment on or make changes to this bug.