Bug 217641
Summary: | A regression of vfio-pci driver with Intel DG2 (A770) discrete graphics card from Linux 6.2 | ||
---|---|---|---|
Product: | Drivers | Reporter: | Gwan-gyeong Mun (elongbug) |
Component: | PCI | Assignee: | drivers_pci (drivers_pci) |
Status: | NEW --- | ||
Severity: | normal | CC: | bjorn, elongbug |
Priority: | P3 | ||
Hardware: | Intel | ||
OS: | Linux | ||
URL: | https://lore.kernel.org/all/c836bf88-d961-040d-b15e-52feb8e11f8d@intel.com/ | ||
Kernel Version: | 6.2 | Subsystem: | |
Regression: | Yes | Bisected commit-id: | d8d2b65a940bb497749d66bdab59b530901d3854 |
Attachments: |
complete dmesg log with v6.4.1 with d8d2b65a940b reverted
dmesg log with v6.4.1 with d8d2b65a940b reverted lspci with v6.4.1 with d8d2b65a940b reverted complete dmesg log with v6.4.1 lspci with v6.4.1 dmesg(journalctl log) of Linux 6.4.1 with vfio_dc3disable_test |
Description
Gwan-gyeong Mun
2023-07-07 13:17:17 UTC
Created attachment 304559 [details]
complete dmesg log with v6.4.1 with d8d2b65a940b reverted
complete dmesg log with v6.4.1 with d8d2b65a940b reverted
Created attachment 304560 [details]
dmesg log with v6.4.1 with d8d2b65a940b reverted
complete dmesg log with v6.4.1 with d8d2b65a940b reverted.
Created attachment 304561 [details]
lspci with v6.4.1 with d8d2b65a940b reverted
lspci -vv with v6.4.1 with d8d2b65a940b reverted
Created attachment 304562 [details]
complete dmesg log with v6.4.1
complete dmesg log with v6.4.1 (which does not work).
Created attachment 304563 [details]
lspci with v6.4.1
lspci -vv with v6.4.1
the problem only occurred when I set DG2 to vfio-pci as shown below in the settings [1]. (The reason for setting DG2 to vfio-pci is to use dg2 as a qemu pci paththru device). If you don't set DG2 to vfio-pci, you won't see any logs of the problem. [1] $ cat /etc/modprobe.d/vfio.conf options vfio-pci ids=8086:56a0,8086:4f90 softdep drm pre: vfio-pci Comment on attachment 304559 [details] complete dmesg log with v6.4.1 with d8d2b65a940b reverted Duplicate of https://bugzilla.kernel.org/attachment.cgi?id=304560 Created attachment 304614 [details]
dmesg(journalctl log) of Linux 6.4.1 with vfio_dc3disable_test
dmesg(journalctl log) of Linux 6.4.1 with vfio_dc3disable_test
Hi, when the DG2 HW is not used as vfio-pci, the i915 driver is used (loaded) and the i915 initializes the DG2 HW. i915 has code[1] to handle pci_d3cold_disable() for the root device of DG2 pcie with the commit below. I referenced this code and applied it to vfio-pci for testing [2], but still encountered the same issue[3]. [1] commit 7d23a80dc9720a378707edc03a7275d5a372355f Author: Anshuman Gupta <anshuman.gupta@intel.com> Date: Thu Jun 16 17:52:49 2022 +0530 drm/i915/dgfx: Disable d3cold at gfx root port Currently i915 disables d3cold for i915 pci dev. This blocks D3 for i915 gfx pci upstream bridge (VSP). Let's disable d3cold at gfx root port to make sure that i915 gfx VSP can transition to D3 to save some power. We don't need to disable/enable d3cold in rpm, s2idle suspend/resume handlers. Disabling/Enabling d3cold at gfx root port in probe/remove phase is sufficient. Fixes: 1a085e23411d ("drm/i915: Disable D3Cold in s2idle and runtime pm") Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> Signed-off-by: Anshuman Gupta <anshuman.gupta@intel.com> Reviewed-by: Badal Nilawar <badal.nilawar@intel.com> Acked-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20220616122249.5007-1-anshuman.gupta@intel.com (cherry picked from commit 138c2fca6f408f397ea8fbbbf33203f244d96e01) Signed-off-by: Jani Nikula <jani.nikula@intel.com> [2] diff --git a/drivers/vfio/pci/vfio_pci.c b/drivers/vfio/pci/vfio_pci.c index 29091ee2e984..ea93cd1e030c 100644 --- a/drivers/vfio/pci/vfio_pci.c +++ b/drivers/vfio/pci/vfio_pci.c @@ -148,6 +148,9 @@ static int vfio_pci_probe(struct pci_dev *pdev, const struct pci_device_id *id) struct vfio_pci_core_device *vdev; int ret; + pr_info("probe pci_dev [%04x:%04x], pci_device_id [%04x:%04x]\n", + pdev->vendor, pdev->device, id->vendor, id->device); + if (vfio_pci_is_denylisted(pdev)) return -EINVAL; @@ -157,6 +160,18 @@ static int vfio_pci_probe(struct pci_dev *pdev, const struct pci_device_id *id) return PTR_ERR(vdev); dev_set_drvdata(&pdev->dev, vdev); + + if (pdev->vendor == 0x8086 && pdev->device == 0x56a0) { + struct pci_dev *root_pdev = pcie_find_root_port(pdev); + + pr_info("probe pci_dev [%04x:%04x]\n",pdev->vendor, pdev->device); + if (root_pdev) { + pci_d3cold_disable(root_pdev); + pr_info("root device [%04x:%04x] set pci_d3cold_disable()\n", + root_pdev->vendor, root_pdev->device); + } + } + ret = vfio_pci_core_register_device(vdev); if (ret) goto out_put_vdev; @@ -171,7 +186,21 @@ static void vfio_pci_remove(struct pci_dev *pdev) { struct vfio_pci_core_device *vdev = dev_get_drvdata(&pdev->dev); + pr_info("remove pci_dev [%04x:%04x]\n", pdev->vendor, pdev->device); + vfio_pci_core_unregister_device(vdev); + + if (pdev->vendor == 0x8086 && pdev->device == 0x56a0) { + struct pci_dev *root_pdev = pcie_find_root_port(pdev); + + pr_info("remove [%04x:%04x]\n",pdev->vendor, pdev->device); + if (root_pdev) { + pci_d3cold_enable(root_pdev); + pr_info("root device [%04x:%04x] set pci_d3cold_enable()\n", + root_pdev->vendor, root_pdev->device); + } + } + vfio_put_device(&vdev->vdev); } [3] https://bugzilla.kernel.org/attachment.cgi?id=304614 |