Bug 220057

Summary: Kernel regression. Linux VMs crashing (I did not test Windows guest VMs)
Product: Virtualization Reporter: Adolfo (adolfotregosa)
Component: kvmAssignee: virtualization_kvm
Status: NEW ---    
Severity: blocking CC: adolfotregosa, alex.williamson, clg, f.gruenbichler
Priority: P3    
Hardware: All   
OS: Linux   
Kernel Version: Subsystem:
Regression: Yes Bisected commit-id: f9e54c3a2f5b79ecc57c7bc7d0d3521e461a2101
Attachments: journalctl
revert patch
Proxmox forum screenshots
dmesg output
lspci -vvv
proxmox_VM
vfio_map_dma failed
phys-bits=39
lspcu output
log_vm_start_up_to_crash
phys-bits=host
qm showcmd 200 --pretty
vm startup
vm startup 2

Description Adolfo 2025-04-27 00:46:58 UTC
Created attachment 308028 [details]
journalctl

I found a kernel regression. I'm using Proxmox, and any kernel with the following commit:

https://github.com/torvalds/linux/commit/f9e54c3a2f5b79ecc57c7bc7d0d3521e461a2101

causes an instant VM crash in some situations that involve GPU acceleration. I’m using an NVIDIA GPU passthrough, but another person experienced the same crashes with an AMD 9070 XT. In my case, this occurs when playing a simple YouTube video in Chromium-based browsers or when running some games.

I have confirmed that reverting this commit prevents my Linux VMs from crashing.

I’ve attached a log showing what the host’s journalctl log displays. The error is always exactly the same.
Comment 1 Adolfo 2025-04-27 00:48:35 UTC
Created attachment 308029 [details]
revert patch
Comment 2 Artem S. Tashkinov 2025-04-27 23:34:27 UTC
Alex, please take a look.
Comment 4 Adolfo 2025-04-28 07:18:53 UTC
(In reply to Alex Williamson from comment #3)
> https://github.com/torvalds/linux/commit/
> 09dfc8a5f2ce897005a94bf66cca4f91e4e03700

I should have specified that I'm running kernel 6.14.4. 

If that helps.
Comment 5 Adolfo 2025-04-28 08:25:45 UTC
(In reply to Alex Williamson from comment #3)
> https://github.com/torvalds/linux/commit/
> 09dfc8a5f2ce897005a94bf66cca4f91e4e03700

I checked. That commit isn't a fix for the crashes in my case, since I tested vanilla 6.14.4 and that commit is already present. How can I help if needed?
Comment 6 Alex Williamson 2025-04-28 15:10:06 UTC
What's the VM configuration? The GPU assigned?  The host CPU?  The QEMU version?  Is the guest using novueau or the nvidia driver?  Please link the other report of this issue.
Comment 7 Adolfo 2025-04-28 19:41:32 UTC
(In reply to Alex Williamson from comment #6)
> What's the VM configuration? The GPU assigned?  The host CPU?  The QEMU
> version?  Is the guest using novueau or the nvidia driver?  Please link the
> other report of this issue.

13900 ,z790 chipset, 128GB ram.
Guest set to Q35. Cpu set to host. qemu 9.2. Guest is using nvidia driver. Crash happen on both a 4060ti and 5060ti.

Other report but with AMD 9070 XT.

https://forum.proxmox.com/threads/opt-in-linux-6-14-kernel-for-proxmox-ve-8-available-on-test-no-subscription.164497/page-5#post-763760
Comment 8 Alex Williamson 2025-04-28 21:11:46 UTC
(In reply to Adolfo from comment #7)

> 13900 ,z790 chipset, 128GB ram.
> Guest set to Q35. Cpu set to host. qemu 9.2. Guest is using nvidia driver.
> Crash happen on both a 4060ti and 5060ti.
> 
> Other report but with AMD 9070 XT.
> 
> https://forum.proxmox.com/threads/opt-in-linux-6-14-kernel-for-proxmox-ve-8-
> available-on-test-no-subscription.164497/page-5#post-763760

I'm not able to access the attachments of this report without a proxmox subscription key, so I can't make any conclusions whether this is related.  I do note the post is originally dated April 14th, so it's not based on v6.14.4, it might be based on a kernel with broken bus reset support that was reverted in v6.14.4.

I don't see any similar issues running a stock 6.14.4 kernel, qemu 9.2, Linux guest (6.14.3) running nvidia 570.144, youtube playback in chromium.

Please provide full VM XML or libvirt log, host 'sudo dmesg', host 'sudo lspci -vvv', guest nvidia driver version.
Comment 9 Adolfo 2025-04-28 21:22:11 UTC
Created attachment 308041 [details]
Proxmox forum screenshots

You don't need a subscription. Just create an account. Either way you have the screenshots attached.
Comment 10 Adolfo 2025-04-28 21:24:42 UTC
Created attachment 308042 [details]
dmesg output

dmesg output.
Comment 11 Adolfo 2025-04-28 21:26:26 UTC
Created attachment 308043 [details]
lspci -vvv

lspci -vvv
Comment 12 Adolfo 2025-04-28 21:39:30 UTC
I tested both NVIDIA driver versions 570.144 and the beta 575.51.02.
Assuming my machine is fine — since reverting that commit resolved the issue — I believe the reason there are so few reports is that Proxmox still ships with kernel 6.8. We had an opt-in for 6.11.11, and it also works fine.
Recently, 6.14 was offered as an opt-in, and that's what led me down the rabbit hole. I started compiling vanilla kernels: 6.12.25 LTS crashed, and 6.12-rc1 crashed as well. This indicated that the problem was introduced between 6.11.11 and 6.12-rc1.
I performed a bisect, which led me to the problematic commit. After reverting that commit, I can now run 6.14.4 without issues, just like 6.11.11.

Regarding Proxmox VM guest configuration. Proxmox does not use libvirt but I leave the VM configuration file bellow.

GNU nano 7.2 /etc/pve/qemu-server/200.conf                                                            
-------------
affinity: 0-7
agent: 0
args: -machine hpet=off
balloon: 0
bios: ovmf
boot: order=hostpci2
cores: 8
cpu: host,flags=+pdpe1gb
cpuunits: 200
efidisk0: local-lvm:vm-200-disk-0,efitype=4m,pre-enrolled-keys=1,size=4M
hookscript: local:snippets/200.pl
hostpci0: 0000:01:00,pcie=1,rombar=0
hostpci1: 0000:00:14.0,rombar=0
hostpci2: 0000:06:00.0,rombar=0
hostpci4: 0000:03:0a.2,rombar=0
hotplug: usb
hugepages: 1024
kvm: 1
machine: q35
memory: 32768
meta: creation-qemu=9.0.2,ctime=1737473987
name: cachyOS
numa: 1
onboot: 0
ostype: l26
scsihw: virtio-scsi-single
smbios1: uuid=f75921fc-d45e-4463-8590-8a4aab19e6e8
sockets: 1
startup: order=3,up=300,down=20
tablet: 0
vga: none
vmgenid: 2cd9b643-58a2-4ac4-a01c-f6a131e65c6d
Comment 13 Adolfo 2025-04-28 21:42:02 UTC
Created attachment 308044 [details]
proxmox_VM

Proxmox VM Hardware config and qemu version screenshot

Hopefully, I provided all the information you asked for.
Comment 14 Adolfo 2025-04-28 21:49:09 UTC
Created attachment 308045 [details]
vfio_map_dma failed

I forgot. I have no idea if this is even remotely linked but I'll leave it here just in case.

host journalctl: vfio_map_dma failed.
Comment 15 Alex Williamson 2025-04-28 22:15:55 UTC
(In reply to Adolfo from comment #14)
> Created attachment 308045 [details]
> vfio_map_dma failed
> 
> I forgot. I have no idea if this is even remotely linked but I'll leave it
> here just in case.
> 
> host journalctl: vfio_map_dma failed.

Does adding ",phys-bits=39" to the cpu: line in the config file resolve these errors?  Please include output of lscpu.
Comment 16 Adolfo 2025-04-28 22:27:55 UTC
(In reply to Alex Williamson from comment #15)
> (In reply to Adolfo from comment #14)
> > Created attachment 308045 [details]
> > vfio_map_dma failed
> > 
> > I forgot. I have no idea if this is even remotely linked but I'll leave it
> > here just in case.
> > 
> > host journalctl: vfio_map_dma failed.
> 
> Does adding ",phys-bits=39" to the cpu: line in the config file resolve
> these errors?  Please include output of lscpu.

Doesn't seam to do anything, no.

----
cores: 8
cpu: host,flags=+pdpe1gb,phys-bits=39
cpuunits: 200
efidisk0: local-lvm:vm-200-disk-0,efitype=4m,pre-enrolled-keys=1,size=4
...
Comment 17 Adolfo 2025-04-28 22:29:35 UTC
Created attachment 308046 [details]
phys-bits=39

log with phys-bits=39 on cpu line
Comment 18 Adolfo 2025-04-28 22:31:26 UTC
Created attachment 308047 [details]
lspcu output

lspcu output has requested.
Comment 19 Alex Williamson 2025-04-28 22:46:10 UTC
(In reply to Adolfo from comment #16)
>
> Doesn't seam to do anything, no.
> 
> ----
> cores: 8
> cpu: host,flags=+pdpe1gb,phys-bits=39
> cpuunits: 200
> efidisk0: local-lvm:vm-200-disk-0,efitype=4m,pre-enrolled-keys=1,size=4
> ...

I'm getting that option from here:
https://pve.proxmox.com/wiki/Manual:_qm.conf

Can you find the QEMU command line in ps while the VM is running? ex. `ps aux | grep qemu`  There should be a difference in the QEMU command line proxmox is using with the option, and it should at least change the addresses based at 0x380000000000 in the logs.

I think the issue with the failed mappings is that you CPU physical address width is 46-bits:

Address sizes: 46 bits physical, 48 bits virtual

But the host IOMMU width is 39-bits:

[    0.341856] DMAR: Host address width 39

Therefore the VM is giving the devices an IOVA that cannot be mapped by the host IOMMU.  I don't know if ultimately that contributes to the issue you're reporting, but it might.
Comment 20 Adolfo 2025-04-28 23:05:49 UTC
Created attachment 308048 [details]
log_vm_start_up_to_crash

as far I can tell, it changes nothing. I loaded up unpatched kernel and attached complete VM startup up to crash. 

ps aux | grep qemu

at file startup.
Comment 21 Fabian Grünbichler 2025-04-29 06:54:22 UTC
FWIW, you can get the full QEMU commandline for a given VM on PVE with "qm showcmd XXX --pretty". you can also use this to verify whether config changes have the desired effect ;)

our kernels are based on Ubuntu's, but since it seems you can also reproduce the issue with a plain upstream kernel, I'll not go into too much detail about that, unless you want me to.
Comment 22 Adolfo 2025-04-29 08:08:30 UTC
Created attachment 308049 [details]
phys-bits=host

It seams phys-bits=host, actually changes something although the "VFIO_MAP_DMA failed" still shows up.
Comment 23 Adolfo 2025-04-29 08:09:47 UTC
Created attachment 308050 [details]
qm showcmd 200 --pretty

qm showcmd 200 --pretty output with phys-bits=host
Comment 24 Adolfo 2025-04-29 08:14:19 UTC
or maybe not.. The address probably changed because I played with rebar setting in the bios!? I don't have the knowledge to answer this. This VM does not have phys-bits=host in the conf file.

Apr 29 09:12:09 pve QEMU[11923]: kvm: VFIO_MAP_DMA failed: Invalid argument
Apr 29 09:12:09 pve QEMU[11923]: kvm: vfio_container_dma_map(0x5c9222494280, 0x380000000000, 0x10000, 0x78075ee70000) = -22 (Invalid argument)
Apr 29 09:12:09 pve QEMU[11923]: kvm: VFIO_MAP_DMA failed: Invalid argument
Apr 29 09:12:09 pve QEMU[11923]: kvm: vfio_container_dma_map(0x5c9222494280, 0x380000011000, 0x3000, 0x78075ee69000) = -22 (Invalid argument)
Apr 29 09:12:09 pve QEMU[11923]: kvm: VFIO_MAP_DMA failed: Invalid argument
Apr 29 09:12:09 pve QEMU[11923]: kvm: vfio_container_dma_map(0x5c9222494280, 0x380000000000, 0x10000, 0x78075ee70000) = -22 (Invalid argument)
Apr 29 09:12:09 pve QEMU[11923]: kvm: VFIO_MAP_DMA failed: Invalid argument
Apr 29 09:12:09 pve QEMU[11923]: kvm: vfio_container_dma_map(0x5c9222494280, 0x380000011000, 0x3000, 0x78075ee69000) = -22 (Invalid argument)
Comment 25 Cédric Le Goater 2025-04-29 14:58:56 UTC
"-cpu host,guest-phys-bits=39" should help to define compatible address spaces.
Could you try please ?
Comment 26 Adolfo 2025-04-29 15:02:02 UTC
(In reply to Cédric Le Goater from comment #25)
> "-cpu host,guest-phys-bits=39" should help to define compatible address
> spaces.
> Could you try please ?

per https://pve.proxmox.com/wiki/Manual:_qm.conf , guest-phys-bits does not exist in proxmox, but yes, I can give it a try.
Comment 27 Adolfo 2025-04-29 15:09:52 UTC
(In reply to Cédric Le Goater from comment #25)
> "-cpu host,guest-phys-bits=39" should help to define compatible address
> spaces.
> Could you try please ?

vm 200 - unable to parse value of 'cpu' - format error
guest-phys-bits: property is not defined in schema and the schema does not allow additional properties
Comment 28 Alex Williamson 2025-04-29 15:15:10 UTC
Another option may be to set the cpu as "cpu: kvm64" which is the default.  I noted somewhere this should present a 40-bit physical address space, which might be close enough.
Comment 29 Adolfo 2025-04-29 15:18:47 UTC
(In reply to Alex Williamson from comment #28)
> Another option may be to set the cpu as "cpu: kvm64" which is the default. 
> I noted somewhere this should present a 40-bit physical address space, which
> might be close enough.

If I recall correctly, cpu must be set to 'host' for nvidia gpu passthrough to work. That said, I would still prefer to keep the CPU set to 'host'.
Comment 30 Alex Williamson 2025-04-29 15:22:32 UTC
(In reply to Adolfo from comment #29)
> (In reply to Alex Williamson from comment #28)
> > Another option may be to set the cpu as "cpu: kvm64" which is the default. 
> > I noted somewhere this should present a 40-bit physical address space,
> which
> > might be close enough.
> 
> If I recall correctly, cpu must be set to 'host' for nvidia gpu passthrough
> to work. That said, I would still prefer to keep the CPU set to 'host'.

kvm64 works just fine with NVIDIA GPU assignment to a Linux guest for me.
Comment 31 Adolfo 2025-04-29 15:25:59 UTC
(In reply to Alex Williamson from comment #30)
> (In reply to Adolfo from comment #29)
> > (In reply to Alex Williamson from comment #28)
> > > Another option may be to set the cpu as "cpu: kvm64" which is the
> default. 
> > > I noted somewhere this should present a 40-bit physical address space,
> > which
> > > might be close enough.
> > 
> > If I recall correctly, cpu must be set to 'host' for nvidia gpu passthrough
> > to work. That said, I would still prefer to keep the CPU set to 'host'.
> 
> kvm64 works just fine with NVIDIA GPU assignment to a Linux guest for me.

I just tested it remotely. VM will not even start.
Comment 32 Adolfo 2025-04-29 15:35:11 UTC
I retested using qemu64 and kvm64. The VM will start but does not boot unless cpu is set to host, at least using this 5060TI gpu. IIRC the same happened with the 4060TI I had previously.
Comment 33 Adolfo 2025-04-29 18:36:09 UTC
these VFIO_MAP_DMA failed are not that uncommon if you do a quick google search. My gut tells me they are not related to what I'm reporting here.

https://forum.proxmox.com/threads/vga-pass-issues-with-radeon-rx-7900-xtx-kvm-vfio_map_dma-failed-invalid-argument.156795/

https://forum.proxmox.com/threads/vfio_map_dma-failed-invalid-argument.125888/

https://bbs.archlinux.org/viewtopic.php?id=299106
Comment 34 Alex Williamson 2025-04-29 20:09:43 UTC
Please run the following in the host before starting the guest and attach the resulting host dmesg logs after running the guest:

# echo "func vfio_pci_mmap_huge_fault +p" > /proc/dynamic_debug/control
Comment 35 Adolfo 2025-04-30 00:20:48 UTC
Created attachment 308055 [details]
vm startup

here, full vm log from startup to shutdown after running:

echo "func vfio_pci_mmap_huge_fault +p" > /proc/dynamic_debug/control

on the host, but I did not spot anything different from usual.
Comment 36 Alex Williamson 2025-04-30 00:41:05 UTC
Please make sure the vfio-pci module is already loaded before issuing the dynamic debug command, I fought with this some myself, ie. modprobe vfio-pci.  There should be vfio_pci_mmap_huge_fault lines in the log.
Comment 37 Adolfo 2025-04-30 07:32:01 UTC
Created attachment 308056 [details]
vm startup 2

I'm sure it was loaded. I have it set to load on host boot, and it binds all of the VFs from the X710 network card. One of my other VMs is running OPNsense with two VFs from the X710 passed through, so I'm 100% certain it was loaded—otherwise, I wouldn't have internet access.

The issue turned out to be that I was running my patched kernel. I booted the host using the Proxmox kernel 6.14.0 instead.

It now shows the information you requested.