After kernel upgrade from 6.10.2 to 6.10.3 network performance has become very low between Windows and Linux guest. Steps to reproduce: MTU 9000 for all adapters and bridge (Jumbo frame 9014 in Windows) KVM host, qemu 9.0.2, bridge (e.g. br0) Linux guest, virtio net adapter bridged to host br0, address e.g. 192.168.0.1 Windows 11 guest, virtio net adapter, bridged to host br0, address e.g. 192.168.0.2 When accessing Linux guest from Windows 11 and if kvm host OR Linux guest has kernel version 6.10.3 then network performance is poor. iperf3 for example: $ iperf3.exe --client 192.168.0.1 Connecting to host 192.168.0.1, port 5201 [ 5] local 192.168.0.2 port 49849 connected to 192.168.0.1 port 5201 [ ID] Interval Transfer Bitrate [ 5] 0.00-1.01 sec 256 KBytes 2.08 Mbits/sec [ 5] 1.01-2.01 sec 128 KBytes 1.04 Mbits/sec [ 5] 2.01-3.01 sec 0.00 Bytes 0.00 bits/sec [ 5] 3.01-4.01 sec 128 KBytes 1.05 Mbits/sec [ 5] 4.01-5.00 sec 0.00 Bytes 0.00 bits/sec [ 5] 5.00-6.01 sec 0.00 Bytes 0.00 bits/sec [ 5] 6.01-7.01 sec 0.00 Bytes 0.00 bits/sec [ 5] 7.01-8.01 sec 128 KBytes 1.05 Mbits/sec [ 5] 8.01-9.01 sec 0.00 Bytes 0.00 bits/sec [ 5] 9.01-10.01 sec 0.00 Bytes 0.00 bits/sec - - - - - - - - - - - - - - - - - - - - - - - - - [ ID] Interval Transfer Bitrate [ 5] 0.00-10.01 sec 640 KBytes 524 Kbits/sec sender [ 5] 0.00-10.02 sec 384 KBytes 314 Kbits/sec receiver iperf Done. If kvm host AND Linux guest has kernel version 6.10.2 then performance seems ok $ iperf3.exe --client 192.168.0.1 Connecting to host 192.168.0.1, port 5201 [ 5] local 192.168.0.2 port 50092 connected to 192.168.0.1 port 5201 [ ID] Interval Transfer Bitrate [ 5] 0.00-1.01 sec 3.86 GBytes 33.0 Gbits/sec [ 5] 1.01-2.01 sec 3.78 GBytes 32.4 Gbits/sec [ 5] 2.01-3.02 sec 3.72 GBytes 31.6 Gbits/sec [ 5] 3.02-4.01 sec 3.75 GBytes 32.2 Gbits/sec [ 5] 4.01-5.01 sec 3.80 GBytes 32.8 Gbits/sec [ 5] 5.01-6.01 sec 3.64 GBytes 31.3 Gbits/sec [ 5] 6.01-7.01 sec 3.90 GBytes 33.6 Gbits/sec [ 5] 7.01-8.00 sec 3.94 GBytes 33.9 Gbits/sec [ 5] 8.00-9.00 sec 3.83 GBytes 32.9 Gbits/sec [ 5] 9.00-10.00 sec 3.87 GBytes 33.3 Gbits/sec - - - - - - - - - - - - - - - - - - - - - - - - - [ ID] Interval Transfer Bitrate [ 5] 0.00-10.00 sec 38.1 GBytes 32.7 Gbits/sec sender [ 5] 0.00-10.01 sec 38.1 GBytes 32.7 Gbits/sec receiver iperf Done. The following entries appear in the kernel logs on Linux guest: [ 157.294081] enp3s0: bad gso: type: 1, size: 8960 [ 157.294298] enp3s0: bad gso: type: 1, size: 8960 [ 157.623938] enp3s0: bad gso: type: 1, size: 8960 [ 157.938094] enp3s0: bad gso: type: 1, size: 8960 [ 158.249957] enp3s0: bad gso: type: 1, size: 8960 [ 158.593349] enp3s0: bad gso: type: 1, size: 8960 [ 158.909346] enp3s0: bad gso: type: 1, size: 8960 [ 159.236646] enp3s0: bad gso: type: 1, size: 8960 [ 159.236721] enp3s0: bad gso: type: 1, size: 8960 [ 159.236745] enp3s0: bad gso: type: 1, size: 8960 This is also reproduced for bare metal Windows 11 PC with bridged physical network adapter to br0 on kvm host.
Likely caused by e269d79c7d35aa3808b1f3c1737d63dab504ddc8, fixed by 89add40066f9ed9abe5f7f886fe5789ff7e0c50e
Any idea when this fix will show up in a released kernel?
No that is not yet clear, but I have proposed it's inclusion to the stable kernels a few days ago: https://lore.kernel.org/all/60bc20c5-7512-44f7-88cb-abc540437ae1@heusel.eu
This doesn't seem to be fixed in 6.10.4 just yet, at least on Arch's default kernel variant.
Yes this is expected as the patch has not yet been included in the stable series 😅 I'll wait for a bit and poke the thread again.
https://github.com/jwhited/tun-einval-repro/blob/main/main.go contains a simplified reproduction of the issue. This writes a GSO_TCPv4 packet to a TUN device w/GSO=1240 and 2 equal length segments. The write returns EINVAL with e269d79c7d35aa3808b1f3c1737d63dab504ddc8 absent the fix in 89add40066f9ed9abe5f7f886fe5789ff7e0c50e.
The patch made it to the 6.10 and 6.6 stable queues :)
Thank you! On 8/14/24 03:24, bugzilla-daemon@kernel.org wrote: > https://bugzilla.kernel.org/show_bug.cgi?id=219129 > > --- Comment #7 from Christian Heusel (christian@heusel.eu) --- > The patch made it to the 6.10 and 6.6 stable queues :) >
I can confirm that performance has been restored with 6.10.5 on my Linode instance. I was wondering why it was suddenly failing to connect to update servers due to extremely slow downloads or timeouts and now everything is OK.
Report on the regressions list: https://lore.kernel.org/regressions/ZsyMzW-4ee_U8NoX@eldamar.lan/T/#m390d6ef7b733149949fb329ae1abffec5cefb99b
And a downstream report in Debian: https://bugs.debian.org/1079684
Fix is queued up for 6.1 now aswell, thanks @carnil! https://lore.kernel.org/all/2024082741-crease-mug-f658@gregkh https://git.kernel.org/pub/scm/linux/kernel/git/stable/stable-queue.git/tree/queue-6.1/net-drop-bad-gso-csum_start-and-offset-in-virtio_net_hdr.patch
The degraded network performance also seems to be a problem in kernel 5.15.165. Should I file a new bug report?