Bug 219129 - virtio net performance degradation between Windows and Linux guest in kernel 6.10.3
Summary: virtio net performance degradation between Windows and Linux guest in kernel ...
Status: NEW
Alias: None
Product: Virtualization
Classification: Unclassified
Component: kvm (show other bugs)
Hardware: All Linux
: P3 normal
Assignee: virtualization_kvm
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2024-08-06 09:16 UTC by Anton Kuleshov
Modified: 2024-08-28 16:26 UTC (History)
12 users (show)

See Also:
Kernel Version:
Subsystem:
Regression: No
Bisected commit-id:


Attachments

Description Anton Kuleshov 2024-08-06 09:16:52 UTC
After kernel upgrade from 6.10.2 to 6.10.3 network performance has become very low between Windows and Linux guest.

Steps to reproduce:
MTU 9000 for all adapters and bridge (Jumbo frame 9014 in Windows)
KVM host, qemu 9.0.2, bridge (e.g. br0)
Linux guest, virtio net adapter bridged to host br0, address e.g. 192.168.0.1
Windows 11 guest, virtio net adapter, bridged to host br0, address e.g. 192.168.0.2

When accessing Linux guest from Windows 11 and if kvm host OR Linux guest has kernel version 6.10.3 then network performance is poor.

iperf3 for example:

$ iperf3.exe --client 192.168.0.1
Connecting to host 192.168.0.1, port 5201
[  5] local 192.168.0.2 port 49849 connected to 192.168.0.1 port 5201
[ ID] Interval           Transfer     Bitrate
[  5]   0.00-1.01   sec   256 KBytes  2.08 Mbits/sec
[  5]   1.01-2.01   sec   128 KBytes  1.04 Mbits/sec
[  5]   2.01-3.01   sec  0.00 Bytes  0.00 bits/sec
[  5]   3.01-4.01   sec   128 KBytes  1.05 Mbits/sec
[  5]   4.01-5.00   sec  0.00 Bytes  0.00 bits/sec
[  5]   5.00-6.01   sec  0.00 Bytes  0.00 bits/sec
[  5]   6.01-7.01   sec  0.00 Bytes  0.00 bits/sec
[  5]   7.01-8.01   sec   128 KBytes  1.05 Mbits/sec
[  5]   8.01-9.01   sec  0.00 Bytes  0.00 bits/sec
[  5]   9.01-10.01  sec  0.00 Bytes  0.00 bits/sec
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval           Transfer     Bitrate
[  5]   0.00-10.01  sec   640 KBytes   524 Kbits/sec                  sender
[  5]   0.00-10.02  sec   384 KBytes   314 Kbits/sec                  receiver

iperf Done.


If kvm host AND Linux guest has kernel version 6.10.2 then performance seems ok

$ iperf3.exe --client 192.168.0.1
Connecting to host 192.168.0.1, port 5201
[  5] local 192.168.0.2 port 50092 connected to 192.168.0.1 port 5201
[ ID] Interval           Transfer     Bitrate
[  5]   0.00-1.01   sec  3.86 GBytes  33.0 Gbits/sec
[  5]   1.01-2.01   sec  3.78 GBytes  32.4 Gbits/sec
[  5]   2.01-3.02   sec  3.72 GBytes  31.6 Gbits/sec
[  5]   3.02-4.01   sec  3.75 GBytes  32.2 Gbits/sec
[  5]   4.01-5.01   sec  3.80 GBytes  32.8 Gbits/sec
[  5]   5.01-6.01   sec  3.64 GBytes  31.3 Gbits/sec
[  5]   6.01-7.01   sec  3.90 GBytes  33.6 Gbits/sec
[  5]   7.01-8.00   sec  3.94 GBytes  33.9 Gbits/sec
[  5]   8.00-9.00   sec  3.83 GBytes  32.9 Gbits/sec
[  5]   9.00-10.00  sec  3.87 GBytes  33.3 Gbits/sec
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval           Transfer     Bitrate
[  5]   0.00-10.00  sec  38.1 GBytes  32.7 Gbits/sec                  sender
[  5]   0.00-10.01  sec  38.1 GBytes  32.7 Gbits/sec                  receiver

iperf Done.

The following entries appear in the kernel logs on Linux guest:
[  157.294081] enp3s0: bad gso: type: 1, size: 8960
[  157.294298] enp3s0: bad gso: type: 1, size: 8960
[  157.623938] enp3s0: bad gso: type: 1, size: 8960
[  157.938094] enp3s0: bad gso: type: 1, size: 8960
[  158.249957] enp3s0: bad gso: type: 1, size: 8960
[  158.593349] enp3s0: bad gso: type: 1, size: 8960
[  158.909346] enp3s0: bad gso: type: 1, size: 8960
[  159.236646] enp3s0: bad gso: type: 1, size: 8960
[  159.236721] enp3s0: bad gso: type: 1, size: 8960
[  159.236745] enp3s0: bad gso: type: 1, size: 8960

This is also reproduced for bare metal Windows 11 PC with bridged physical network adapter to br0 on kvm host.
Comment 1 James Tucker 2024-08-06 21:16:45 UTC
Likely caused by e269d79c7d35aa3808b1f3c1737d63dab504ddc8, fixed by 89add40066f9ed9abe5f7f886fe5789ff7e0c50e
Comment 2 Thomas Clark 2024-08-11 13:34:10 UTC
Any idea when this fix will show up in a released kernel?
Comment 3 Christian Heusel 2024-08-12 07:18:39 UTC
No that is not yet clear, but I have proposed it's inclusion to the stable kernels a few days ago: https://lore.kernel.org/all/60bc20c5-7512-44f7-88cb-abc540437ae1@heusel.eu
Comment 4 alexucu 2024-08-13 15:46:00 UTC
This doesn't seem to be fixed in 6.10.4 just yet, at least on Arch's default kernel variant.
Comment 5 Christian Heusel 2024-08-13 16:22:00 UTC
Yes this is expected as the patch has not yet been included in the stable series 😅
I'll wait for a bit and poke the thread again.
Comment 6 Jordan Whited 2024-08-13 16:39:45 UTC
https://github.com/jwhited/tun-einval-repro/blob/main/main.go contains a simplified reproduction of the issue. This writes a GSO_TCPv4 packet to a TUN device w/GSO=1240 and 2 equal length segments. The write returns EINVAL with e269d79c7d35aa3808b1f3c1737d63dab504ddc8 absent the fix in 89add40066f9ed9abe5f7f886fe5789ff7e0c50e.
Comment 7 Christian Heusel 2024-08-14 10:24:33 UTC
The patch made it to the 6.10 and 6.6 stable queues :)
Comment 8 Thomas Clark 2024-08-14 15:27:47 UTC
Thank you!

On 8/14/24 03:24, bugzilla-daemon@kernel.org wrote:
> https://bugzilla.kernel.org/show_bug.cgi?id=219129
>
> --- Comment #7 from Christian Heusel (christian@heusel.eu) ---
> The patch made it to the 6.10 and 6.6 stable queues :)
>
Comment 9 Raymond Jay Golo 2024-08-15 01:11:02 UTC
I can confirm that performance has been restored with 6.10.5 on my Linode instance. I was wondering why it was suddenly failing to connect to update servers due to extremely slow downloads or timeouts and now everything is OK.
Comment 10 Salvatore Bonaccorso 2024-08-26 18:52:35 UTC
Report on the regressions list: https://lore.kernel.org/regressions/ZsyMzW-4ee_U8NoX@eldamar.lan/T/#m390d6ef7b733149949fb329ae1abffec5cefb99b
Comment 11 Salvatore Bonaccorso 2024-08-26 18:52:55 UTC
And a downstream report in Debian: https://bugs.debian.org/1079684
Comment 13 Karl Tischler 2024-08-28 16:26:35 UTC
The degraded network performance also seems to be a problem in kernel 5.15.165.
Should I file a new bug report?

Note You need to log in before you can comment on or make changes to this bug.