Hello, On a server, I use to receive files (parallel cp with NFS4) with a 10g NIC (Intel X710-T2L). I can usually achieve a total speed close to 1 GB/s (generally 4 cp at the same time, targeted to 4 different HDD). Switching to kernel 5.13 (Archlinux) with the exact same configuration, my speed is now limited at 250MB/s. Rolling back to a previous kernel fix the speed. The problem is maybe in the last Intel driver i40e, but I don't really know how to investigate more. The problem still appear in 5.13.7 (last one at the moment).
Thanks for your bug report. I don't know of anything immediately pending that might cause this, so this is the first time I've heard of this issue. There are a few troubleshooting items that you can provide that will help us be able to reproduce and hopefully fix the bug. full dmesg from boot output from ethtool -i ethX ethtool -S ethX, before and after the test output from netstat -S before and after the test Any reproduction instructions? mount options for NFSv4 (output of mount command would do)
Thanks for your answer. > full dmesg from boot > output from ethtool -i ethX > ethtool -S ethX, before and after the test > output from netstat -S before and after the test This server is currently processing data for the next few days. I'll provide these info as soon as I can. > Any reproduction instructions? mount options for NFSv4 (output of mount > command would do) Not really, no matter if I used `cp` with normal files or `dd`for the test, the throughput was low. About the NFS options (in fstab): nfsvers=4.2,noatime,nodiratime,_netdev,noauto,x-systemd.automount,x-systemd.mount-timeout=10 With `nfsstat -m`: rw,noatime,nodiratime,vers=4.2,rsize=1048576,wsize=1048576,namlen=255,hard,proto=tcp,timeo=600,retrans=2,sec=sys,clientaddr=10.0.2.2,local_lock=none,addr=10.0.2.1 I also increased a bit the rx ring buffer (-> 1024) to reduce packet loss, but it didn't helped with the throughput. > I don't know of anything immediately pending > that might cause this, so this is the first time I've heard of this issue. I don't know how specific this bug is, but just for the log, another user on the Phoronix forum have a similar problem: https://www.phoronix.com/forums/forum/software/general-linux-open-source/1270526-vmware-hits-a-nasty-performance-regression-with-linux-5-13?p=1270550#post1270550
I switched back again on 4.13 branch (currently 4.13.12), but I cannot reproduce the problem anymore. I'm not sure what happened, but everything's good now. Thank you for your time.