I have a Linux router with the following specs: Intel(R) Xeon(R) CPU E5-2630 v3 @ 2.40GHz 01:00.0 Intel Corporation 82571EB Gigabit Ethernet Controller (rev 06) 01:00.1 Intel Corporation 82571EB Gigabit Ethernet Controller (rev 06) 02:00.0 Intel Corporation 82571EB Gigabit Ethernet Controller (rev 06) 02:00.1 Intel Corporation 82571EB Gigabit Ethernet Controller (rev 06) 03:00.0 Intel Corporation Ethernet Controller 10G X550T (rev 01) 03:00.1 Intel Corporation Ethernet Controller 10G X550T (rev 01) 04:00.0 Intel Corporation Ethernet Controller 10-Gigabit X540-AT2 (rev 01) 04:00.1 Intel Corporation Ethernet Controller 10-Gigabit X540-AT2 (rev 01) Up to Linux 5.7 kernel everything was ok. After upgrading to 5.8 I've noticed that latency and jitter to one of the e1000e NICs (82571EB) increased. Further tests revealed the following: # 4.19 rtt min/avg/max/mdev = 0.183/0.275/0.382/0.032 ms # 5.0 rtt min/avg/max/mdev = 0.180/0.249/0.310/0.034 ms # 5.4 rtt min/avg/max/mdev = 0.146/0.226/0.306/0.046 ms # 5.6 rtt min/avg/max/mdev = 0.128/0.172/0.210/0.022 ms # 5.7 rtt min/avg/max/mdev = 0.147/0.181/0.233/0.024 ms # 5.8 rtt min/avg/max/mdev = 0.152/2.182/3.944/1.524 ms # 5.9 rtt min/avg/max/mdev = 0.345/2.095/4.192/1.160 ms # 5.10 rtt min/avg/max/mdev = 0.345/2.095/4.192/1.160 ms Single ping session looks like this: PING 172.31.252.132 (172.31.252.132) 56(84) bytes of data. 64 bytes from 172.31.252.132: icmp_seq=1 ttl=64 time=2.42 ms 64 bytes from 172.31.252.132: icmp_seq=2 ttl=64 time=0.872 ms 64 bytes from 172.31.252.132: icmp_seq=3 ttl=64 time=0.171 ms 64 bytes from 172.31.252.132: icmp_seq=4 ttl=64 time=1.89 ms 64 bytes from 172.31.252.132: icmp_seq=5 ttl=64 time=0.163 ms 64 bytes from 172.31.252.132: icmp_seq=6 ttl=64 time=0.196 ms 64 bytes from 172.31.252.132: icmp_seq=7 ttl=64 time=0.191 ms 64 bytes from 172.31.252.132: icmp_seq=8 ttl=64 time=1.90 ms 64 bytes from 172.31.252.132: icmp_seq=9 ttl=64 time=4.00 ms No such problem on another NIC (10G X550T). Using e1000e-3.8.7.tar.gz driver (dkms) didn't help at all. I'm trying to bisect it right now...
bisect session completed earlier than expected and the suspected commit is: commit 126c2092e5c8b28623cb890cd2930aa292410676 Author: Peter Zijlstra <peterz@infradead.org> Date: Tue May 26 18:11:03 2020 +0200 sched: Add rq::ttwu_pending In preparation of removing rq->wake_list, replace the !list_empty(rq->wake_list) with rq->ttwu_pending. This is not fully equivalent as this new variable is racy. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Ingo Molnar <mingo@kernel.org> Link: https://lore.kernel.org/r/20200526161908.070399698@infradead.org And just in case here's the full bisect log: git bisect start # bad: [bcf876870b95592b52519ed4aafcf9d95999bc9c] Linux 5.8 git bisect bad bcf876870b95592b52519ed4aafcf9d95999bc9c # good: [3d77e6a8804abcc0504c904bd6e5cdf3a5cf8162] Linux 5.7 git bisect good 3d77e6a8804abcc0504c904bd6e5cdf3a5cf8162 # bad: [694b5a5d313f3997764b67d52bab66ec7e59e714] Merge tag 'arm-soc-5.8' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc git bisect bad 694b5a5d313f3997764b67d52bab66ec7e59e714 # bad: [2e63f6ce7ed2c4ff83ba30ad9ccad422289a6c63] Merge branch 'uaccess.comedi' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs git bisect bad 2e63f6ce7ed2c4ff83ba30ad9ccad422289a6c63 # good: [cfa3b8068b09f25037146bfd5eed041b78878bee] Merge tag 'for-linus-hmm' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma git bisect good cfa3b8068b09f25037146bfd5eed041b78878bee # good: [c41219fda6e04255c44d37fd2c0d898c1c46abf1] Merge tag 'drm-intel-next-fixes-2020-05-20' of git://anongit.freedesktop.org/drm/drm-intel into drm-next git bisect good c41219fda6e04255c44d37fd2c0d898c1c46abf1 # good: [f3cdc8ae116e27d84e1f33c7a2995960cebb73ac] Merge tag 'for-5.8-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux git bisect good f3cdc8ae116e27d84e1f33c7a2995960cebb73ac # bad: [f1e455352b6f503532eb3637d0a6d991895e7856] Merge tag 'kgdb-5.8-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/danielt/linux git bisect bad f1e455352b6f503532eb3637d0a6d991895e7856 # good: [9bd0bd264578fe191bf5d2ff23f9887b91862536] MIPS: ralink: drop ralink_clk_init for mt7621 git bisect good 9bd0bd264578fe191bf5d2ff23f9887b91862536 # good: [f6aee505c71bbb035dde146caf5a6abbf3ccbe47] Merge tag 'x86-timers-2020-06-03' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip git bisect good f6aee505c71bbb035dde146caf5a6abbf3ccbe47 # bad: [d479c5a1919b4e569dcd3ae9c84ed74a675d0b94] Merge tag 'sched-core-2020-06-02' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip git bisect bad d479c5a1919b4e569dcd3ae9c84ed74a675d0b94 # good: [9013196a467e770e1470cccee6c0fe435ef37c66] Merge branch 'sched/urgent' git bisect good 9013196a467e770e1470cccee6c0fe435ef37c66 # good: [58ef57b16d9e91cce1c640a6fe8a21d53a85181d] Merge branch 'core/rcu' into sched/core, to pick up dependency git bisect good 58ef57b16d9e91cce1c640a6fe8a21d53a85181d # good: [4b44a21dd640b692d4e9b12d3e37c24825f90baa] irq_work, smp: Allow irq_work on call_single_queue git bisect good 4b44a21dd640b692d4e9b12d3e37c24825f90baa # bad: [a148866489fbe243c936fe43e4525d8dbfa0318f] sched: Replace rq::wake_list git bisect bad a148866489fbe243c936fe43e4525d8dbfa0318f # bad: [126c2092e5c8b28623cb890cd2930aa292410676] sched: Add rq::ttwu_pending git bisect bad 126c2092e5c8b28623cb890cd2930aa292410676 # first bad commit: [126c2092e5c8b28623cb890cd2930aa292410676] sched: Add rq::ttwu_pending
Thanks for doing the bisect! That is a completely unexpected result (I figured we broke something) please attach output from dmesg lspci -vvv ethtool -i ethtool -S ethtool -c cpupower idle-info I wonder if your system is now doing more power management (staying in deeper sleep states) and that is causing interrupt delays.
I'm attaching the requested information. I just re-tested with e1000e module parameter InterruptThrottleRate=0 and it is working much better this way! I thought I tested that before, but somehow I messed up (forgot to update initrd?) and missed it. If I set InterruptThrottleRate to 3 (default) or 1 -- there's jitter. Setting it to 0 or high enough fixed interrupt rate (e.g. 4000) makes it smooth again.
Created attachment 294755 [details] dmesg
Created attachment 294757 [details] lspci -vvv
Created attachment 294759 [details] ethtool -i
Created attachment 294761 [details] ethtool -S
Created attachment 294763 [details] ethtool -c
Created attachment 294765 [details] cpupower idle-info
you might want to use ethtool -C ethX rx-usecs 0 instead of module parameter, as the instructions you gave for InterruptThrottleRate=0 means only the first e1000e instance will get the =0. The correct module parameter usage for your setup would be =0,0,0,0 Your system is definitely sleeping a lot in a sleep state that has a maximum wake latency of 133us. Leaving the driver's default dynamic interrupt moderation means that your CPUs might be going back to sleep after/between an interrupt. If you think the configuration you've settled upon fixes the issue do you want to just close this?
I have my fix, so you can close this. Still, it works differently before/after kernel 5.8.
Unfortunately the InterruptThrottleRate workaround is not perfect. Occasionally latency is spiking quite a bit above what it used to be pre 5.8: # 5.7 rtt min/avg/max/mdev = 0.147/0.181/0.233/0.024 ms # 5.9 rtt min/avg/max/mdev = 0.101/0.516/3.536/0.441 ms Still better than default settings. I've just tested 5.11-rc6 but no improvement. I'm just going back to a previous LTS kernel for the time being - either 5.4 or 4.19.
I'm no longer experiencing this problem with Debian's 5.10.0-23-amd64 #1 SMP Debian 5.10.179-1 (2023-05-12) kernel.