Bug 210855 - Increased latency & jitter with e1000e on Linux 5.8
Summary: Increased latency & jitter with e1000e on Linux 5.8
Status: RESOLVED OBSOLETE
Alias: None
Product: Networking
Classification: Unclassified
Component: Other (show other bugs)
Hardware: Intel Linux
: P1 normal
Assignee: Stephen Hemminger
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2020-12-22 16:59 UTC by Theodor Milkov
Modified: 2023-07-12 11:15 UTC (History)
2 users (show)

See Also:
Kernel Version: 5.8
Subsystem:
Regression: No
Bisected commit-id:


Attachments
dmesg (84.61 KB, text/plain)
2021-01-19 20:13 UTC, Theodor Milkov
Details
lspci -vvv (96.52 KB, text/plain)
2021-01-19 20:14 UTC, Theodor Milkov
Details
ethtool -i (239 bytes, text/plain)
2021-01-19 20:15 UTC, Theodor Milkov
Details
ethtool -S (1.28 KB, text/plain)
2021-01-19 20:15 UTC, Theodor Milkov
Details
ethtool -c (381 bytes, text/plain)
2021-01-19 20:16 UTC, Theodor Milkov
Details
cpupower idle-info (523 bytes, text/plain)
2021-01-19 20:16 UTC, Theodor Milkov
Details

Description Theodor Milkov 2020-12-22 16:59:27 UTC
I have a Linux router with the following specs:

Intel(R) Xeon(R) CPU E5-2630 v3 @ 2.40GHz

01:00.0  Intel Corporation 82571EB Gigabit Ethernet Controller (rev 06)
01:00.1  Intel Corporation 82571EB Gigabit Ethernet Controller (rev 06)
02:00.0  Intel Corporation 82571EB Gigabit Ethernet Controller (rev 06)
02:00.1  Intel Corporation 82571EB Gigabit Ethernet Controller (rev 06)
03:00.0  Intel Corporation Ethernet Controller 10G X550T (rev 01)
03:00.1  Intel Corporation Ethernet Controller 10G X550T (rev 01)
04:00.0  Intel Corporation Ethernet Controller 10-Gigabit X540-AT2 (rev 01)
04:00.1  Intel Corporation Ethernet Controller 10-Gigabit X540-AT2 (rev 01)

Up to Linux 5.7 kernel everything was ok. After upgrading to 5.8 I've noticed that latency and jitter to one of the e1000e NICs (82571EB) increased.

Further tests revealed the following:

# 4.19  rtt min/avg/max/mdev = 0.183/0.275/0.382/0.032 ms
# 5.0   rtt min/avg/max/mdev = 0.180/0.249/0.310/0.034 ms
# 5.4   rtt min/avg/max/mdev = 0.146/0.226/0.306/0.046 ms
# 5.6   rtt min/avg/max/mdev = 0.128/0.172/0.210/0.022 ms
# 5.7   rtt min/avg/max/mdev = 0.147/0.181/0.233/0.024 ms

# 5.8   rtt min/avg/max/mdev = 0.152/2.182/3.944/1.524 ms
# 5.9   rtt min/avg/max/mdev = 0.345/2.095/4.192/1.160 ms
# 5.10  rtt min/avg/max/mdev = 0.345/2.095/4.192/1.160 ms

Single ping session looks like this:

PING 172.31.252.132 (172.31.252.132) 56(84) bytes of data.
64 bytes from 172.31.252.132: icmp_seq=1 ttl=64 time=2.42 ms
64 bytes from 172.31.252.132: icmp_seq=2 ttl=64 time=0.872 ms
64 bytes from 172.31.252.132: icmp_seq=3 ttl=64 time=0.171 ms
64 bytes from 172.31.252.132: icmp_seq=4 ttl=64 time=1.89 ms
64 bytes from 172.31.252.132: icmp_seq=5 ttl=64 time=0.163 ms
64 bytes from 172.31.252.132: icmp_seq=6 ttl=64 time=0.196 ms
64 bytes from 172.31.252.132: icmp_seq=7 ttl=64 time=0.191 ms
64 bytes from 172.31.252.132: icmp_seq=8 ttl=64 time=1.90 ms
64 bytes from 172.31.252.132: icmp_seq=9 ttl=64 time=4.00 ms

No such problem on another NIC (10G X550T).

Using e1000e-3.8.7.tar.gz driver (dkms) didn't help at all.

I'm trying to bisect it right now...
Comment 1 Theodor Milkov 2020-12-22 17:14:25 UTC
bisect session completed earlier than expected and the suspected commit is:

commit 126c2092e5c8b28623cb890cd2930aa292410676
Author: Peter Zijlstra <peterz@infradead.org>
Date:   Tue May 26 18:11:03 2020 +0200

    sched: Add rq::ttwu_pending

    In preparation of removing rq->wake_list, replace the
    !list_empty(rq->wake_list) with rq->ttwu_pending. This is not fully
    equivalent as this new variable is racy.

    Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
    Signed-off-by: Ingo Molnar <mingo@kernel.org>
    Link: https://lore.kernel.org/r/20200526161908.070399698@infradead.org

And just in case here's the full bisect log:

git bisect start
# bad: [bcf876870b95592b52519ed4aafcf9d95999bc9c] Linux 5.8
git bisect bad bcf876870b95592b52519ed4aafcf9d95999bc9c
# good: [3d77e6a8804abcc0504c904bd6e5cdf3a5cf8162] Linux 5.7
git bisect good 3d77e6a8804abcc0504c904bd6e5cdf3a5cf8162
# bad: [694b5a5d313f3997764b67d52bab66ec7e59e714] Merge tag 'arm-soc-5.8' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc
git bisect bad 694b5a5d313f3997764b67d52bab66ec7e59e714
# bad: [2e63f6ce7ed2c4ff83ba30ad9ccad422289a6c63] Merge branch 'uaccess.comedi' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
git bisect bad 2e63f6ce7ed2c4ff83ba30ad9ccad422289a6c63
# good: [cfa3b8068b09f25037146bfd5eed041b78878bee] Merge tag 'for-linus-hmm' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma
git bisect good cfa3b8068b09f25037146bfd5eed041b78878bee
# good: [c41219fda6e04255c44d37fd2c0d898c1c46abf1] Merge tag 'drm-intel-next-fixes-2020-05-20' of git://anongit.freedesktop.org/drm/drm-intel into drm-next
git bisect good c41219fda6e04255c44d37fd2c0d898c1c46abf1
# good: [f3cdc8ae116e27d84e1f33c7a2995960cebb73ac] Merge tag 'for-5.8-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux
git bisect good f3cdc8ae116e27d84e1f33c7a2995960cebb73ac
# bad: [f1e455352b6f503532eb3637d0a6d991895e7856] Merge tag 'kgdb-5.8-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/danielt/linux
git bisect bad f1e455352b6f503532eb3637d0a6d991895e7856
# good: [9bd0bd264578fe191bf5d2ff23f9887b91862536] MIPS: ralink: drop ralink_clk_init for mt7621
git bisect good 9bd0bd264578fe191bf5d2ff23f9887b91862536
# good: [f6aee505c71bbb035dde146caf5a6abbf3ccbe47] Merge tag 'x86-timers-2020-06-03' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
git bisect good f6aee505c71bbb035dde146caf5a6abbf3ccbe47
# bad: [d479c5a1919b4e569dcd3ae9c84ed74a675d0b94] Merge tag 'sched-core-2020-06-02' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
git bisect bad d479c5a1919b4e569dcd3ae9c84ed74a675d0b94
# good: [9013196a467e770e1470cccee6c0fe435ef37c66] Merge branch 'sched/urgent'
git bisect good 9013196a467e770e1470cccee6c0fe435ef37c66
# good: [58ef57b16d9e91cce1c640a6fe8a21d53a85181d] Merge branch 'core/rcu' into sched/core, to pick up dependency
git bisect good 58ef57b16d9e91cce1c640a6fe8a21d53a85181d
# good: [4b44a21dd640b692d4e9b12d3e37c24825f90baa] irq_work, smp: Allow irq_work on call_single_queue
git bisect good 4b44a21dd640b692d4e9b12d3e37c24825f90baa
# bad: [a148866489fbe243c936fe43e4525d8dbfa0318f] sched: Replace rq::wake_list
git bisect bad a148866489fbe243c936fe43e4525d8dbfa0318f
# bad: [126c2092e5c8b28623cb890cd2930aa292410676] sched: Add rq::ttwu_pending
git bisect bad 126c2092e5c8b28623cb890cd2930aa292410676
# first bad commit: [126c2092e5c8b28623cb890cd2930aa292410676] sched: Add rq::ttwu_pending
Comment 2 Jesse Brandeburg 2021-01-15 17:43:17 UTC
Thanks for doing the bisect! That is a completely unexpected result (I figured we broke something)

please attach output from
dmesg
lspci -vvv
ethtool -i
ethtool -S
ethtool -c
cpupower idle-info

I wonder if your system is now doing more power management (staying in deeper sleep states) and that is causing interrupt delays.
Comment 3 Theodor Milkov 2021-01-19 20:12:24 UTC
I'm attaching the requested information.

I just re-tested with e1000e module parameter InterruptThrottleRate=0 and it is working much better this way! I thought I tested that before, but somehow I messed up (forgot to update initrd?) and missed it. If I set InterruptThrottleRate to 3 (default) or 1 -- there's jitter. Setting it to 0 or high enough fixed interrupt rate (e.g. 4000) makes it smooth again.
Comment 4 Theodor Milkov 2021-01-19 20:13:49 UTC
Created attachment 294755 [details]
dmesg
Comment 5 Theodor Milkov 2021-01-19 20:14:44 UTC
Created attachment 294757 [details]
lspci -vvv
Comment 6 Theodor Milkov 2021-01-19 20:15:18 UTC
Created attachment 294759 [details]
ethtool -i
Comment 7 Theodor Milkov 2021-01-19 20:15:42 UTC
Created attachment 294761 [details]
ethtool -S
Comment 8 Theodor Milkov 2021-01-19 20:16:22 UTC
Created attachment 294763 [details]
ethtool -c
Comment 9 Theodor Milkov 2021-01-19 20:16:50 UTC
Created attachment 294765 [details]
cpupower idle-info
Comment 10 Jesse Brandeburg 2021-01-19 21:50:51 UTC
you might want to use ethtool -C ethX rx-usecs 0 instead of module parameter, as the instructions you gave for InterruptThrottleRate=0 means only the first e1000e instance will get the =0. The correct module parameter usage for your setup would be =0,0,0,0

Your system is definitely sleeping a lot in a sleep state that has a maximum wake latency of 133us. Leaving the driver's default dynamic interrupt moderation means that your CPUs might be going back to sleep after/between an interrupt.

If you think the configuration you've settled upon fixes the issue do you want to just close this?
Comment 11 Theodor Milkov 2021-01-19 22:03:42 UTC
I have my fix, so you can close this.

Still, it works differently before/after kernel 5.8.
Comment 12 Theodor Milkov 2021-02-04 21:44:49 UTC
Unfortunately the InterruptThrottleRate workaround is not perfect. Occasionally latency is spiking quite a bit above what it used to be pre 5.8:

# 5.7  rtt min/avg/max/mdev = 0.147/0.181/0.233/0.024 ms
# 5.9  rtt min/avg/max/mdev = 0.101/0.516/3.536/0.441 ms

Still better than default settings.

I've just tested 5.11-rc6 but no improvement.

I'm just going back to a previous LTS kernel for the time being - either 5.4 or 4.19.
Comment 13 Theodor Milkov 2023-07-12 11:15:47 UTC
I'm no longer experiencing this problem with Debian's 5.10.0-23-amd64 #1 SMP Debian 5.10.179-1 (2023-05-12) kernel.

Note You need to log in before you can comment on or make changes to this bug.