Bug 210855
Summary: | Increased latency & jitter with e1000e on Linux 5.8 | ||
---|---|---|---|
Product: | Networking | Reporter: | Theodor Milkov (tm) |
Component: | Other | Assignee: | Stephen Hemminger (stephen) |
Status: | RESOLVED OBSOLETE | ||
Severity: | normal | CC: | jbrandeb, peterz |
Priority: | P1 | ||
Hardware: | Intel | ||
OS: | Linux | ||
Kernel Version: | 5.8 | Subsystem: | |
Regression: | No | Bisected commit-id: | |
Attachments: |
dmesg
lspci -vvv ethtool -i ethtool -S ethtool -c cpupower idle-info |
Description
Theodor Milkov
2020-12-22 16:59:27 UTC
bisect session completed earlier than expected and the suspected commit is: commit 126c2092e5c8b28623cb890cd2930aa292410676 Author: Peter Zijlstra <peterz@infradead.org> Date: Tue May 26 18:11:03 2020 +0200 sched: Add rq::ttwu_pending In preparation of removing rq->wake_list, replace the !list_empty(rq->wake_list) with rq->ttwu_pending. This is not fully equivalent as this new variable is racy. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Ingo Molnar <mingo@kernel.org> Link: https://lore.kernel.org/r/20200526161908.070399698@infradead.org And just in case here's the full bisect log: git bisect start # bad: [bcf876870b95592b52519ed4aafcf9d95999bc9c] Linux 5.8 git bisect bad bcf876870b95592b52519ed4aafcf9d95999bc9c # good: [3d77e6a8804abcc0504c904bd6e5cdf3a5cf8162] Linux 5.7 git bisect good 3d77e6a8804abcc0504c904bd6e5cdf3a5cf8162 # bad: [694b5a5d313f3997764b67d52bab66ec7e59e714] Merge tag 'arm-soc-5.8' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc git bisect bad 694b5a5d313f3997764b67d52bab66ec7e59e714 # bad: [2e63f6ce7ed2c4ff83ba30ad9ccad422289a6c63] Merge branch 'uaccess.comedi' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs git bisect bad 2e63f6ce7ed2c4ff83ba30ad9ccad422289a6c63 # good: [cfa3b8068b09f25037146bfd5eed041b78878bee] Merge tag 'for-linus-hmm' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma git bisect good cfa3b8068b09f25037146bfd5eed041b78878bee # good: [c41219fda6e04255c44d37fd2c0d898c1c46abf1] Merge tag 'drm-intel-next-fixes-2020-05-20' of git://anongit.freedesktop.org/drm/drm-intel into drm-next git bisect good c41219fda6e04255c44d37fd2c0d898c1c46abf1 # good: [f3cdc8ae116e27d84e1f33c7a2995960cebb73ac] Merge tag 'for-5.8-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux git bisect good f3cdc8ae116e27d84e1f33c7a2995960cebb73ac # bad: [f1e455352b6f503532eb3637d0a6d991895e7856] Merge tag 'kgdb-5.8-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/danielt/linux git bisect bad f1e455352b6f503532eb3637d0a6d991895e7856 # good: [9bd0bd264578fe191bf5d2ff23f9887b91862536] MIPS: ralink: drop ralink_clk_init for mt7621 git bisect good 9bd0bd264578fe191bf5d2ff23f9887b91862536 # good: [f6aee505c71bbb035dde146caf5a6abbf3ccbe47] Merge tag 'x86-timers-2020-06-03' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip git bisect good f6aee505c71bbb035dde146caf5a6abbf3ccbe47 # bad: [d479c5a1919b4e569dcd3ae9c84ed74a675d0b94] Merge tag 'sched-core-2020-06-02' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip git bisect bad d479c5a1919b4e569dcd3ae9c84ed74a675d0b94 # good: [9013196a467e770e1470cccee6c0fe435ef37c66] Merge branch 'sched/urgent' git bisect good 9013196a467e770e1470cccee6c0fe435ef37c66 # good: [58ef57b16d9e91cce1c640a6fe8a21d53a85181d] Merge branch 'core/rcu' into sched/core, to pick up dependency git bisect good 58ef57b16d9e91cce1c640a6fe8a21d53a85181d # good: [4b44a21dd640b692d4e9b12d3e37c24825f90baa] irq_work, smp: Allow irq_work on call_single_queue git bisect good 4b44a21dd640b692d4e9b12d3e37c24825f90baa # bad: [a148866489fbe243c936fe43e4525d8dbfa0318f] sched: Replace rq::wake_list git bisect bad a148866489fbe243c936fe43e4525d8dbfa0318f # bad: [126c2092e5c8b28623cb890cd2930aa292410676] sched: Add rq::ttwu_pending git bisect bad 126c2092e5c8b28623cb890cd2930aa292410676 # first bad commit: [126c2092e5c8b28623cb890cd2930aa292410676] sched: Add rq::ttwu_pending Thanks for doing the bisect! That is a completely unexpected result (I figured we broke something) please attach output from dmesg lspci -vvv ethtool -i ethtool -S ethtool -c cpupower idle-info I wonder if your system is now doing more power management (staying in deeper sleep states) and that is causing interrupt delays. I'm attaching the requested information. I just re-tested with e1000e module parameter InterruptThrottleRate=0 and it is working much better this way! I thought I tested that before, but somehow I messed up (forgot to update initrd?) and missed it. If I set InterruptThrottleRate to 3 (default) or 1 -- there's jitter. Setting it to 0 or high enough fixed interrupt rate (e.g. 4000) makes it smooth again. Created attachment 294755 [details]
dmesg
Created attachment 294757 [details]
lspci -vvv
Created attachment 294759 [details]
ethtool -i
Created attachment 294761 [details]
ethtool -S
Created attachment 294763 [details]
ethtool -c
Created attachment 294765 [details]
cpupower idle-info
you might want to use ethtool -C ethX rx-usecs 0 instead of module parameter, as the instructions you gave for InterruptThrottleRate=0 means only the first e1000e instance will get the =0. The correct module parameter usage for your setup would be =0,0,0,0 Your system is definitely sleeping a lot in a sleep state that has a maximum wake latency of 133us. Leaving the driver's default dynamic interrupt moderation means that your CPUs might be going back to sleep after/between an interrupt. If you think the configuration you've settled upon fixes the issue do you want to just close this? I have my fix, so you can close this. Still, it works differently before/after kernel 5.8. Unfortunately the InterruptThrottleRate workaround is not perfect. Occasionally latency is spiking quite a bit above what it used to be pre 5.8: # 5.7 rtt min/avg/max/mdev = 0.147/0.181/0.233/0.024 ms # 5.9 rtt min/avg/max/mdev = 0.101/0.516/3.536/0.441 ms Still better than default settings. I've just tested 5.11-rc6 but no improvement. I'm just going back to a previous LTS kernel for the time being - either 5.4 or 4.19. I'm no longer experiencing this problem with Debian's 5.10.0-23-amd64 #1 SMP Debian 5.10.179-1 (2023-05-12) kernel. |