Most recent kernel where this bug did *NOT* occur: 2.6.17.13 Distribution: Crux Hardware Environment: Seem not dependence tested om Intel P4 PII and AMD Optron based. We use Intel or 3com NIC the problem seem to be the same regardels wich nic it runs through. Software Environment: Pure linux router with iptables or as filserver with samba. Teste also on server with Apache and singel NIC, but same issue. Problem Description: Since 2.18.X it seem to be problem with the network performance. This problem might have been for a while but it come more obvious now when I updated the Main router. The trafik is now much slower even testing local between to machines with same kernel version. I tested between mashine with older kernel with same network connection, this show normal transfer speed and ping. Steps to reproduce: I have tested on non loaded local network. THIS IS PART OF THE PROBLEM Ping is flutuating, no other traffic to from the mashine during test. The test are made between two 2.6.18.3 servers. 664 octets from 192.168.117.254: icmp_seq=0 ttl=64 time=9.3 ms 64 octets from 192.168.117.254: icmp_seq=1 ttl=64 time=0.2 ms 64 octets from 192.168.117.254: icmp_seq=2 ttl=64 time=1.2 ms 64 octets from 192.168.117.254: icmp_seq=3 ttl=64 time=1.3 ms 64 octets from 192.168.117.254: icmp_seq=4 ttl=64 time=1.2 ms 64 octets from 192.168.117.254: icmp_seq=5 ttl=64 time=0.2 ms 64 octets from 192.168.117.254: icmp_seq=6 ttl=64 time=0.2 ms 64 octets from 192.168.117.254: icmp_seq=7 ttl=64 time=1.2 ms 64 octets from 192.168.117.254: icmp_seq=8 ttl=64 time=1.2 ms 64 octets from 192.168.117.254: icmp_seq=9 ttl=64 time=1.2 ms 64 octets from 192.168.117.254: icmp_seq=10 ttl=64 time=1.2 ms 64 octets from 192.168.117.254: icmp_seq=11 ttl=64 time=1.2 ms 64 octets from 192.168.117.254: icmp_seq=12 ttl=64 time=1.3 ms 64 octets from 192.168.117.254: icmp_seq=13 ttl=64 time=1.2 ms 64 octets from 192.168.117.254: icmp_seq=14 ttl=64 time=0.2 ms 64 octets from 192.168.117.254: icmp_seq=15 ttl=64 time=1.2 ms 64 octets from 192.168.117.254: icmp_seq=16 ttl=64 time=1.2 ms 64 octets from 192.168.117.254: icmp_seq=17 ttl=64 time=1.2 ms 64 octets from 192.168.117.254: icmp_seq=18 ttl=64 time=0.2 ms 64 octets from 192.168.117.254: icmp_seq=19 ttl=64 time=0.2 ms 64 octets from 192.168.117.254: icmp_seq=20 ttl=64 time=0.1 ms 64 octets from 192.168.117.254: icmp_seq=21 ttl=64 time=0.2 ms 64 octets from 192.168.117.254: icmp_seq=22 ttl=64 time=0.2 ms 64 octets from 192.168.117.254: icmp_seq=23 ttl=64 time=1.3 ms 64 octets from 192.168.117.254: icmp_seq=24 ttl=64 time=1.3 ms This is between 2 2.6.17.13 This is very normal ping for our network. 64 octets from 192.168.117.229: icmp_seq=0 ttl=64 time=0.1 ms 64 octets from 192.168.117.229: icmp_seq=1 ttl=64 time=0.1 ms 64 octets from 192.168.117.229: icmp_seq=2 ttl=64 time=0.1 ms 64 octets from 192.168.117.229: icmp_seq=3 ttl=64 time=0.1 ms 64 octets from 192.168.117.229: icmp_seq=4 ttl=64 time=0.0 ms 64 octets from 192.168.117.229: icmp_seq=5 ttl=64 time=0.0 ms 64 octets from 192.168.117.229: icmp_seq=6 ttl=64 time=0.1 ms 64 octets from 192.168.117.229: icmp_seq=7 ttl=64 time=0.1 ms 64 octets from 192.168.117.229: icmp_seq=8 ttl=64 time=0.1 ms 64 octets from 192.168.117.229: icmp_seq=9 ttl=64 time=0.1 ms 64 octets from 192.168.117.229: icmp_seq=10 ttl=64 time=0.1 ms 64 octets from 192.168.117.229: icmp_seq=11 ttl=64 time=0.1 ms 64 octets from 192.168.117.229: icmp_seq=12 ttl=64 time=0.1 ms 64 octets from 192.168.117.229: icmp_seq=13 ttl=64 time=0.0 ms 64 octets from 192.168.117.229: icmp_seq=14 ttl=64 time=0.1 ms 64 octets from 192.168.117.229: icmp_seq=15 ttl=64 time=0.1 ms 64 octets from 192.168.117.229: icmp_seq=16 ttl=64 time=0.1 ms 64 octets from 192.168.117.229: icmp_seq=17 ttl=64 time=0.0 ms 64 octets from 192.168.117.229: icmp_seq=18 ttl=64 time=0.1 ms 64 octets from 192.168.117.229: icmp_seq=19 ttl=64 time=0.1 ms 64 octets from 192.168.117.229: icmp_seq=20 ttl=64 time=0.1 ms 64 octets from 192.168.117.229: icmp_seq=21 ttl=64 time=0.2 ms 64 octets from 192.168.117.229: icmp_seq=22 ttl=64 time=0.0 ms 64 octets from 192.168.117.229: icmp_seq=23 ttl=64 time=0.1 ms 64 octets from 192.168.117.229: icmp_seq=24 ttl=64 time=0.1 ms 64 octets from 192.168.117.229: icmp_seq=25 ttl=64 time=0.1 ms COPY TEST: Are general made between ramddisk I have use both scp and other copy method, like SAMBA and NFS but the problem are simmular, use the result from scp test. TEST 1 This is between two server with kernel 2.6.17.13 scp sysop@229.ndc:/usr/src/linux-2.6.18.3.tar.bz2 /home/sysop sysop@229.ndc's password: linux-2.6.18.3.tar.bz2 100% 40MB 13.3MB/s 00:03 The same speed all time TEST 2 This is between two server with kernel 2.6.16.7 and 2.6.17.13 scp sysop@229.ndc:/usr/src/linux-2.6.18.3.tar.bz2 /home/sysop sysop@229.ndc's password: linux-2.6.18.3.tar.bz2 100% 40MB 10.0MB/s 00:04 Same speed all the time TEST 3 HERE IS PROBLEM SHOWN This is between two server with kernel 2.6.18.3 The speed fluctuate scp sysop@217.25.252.230:/usr/src/linux-2.6.18.1.tar.bz2 /home/sysop sysop@217.25.252.230's password: linux-2.6.18.1.tar.bz2 100% 40MB 8.0MB/s 00:05 TEST 3a scp sysop@217.25.252.230:/usr/src/linux-2.6.18.1.tar.bz2 /home/sysop sysop@217.25.252.230's password: linux-2.6.18.1.tar.bz2 100% 40MB 4.6MB/s 00:09 TEST 3b scp sysop@217.25.252.230:/usr/src/linux-2.6.18.1.tar.bz2 /home/sysop sysop@217.25.252.230's password: linux-2.6.18.1.tar.bz2 100% 40MB 7.6MB/s 00:07 Test 4a This is between two other server with kernel 2.6.18.3 but two other machine differnt NIC. scp sysop@77.ndc:/usr/src/linux-2.6.17.13.tar.bz2 /home/sysop sysop@77.ndc's password: linux-2.6.17.13.tar.bz2 100% 39MB 7.9MB/s 00:05 Test 4a scp sysop@77.ndc:/usr/src/linux-2.6.17.13.tar.bz2 /home/sysop sysop@77.ndc's password: linux-2.6.17.13.tar.bz2 100% 39MB 6.6MB/s 00:06 Im not a programmer, so I can't go more deep down to the problem than this.
NEW TEST I found this out: If I run test from a machine with 2.8.18.2 kernel to a machin with older kernel in this test 2.6.16.20 It seem to be ok. 64 octets from 192.168.39.1: icmp_seq=10 ttl=64 time=0.2 ms 64 octets from 192.168.39.1: icmp_seq=11 ttl=64 time=0.3 ms 64 octets from 192.168.39.1: icmp_seq=12 ttl=64 time=0.2 ms 64 octets from 192.168.39.1: icmp_seq=13 ttl=64 time=0.2 ms 64 octets from 192.168.39.1: icmp_seq=14 ttl=64 time=0.2 ms 64 octets from 192.168.39.1: icmp_seq=15 ttl=64 time=0.3 ms 64 octets from 192.168.39.1: icmp_seq=16 ttl=64 time=0.2 ms 64 octets from 192.168.39.1: icmp_seq=17 ttl=64 time=0.5 ms 64 octets from 192.168.39.1: icmp_seq=18 ttl=64 time=0.2 ms 64 octets from 192.168.39.1: icmp_seq=19 ttl=64 time=0.2 ms 64 octets from 192.168.39.1: icmp_seq=20 ttl=64 time=0.2 ms 64 octets from 192.168.39.1: icmp_seq=21 ttl=64 time=0.2 ms 64 octets from 192.168.39.1: icmp_seq=22 ttl=64 time=0.2 ms 64 octets from 192.168.39.1: icmp_seq=23 ttl=64 time=0.2 ms 64 octets from 192.168.39.1: icmp_seq=24 ttl=64 time=0.2 ms 64 octets from 192.168.39.1: icmp_seq=25 ttl=64 time=0.2 ms 64 octets from 192.168.39.1: icmp_seq=26 ttl=64 time=0.2 ms 64 octets from 192.168.39.1: icmp_seq=27 ttl=64 time=0.3 ms 64 octets from 192.168.39.1: icmp_seq=28 ttl=64 time=0.2 ms 64 octets from 192.168.39.1: icmp_seq=29 ttl=64 time=0.3 ms 64 octets from 192.168.39.1: icmp_seq=30 ttl=64 time=0.2 ms 64 octets from 192.168.39.1: icmp_seq=31 ttl=64 time=0.2 ms 64 octets from 192.168.39.1: icmp_seq=32 ttl=64 time=0.2 ms 64 octets from 192.168.39.1: icmp_seq=33 ttl=64 time=0.2 ms 64 octets from 192.168.39.1: icmp_seq=34 ttl=64 time=0.2 ms 64 octets from 192.168.39.1: icmp_seq=35 ttl=64 time=0.3 ms 64 octets from 192.168.39.1: icmp_seq=36 ttl=64 time=0.3 ms 64 octets from 192.168.39.1: icmp_seq=37 ttl=64 time=0.2 ms 64 octets from 192.168.39.1: icmp_seq=38 ttl=64 time=0.2 ms 64 octets from 192.168.39.1: icmp_seq=39 ttl=64 time=0.2 ms But if i make it vice versa then there is a problem. 64 octets from 192.168.117.5: icmp_seq=0 ttl=63 time=0.5 ms 64 octets from 192.168.117.5: icmp_seq=1 ttl=63 time=0.4 ms 64 octets from 192.168.117.5: icmp_seq=2 ttl=63 time=0.4 ms 64 octets from 192.168.117.5: icmp_seq=3 ttl=63 time=1.5 ms 64 octets from 192.168.117.5: icmp_seq=4 ttl=63 time=0.4 ms 64 octets from 192.168.117.5: icmp_seq=5 ttl=63 time=1.5 ms 64 octets from 192.168.117.5: icmp_seq=6 ttl=63 time=1.5 ms 64 octets from 192.168.117.5: icmp_seq=7 ttl=63 time=0.4 ms 64 octets from 192.168.117.5: icmp_seq=8 ttl=63 time=1.5 ms 64 octets from 192.168.117.5: icmp_seq=9 ttl=63 time=1.5 ms 64 octets from 192.168.117.5: icmp_seq=10 ttl=63 time=1.5 ms 64 octets from 192.168.117.5: icmp_seq=11 ttl=63 time=0.5 ms 64 octets from 192.168.117.5: icmp_seq=12 ttl=63 time=0.4 ms 64 octets from 192.168.117.5: icmp_seq=13 ttl=63 time=0.7 ms 64 octets from 192.168.117.5: icmp_seq=14 ttl=63 time=1.5 ms 64 octets from 192.168.117.5: icmp_seq=15 ttl=63 time=0.4 ms 64 octets from 192.168.117.5: icmp_seq=16 ttl=63 time=1.4 ms 64 octets from 192.168.117.5: icmp_seq=17 ttl=63 time=0.7 ms 64 octets from 192.168.117.5: icmp_seq=18 ttl=63 time=1.4 ms 64 octets from 192.168.117.5: icmp_seq=19 ttl=63 time=1.5 ms 64 octets from 192.168.117.5: icmp_seq=20 ttl=63 time=0.8 ms 64 octets from 192.168.117.5: icmp_seq=21 ttl=63 time=1.5 ms 64 octets from 192.168.117.5: icmp_seq=22 ttl=63 time=0.4 ms 64 octets from 192.168.117.5: icmp_seq=23 ttl=63 time=0.4 ms 64 octets from 192.168.117.5: icmp_seq=24 ttl=63 time=1.5 ms 64 octets from 192.168.117.5: icmp_seq=25 ttl=63 time=0.9 ms The problem seem to be more obvius when a route are involved. But that could be part of the fact there is some genereal problem and each route then will case more latency. Through our outgoing main internet connection, it was an increas from 0.6 up to 1.9 ms. Checked our contiued log and that shows we have an average on 0.6 to our external routing point. And now we have average 1,9 ms. I will downgrade to 2.6.17.13 and see if the problem then disappear.
Could you try and narrow down the kernel change that cause this? with git bisect I know it will be a nuisance, to build all those kernels, but if it is a simple regression, it is much easier to know which change cause it than playing "guess the needle in the haystack"