Bug 7551 - Fluctuating and slow network perfomance
Summary: Fluctuating and slow network perfomance
Status: REJECTED INSUFFICIENT_DATA
Alias: None
Product: Networking
Classification: Unclassified
Component: IPV4 (show other bugs)
Hardware: i386 Linux
: P2 high
Assignee: Stephen Hemminger
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2006-11-19 07:12 UTC by Charlie G Mentorez
Modified: 2007-02-20 13:41 UTC (History)
1 user (show)

See Also:
Kernel Version: 2.6.18.x
Subsystem:
Regression: ---
Bisected commit-id:


Attachments

Description Charlie G Mentorez 2006-11-19 07:12:31 UTC
Most recent kernel where this bug did *NOT* occur:
2.6.17.13

Distribution:
Crux

Hardware Environment:
Seem not dependence tested om Intel P4 PII and AMD Optron based. We use Intel or
3com NIC the problem seem to be the same regardels wich nic it runs through.

Software Environment:
Pure linux router with iptables or as filserver with samba.
Teste also on server with Apache and singel NIC, but same issue.

Problem Description:
Since 2.18.X it seem to be problem with the network performance.
This problem might have been for a while but it come more obvious now when I
updated the Main router. The trafik is now much slower even testing local
between to machines with same kernel version. 
I tested between mashine with older kernel with same network connection, this
show normal transfer speed and ping.

Steps to reproduce:
I have tested on non loaded local network.

THIS IS PART OF THE PROBLEM
Ping is flutuating, no other traffic to from the mashine during test. 
The test are made between two 2.6.18.3 servers.
664 octets from 192.168.117.254: icmp_seq=0 ttl=64 time=9.3 ms
64 octets from 192.168.117.254: icmp_seq=1 ttl=64 time=0.2 ms
64 octets from 192.168.117.254: icmp_seq=2 ttl=64 time=1.2 ms
64 octets from 192.168.117.254: icmp_seq=3 ttl=64 time=1.3 ms
64 octets from 192.168.117.254: icmp_seq=4 ttl=64 time=1.2 ms
64 octets from 192.168.117.254: icmp_seq=5 ttl=64 time=0.2 ms
64 octets from 192.168.117.254: icmp_seq=6 ttl=64 time=0.2 ms
64 octets from 192.168.117.254: icmp_seq=7 ttl=64 time=1.2 ms
64 octets from 192.168.117.254: icmp_seq=8 ttl=64 time=1.2 ms
64 octets from 192.168.117.254: icmp_seq=9 ttl=64 time=1.2 ms
64 octets from 192.168.117.254: icmp_seq=10 ttl=64 time=1.2 ms
64 octets from 192.168.117.254: icmp_seq=11 ttl=64 time=1.2 ms
64 octets from 192.168.117.254: icmp_seq=12 ttl=64 time=1.3 ms
64 octets from 192.168.117.254: icmp_seq=13 ttl=64 time=1.2 ms
64 octets from 192.168.117.254: icmp_seq=14 ttl=64 time=0.2 ms
64 octets from 192.168.117.254: icmp_seq=15 ttl=64 time=1.2 ms
64 octets from 192.168.117.254: icmp_seq=16 ttl=64 time=1.2 ms
64 octets from 192.168.117.254: icmp_seq=17 ttl=64 time=1.2 ms
64 octets from 192.168.117.254: icmp_seq=18 ttl=64 time=0.2 ms
64 octets from 192.168.117.254: icmp_seq=19 ttl=64 time=0.2 ms
64 octets from 192.168.117.254: icmp_seq=20 ttl=64 time=0.1 ms
64 octets from 192.168.117.254: icmp_seq=21 ttl=64 time=0.2 ms
64 octets from 192.168.117.254: icmp_seq=22 ttl=64 time=0.2 ms
64 octets from 192.168.117.254: icmp_seq=23 ttl=64 time=1.3 ms
64 octets from 192.168.117.254: icmp_seq=24 ttl=64 time=1.3 ms

This is between 2 2.6.17.13
This is very normal ping for our network.
64 octets from 192.168.117.229: icmp_seq=0 ttl=64 time=0.1 ms
64 octets from 192.168.117.229: icmp_seq=1 ttl=64 time=0.1 ms
64 octets from 192.168.117.229: icmp_seq=2 ttl=64 time=0.1 ms
64 octets from 192.168.117.229: icmp_seq=3 ttl=64 time=0.1 ms
64 octets from 192.168.117.229: icmp_seq=4 ttl=64 time=0.0 ms
64 octets from 192.168.117.229: icmp_seq=5 ttl=64 time=0.0 ms
64 octets from 192.168.117.229: icmp_seq=6 ttl=64 time=0.1 ms
64 octets from 192.168.117.229: icmp_seq=7 ttl=64 time=0.1 ms
64 octets from 192.168.117.229: icmp_seq=8 ttl=64 time=0.1 ms
64 octets from 192.168.117.229: icmp_seq=9 ttl=64 time=0.1 ms
64 octets from 192.168.117.229: icmp_seq=10 ttl=64 time=0.1 ms
64 octets from 192.168.117.229: icmp_seq=11 ttl=64 time=0.1 ms
64 octets from 192.168.117.229: icmp_seq=12 ttl=64 time=0.1 ms
64 octets from 192.168.117.229: icmp_seq=13 ttl=64 time=0.0 ms
64 octets from 192.168.117.229: icmp_seq=14 ttl=64 time=0.1 ms
64 octets from 192.168.117.229: icmp_seq=15 ttl=64 time=0.1 ms
64 octets from 192.168.117.229: icmp_seq=16 ttl=64 time=0.1 ms
64 octets from 192.168.117.229: icmp_seq=17 ttl=64 time=0.0 ms
64 octets from 192.168.117.229: icmp_seq=18 ttl=64 time=0.1 ms
64 octets from 192.168.117.229: icmp_seq=19 ttl=64 time=0.1 ms
64 octets from 192.168.117.229: icmp_seq=20 ttl=64 time=0.1 ms
64 octets from 192.168.117.229: icmp_seq=21 ttl=64 time=0.2 ms
64 octets from 192.168.117.229: icmp_seq=22 ttl=64 time=0.0 ms
64 octets from 192.168.117.229: icmp_seq=23 ttl=64 time=0.1 ms
64 octets from 192.168.117.229: icmp_seq=24 ttl=64 time=0.1 ms
64 octets from 192.168.117.229: icmp_seq=25 ttl=64 time=0.1 ms


COPY TEST:
Are general made between ramddisk
I have use both scp and other copy method, like SAMBA and NFS but the problem
are simmular, use the result from scp test.

TEST 1
This is between two server with kernel 2.6.17.13
scp sysop@229.ndc:/usr/src/linux-2.6.18.3.tar.bz2 /home/sysop
sysop@229.ndc's password:
linux-2.6.18.3.tar.bz2                                                         
      100%   40MB  13.3MB/s   00:03
The same speed all time

TEST 2
This is between two server with kernel 2.6.16.7 and 2.6.17.13
scp sysop@229.ndc:/usr/src/linux-2.6.18.3.tar.bz2 /home/sysop
sysop@229.ndc's password:
linux-2.6.18.3.tar.bz2                                                         
      100%   40MB  10.0MB/s   00:04
Same speed all the time

TEST 3
HERE IS PROBLEM SHOWN
This is between two server with kernel 2.6.18.3 
The speed fluctuate
scp sysop@217.25.252.230:/usr/src/linux-2.6.18.1.tar.bz2 /home/sysop
sysop@217.25.252.230's password:
linux-2.6.18.1.tar.bz2                                                         
      100%   40MB   8.0MB/s   00:05

TEST 3a
scp sysop@217.25.252.230:/usr/src/linux-2.6.18.1.tar.bz2 /home/sysop
sysop@217.25.252.230's password:
linux-2.6.18.1.tar.bz2                                                         
      100%   40MB   4.6MB/s   00:09

TEST 3b
scp sysop@217.25.252.230:/usr/src/linux-2.6.18.1.tar.bz2 /home/sysop
sysop@217.25.252.230's password:
linux-2.6.18.1.tar.bz2                                                         
      100%   40MB   7.6MB/s   00:07

Test 4a
This is between two other server with kernel 2.6.18.3 but two other machine
differnt NIC.

scp sysop@77.ndc:/usr/src/linux-2.6.17.13.tar.bz2 /home/sysop
sysop@77.ndc's password:
linux-2.6.17.13.tar.bz2                                                        
      100%   39MB   7.9MB/s   00:05

Test 4a
scp sysop@77.ndc:/usr/src/linux-2.6.17.13.tar.bz2 /home/sysop
sysop@77.ndc's password:
linux-2.6.17.13.tar.bz2                                                        
      
100%   39MB   6.6MB/s   00:06

Im not a programmer, so I can't go more deep down to the problem than this.
Comment 1 Charlie G Mentorez 2006-11-19 09:32:20 UTC
NEW TEST
I found this out:
If I run test from a machine with 2.8.18.2 kernel to a machin with older kernel
in this test 2.6.16.20 It seem to be ok.

64 octets from 192.168.39.1: icmp_seq=10 ttl=64 time=0.2 ms
64 octets from 192.168.39.1: icmp_seq=11 ttl=64 time=0.3 ms
64 octets from 192.168.39.1: icmp_seq=12 ttl=64 time=0.2 ms
64 octets from 192.168.39.1: icmp_seq=13 ttl=64 time=0.2 ms
64 octets from 192.168.39.1: icmp_seq=14 ttl=64 time=0.2 ms
64 octets from 192.168.39.1: icmp_seq=15 ttl=64 time=0.3 ms
64 octets from 192.168.39.1: icmp_seq=16 ttl=64 time=0.2 ms
64 octets from 192.168.39.1: icmp_seq=17 ttl=64 time=0.5 ms
64 octets from 192.168.39.1: icmp_seq=18 ttl=64 time=0.2 ms
64 octets from 192.168.39.1: icmp_seq=19 ttl=64 time=0.2 ms
64 octets from 192.168.39.1: icmp_seq=20 ttl=64 time=0.2 ms
64 octets from 192.168.39.1: icmp_seq=21 ttl=64 time=0.2 ms
64 octets from 192.168.39.1: icmp_seq=22 ttl=64 time=0.2 ms
64 octets from 192.168.39.1: icmp_seq=23 ttl=64 time=0.2 ms
64 octets from 192.168.39.1: icmp_seq=24 ttl=64 time=0.2 ms
64 octets from 192.168.39.1: icmp_seq=25 ttl=64 time=0.2 ms
64 octets from 192.168.39.1: icmp_seq=26 ttl=64 time=0.2 ms
64 octets from 192.168.39.1: icmp_seq=27 ttl=64 time=0.3 ms
64 octets from 192.168.39.1: icmp_seq=28 ttl=64 time=0.2 ms
64 octets from 192.168.39.1: icmp_seq=29 ttl=64 time=0.3 ms
64 octets from 192.168.39.1: icmp_seq=30 ttl=64 time=0.2 ms
64 octets from 192.168.39.1: icmp_seq=31 ttl=64 time=0.2 ms
64 octets from 192.168.39.1: icmp_seq=32 ttl=64 time=0.2 ms
64 octets from 192.168.39.1: icmp_seq=33 ttl=64 time=0.2 ms
64 octets from 192.168.39.1: icmp_seq=34 ttl=64 time=0.2 ms
64 octets from 192.168.39.1: icmp_seq=35 ttl=64 time=0.3 ms
64 octets from 192.168.39.1: icmp_seq=36 ttl=64 time=0.3 ms
64 octets from 192.168.39.1: icmp_seq=37 ttl=64 time=0.2 ms
64 octets from 192.168.39.1: icmp_seq=38 ttl=64 time=0.2 ms
64 octets from 192.168.39.1: icmp_seq=39 ttl=64 time=0.2 ms

But if i make it vice versa then there is a problem.

64 octets from 192.168.117.5: icmp_seq=0 ttl=63 time=0.5 ms
64 octets from 192.168.117.5: icmp_seq=1 ttl=63 time=0.4 ms
64 octets from 192.168.117.5: icmp_seq=2 ttl=63 time=0.4 ms
64 octets from 192.168.117.5: icmp_seq=3 ttl=63 time=1.5 ms
64 octets from 192.168.117.5: icmp_seq=4 ttl=63 time=0.4 ms
64 octets from 192.168.117.5: icmp_seq=5 ttl=63 time=1.5 ms
64 octets from 192.168.117.5: icmp_seq=6 ttl=63 time=1.5 ms
64 octets from 192.168.117.5: icmp_seq=7 ttl=63 time=0.4 ms
64 octets from 192.168.117.5: icmp_seq=8 ttl=63 time=1.5 ms
64 octets from 192.168.117.5: icmp_seq=9 ttl=63 time=1.5 ms
64 octets from 192.168.117.5: icmp_seq=10 ttl=63 time=1.5 ms
64 octets from 192.168.117.5: icmp_seq=11 ttl=63 time=0.5 ms
64 octets from 192.168.117.5: icmp_seq=12 ttl=63 time=0.4 ms
64 octets from 192.168.117.5: icmp_seq=13 ttl=63 time=0.7 ms
64 octets from 192.168.117.5: icmp_seq=14 ttl=63 time=1.5 ms
64 octets from 192.168.117.5: icmp_seq=15 ttl=63 time=0.4 ms
64 octets from 192.168.117.5: icmp_seq=16 ttl=63 time=1.4 ms
64 octets from 192.168.117.5: icmp_seq=17 ttl=63 time=0.7 ms
64 octets from 192.168.117.5: icmp_seq=18 ttl=63 time=1.4 ms
64 octets from 192.168.117.5: icmp_seq=19 ttl=63 time=1.5 ms
64 octets from 192.168.117.5: icmp_seq=20 ttl=63 time=0.8 ms
64 octets from 192.168.117.5: icmp_seq=21 ttl=63 time=1.5 ms
64 octets from 192.168.117.5: icmp_seq=22 ttl=63 time=0.4 ms
64 octets from 192.168.117.5: icmp_seq=23 ttl=63 time=0.4 ms
64 octets from 192.168.117.5: icmp_seq=24 ttl=63 time=1.5 ms
64 octets from 192.168.117.5: icmp_seq=25 ttl=63 time=0.9 ms

The problem seem to be more obvius when a route are involved.
But that could be part of the fact there is some genereal problem and each route
then will case more latency.
Through our outgoing main internet connection, it was an increas from 0.6 up to
1.9 ms. Checked our contiued log and that shows we have an average on 0.6 to our
external routing point. And now we have average 1,9 ms. 

I will downgrade to 2.6.17.13 and see if the problem then disappear.









Comment 2 Stephen Hemminger 2006-12-18 13:53:20 UTC
Could you try and narrow down the kernel change that cause this?  with
  git bisect
I know it will be a nuisance, to build all those kernels, but if it is a
simple regression, it is much easier to know which change cause it than playing
"guess the needle in the haystack"

Note You need to log in before you can comment on or make changes to this bug.