Bug 6177 - Java remote debugging is slow due to apparent networking bug
Summary: Java remote debugging is slow due to apparent networking bug
Status: REJECTED WILL_NOT_FIX
Alias: None
Product: Networking
Classification: Unclassified
Component: Other (show other bugs)
Hardware: i386 Linux
: P2 normal
Assignee: Stephen Hemminger
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2006-03-06 17:46 UTC by Eric Molitor
Modified: 2006-03-24 15:15 UTC (History)
1 user (show)

See Also:
Kernel Version: 2.6.15
Subsystem:
Regression: ---
Bisected commit-id:


Attachments

Description Eric Molitor 2006-03-06 17:46:11 UTC
Most recent kernel where this bug did not occur:
Distribution: Suse 10.0, Suse 10.1 Debian 3.1
Hardware Environment: ix86
Software Environment: 2.6.14
Problem Description:
Sometime between 2.6.14 and 2.6.15 remote Java debugging has slowed to a 
crawl. Users have reported the problem on both Debian, Suse. Downgrading to 
2.6.14 solves the problem. The problem occurs with IDEA IntelliJ, JBuilder, 
and even the Sun JDAPI examples. I've talked with many people about this and 
here is what is known so far. http://www.jetbrains.net/jira/browse/IDEA-6540 
The best quote to summerize that I know of is "tcpdump shows tons and tons of 
packets going back and forth, none of which individually look strange, but the 
fact that it took somewhere around the neighborhood of 2500 packets to open 
the key/value for a single Hash element was weird. Each packet has a very 
small payload of only a few bytes of information. (I will happily send the 
tcpdump dump files if anyone wants.)" I know that this bug report sucks 
because of the limited information but it is a real issue and somewhat hard to 
provide a better test case. Eugene Zhuravlev jeka@intellij.com of IDEA 
(IntelliJ's publisher) has offered to help track this problem down.

Steps to reproduce:
Make sure you are running 2.6.15 or higher (occurs in 2.6.16 pre as well) 
install Eclipse and start a remote debugging session. Downgrading to 2.6.14 
will cause the app to run at its normal speed. Its probably easiest to install 
Tomcat and attach Eclipse to that.
Comment 1 Andrew Morton 2006-03-06 18:03:37 UTC
bugme-daemon@bugzilla.kernel.org wrote:
>
> http://bugzilla.kernel.org/show_bug.cgi?id=6177
> 
>            Summary: Java remote debugging is slow due to apparent networking
>                     bug
>     Kernel Version: 2.6.15
>             Status: NEW
>           Severity: normal
>              Owner: acme@conectiva.com.br
>          Submitter: eric.molitor@gmail.com
> 
> 
> Most recent kernel where this bug did not occur:
> Distribution: Suse 10.0, Suse 10.1 Debian 3.1
> Hardware Environment: ix86
> Software Environment: 2.6.14
> Problem Description:
> Sometime between 2.6.14 and 2.6.15 remote Java debugging has slowed to a 
> crawl. Users have reported the problem on both Debian, Suse. Downgrading to 
> 2.6.14 solves the problem. The problem occurs with IDEA IntelliJ, JBuilder, 
> and even the Sun JDAPI examples. I've talked with many people about this and 
> here is what is known so far. http://www.jetbrains.net/jira/browse/IDEA-6540 
> The best quote to summerize that I know of is "tcpdump shows tons and tons of 
> packets going back and forth, none of which individually look strange, but the 
> fact that it took somewhere around the neighborhood of 2500 packets to open 
> the key/value for a single Hash element was weird. Each packet has a very 
> small payload of only a few bytes of information. (I will happily send the 
> tcpdump dump files if anyone wants.)" I know that this bug report sucks 
> because of the limited information but it is a real issue and somewhat hard to 
> provide a better test case. Eugene Zhuravlev jeka@intellij.com of IDEA 
> (IntelliJ's publisher) has offered to help track this problem down.
> 
> Steps to reproduce:
> Make sure you are running 2.6.15 or higher (occurs in 2.6.16 pre as well) 
> install Eclipse and start a remote debugging session. Downgrading to 2.6.14 
> will cause the app to run at its normal speed. Its probably easiest to install 
> Tomcat and attach Eclipse to that.
> 

Yes, if you can get the net guys a full tcpdump it would really help, thanks.

(Please respond via email rather than via bugzilla so the
non-bugilla-capable net developers get to see it, thanks ;))

Comment 2 Eric Molitor 2006-03-07 11:36:11 UTC
Attached is a TCP Dump, SYS output was...

 tcpdump -i lo > debug.dump
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on lo, link-type EN10MB (Ethernet), capture size 96 bytes
31066 packets captured
93658 packets received by filter
459 packets dropped by kernel

This was while debuging iteration over a list of 10 items. On such a simple
example on 2.6.14 it runs basically instantly. On 2.6.15 this took several
minutes.

Cheers,
   Eric Molitor



On 3/6/06, Andrew Morton <akpm@osdl.org> wrote:
>
> bugme-daemon@bugzilla.kernel.org wrote:
> >
> > http://bugzilla.kernel.org/show_bug.cgi?id=6177
> >
> >            Summary: Java remote debugging is slow due to apparent
> networking
> >                     bug
> >     Kernel Version: 2.6.15
> >             Status: NEW
> >           Severity: normal
> >              Owner: acme@conectiva.com.br
> >          Submitter: eric.molitor@gmail.com
> >
> >
> > Most recent kernel where this bug did not occur:
> > Distribution: Suse 10.0, Suse 10.1 Debian 3.1
> > Hardware Environment: ix86
> > Software Environment: 2.6.14
> > Problem Description:
> > Sometime between 2.6.14 and 2.6.15 remote Java debugging has slowed to a
> > crawl. Users have reported the problem on both Debian, Suse. Downgrading
> to
> > 2.6.14 solves the problem. The problem occurs with IDEA IntelliJ,
> JBuilder,
> > and even the Sun JDAPI examples. I've talked with many people about this
> and
> > here is what is known so far.
> http://www.jetbrains.net/jira/browse/IDEA-6540
> > The best quote to summerize that I know of is "tcpdump shows tons and
> tons of
> > packets going back and forth, none of which individually look strange,
> but the
> > fact that it took somewhere around the neighborhood of 2500 packets to
> open
> > the key/value for a single Hash element was weird. Each packet has a
> very
> > small payload of only a few bytes of information. (I will happily send
> the
> > tcpdump dump files if anyone wants.)" I know that this bug report sucks
> > because of the limited information but it is a real issue and somewhat
> hard to
> > provide a better test case. Eugene Zhuravlev jeka@intellij.com of IDEA
> > (IntelliJ's publisher) has offered to help track this problem down.
> >
> > Steps to reproduce:
> > Make sure you are running 2.6.15 or higher (occurs in 2.6.16 pre as
> well)
> > install Eclipse and start a remote debugging session. Downgrading to
> 2.6.14
> > will cause the app to run at its normal speed. Its probably easiest to
> install
> > Tomcat and attach Eclipse to that.
> >
>
> Yes, if you can get the net guys a full tcpdump it would really help,
> thanks.
>
> (Please respond via email rather than via bugzilla so the
> non-bugilla-capable net developers get to see it, thanks ;))
>
>
Attached is a TCP Dump, SYS output was...<br><br>&nbsp;tcpdump -i lo &gt; debug.dump<br>tcpdump: verbose output suppressed, use -v or -vv for full protocol decode<br>listening on lo, link-type EN10MB (Ethernet), capture size 96 bytes
<br>31066 packets captured<br>93658 packets received by filter<br>459 packets dropped by kernel<br><br>This was while debuging iteration over a list of 10 items. On such a simple example on 2.6.14 it runs basically instantly. On 
2.6.15 this took several minutes.<br><br>Cheers,<br>&nbsp;&nbsp; Eric Molitor<br><br><br><br><div><span class="gmail_quote">On 3/6/06, <b class="gmail_sendername">Andrew Morton</b> &lt;<a href="mailto:akpm@osdl.org">akpm@osdl.org</a>
&gt; wrote:</span><blockquote class="gmail_quote" DEFANGED_style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;"><a href="mailto:bugme-daemon@bugzilla.kernel.org">bugme-daemon@bugzilla.kernel.org
</a> wrote:<br>&gt;<br>&gt; <a href="http://bugzilla.kernel.org/show_bug.cgi?id=6177">http://bugzilla.kernel.org/show_bug.cgi?id=6177</a><br>&gt;<br>&gt;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Summary: Java remote debugging is slow due to apparent networking
<br>&gt;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; bug<br>&gt;&nbsp;&nbsp;&nbsp;&nbsp; Kernel Version: 2.6.15<br>&gt;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; Status: NEW<br>&gt;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; Severity: normal<br>&gt;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Owner: <a href="mailto:acme@conectiva.com.br">acme@conectiva.com.br
</a><br>&gt;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Submitter: <a href="mailto:eric.molitor@gmail.com">eric.molitor@gmail.com</a><br>&gt;<br>&gt;<br>&gt; Most recent kernel where this bug did not occur:<br>&gt; Distribution: Suse 10.0, Suse 10.1 Debian 
3.1<br>&gt; Hardware Environment: ix86<br>&gt; Software Environment: 2.6.14<br>&gt; Problem Description:<br>&gt; Sometime between 2.6.14 and 2.6.15 remote Java debugging has slowed to a<br>&gt; crawl. Users have reported the problem on both Debian, Suse. Downgrading to
<br>&gt; 2.6.14 solves the problem. The problem occurs with IDEA IntelliJ, JBuilder,<br>&gt; and even the Sun JDAPI examples. I've talked with many people about this and<br>&gt; here is what is known so far. <a href="http://www.jetbrains.net/jira/browse/IDEA-6540">
http://www.jetbrains.net/jira/browse/IDEA-6540</a><br>&gt; The best quote to summerize that I know of is &quot;tcpdump shows tons and tons of<br>&gt; packets going back and forth, none of which individually look strange, but the
<br>&gt; fact that it took somewhere around the neighborhood of 2500 packets to open<br>&gt; the key/value for a single Hash element was weird. Each packet has a very<br>&gt; small payload of only a few bytes of information. (I will happily send the
<br>&gt; tcpdump dump files if anyone wants.)&quot; I know that this bug report sucks<br>&gt; because of the limited information but it is a real issue and somewhat hard to<br>&gt; provide a better test case. Eugene Zhuravlev 
<a href="mailto:jeka@intellij.com">jeka@intellij.com</a> of IDEA<br>&gt; (IntelliJ's publisher) has offered to help track this problem down.<br>&gt;<br>&gt; Steps to reproduce:<br>&gt; Make sure you are running 2.6.15 or higher (occurs in 
2.6.16 pre as well)<br>&gt; install Eclipse and start a remote debugging session. Downgrading to 2.6.14<br>&gt; will cause the app to run at its normal speed. Its probably easiest to install<br>&gt; Tomcat and attach Eclipse to that.
<br>&gt;<br><br>Yes, if you can get the net guys a full tcpdump it would really help, thanks.<br><br>(Please respond via email rather than via bugzilla so the<br>non-bugilla-capable net developers get to see it, thanks ;))
<br><br></blockquote></div><br>
Comment 3 Eric Molitor 2006-03-07 12:08:39 UTC
Here is non-html version, sorry about that.

On 3/7/06, Eric Molitor <eric.molitor@gmail.com> wrote:
>
> Attached is a TCP Dump, SYS output was...
>
>  tcpdump -i lo > debug.dump
> tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
> listening on lo, link-type EN10MB (Ethernet), capture size 96 bytes
> 31066 packets captured
> 93658 packets received by filter
> 459 packets dropped by kernel
>
> This was while debuging iteration over a list of 10 items. On such a simple example on 2.6.14 it runs basically instantly. On  2.6.15 this took several minutes.
>
> Cheers,
>
>    Eric Molitor
>
>
>
>
>
> On 3/6/06, Andrew Morton <akpm@osdl.org > wrote:
> > bugme-daemon@bugzilla.kernel.org  wrote:
> > >
> > > http://bugzilla.kernel.org/show_bug.cgi?id=6177
> > >
> > >            Summary: Java remote debugging is slow due to apparent networking
> > >                     bug
> > >     Kernel Version: 2.6.15
> > >             Status: NEW
> > >           Severity: normal
> > >              Owner: acme@conectiva.com.br
> > >          Submitter: eric.molitor@gmail.com
> > >
> > >
> > > Most recent kernel where this bug did not occur:
> > > Distribution: Suse 10.0, Suse 10.1 Debian  3.1
> > > Hardware Environment: ix86
> > > Software Environment: 2.6.14
> > > Problem Description:
> > > Sometime between 2.6.14 and 2.6.15 remote Java debugging has slowed to a
> > > crawl. Users have reported the problem on both Debian, Suse. Downgrading to
> > > 2.6.14 solves the problem. The problem occurs with IDEA IntelliJ, JBuilder,
> > > and even the Sun JDAPI examples. I've talked with many people about this and
> > > here is what is known so far.  http://www.jetbrains.net/jira/browse/IDEA-6540
> > > The best quote to summerize that I know of is "tcpdump shows tons and tons of
> > > packets going back and forth, none of which individually look strange, but the
> > > fact that it took somewhere around the neighborhood of 2500 packets to open
> > > the key/value for a single Hash element was weird. Each packet has a very
> > > small payload of only a few bytes of information. (I will happily send the
> > > tcpdump dump files if anyone wants.)" I know that this bug report sucks
> > > because of the limited information but it is a real issue and somewhat hard to
> > > provide a better test case. Eugene Zhuravlev  jeka@intellij.com of IDEA
> > > (IntelliJ's publisher) has offered to help track this problem down.
> > >
> > > Steps to reproduce:
> > > Make sure you are running 2.6.15 or higher (occurs in  2.6.16 pre as well)
> > > install Eclipse and start a remote debugging session. Downgrading to 2.6.14
> > > will cause the app to run at its normal speed. Its probably easiest to install
> > > Tomcat and attach Eclipse to that.
> > >
> >
> > Yes, if you can get the net guys a full tcpdump it would really help, thanks.
> >
> > (Please respond via email rather than via bugzilla so the
> > non-bugilla-capable net developers get to see it, thanks ;))
> >
> >
>
>
>
Comment 4 Andrew Morton 2006-03-07 12:09:42 UTC
"Eric Molitor" <eric.molitor@gmail.com> wrote:
>
> Attached is a TCP Dump, SYS output was...
> 
>  tcpdump -i lo > debug.dump
> tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
> listening on lo, link-type EN10MB (Ethernet), capture size 96 bytes
> 31066 packets captured
> 93658 packets received by filter
> 459 packets dropped by kernel
> 
> This was while debuging iteration over a list of 10 items. On such a simple
> example on 2.6.14 it runs basically instantly. On 2.6.15 this took several
> minutes.
> 

Thanks.

The attachment was probably too large for the mailing list, so I've
uploaded it to
http://www.zip.com.au/~akpm/linux/patches/stuff/debug.dump.gz


> 
> 
> On 3/6/06, Andrew Morton <akpm@osdl.org> wrote:
> >
> > bugme-daemon@bugzilla.kernel.org wrote:
> > >
> > > http://bugzilla.kernel.org/show_bug.cgi?id=6177
> > >
> > >            Summary: Java remote debugging is slow due to apparent
> > networking
> > >                     bug
> > >     Kernel Version: 2.6.15
> > >             Status: NEW
> > >           Severity: normal
> > >              Owner: acme@conectiva.com.br
> > >          Submitter: eric.molitor@gmail.com
> > >
> > >
> > > Most recent kernel where this bug did not occur:
> > > Distribution: Suse 10.0, Suse 10.1 Debian 3.1
> > > Hardware Environment: ix86
> > > Software Environment: 2.6.14
> > > Problem Description:
> > > Sometime between 2.6.14 and 2.6.15 remote Java debugging has slowed to a
> > > crawl. Users have reported the problem on both Debian, Suse. Downgrading
> > to
> > > 2.6.14 solves the problem. The problem occurs with IDEA IntelliJ,
> > JBuilder,
> > > and even the Sun JDAPI examples. I've talked with many people about this
> > and
> > > here is what is known so far.
> > http://www.jetbrains.net/jira/browse/IDEA-6540
> > > The best quote to summerize that I know of is "tcpdump shows tons and
> > tons of
> > > packets going back and forth, none of which individually look strange,
> > but the
> > > fact that it took somewhere around the neighborhood of 2500 packets to
> > open
> > > the key/value for a single Hash element was weird. Each packet has a
> > very
> > > small payload of only a few bytes of information. (I will happily send
> > the
> > > tcpdump dump files if anyone wants.)" I know that this bug report sucks
> > > because of the limited information but it is a real issue and somewhat
> > hard to
> > > provide a better test case. Eugene Zhuravlev jeka@intellij.com of IDEA
> > > (IntelliJ's publisher) has offered to help track this problem down.
> > >
> > > Steps to reproduce:
> > > Make sure you are running 2.6.15 or higher (occurs in 2.6.16 pre as
> > well)
> > > install Eclipse and start a remote debugging session. Downgrading to
> > 2.6.14
> > > will cause the app to run at its normal speed. Its probably easiest to
> > install
> > > Tomcat and attach Eclipse to that.
> > >
> >
> > Yes, if you can get the net guys a full tcpdump it would really help,
> > thanks.
> >
> > (Please respond via email rather than via bugzilla so the
> > non-bugilla-capable net developers get to see it, thanks ;))
> >
> >
> 

Comment 5 Stephen Hemminger 2006-03-08 09:08:34 UTC
It would be really useful to know more about the system and environment.

What the size of the data being sent in each system call is? This can be easily
obtained by using strace to attach to the sending process.  Is it being stupid
and writing in small chunks?

Does the application call setsockopt(s, IPPROTO_TCP, TCP_NODELAY, ...)
and turn off the Nagle algorithm?

Is the interface being used (eth0) using TCP Segment Offload (TSO)?
ethtool -k eth0
Comment 6 Eric Molitor 2006-03-08 13:23:06 UTC
The system the TCPDump is from is Suse 10.1 Beta6 with
2.6.16-rc5-git2-3 but the same results occur with Debian 3.1 with
2.6.15. Both 32bit x86. Downgrading to 2.6.14 in both cases  solves
the slowdowns.

The interface being used is lo so not sure if TSO applies to localhost.

I will gather more information later today.

Comment 7 Eric Molitor 2006-03-08 13:33:20 UTC
The app is basically Suns JDK but it seems to occur with all JDK's.
(I've only tested JDK 1.4.2_06 and JDK 1.5.0_06) This dump was done
while IDEA IntelliJ was running against JDK 1.5.0_06. The same issues
occur with eclipse, JBuilder, and all other IDE's.

Comment 8 Stephen Hemminger 2006-03-08 23:24:45 UTC
I am not sure if it is the same problem, but I am now able to reproduce 
slowness
if I use eclipse and debug something. It is annoying, but not fatal.

If I turn off TCP appropriate byte count:
    sudo sysctl -w net.ipv4.tcp_abc=0
then the problem goes away. See RFC 3465
     http://www.apps.ietf.org/rfc/rfc3465.html
for a description.

I have gotten massive strace's and the java VM is:
    1) Turning on TCP_NODELAY
    2) Sending small packets.

So I think we are counting the small packets now counting against it and 
it getting
blocked.  There are a several possible options:
1) Ship with TCP ABC = 0 off -- bad because no one ever changes things to be
    more fair.
2) Ship with TCP ABC set 2 -- makes it more aggressive, that may work.
3) Tweak TCP to know more about the loopback interface so it has bigger cwnd
4) Fix java

Comment 9 Anonymous Emailer 2006-03-08 23:29:51 UTC
Reply-To: davem@davemloft.net

From: Stephen Hemminger <shemminger@osdl.org>
Date: Wed, 08 Mar 2006 23:24:22 -0800

> I have gotten massive strace's and the java VM is:
>     1) Turning on TCP_NODELAY
>     2) Sending small packets.

Java is doing the wrong thing, obviously.

> 4) Fix java

And this is the only reasonable recourse.

You cannot turn on TCP_NODELAY and expect good performance
when sending out small packets.  You are asking for low
latency and no delaying of packets in order to allow larger
ones to accumulate.

The kernel is doing exactly what Java is asking it to do.

In fact I consider the new behavior of the kernel a bug fix.

Comment 10 Thomas Hartwig 2006-03-09 00:29:04 UTC
Dear kernel hackers,

can you give us a clear, but small problem description so we can raise this in
the Sun bug database. In detail, what has changed in the kernel and why is the
"wrong" behaviour of Java getting a problem in the new kernel? Unfortunately I
don't understand the problem from your words ;-) - but I would like to send it
to Sun.

Thanks
Thomas
Comment 11 Thomas Hartwig 2006-03-09 00:48:33 UTC
Sorry for my unmeaning post ahead, I will try to summarize:

1. Java is always sending a huge amount of small packages in a debugger session
over the tcp stack.
2. In the new kernel there is a new algorithm which counts the packages and will
do improvements to get better network performance as described in the rfc:
http://www.apps.ietf.org/rfc/rfc3465.html
3. This leads to a performance loss and only Java is doing this by fault.

Is this right?
Comment 12 Stephen Hemminger 2006-03-09 08:29:20 UTC
On Wed, 08 Mar 2006 23:29:48 -0800 (PST)
"David S. Miller" <davem@davemloft.net> wrote:

> From: Stephen Hemminger <shemminger@osdl.org>
> Date: Wed, 08 Mar 2006 23:24:22 -0800
> 
> > I have gotten massive strace's and the java VM is:
> >     1) Turning on TCP_NODELAY
> >     2) Sending small packets.
> 
> Java is doing the wrong thing, obviously.
> 
> > 4) Fix java
> 
> And this is the only reasonable recourse.
> 
> You cannot turn on TCP_NODELAY and expect good performance
> when sending out small packets.  You are asking for low
> latency and no delaying of packets in order to allow larger
> ones to accumulate.
> 
> The kernel is doing exactly what Java is asking it to do.
> 
> In fact I consider the new behavior of the kernel a bug fix.

A possible solution would be to set cwnd bigger for loopback.
If there was a clean way to know that connection was over loopback,
then doing something in tcp_init_metrics() to set INIT_CWND 

	if (IsLoopback(sk))
		dst->metrics[RTAX_INIT_CWND-1] = 10;
then tcp_init_cwnd() would return a bigger congestion window.

Comment 13 Eric Molitor 2006-03-09 10:29:11 UTC
Just out of curiosity was the window size changed in 2.6.15? Just
trying to get an idea of what might have changed in 2.6.15 that
triggered this. (In 2.6.14 and 2.4.27 things run very fast)

On 3/9/06, Stephen Hemminger <shemminger@osdl.org> wrote:
> On Wed, 08 Mar 2006 23:29:48 -0800 (PST)
> "David S. Miller" <davem@davemloft.net> wrote:
>
> > From: Stephen Hemminger <shemminger@osdl.org>
> > Date: Wed, 08 Mar 2006 23:24:22 -0800
> >
> > > I have gotten massive strace's and the java VM is:
> > >     1) Turning on TCP_NODELAY
> > >     2) Sending small packets.
> >
> > Java is doing the wrong thing, obviously.
> >
> > > 4) Fix java
> >
> > And this is the only reasonable recourse.
> >
> > You cannot turn on TCP_NODELAY and expect good performance
> > when sending out small packets.  You are asking for low
> > latency and no delaying of packets in order to allow larger
> > ones to accumulate.
> >
> > The kernel is doing exactly what Java is asking it to do.
> >
> > In fact I consider the new behavior of the kernel a bug fix.
>
> A possible solution would be to set cwnd bigger for loopback.
> If there was a clean way to know that connection was over loopback,
> then doing something in tcp_init_metrics() to set INIT_CWND
>
>         if (IsLoopback(sk))
>                 dst->metrics[RTAX_INIT_CWND-1] = 10;
> then tcp_init_cwnd() would return a bigger congestion window.
>

Comment 14 Stephen Hemminger 2006-03-09 11:33:40 UTC
On Thu, 9 Mar 2006 12:29:08 -0600
"Eric Molitor" <eric.molitor@gmail.com> wrote:

> Just out of curiosity was the window size changed in 2.6.15? Just
> trying to get an idea of what might have changed in 2.6.15 that
> triggered this. (In 2.6.14 and 2.4.27 things run very fast)

No, window size hasn't changed, but how we account for it has.
Appropriate Byte Count changes what constitutes a packet for increasing the congestion window.

Without ABC, the congestion window is increased by one after each successful
acknowledge during slow start.  With ABC, we don't increase the congestion window
until after you get an acknowledgement for the number of bytes in a full TCP
packet.  This means if you send small packets, the window will increase more
slowly, read the RFC.




Comment 15 Anonymous Emailer 2006-03-09 11:56:48 UTC
Reply-To: davem@davemloft.net

From: Stephen Hemminger <shemminger@osdl.org>
Date: Thu, 9 Mar 2006 08:33:15 -0800

> A possible solution would be to set cwnd bigger for loopback.
> If there was a clean way to know that connection was over loopback,
> then doing something in tcp_init_metrics() to set INIT_CWND 
> 
> 	if (IsLoopback(sk))
> 		dst->metrics[RTAX_INIT_CWND-1] = 10;
> then tcp_init_cwnd() would return a bigger congestion window.

I'm not even going to entertain workaround for applications
that set socket options and then things go wrong because
the kernel actually does what the application has asked for.

Comment 16 Eric Molitor 2006-03-09 13:13:42 UTC
I did open up a bug with SUN about this. It looks like most clients
dont set TCP_NODELAY on debug sockets but the JDK itself has
TCP_NODELAY hardcoded.

In the meantime is there a way to set or disable Appropriate Byte
Counting on a per interface basis? (I know that its a protocal but the
abiltiy to set protocal options on a per interface basis would seem
nice.)


On 3/9/06, David S. Miller <davem@davemloft.net> wrote:
> From: Stephen Hemminger <shemminger@osdl.org>
> Date: Thu, 9 Mar 2006 08:33:15 -0800
>
> > A possible solution would be to set cwnd bigger for loopback.
> > If there was a clean way to know that connection was over loopback,
> > then doing something in tcp_init_metrics() to set INIT_CWND
> >
> >       if (IsLoopback(sk))
> >               dst->metrics[RTAX_INIT_CWND-1] = 10;
> > then tcp_init_cwnd() would return a bigger congestion window.
>
> I'm not even going to entertain workaround for applications
> that set socket options and then things go wrong because
> the kernel actually does what the application has asked for.
>

Comment 17 Stephen Hemminger 2006-03-09 13:27:46 UTC
On Thu, 9 Mar 2006 15:13:39 -0600
"Eric Molitor" <eric.molitor@gmail.com> wrote:

> I did open up a bug with SUN about this. It looks like most clients
> dont set TCP_NODELAY on debug sockets but the JDK itself has
> TCP_NODELAY hardcoded.
> 
> In the meantime is there a way to set or disable Appropriate Byte
> Counting on a per interface basis? (I know that its a protocal but the
> abiltiy to set protocal options on a per interface basis would seem
> nice.)
>

No, but you may be able to set a bigger initial cwnd with by altering
the route for the loopback interface.

Comment 18 Stephen Hemminger 2006-03-09 15:28:07 UTC
It is a JVM problem. The JVM does a setsockopt(TCP_NODELAY) that turns off the
Nagle algorithm, then does calls to send() with small packets.  With the 
RFC3545 ABC algorithm in TCP, we no longer increase the window based on packets
but now on bytes / packets.  The net result is that the kernel holds off the
third send until the first one is acknowledged.

The JVM needs to be fixed to either not set NODELAY, or aggregrate requests.
Two ways to do that would be to use a syscall with scatter/gather (writev,
sendmsg) or use TCP_CORK.
Comment 19 Tim Bell 2006-03-20 20:03:52 UTC
As far at this issue affects Java Platform Debugger Architecture (JPDA) use
during debug sessions, reference:

  http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=6401245.

Comment 20 Eric Neilsen 2006-03-21 08:52:05 UTC
A bug has been opened up w/ java.

http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=6401245

I need everyone to vote on this bug so Sun will fix the problem ASAP.

You will need to create a sun developer network account, unless you already have
one, in order to vote and watch the bug. It's free, quick, and easy 
Comment 21 Eric Molitor 2006-03-24 15:15:35 UTC
Sun has acknowledged the issue and is fixing, for more information see
http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=6401245

Note You need to log in before you can comment on or make changes to this bug.