Bug 15507 - kernel misses 3rd part of tcp handshake (ACK), stays in SYN_RECV state
Summary: kernel misses 3rd part of tcp handshake (ACK), stays in SYN_RECV state
Status: RESOLVED OBSOLETE
Alias: None
Product: Networking
Classification: Unclassified
Component: IPV4 (show other bugs)
Hardware: All Linux
: P1 normal
Assignee: Stephen Hemminger
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2010-03-10 14:50 UTC by Joshua Roys
Modified: 2012-07-05 16:07 UTC (History)
1 user (show)

See Also:
Kernel Version: 2.6.32.9-67.fc12
Subsystem:
Regression: Yes
Bisected commit-id:


Attachments
tcpdump from virtual guest over ::1 showing dup synacks (1.39 KB, application/octet-stream)
2010-03-10 14:50 UTC, Joshua Roys
Details

Description Joshua Roys 2010-03-10 14:50:49 UTC
Created attachment 25450 [details]
tcpdump from virtual guest over ::1 showing dup synacks

Setup:
64-bit dual-core, latest F12.  (I've experienced this on .31 and .32 kernels.)

I have also experienced this on 2 other machines, all 64-bit.  One was a single-core virtual guest.

1. Install httpd.
2. Start it.  Don't bother configuring anything unless you need to.
3. tcpdump .....
4. connect from localhost or (possibly) a nearby host.  I'm not sure if this is timing related yet.
5. note duplicate SYN-ACKs sent out at the exponential backoff.  Some setups exhibit this more/worse than others, and will actually time out the connection.  (Others will only send one or two duplicate SYN-ACKs, and then enter the ESTABLISHED state.)  However, if any data is sent the kernel enters ESTABLISHED.

Note that this isn't ipv4 specific.  I have seen it under ipv4 and ipv6 (via loopback).
Comment 1 Andrew Morton 2010-03-18 23:01:42 UTC
(switched to email.  Please respond via emailed reply-to-all, not via the
bugzilla web interface).

On Wed, 10 Mar 2010 14:50:51 GMT
bugzilla-daemon@bugzilla.kernel.org wrote:

> http://bugzilla.kernel.org/show_bug.cgi?id=15507
> 
>            Summary: kernel misses 3rd part of tcp handshake (ACK), stays
>                     in SYN_RECV state
>            Product: Networking
>            Version: 2.5
>     Kernel Version: 2.6.32.9-67.fc12
>           Platform: All
>         OS/Version: Linux
>               Tree: Fedora
>             Status: NEW
>           Severity: normal
>           Priority: P1
>          Component: IPV4
>         AssignedTo: shemminger@linux-foundation.org
>         ReportedBy: roysjosh@gmail.com
>         Regression: Yes
> 
> 
> Created an attachment (id=25450)
>  --> (http://bugzilla.kernel.org/attachment.cgi?id=25450)
> tcpdump from virtual guest over ::1 showing dup synacks
> 
> Setup:
> 64-bit dual-core, latest F12.  (I've experienced this on .31 and .32
> kernels.)
> 
> I have also experienced this on 2 other machines, all 64-bit.  One was a
> single-core virtual guest.
> 
> 1. Install httpd.
> 2. Start it.  Don't bother configuring anything unless you need to.
> 3. tcpdump .....
> 4. connect from localhost or (possibly) a nearby host.  I'm not sure if this
> is
> timing related yet.
> 5. note duplicate SYN-ACKs sent out at the exponential backoff.  Some setups
> exhibit this more/worse than others, and will actually time out the
> connection.
>  (Others will only send one or two duplicate SYN-ACKs, and then enter the
> ESTABLISHED state.)  However, if any data is sent the kernel enters
> ESTABLISHED.
> 
> Note that this isn't ipv4 specific.  I have seen it under ipv4 and ipv6 (via
> loopback).
>
Comment 2 Eric Dumazet 2010-03-18 23:13:12 UTC
Le jeudi 18 mars 2010 à 16:01 -0700, Andrew Morton a écrit :
> (switched to email.  Please respond via emailed reply-to-all, not via the
> bugzilla web interface).
> 
> On Wed, 10 Mar 2010 14:50:51 GMT
> bugzilla-daemon@bugzilla.kernel.org wrote:
> 
> > http://bugzilla.kernel.org/show_bug.cgi?id=15507
> > 
> >            Summary: kernel misses 3rd part of tcp handshake (ACK), stays
> >                     in SYN_RECV state
> >            Product: Networking
> >            Version: 2.5
> >     Kernel Version: 2.6.32.9-67.fc12
> >           Platform: All
> >         OS/Version: Linux
> >               Tree: Fedora
> >             Status: NEW
> >           Severity: normal
> >           Priority: P1
> >          Component: IPV4
> >         AssignedTo: shemminger@linux-foundation.org
> >         ReportedBy: roysjosh@gmail.com
> >         Regression: Yes
> > 
> > 
> > Created an attachment (id=25450)
> >  --> (http://bugzilla.kernel.org/attachment.cgi?id=25450)
> > tcpdump from virtual guest over ::1 showing dup synacks
> > 
> > Setup:
> > 64-bit dual-core, latest F12.  (I've experienced this on .31 and .32
> kernels.)
> > 
> > I have also experienced this on 2 other machines, all 64-bit.  One was a
> > single-core virtual guest.
> > 
> > 1. Install httpd.
> > 2. Start it.  Don't bother configuring anything unless you need to.
> > 3. tcpdump .....
> > 4. connect from localhost or (possibly) a nearby host.  I'm not sure if
> this is
> > timing related yet.
> > 5. note duplicate SYN-ACKs sent out at the exponential backoff.  Some
> setups
> > exhibit this more/worse than others, and will actually time out the
> connection.
> >  (Others will only send one or two duplicate SYN-ACKs, and then enter the
> > ESTABLISHED state.)  However, if any data is sent the kernel enters
> > ESTABLISHED.
> > 
> > Note that this isn't ipv4 specific.  I have seen it under ipv4 and ipv6
> (via
> > loopback).
> > 
> 
> --

I would say this is expected if httpd server set DEFER_ACCEPT socket
option.
Comment 3 Joshua Roys 2010-03-19 11:54:21 UTC
On 03/18/2010 07:12 PM, Eric Dumazet wrote:
>
> I would say this is expected if httpd server set DEFER_ACCEPT socket
> option.
>
>
>

Gah!  You've got to be kidding :)  If that's what it is, close the 
bug...  and I'll go read some man-pages!

Josh
Comment 4 Eric Dumazet 2010-03-19 13:42:54 UTC
Le vendredi 19 mars 2010 à 07:53 -0400, Joshua Roys a écrit :
> On 03/18/2010 07:12 PM, Eric Dumazet wrote:
> >
> > I would say this is expected if httpd server set DEFER_ACCEPT socket
> > option.
> >
> >
> >
> 
> Gah!  You've got to be kidding :)  If that's what it is, close the 
> bug...  and I'll go read some man-pages!
> 

So you _confirm_ DEFER_ACCEPT is not used, its _important_ for us.

I dont believe my comment was trivial at all :)

I gave a hint to myself and other network guys, because we did some
changes in this area lately (commit b103cf34 tcp: fix TCP_DEFER_ACCEPT
retrans calculation) from Julian Anastasov.

git describe b103cf34
v2.6.31-9056-gb103cf3

Please boot a fresh vm, then try exactly 10 'bad connects', please make
each attempt last at least 2 minutes. (you can start all these in
parallel)

And report :

netstat -s


(My FC12 copy doesnt exhibit this problem)

Thanks
Comment 5 Eric Dumazet 2010-03-19 14:47:21 UTC
Le vendredi 19 mars 2010 à 07:53 -0400, Joshua Roys a écrit :

> 
> Gah!  You've got to be kidding :)  If that's what it is, close the 
> bug...  and I'll go read some man-pages!

So I checked FC12 httpd and yes, it does use DEFER_ACCEPT (setting val
to 1 second).


So you hit a problem that was corrected by following commit.

Time to bug RedHat I suppose...



commit d1b99ba41d6c5aa1ed2fc634323449dd656899e9
Author: Julian Anastasov <ja@ssi.bg>
Date:   Mon Oct 19 10:01:56 2009 +0000

    tcp: accept socket after TCP_DEFER_ACCEPT period
    
    Willy Tarreau and many other folks in recent years
    were concerned what happens when the TCP_DEFER_ACCEPT period
    expires for clients which sent ACK packet. They prefer clients
    that actively resend ACK on our SYN-ACK retransmissions to be
    converted from open requests to sockets and queued to the
    listener for accepting after the deferring period is finished.
    Then application server can decide to wait longer for data
    or to properly terminate the connection with FIN if read()
    returns EAGAIN which is an indication for accepting after
    the deferring period. This change still can have side effects
    for applications that expect always to see data on the accepted
    socket. Others can be prepared to work in both modes (with or
    without TCP_DEFER_ACCEPT period) and their data processing can
    ignore the read=EAGAIN notification and to allocate resources for
    clients which proved to have no data to send during the deferring
    period. OTOH, servers that use TCP_DEFER_ACCEPT=1 as flag (not
    as a timeout) to wait for data will notice clients that didn't
    send data for 3 seconds but that still resend ACKs.
    Thanks to Willy Tarreau for the initial idea and to
    Eric Dumazet for the review and testing the change.
    
    Signed-off-by: Julian Anastasov <ja@ssi.bg>
    Acked-by: Eric Dumazet <eric.dumazet@gmail.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>


diff --git a/net/ipv4/tcp_minisocks.c b/net/ipv4/tcp_minisocks.c
index 624c3c9..4c03598 100644
--- a/net/ipv4/tcp_minisocks.c
+++ b/net/ipv4/tcp_minisocks.c
@@ -641,8 +641,8 @@ struct sock *tcp_check_req(struct sock *sk, struct sk_buff *skb,
        if (!(flg & TCP_FLAG_ACK))
                return NULL;
 
-       /* If TCP_DEFER_ACCEPT is set, drop bare ACK. */
-       if (inet_csk(sk)->icsk_accept_queue.rskq_defer_accept &&
+       /* While TCP_DEFER_ACCEPT is active, drop bare ACK. */
+       if (req->retrans < inet_csk(sk)->icsk_accept_queue.rskq_defer_accept &&
            TCP_SKB_CB(skb)->end_seq == tcp_rsk(req)->rcv_isn + 1) {
                inet_rsk(req)->acked = 1;
                return NULL;

Note You need to log in before you can comment on or make changes to this bug.