Created attachment 25450 [details] tcpdump from virtual guest over ::1 showing dup synacks Setup: 64-bit dual-core, latest F12. (I've experienced this on .31 and .32 kernels.) I have also experienced this on 2 other machines, all 64-bit. One was a single-core virtual guest. 1. Install httpd. 2. Start it. Don't bother configuring anything unless you need to. 3. tcpdump ..... 4. connect from localhost or (possibly) a nearby host. I'm not sure if this is timing related yet. 5. note duplicate SYN-ACKs sent out at the exponential backoff. Some setups exhibit this more/worse than others, and will actually time out the connection. (Others will only send one or two duplicate SYN-ACKs, and then enter the ESTABLISHED state.) However, if any data is sent the kernel enters ESTABLISHED. Note that this isn't ipv4 specific. I have seen it under ipv4 and ipv6 (via loopback).
(switched to email. Please respond via emailed reply-to-all, not via the bugzilla web interface). On Wed, 10 Mar 2010 14:50:51 GMT bugzilla-daemon@bugzilla.kernel.org wrote: > http://bugzilla.kernel.org/show_bug.cgi?id=15507 > > Summary: kernel misses 3rd part of tcp handshake (ACK), stays > in SYN_RECV state > Product: Networking > Version: 2.5 > Kernel Version: 2.6.32.9-67.fc12 > Platform: All > OS/Version: Linux > Tree: Fedora > Status: NEW > Severity: normal > Priority: P1 > Component: IPV4 > AssignedTo: shemminger@linux-foundation.org > ReportedBy: roysjosh@gmail.com > Regression: Yes > > > Created an attachment (id=25450) > --> (http://bugzilla.kernel.org/attachment.cgi?id=25450) > tcpdump from virtual guest over ::1 showing dup synacks > > Setup: > 64-bit dual-core, latest F12. (I've experienced this on .31 and .32 > kernels.) > > I have also experienced this on 2 other machines, all 64-bit. One was a > single-core virtual guest. > > 1. Install httpd. > 2. Start it. Don't bother configuring anything unless you need to. > 3. tcpdump ..... > 4. connect from localhost or (possibly) a nearby host. I'm not sure if this > is > timing related yet. > 5. note duplicate SYN-ACKs sent out at the exponential backoff. Some setups > exhibit this more/worse than others, and will actually time out the > connection. > (Others will only send one or two duplicate SYN-ACKs, and then enter the > ESTABLISHED state.) However, if any data is sent the kernel enters > ESTABLISHED. > > Note that this isn't ipv4 specific. I have seen it under ipv4 and ipv6 (via > loopback). >
Le jeudi 18 mars 2010 à 16:01 -0700, Andrew Morton a écrit : > (switched to email. Please respond via emailed reply-to-all, not via the > bugzilla web interface). > > On Wed, 10 Mar 2010 14:50:51 GMT > bugzilla-daemon@bugzilla.kernel.org wrote: > > > http://bugzilla.kernel.org/show_bug.cgi?id=15507 > > > > Summary: kernel misses 3rd part of tcp handshake (ACK), stays > > in SYN_RECV state > > Product: Networking > > Version: 2.5 > > Kernel Version: 2.6.32.9-67.fc12 > > Platform: All > > OS/Version: Linux > > Tree: Fedora > > Status: NEW > > Severity: normal > > Priority: P1 > > Component: IPV4 > > AssignedTo: shemminger@linux-foundation.org > > ReportedBy: roysjosh@gmail.com > > Regression: Yes > > > > > > Created an attachment (id=25450) > > --> (http://bugzilla.kernel.org/attachment.cgi?id=25450) > > tcpdump from virtual guest over ::1 showing dup synacks > > > > Setup: > > 64-bit dual-core, latest F12. (I've experienced this on .31 and .32 > kernels.) > > > > I have also experienced this on 2 other machines, all 64-bit. One was a > > single-core virtual guest. > > > > 1. Install httpd. > > 2. Start it. Don't bother configuring anything unless you need to. > > 3. tcpdump ..... > > 4. connect from localhost or (possibly) a nearby host. I'm not sure if > this is > > timing related yet. > > 5. note duplicate SYN-ACKs sent out at the exponential backoff. Some > setups > > exhibit this more/worse than others, and will actually time out the > connection. > > (Others will only send one or two duplicate SYN-ACKs, and then enter the > > ESTABLISHED state.) However, if any data is sent the kernel enters > > ESTABLISHED. > > > > Note that this isn't ipv4 specific. I have seen it under ipv4 and ipv6 > (via > > loopback). > > > > -- I would say this is expected if httpd server set DEFER_ACCEPT socket option.
On 03/18/2010 07:12 PM, Eric Dumazet wrote: > > I would say this is expected if httpd server set DEFER_ACCEPT socket > option. > > > Gah! You've got to be kidding :) If that's what it is, close the bug... and I'll go read some man-pages! Josh
Le vendredi 19 mars 2010 à 07:53 -0400, Joshua Roys a écrit : > On 03/18/2010 07:12 PM, Eric Dumazet wrote: > > > > I would say this is expected if httpd server set DEFER_ACCEPT socket > > option. > > > > > > > > Gah! You've got to be kidding :) If that's what it is, close the > bug... and I'll go read some man-pages! > So you _confirm_ DEFER_ACCEPT is not used, its _important_ for us. I dont believe my comment was trivial at all :) I gave a hint to myself and other network guys, because we did some changes in this area lately (commit b103cf34 tcp: fix TCP_DEFER_ACCEPT retrans calculation) from Julian Anastasov. git describe b103cf34 v2.6.31-9056-gb103cf3 Please boot a fresh vm, then try exactly 10 'bad connects', please make each attempt last at least 2 minutes. (you can start all these in parallel) And report : netstat -s (My FC12 copy doesnt exhibit this problem) Thanks
Le vendredi 19 mars 2010 à 07:53 -0400, Joshua Roys a écrit : > > Gah! You've got to be kidding :) If that's what it is, close the > bug... and I'll go read some man-pages! So I checked FC12 httpd and yes, it does use DEFER_ACCEPT (setting val to 1 second). So you hit a problem that was corrected by following commit. Time to bug RedHat I suppose... commit d1b99ba41d6c5aa1ed2fc634323449dd656899e9 Author: Julian Anastasov <ja@ssi.bg> Date: Mon Oct 19 10:01:56 2009 +0000 tcp: accept socket after TCP_DEFER_ACCEPT period Willy Tarreau and many other folks in recent years were concerned what happens when the TCP_DEFER_ACCEPT period expires for clients which sent ACK packet. They prefer clients that actively resend ACK on our SYN-ACK retransmissions to be converted from open requests to sockets and queued to the listener for accepting after the deferring period is finished. Then application server can decide to wait longer for data or to properly terminate the connection with FIN if read() returns EAGAIN which is an indication for accepting after the deferring period. This change still can have side effects for applications that expect always to see data on the accepted socket. Others can be prepared to work in both modes (with or without TCP_DEFER_ACCEPT period) and their data processing can ignore the read=EAGAIN notification and to allocate resources for clients which proved to have no data to send during the deferring period. OTOH, servers that use TCP_DEFER_ACCEPT=1 as flag (not as a timeout) to wait for data will notice clients that didn't send data for 3 seconds but that still resend ACKs. Thanks to Willy Tarreau for the initial idea and to Eric Dumazet for the review and testing the change. Signed-off-by: Julian Anastasov <ja@ssi.bg> Acked-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> diff --git a/net/ipv4/tcp_minisocks.c b/net/ipv4/tcp_minisocks.c index 624c3c9..4c03598 100644 --- a/net/ipv4/tcp_minisocks.c +++ b/net/ipv4/tcp_minisocks.c @@ -641,8 +641,8 @@ struct sock *tcp_check_req(struct sock *sk, struct sk_buff *skb, if (!(flg & TCP_FLAG_ACK)) return NULL; - /* If TCP_DEFER_ACCEPT is set, drop bare ACK. */ - if (inet_csk(sk)->icsk_accept_queue.rskq_defer_accept && + /* While TCP_DEFER_ACCEPT is active, drop bare ACK. */ + if (req->retrans < inet_csk(sk)->icsk_accept_queue.rskq_defer_accept && TCP_SKB_CB(skb)->end_seq == tcp_rsk(req)->rcv_isn + 1) { inet_rsk(req)->acked = 1; return NULL;