Bug 33902

Summary: tcpi_state field in tcp_info structure reports TCP_CLOSE instead of TCP_TIME_WAIT state
Product: Networking Reporter: Dmitry Izbitsky (Dmitry.Izbitsky)
Component: IPV4Assignee: Stephen Hemminger (stephen)
Status: RESOLVED WILL_NOT_FIX    
Severity: normal CC: alan
Priority: P1    
Hardware: All   
OS: Linux   
Kernel Version: 2.6.38 Subsystem:
Regression: No Bisected commit-id:

Description Dmitry Izbitsky 2011-04-25 08:08:35 UTC
Setup - TCP connection in ESTABLISHED state. Local socket calls shutdown(SHUT_RDWR). After that peer calls shutdown(SHUT_RDWR).

Local socket should now be in TIME_WAIT state (from specification point 
of view). And it's indeed in TIME_WAIT (TCP_TIME_WAIT) state if we look at 
/proc/net/tcp (or netstat -t). However, if one tries to get connection state via tcp_info (getsockopt(TCP_INFO)) the reported state is CLOSED (TCP_CLOSE).

Looks like the problem is in tcp_time_wait() function (net/ipv4/tcp_minisocks.c).
It's called with state=TCP_TIME_WAIT, and sets inet_timewaitk_sock *tw->tw_state field to TCP_TIME_WAIT. That's why the state is reported correctly when looking into /proc. However, at the end it calls tcp_done(sk), which itself calls tcp_set_state(TCP_CLOSE), so sk->sk_state is set to TCP_CLOSE instead of TCP_TIME_WAIT. And it's reported this way via TCP_INFO socket option.

Problem is reproduced on 2.6.26, 2.6.38 and is probably observed on earlier kernels.
Comment 1 Andrew Morton 2011-04-25 21:35:05 UTC
(switched to email.  Please respond via emailed reply-to-all, not via the
bugzilla web interface).

On Mon, 25 Apr 2011 08:08:36 GMT
bugzilla-daemon@bugzilla.kernel.org wrote:

> https://bugzilla.kernel.org/show_bug.cgi?id=33902
> 
>            Summary: tcpi_state field in tcp_info structure reports
>                     TCP_CLOSE instead of TCP_TIME_WAIT state
>            Product: Networking
>            Version: 2.5
>     Kernel Version: 2.6.38
>           Platform: All
>         OS/Version: Linux
>               Tree: Mainline
>             Status: NEW
>           Severity: normal
>           Priority: P1
>          Component: IPV4
>         AssignedTo: shemminger@linux-foundation.org
>         ReportedBy: Dmitry.Izbitsky@oktetlabs.ru
>         Regression: No
> 
> 
> Setup - TCP connection in ESTABLISHED state. Local socket calls
> shutdown(SHUT_RDWR). After that peer calls shutdown(SHUT_RDWR).
> 
> Local socket should now be in TIME_WAIT state (from specification point 
> of view). And it's indeed in TIME_WAIT (TCP_TIME_WAIT) state if we look at 
> /proc/net/tcp (or netstat -t). However, if one tries to get connection state
> via tcp_info (getsockopt(TCP_INFO)) the reported state is CLOSED (TCP_CLOSE).
> 
> Looks like the problem is in tcp_time_wait() function
> (net/ipv4/tcp_minisocks.c).
> It's called with state=TCP_TIME_WAIT, and sets inet_timewaitk_sock
> *tw->tw_state field to TCP_TIME_WAIT. That's why the state is reported
> correctly when looking into /proc. However, at the end it calls tcp_done(sk),
> which itself calls tcp_set_state(TCP_CLOSE), so sk->sk_state is set to
> TCP_CLOSE instead of TCP_TIME_WAIT. And it's reported this way via TCP_INFO
> socket option.
> 
> Problem is reproduced on 2.6.26, 2.6.38 and is probably observed on earlier
> kernels.
Comment 2 David S. Miller 2011-06-07 00:06:21 UTC
From: Andrew Morton <akpm@linux-foundation.org>
Date: Mon, 25 Apr 2011 14:34:21 -0700

> On Mon, 25 Apr 2011 08:08:36 GMT
> bugzilla-daemon@bugzilla.kernel.org wrote:
> 
>> Setup - TCP connection in ESTABLISHED state. Local socket calls
>> shutdown(SHUT_RDWR). After that peer calls shutdown(SHUT_RDWR).
>> 
>> Local socket should now be in TIME_WAIT state (from specification point 
>> of view). And it's indeed in TIME_WAIT (TCP_TIME_WAIT) state if we look at 
>> /proc/net/tcp (or netstat -t). However, if one tries to get connection state
>> via tcp_info (getsockopt(TCP_INFO)) the reported state is CLOSED
>> (TCP_CLOSE).
>> 
>> Looks like the problem is in tcp_time_wait() function
>> (net/ipv4/tcp_minisocks.c).
>> It's called with state=TCP_TIME_WAIT, and sets inet_timewaitk_sock
>> *tw->tw_state field to TCP_TIME_WAIT. That's why the state is reported
>> correctly when looking into /proc. However, at the end it calls
>> tcp_done(sk),
>> which itself calls tcp_set_state(TCP_CLOSE), so sk->sk_state is set to
>> TCP_CLOSE instead of TCP_TIME_WAIT. And it's reported this way via TCP_INFO
>> socket option.
>> 
>> Problem is reproduced on 2.6.26, 2.6.38 and is probably observed on earlier
>> kernels.

As far as the user side of the socket is concerned, it is TCP_CLOSE.

For timewait connections we create a completely seperate light-weight object
to manage the network side visible state of the TCP flow.  This is not
accessible from, and is entirely differently from, the heavy-weight full
socket we keep around until the user gives up his final reference.

So I do not see this behavior changing, it would be quite invasive and
expensive to make this work as you expect, and only for marginal gain.