When the network interface is taken down (in this case eth0 plugging out cable) an established socket goes to zero window probing state (netstat -o shows unkn-4). This state ignores TCP_USER_TIMEOUT (and any keep-alive timeouts) and result is that socket takes ~12 min. to timeout instead of what is specified with TCP_USER_TIMEOUT. There seems to be 2 'issues': 1. Why does socket go to zero window probing in this case? 2. Why does zero window probing not respect TCP_USER_TIMEOUT? Expected behavior: TCP_USER_TIMEOUT can be used to limit how long it takes a socket to timeout.
on issue 2, I guess we should not require any probe is attempted / sent? diff --git a/net/ipv4/tcp_timer.c b/net/ipv4/tcp_timer.c index 286227a..8f52c40 100644 --- a/net/ipv4/tcp_timer.c +++ b/net/ipv4/tcp_timer.c @@ -610,8 +610,7 @@ static void tcp_keepalive_timer (unsigned long data) * to determine when to timeout instead. */ if ((icsk->icsk_user_timeout != 0 && - elapsed >= icsk->icsk_user_timeout && - icsk->icsk_probes_out > 0) || + elapsed >= icsk->icsk_user_timeout) || (icsk->icsk_user_timeout == 0 && icsk->icsk_probes_out >= keepalive_probes(tp))) { tcp_send_active_reset(sk, GFP_ATOMIC);
Networking patches should go to netdev@vger.kernel.org with a Signed-off-by: line See Documentation/SubmittingPatches