Bug 15487

Summary: System hangs during retransmit of TCP ACK or SACK
Product: Networking Reporter: Yuriy Yevtukhov (yuriy)
Component: IPV4Assignee: Stephen Hemminger (stephen)
Status: RESOLVED DUPLICATE    
Severity: high    
Priority: P1    
Hardware: All   
OS: Linux   
Kernel Version: 2.6.32.8 Subsystem:
Regression: No Bisected commit-id:
Attachments: Crash dump
custom kernel config

Description Yuriy Yevtukhov 2010-03-09 12:05:50 UTC
Created attachment 25419 [details]
Crash dump

System (highly loaded web server) accidentially hangs after several days after outputting crash dump. No reaction to keyboard or records in system logs. By this reason only last part of crash dump is accessible, screen copy is in attachment.
Problem was noticed in kernel 2.6.31.6. Then upgrade was made to 2.6.32.8.
Problem seems to be connected with my tunings of system variables. Here they are

net.ipv4.tcp_syncookies=1
net.ipv4.tcp_max_syn_backlog=8192
net.core.rmem_max=4194304
net.core.wmem_max=4194304
net.ipv4.tcp_mem=786432 1048576 1572864
net.ipv4.tcp_rmem=4096 87380 1048576
net.ipv4.tcp_wmem=4096 32768 1048576
net.ipv4.tcp_orphan_retries=4
net.ipv4.tcp_fin_timeout=15
net.ipv4.tcp_syn_retries=5
net.ipv4.tcp_synack_retries=2
net.ipv4.tcp_sack=1
net.ipv4.tcp_timestamps=1
net.ipv4.tcp_retries1=3

Other changed values unrelated to TCP.
Problem noticed on different hardware (Multiple Xeons with Intel's chipset and network cards, Multiple Opterons with nvidia's netword cards) with 16-32GB of memory. Never noticed before providing specified sysctl values.
Comment 1 Yuriy Yevtukhov 2010-03-09 12:11:12 UTC
Created attachment 25420 [details]
custom kernel config
Comment 2 Yuriy Yevtukhov 2010-03-09 17:13:46 UTC
Checked something. It seems that 
net.ipv4.tcp_sack=1
enables this problem.
Comment 3 Yuriy Yevtukhov 2010-03-09 17:42:58 UTC
(In reply to comment #2)
> Checked something. It seems that 
> net.ipv4.tcp_sack=1
> enables this problem.

I meant together with other my tuning. As I never noticed this problem on server with
net.ipv4.tcp_rmem=4096 87380 2097152
net.ipv4.tcp_wmem=4096 131072 2097152
net.ipv4.tcp_fin_timeout=15
net.ipv4.tcp_syn_retries=5
net.ipv4.tcp_synack_retries=2
net.ipv4.ip_local_port_range=4096 65000
net.ipv4.tcp_sack=0
net.ipv4.tcp_timestamps=1
net.ipv4.tcp_retries1=2



Or may be these tunings are unrelated.
Comment 4 Yuriy Yevtukhov 2010-03-09 23:40:22 UTC

*** This bug has been marked as a duplicate of bug 14470 ***