Bug 15487 - System hangs during retransmit of TCP ACK or SACK
System hangs during retransmit of TCP ACK or SACK
Status: RESOLVED DUPLICATE of bug 14470
Product: Networking
Classification: Unclassified
Component: IPV4
All Linux
: P1 high
Assigned To: Stephen Hemminger
:
Depends on:
Blocks:
  Show dependency treegraph
 
Reported: 2010-03-09 12:05 UTC by Yuriy Yevtukhov
Modified: 2010-03-09 23:40 UTC (History)
0 users

See Also:
Kernel Version: 2.6.32.8
Tree: Mainline
Regression: No


Attachments
Crash dump (117.98 KB, image/png)
2010-03-09 12:05 UTC, Yuriy Yevtukhov
Details
custom kernel config (16.14 KB, application/gzip)
2010-03-09 12:11 UTC, Yuriy Yevtukhov
Details

Description Yuriy Yevtukhov 2010-03-09 12:05:50 UTC
Created attachment 25419 [details]
Crash dump

System (highly loaded web server) accidentially hangs after several days after outputting crash dump. No reaction to keyboard or records in system logs. By this reason only last part of crash dump is accessible, screen copy is in attachment.
Problem was noticed in kernel 2.6.31.6. Then upgrade was made to 2.6.32.8.
Problem seems to be connected with my tunings of system variables. Here they are

net.ipv4.tcp_syncookies=1
net.ipv4.tcp_max_syn_backlog=8192
net.core.rmem_max=4194304
net.core.wmem_max=4194304
net.ipv4.tcp_mem=786432 1048576 1572864
net.ipv4.tcp_rmem=4096 87380 1048576
net.ipv4.tcp_wmem=4096 32768 1048576
net.ipv4.tcp_orphan_retries=4
net.ipv4.tcp_fin_timeout=15
net.ipv4.tcp_syn_retries=5
net.ipv4.tcp_synack_retries=2
net.ipv4.tcp_sack=1
net.ipv4.tcp_timestamps=1
net.ipv4.tcp_retries1=3

Other changed values unrelated to TCP.
Problem noticed on different hardware (Multiple Xeons with Intel's chipset and network cards, Multiple Opterons with nvidia's netword cards) with 16-32GB of memory. Never noticed before providing specified sysctl values.
Comment 1 Yuriy Yevtukhov 2010-03-09 12:11:12 UTC
Created attachment 25420 [details]
custom kernel config
Comment 2 Yuriy Yevtukhov 2010-03-09 17:13:46 UTC
Checked something. It seems that 
net.ipv4.tcp_sack=1
enables this problem.
Comment 3 Yuriy Yevtukhov 2010-03-09 17:42:58 UTC
(In reply to comment #2)
> Checked something. It seems that 
> net.ipv4.tcp_sack=1
> enables this problem.

I meant together with other my tuning. As I never noticed this problem on server with
net.ipv4.tcp_rmem=4096 87380 2097152
net.ipv4.tcp_wmem=4096 131072 2097152
net.ipv4.tcp_fin_timeout=15
net.ipv4.tcp_syn_retries=5
net.ipv4.tcp_synack_retries=2
net.ipv4.ip_local_port_range=4096 65000
net.ipv4.tcp_sack=0
net.ipv4.tcp_timestamps=1
net.ipv4.tcp_retries1=2



Or may be these tunings are unrelated.
Comment 4 Yuriy Yevtukhov 2010-03-09 23:40:22 UTC

*** This bug has been marked as a duplicate of bug 14470 ***

Note You need to log in before you can comment on or make changes to this bug.