Bug 219849

Summary: Issue with Missing Final ACK in TCP Connections to gracefully close the TCP connection – Regression in 5.15.123 (15251e78)?
Product: Networking Reporter: Marko (marko.pacaric)
Component: IPV4Assignee: Stephen Hemminger (stephen)
Status: NEW ---    
Severity: normal CC: marko.pacaric
Priority: P3    
Hardware: All   
OS: Linux   
Kernel Version: 5.15.123 Subsystem:
Regression: No Bisected commit-id:
Attachments: Wireshark_cutout

Description Marko 2025-03-07 19:35:26 UTC
Created attachment 307779 [details]
Wireshark_cutout

Dear Linux Community,

I’m currently investigating a complex issue related to TCP connection termination and would appreciate your insights to help clarify the situation.

Background

We have a complex network setup involving multiple hops and a special implementation on the mobile provider side.

Given this setup, we rely heavily on TCP connections being gracefully closed—following the standard FIN → FIN-ACK → ACK sequence.

The Problem

For the past few months, we have been struggling with an issue where the final ACK from our client is not being sent, leaving the connection stuck in CLOSE_WAIT or LAST_ACK state on the receiver's side. This behavior is causing significant issues in our system.

What We've Tried

1.       Application-Level Investigation
o    Initially, we suspected an issue in our implementation.
o    We rebuilt several applications, but the final ACK was consistently missing.
o    We even tested with different programming languages and libraries—same issue.
2.       Kernel & TCP Stack Configuration
o    We modified various TCP stack parameters, but none of the changes resolved the problem.
Findings

To isolate the issue, we started testing with different Linux kernel versions:

·         Our current used version5.15.123 (KERNEL.PLATFORM.2.0.r10-05800-kernel.0) → Final ACK is missing, connections remain in FIN_WAIT_2.
·         5.15.104 (KERNEL.PLATFORM.2.0.r10-04600-kernel.0) → Final ACK is sent, connections close correctly.
Through systematic downgrading, we identified that 5.15.104 is the last version where the issue does not occur. We then analyzed the commits between these versions and found that the issue seems to be introduced by the following commit:

🔗 Commit 15251e783a4b
https://git.codelinaro.org/clo/la/kernel/msm-5.15/-/commit/15251e783a4b

What did we do now

We build ourselfs a small patch which reverts the commit, so that we can see if the false behavior is revered.

After applying the following patch, we do no longer observe the missing ACKs:

diff --git a/net/ipv4/tcp_ipv4.c b/net/ipv4/tcp_ipv4.c
index 420d3bdeaa1b..0580d8719f37 100644
--- a/net/ipv4/tcp_ipv4.c
+++ b/net/ipv4/tcp_ipv4.c
@@ -923,7 +923,6 @@ static void tcp_v4_send_ack(const struct sock *sk,
                              &arg, arg.iov[0].iov_len,
                              transmit_time);
 
-       sock_net_set(ctl_sk, &init_net);
        __TCP_INC_STATS(net, TCP_MIB_OUTSEGS);
        local_bh_enable();
}
 
Next Steps & Questions

1.       Could this commit be responsible for suppressing the final ACK in certain network conditions?
2.       Has anyone else observed similar behavior in recent kernel versions?
3.       Are there any known workarounds or patches addressing this issue?
Any insights or suggestions on how to proceed would be greatly appreciated!

Thank you very much,

Marko