Created attachment 280501 [details] Tpr Browser and kernels run log Since kernel 4.20 there is a problem with Tor Browser (TB) on close. The last good kernel was 4.19.12 . Tested: kernel 4.20.1 and 4.19.12 . Tor Browser 8.0.4 (prod version). Tor Browser (TB) and kernel run logs (as attached): tor-netfilter-4.20.1-and-4.19.12.txt kernel 4.20.1 Results: There is a problem with TB on close: - TB process left behind - INVALID messages (ACK FIN and ACK PSH FIN) cought in OUTPUT chain of iptables, on lo interface, and logged to system log/journal. -A OUTPUT -m conntrack --ctstate INVALID -j loginv kernel 4.19.12 Results: There is no problem with TB, i.e. clean close. There is an old reference to this problem https://www.linode.com/community/questions/2495/iptables-drop-ackfin-as-invalid
Created attachment 281145 [details] git bisect log
Tested on: git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git Per 'git bisect' (see attached log): first bad commit: [6f41617bf23a17d9cb7cc6ca8161534f05f80293] Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net $ git show 6f41617bf23a17d9cb7cc6ca8161534f05f80293 commit 6f41617bf23a17d9cb7cc6ca8161534f05f80293 (refs/bisect/bad) Merge: 7bdaae270cb5 cec4de302c5f Author: David S. Miller <davem@davemloft.net> Date: Wed Oct 3 21:00:17 2018 -0700 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net Minor conflict in net/core/rtnetlink.c, David Ahern's bug fix in 'net' overlapped the renaming of a netlink attribute in net-next. Signed-off-by: David S. Miller <davem@davemloft.net> ... Can you or David S. Miller figure it out ?
Created attachment 281191 [details] git bisect log
In ref to Tor Browser (TB) and kernel run logs (as attached): The 5 min (300 secs) timeout is a deciding factor. The Tor Browser has to be open (just leave it idle or do some browsing) longer than 5 min. After that, closing the browser causes INVALID messages on lo interface and a browser process left behind. net/netfilter/nf_conntrack_proto_tcp.c ... [TCP_CONNTRACK_RETRANS] = 5 MINS, [TCP_CONNTRACK_UNACK] = 5 MINS, ... # sysctl -a net.netfilter.nf_conntrack_tcp_timeout_max_retrans = 300 net.netfilter.nf_conntrack_tcp_timeout_unacknowledged = 300 Here is my second bisect, with wider source code scope, and thus more reliable. Attached: bisect-2.log Per 'git bisect': # first bad commit: [a337531b942bd8a03e7052444d7e36972aac2d92] tcp: up initial rmem to 128KB and SYN rwin to around 64KB $ git show a337531b9 commit a337531b942bd8a03e7052444d7e36972aac2d92 (refs/bisect/bad) Author: Yuchung Cheng <ycheng@google.com> Date: Thu Sep 27 11:21:19 2018 -0700 tcp: up initial rmem to 128KB and SYN rwin to around 64KB Previously TCP initial receive buffer is ~87KB by default and the initial receive window is ~29KB (20 MSS). This patch changes the two numbers to 128KB and ~64KB (rounding down to the multiples of MSS) respectively. The patch also simplifies the calculations s.t. the two numbers are directly controlled by sysctl tcp_rmem[1]: 1) Initial receiver buffer budget (sk_rcvbuf): while this should be configured via sysctl tcp_rmem[1], previously tcp_fixup_rcvbuf() always override and set a larger size when a new connection establishes. 2) Initial receive window in SYN: previously it is set to 20 packets if MSS <= 1460. The number 20 was based on the initial congestion window of 10: the receiver needs twice amount to avoid being limited by the receive window upon out-of-order delivery in the first window burst. But since this only applies if the receiving MSS <= 1460, connection using large MTU (e.g. to utilize receiver zero-copy) may be limited by the receive window. With this patch TCP memory configuration is more straight-forward and more properly sized to modern high-speed networks by default. Several popular stacks have been announcing 64KB rwin in SYNs as well. ... Host data - for comparison: bisect good 1288aa7a -------------------- # sysctl -a |grep rmem net.core.rmem_default = 212992 net.core.rmem_max = 212992 net.ipv4.tcp_rmem = 4096 87380 6291456 net.ipv4.udp_rmem_min = 4096 # sysctl -a |grep wmem net.core.wmem_default = 212992 net.core.wmem_max = 212992 net.ipv4.tcp_wmem = 4096 16384 4194304 net.ipv4.udp_wmem_min = 4096 # sysctl -a |grep rwin # Other 300 secs net values: # sysctl -a |grep 300 net.core.netdev_budget = 300 net.ipv4.route.error_cost = 300 net.ipv4.route.gc_timeout = 300 net.ipv4.tcp_max_reordering = 300 net.ipv4.tcp_min_rtt_wlen = 300 net.netfilter.nf_conntrack_tcp_timeout_max_retrans = 300 net.netfilter.nf_conntrack_tcp_timeout_unacknowledged = 300 bisect bad a337531b94 [first bad commit] ---------------------------------------- # sysctl -a |grep rmem net.core.rmem_default = 212992 net.core.rmem_max = 212992 net.ipv4.tcp_rmem = 4096 131072 6291456 <--------------- net.ipv4.udp_rmem_min = 4096 # sysctl -a |grep wmem net.core.wmem_default = 212992 net.core.wmem_max = 212992 net.ipv4.tcp_wmem = 4096 16384 4194304 net.ipv4.udp_wmem_min = 4096 # sysctl -a |grep rwin # Other 300 secs net values: # sysctl -a |grep 300 net.core.netdev_budget = 300 net.ipv4.route.error_cost = 300 net.ipv4.route.gc_timeout = 300 net.ipv4.tcp_max_reordering = 300 net.ipv4.tcp_min_rtt_wlen = 300 net.netfilter.nf_conntrack_tcp_timeout_max_retrans = 300 net.netfilter.nf_conntrack_tcp_timeout_unacknowledged = 300
Hi the patch is indeed buggy. The fix was merged subsequently commit 041a14d2671573611ffd6412bc16e2f64469f7fb Author: Yuchung Cheng <ycheng@google.com> Date: Mon Oct 1 15:42:32 2018 -0700 tcp: start receiver buffer autotuning sooner
I verified your fix commit 041a14d26715: $ git log --oneline v4.19..v4.20 -- include net ... 041a14d26715 tcp: start receiver buffer autotuning sooner <- BAD ... a337531b942b tcp: up initial rmem to 128KB and SYN rwin to around 64KB <- BAD 9e288cefcc55 clk: renesas: Convert to SPDX identifiers <- GOOD $ In theory, there could be a GOOD commit in this range, but branch deteriorated later: $ git log a337531b..041a14d26 -- include net |wc -l 146 $ So, I bisected it and all commits are BAD. ------ I also verified wider scope, just in case: $ git log --oneline v4.19..v4.20 ... 041a14d26715 tcp: start receiver buffer autotuning sooner <- BAD ... a337531b942b tcp: up initial rmem to 128KB and SYN rwin to around 64KB <- BAD 3ff6cde84685 hns3: Another build fix. <- GOOD ... 9e288cefcc55 clk: renesas: Convert to SPDX identifiers <- GOOD $ So, we are back to this faulty commit: a337531b942b tcp: up initial rmem to 128KB and SYN rwin to around 64KB
Can you give this patch a try: https://lore.kernel.org/netfilter-devel/20190802132302.13537-4-sashal@kernel.org/ ? There was a bug related to TCP window scaling causing the wrong conntrack timeout to be set for TCP connections (300s instead of 5d). It might be related since your bisect points to changes of the window scaling handling. (I found your bugzilla entry while tracing down the exact issue the mentioned patch fixes) Here's the thread how this was diagnosed: https://lore.kernel.org/netfilter-devel/alpine.DEB.2.20.1907101242560.17522@blackhole.kfki.hu/
Great. Tested on 5.3.0-rc4 from git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git It is OK now. Thanks to Florian Westphal for the fix and all other users/devs involved. If there are no comments I will close it in a couple of days.
Code fix present in upcoming 5.3 kernel: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?h=v5.3-rc4&id=959b69ef57db00cb33e9c4777400ae7183ebddd3