Bug 111041 - random openssh connection failure during connection to server
Summary: random openssh connection failure during connection to server
Status: RESOLVED CODE_FIX
Alias: None
Product: Networking
Classification: Unclassified
Component: IPV4 (show other bugs)
Hardware: All Linux
: P1 normal
Assignee: Stephen Hemminger
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2016-01-20 10:14 UTC by Tomas Mozes
Modified: 2016-06-24 07:20 UTC (History)
3 users (show)

See Also:
Kernel Version: 4.4.1
Tree: Mainline
Regression: No


Attachments

Description Tomas Mozes 2016-01-20 10:14:17 UTC
It's sometimes impossible to connect to a server running Gentoo Linux with kernel 4.4 using ssh. The problem is only the connection, it fails a few times but then it goes fine and ssh works as expected.

OpenSSH_7.1p2-hpn14v10, OpenSSL 1.0.2e 3 Dec 2015
debug1: Reading configuration data /etc/ssh/ssh_config
debug2: ssh_connect: needpriv 0
debug1: Connecting to 10.0.0.5 [10.0.0.5] port 22.
debug1: Connection established.
debug1: Enabling compatibility mode for protocol 2.0
write: Connection reset by peer

Or alternatively it fails with "ssh_exchange_identification: read: Connection reset by peer".

We were able to reproduce this problem with random clients, Gentoo, Ubuntu, Debian, random kernels. It's interesting that only 1 client doesn't have a problem to connect to the server - a system with Gentoo Linux Hardened 4.1 (using the same openssh version).

The server is a Supermicro X10DRW, Intel PCI-Express Gigabit Ethernet (igb) networking. Two ethernet ports were connected to bond and that to a bridge for xen. I've also tried without bond and without any iptables rules, it didn't help. This setup works without any modification on kernel 4.1.15.

A Gentoo forum post: https://forums.gentoo.org/viewtopic.php?p=7868744
Comment 1 hannes 2016-01-20 18:06:42 UTC
Please follow this patch, it might solve the problem:
https://patchwork.ozlabs.org/patch/570514/
Comment 2 hannes 2016-01-20 18:08:48 UTC
Hmm, probably might not help, sorry.
Comment 3 Tomas Mozes 2016-02-17 05:49:15 UTC
(In reply to hannes from comment #1)
> Please follow this patch, it might solve the problem:
> https://patchwork.ozlabs.org/patch/570514/

This happens with/without bonding, so not sure if it really helps.
Comment 4 Tomas Mozes 2016-02-17 05:49:37 UTC
Just tested with 4.4.1, still the same.
Comment 5 Tomas Mozes 2016-03-24 08:46:25 UTC
Just tried with 4.4.6 and the problem seems to go away, cannot reproduce it after numerous login retries.
Comment 6 ganthore 2016-06-24 06:10:32 UTC
Happens for me using kernel 4.6.2.
Comment 7 ganthore 2016-06-24 06:11:13 UTC
> ssh 10.0.0.200 -vvv
OpenSSH_7.2p2lpk, OpenSSL 1.0.2h  3 May 2016
debug1: Reading configuration data /home/ganthore/.ssh/config
debug1: Reading configuration data /etc/ssh/ssh_config
debug1: auto-mux: Trying existing master
debug1: Control socket "/tmp/ganthore@10.0.0.200:22" does not exist
debug2: resolving "10.0.0.200" port 22
debug2: ssh_connect_direct: needpriv 0
debug1: Connecting to 10.0.0.200 [10.0.0.200] port 22.
debug1: Connection established.
debug1: identity file /home/ganthore/.ssh/id_rsa type 1
debug1: key_load_public: No such file or directory
debug1: identity file /home/ganthore/.ssh/id_rsa-cert type -1
debug1: key_load_public: No such file or directory
debug1: identity file /home/ganthore/.ssh/id_dsa type -1
debug1: key_load_public: No such file or directory
debug1: identity file /home/ganthore/.ssh/id_dsa-cert type -1
debug1: key_load_public: No such file or directory
debug1: identity file /home/ganthore/.ssh/id_ecdsa type -1
debug1: key_load_public: No such file or directory
debug1: identity file /home/ganthore/.ssh/id_ecdsa-cert type -1
debug1: key_load_public: No such file or directory
debug1: identity file /home/ganthore/.ssh/id_ed25519 type -1
debug1: key_load_public: No such file or directory
debug1: identity file /home/ganthore/.ssh/id_ed25519-cert type -1
debug1: Enabling compatibility mode for protocol 2.0
debug1: Local version string SSH-2.0-OpenSSH_7.2
ssh_exchange_identification: read: Connection reset by peer
Comment 8 ganthore 2016-06-24 07:20:31 UTC
Disregard my post, it was a different issue entirely (bad config from NetworkManager on a static route).

Note You need to log in before you can comment on or make changes to this bug.