Ssh connection unstable or broken

I deploy a Java application from Octopus 3.1.3 to CentOS 6.7 VM. It worked fine for a while. But from today, I keep getting problem “Could not connect to SSH endpoint”. I connected successfully to that VM using Putty and other ssh client.

Most time, I got failure when “check environment health”. Occasionally I got check health OK; but even in this moment, I couldn’t deploy to Linux, I still got ssh connection problem. Here is some error message I got:

A connection attempt failed because the connected party did not properly respond after a period of time, or established connection failed because connected host has failed to respond
Renci.SshNet.Common.SshConnectionException: A connection attempt failed because the connected party did not properly respond after a period of time, or established connection failed because connected host has failed to respond —>

Any idea why? How can I tackle this problem?

Again, I tested ssh connect using Putty and it was fine.

Thanks,

Edward

Hi Edward,

Thanks for getting in touch. It seems a bit odd that it was working and then stopped suddenly. Do you know if there were any changes made to the sshd config on the CentOS server or password/permission changes etc? Octopus connects to linux boxes in a non-interactive fashion so successful connections with putty or other ssh clients doesn’t always mean the client/server configuration is correct.

In any event, can you send me the Linux Server’s health check raw log as this should help us figure out why it’s failing. The following URL provides instructions on how to obtain a raw task log.

Looking forward to your reply.

Thanks

Rob

I’m having exactly the same issue.
Octopus Server 3.3.8 attempting to run a script on a CentOS 6.8 server. Works perfectly for a while, then goes into a period where this error is thrown consistently for at least several hours:

Could not connect to SSH endpoint
|     Octopus.Worker.Ssh.SshEndpointConnectionException: Could not connect to SSH endpoint ---> System.Net.Sockets.SocketException: A connection attempt failed because the connected party did not properly respond after a period of time, or established connection failed because connected host has failed to respond xxxxxxxxxxx:22
|     at System.Net.Sockets.Socket.EndConnect(IAsyncResult asyncResult)
|     at Renci.SshNet.Session.SocketConnect(String host, Int32 port)
|     at Renci.SshNet.Session.Connect()
|     at Renci.SshNet.BaseClient.Connect()
|     at Octopus.Worker.Ssh.Connectivity.SshClientFactory.EstablishConnection[TClient](Func`2 clientConstrutor) in Y:\work\refs\tags\3.3.8\source\Octopus.Worker\Ssh\Connectivity\SshClientFactory.cs:line 66

So far it has fixed itself overnight once, worked again for half a day, then started failing again.

I’ve RDP’d onto the Octopus server and used the ssh client that ships with git-bash for windows (MINGW64), always works fine for the same server and the same credentials.

Then I opened a powershell prompt on the Octopus server and used the following:

[Void][Reflection.Assembly]::LoadFrom('C:\Program Files\Octopus Deploy\Octopus\Renci.SshNet.dll')
$client=New-Object Renci.SshNet.SshClient('server_name','username','password')
$client.Connect()

Got the same error (“A connection attempt failed…”) as I saw in the original error.

So the problem is isolated to the manner in which the Renci Ssh client is creating the connection.

Will see if I can narrow it down any further.

Turns out in my case the problem was lower down the networking stack. There was a network interface whose address was being registered in DNS but could not be routed to from the octopus server.

I think the confusion came because the ssh client I was using must use a different name resolution strategy (or cache) to the .NET library

Hi Jimmy,

Thanks for getting in touch. Sometimes the exceptions can be a bit misleading but you did a great job digging into it.

Let me know if you have any other issues.

Thanks

Rob

Hi Rob, no problem.

The powershell snippet I posted should be useful for anyone else having the same issue, as running it from the Octopus server can at least exclude the Octopus application from the equation and narrow it down to an inability to SSH via the NetSSH (Renci) libraries.

With that in mind (and I know this is not my post), I think we can consider Octopus vindicated and close this one off.

Hi Jimmy,

Great point as that’s a simple and handy way to confirm similar issues. In regards to closing this post, we generally leave them open in case anyone has similar issues but I’m happy to consider it closed. :slight_smile:

Happy deploying!

Rob