We have been using Octopus successfully to deploy our application to virtual machines in the AWS environment as they are initiated. Recently we have encountered an issue where the new machine will register with the Octopus server but then fail to install the application.
We have two Octopus servers (one production and one QA environment) configured the same. The likely difference being the number of tenants (1 where it succeeds, ~20 where it fails).
The only error I see on the client is attached essentially it is this:
2017-06-28 11:39:19.5884 10 INFO listen://[::]:10933/ 10 Unhandled error when handling request from client: [::ffff:10.200.11.150]:53483
System.IO.IOException: Unable to read data from the transport connection: A connection attempt failed because the connected party did not properly respond after a period of time, or established connection failed because connected host has failed to respond. —> System.Net.Sockets.SocketException: A connection attempt failed because the connected party did not properly respond after a period of time, or established connection failed because connected host has failed to respond
at System.Net.Sockets.Socket.Receive(Byte buffer, Int32 offset, Int32 size, SocketFlags socketFlags)
at System.Net.Sockets.NetworkStream.Read(Byte buffer, Int32 offse
OctopusTentacle.txt (3 KB)
Hi Dave, thanks for reaching out.
The exception you have described feels like a network saturation problem. There are few things that we can look at to get some more information.
Can you please supply the server logs, as these may also have some details that relate to this issue. https://octopus.com/docs/reference/log-files has details on where to find the server log files.
What type of EC2 instances are you using? Different instance types have different network characteristics that may impact bandwidth.
You can also try and maintain a ping between the two servers to see if that is affected when the exception is thrown. From a Windows host you can run a continuous ping with the command
ping destination_or_ip -t
From a Linux host, you can run:
If the pings are interrupted at the same time the Tentacle communication has issues then this will narrow the problem down for us.
Finally, what version of Octopus are you running?
We discovered the root of this issue was our use of farms, release versions and lifecycles. A powershell script was querying to find the latest version and as it had already moved from TEST to PROD it was being returned an error that no releases were available. Nothing to do with the error found in the octopus log.
I think this discussion can be closed as we located the real source of our issue.
Glad to hear your found the root cause. Thanks for letting us know.