When I deploy to a machine in Octopus and immediately realize that I needed to deploy to a different machine, I choose to cancel the task while it is still in step 1. This causes the task to get stuck in a “Cancelling” state and if I press cancel again, it will go into a “Timed out” state. However, if I try to deploy to any machine again after, they are stuck in a “queued” state.
To fix this, I have to backup Octopus (which doesn’t always work, it also gets stuck in a queued state), then run the repair script.
Has anyone ever experienced this issue? If so, is there a fix?
Thanks for your quick response! I am running Octopus Deploy 2.5.12.666. I plan to upgrade to 2.6.0.778 soon, just want to get all of my ducks in a row first.
+1 this
We see this problem almost daily in our environment also running 2.5.12.666
We have had to automate the cancelling of all other deployments as well as the restarting of the service to get Octopus running smoothly.
Vanessa,
We have seen this with variable time frames. We recommend users do not hit cancel a second time at all and manage that from our Ops team, so the delay can be as much as an hour or two before we force cancel the task with the second click. It may be helpful to add a click timer to the button with a 15 second delay to ensure that users are not able to hit the second click immediately.
I agree with Jeff, they delay can be an hour or two before I hit cancel
again to time it out. As far as environments, I’ve seen this happen in at
least 2 different environments and at least 5 different machines have
triggered it at different times. On a similar note, the users are hitting
cancel because they accidentally forgot to select a machine within the
environment before deploying and end up deploying to the whole
environment. Is there a trick to make the 2nd green "Deploy Release"
button gray out until the user selects a specific machine? In my case, we
don’t have any reason to deploy to all machines in an environment at once.
We have a similar issue and created a custom step template to limit the users from deploying to all machines at once. We use this in our general development environment to prevent a user from deploying to all machines in an environments at once. We limit them to deploy to only a single machine at a time. Hope this helps you out.
-Jeff
$deploymachineList = $OctopusParameters[‘Octopus.Deployment.SpecificMachines’]
Write-Output “Specific machine list = $deploymachineList”
if( ($environment -eq ‘EnvironmentName’) -and ($deploymachineList -eq ‘’) )
{
Write-Output "Deployment Stopped. You must select a specific machine target when deploying to EnvironmentName environment"
Exit -1
}
I’m having trouble reproducing this on 2.6.3, and I’d really appreciate some more information:
Does this happen every single time you cancel a deployment, or just occasionally?
Does it eventually return to normal if you click cancel twice (the task should be marked as timed out, but other tasks should then be able to run)?
When it does happen, can you look in RavenDB (http://localhost:10931 on the Octopus server) and see if there are any stale indexes or errors?
If this is something you can reproduce easily/on demand, we’d love to do a screen sharing session to see it and try to get to the bottom of it. You can pick a time that works here:
Today I saw that I had a task that had been sitting in a “deploying” state
for one month. I was afraid to cancel it because then I would have to shut
down the Octopus server and run the repair script. Today, I got tired of
seeing it sitting there and took the chance of cancelling it. It worked!
I had no issues with it getting stuck in a queued state, didn’t have to
click ‘cancel’ again to get it to actually cancel, and I was able to
successfully deploy other projects after I cancelled it.
So it seems that this is no longer an issue for me. I didn’t do any
updates to Octopus, other than a normal Windows update a few days ago.