We have 2 or more web projects being deployed to separate folders on a IIS server using the same site and application pool. In the process for all projects we first stop the application pool so that no files are locked and the copying of new files is successful. After the deployment of the files we start the pool and run a simple smoke test (make http get request and expect status code 200). However it seems that multiple deployment projects are running on the same time, it happens sometimes (randomly) that when the first project has been deployed and is going to run the smoke test the other project deployment has stopped the application pool, making the smoke test fail for the first project.
Is there any way to make only one project deployment run at a time or should we somehow change the process completely?
Thanks for reaching out. Are all projects under that same site always deployed together? If that is the case, I’d recommend you to deploy them on the same Octopus project, under different NuGet deploy steps. This will of course force you to deploy all the projects on each run.
If you are not deploying them altogether, then you could have each of them on a different Octopus project, but deploy them always from the same Tentacle. By default, Tentacles will run only one process at a time unless you change that. So if ProjectA is being deployed, and you trigger a deployment for ProjectB, the latter will wait for the first to one finish before starting.
All projects are not always deployed together, but the deployments can be triggered simultaneously, say if there has been code changes to a shared library that two or more projects use. So we cannot have them all in a single Octopus project.
We only have one tentacle doing the deployment for all projects and I also understood that the tentacle will only be doing one deployment at a time unless other specified. But when I was checking the dashboard there was three deployments going on, the first one had finished the nuget deployment and was doing the smoke test, the second one had run the first step (stop IIS Application pool) and was now waiting to do the nuget deployment, the third one I think had not yet done a single step which is correct.
So the smoke test for the first project failed because project 2 had stopped the application pool although it shouldn’t have, and it was waiting to deploy the nuget package, there was a message saying “This Tentacle is currently busy performing a task that cannot be run in conjunction with any other task. Please wait…”
The process for all projects is something like this:
Stop application pool
Deploy nuget package
Clean config transforms
Convert folder to IIS application
Start the application pool
Smoke test (retry for 10 minutes if the site does not return code 200)
Send email
The best would be if we wouldn’t have to stop the application pool at all, but I think we do this because otherwise IIS might be locking files and then the nuget deployment would fail.
I’ll attach some log files if they might be of any help to understand what happened.
Thanks for sending over the logs and sorry for the delay. I’ve looked at your deployment logs and I found something weird:
Project2 starts deploying after Project1, but 2 only realizes there was another deployment going on during the 2nd step, when it prints the message “This Tentacle is currently busy performing a task that cannot be run in conjunction with any other task. Please wait…”. By that point it already ran the 1st step (stopping the app pool) which seems to be causing this issue.
I tried to reproduce this on my end. but in my case Project2 realizes Project1 is deploying, and it waits for it to finish, which is the result we are looking for here.
Is this consistently working like this for you? Meaning that Project2 runs the first step in parallel, and then during the 2nd one it realized it should wait for Project1 to finish.
That is exactly the same as what I have realized It does not happen every time, roughly every 10th deployment or once a day. And it is not always Project1 that fails, sometimes it is the other way around (Project2 fails because Project1 has stopped the application pool), or it might be Project3, 4, 5 or 6 that fails.
Here is the timestamps from yesterday when it failed again:
Project1:
16:19:39 Stop application pool
16:19:49 Deploy package
16:20:00 Clean configurations
16:20:04 Convert folder to application
16:20:08 Start application pool
16:20:12 - 16:30:21 Smoke test failed
Project 2:
16:20:15 Stop application pool
16:30:30 Deploy package (busy waiting 10 minutes)
16:31:08 Clean configurations
16:31:12 Convert folder to application
16:31:17 Start application pool
16:31:47 Smoke test (succesful)
So Project2 is clearly running Step1 when it shouldn’t. Is there something we could do to try to fix this? There is a “Wait for packages to be downloaded before running” on the first step, maybe checking this could help? or adding a dummy Step1 so that stopping the application pool comes as Step2? Any other suggestions?
This is not a show stopper for us but it can be a bit irritating in the long run Your help is greatly appreciated!
I’m gonna report this behavior to the rest of the team and let you know what we find.
In the meantime, since the deployments are taking place in the same server, we could add a 1st step that checks for the value of a file before moving on, and a last step that updates that file. I’ve written 2 code snippets that will do this for you:
This first step will check the content of $lockfile. If the content is “Available”, it will continue with the deployment. If its not “available”, it will stall and check again in 15 seconds
This should be the last step of your deployment. It’ll set the content of $lockfile to “available” so the other deployments can move on. Make Sure to set this step condition to “Run Always”
I have now tried to check the checkbox “Wait for packages to be downloaded before running” on Step1 on all projects but this did not seem to help as a Smoke test failed today while another project had stopped the application pool.
Thanks for your suggestion about creating and checking a lockfile. I have now implemented this and will wait to see what happens over the next days I added a line to write out “Not available” to the lockfile when a project gets the lock:
if($content -eq "Available"){
Write-Output "Content of $lockfile is 'Available'. Moving on to next step"
"Not available" | Out-File $LockFile
$continue = $true
}
The lockfile method was not working that good after all. Apparently the system went into deadlock. Project1 got the lock from the file and was going to download the new package but stalled (This tentacle is busy performing a task…) because Project2 was now on Step1 checking the lock file each 15 seconds for eternity.
Is there anything else I can do to help this matter forward?
Its probably gonna be better not to use that workaround then. Please remove it from your process so we can reproduce your issue consistently. I’ve discussed this with one of my teammates, and even though we couldn’t reproduce it, we believe it should be investigated. I’ve created a github issue for this: https://github.com/OctopusDeploy/Issues/issues/1950
Once you remove the workaround I proposed, are you still able to consistently reproduce this issue?
I experience the exact same issue on our end. We have one Team City build for 4 projects that triggers deployment process on all four Octopus projects to the same tentacles. This becomes a dead lock sometimes.
If we but monitor deployment flag on in Team City there is no problem, but deployment cannot be done without holding up the build agents for 4 min. for deployment to finish. Not good for Team City.
So maybe there should be an improved queue system to tentacles?
Something happened we removed the workaround and at the same time changed some global variables:
OctopusPrintEvaluatedVariables: False
OctopusPrintVariables: False
(they where both “True” before)
In three weeks the dead lock issue has only occurred once instead of about once each day. So this might be something that could help reproduce the error, setting those variables to “True”.