Thanks for getting in touch and sorry that you are facing this issue.
I’m unable to determine precisely what’s causing this issue, or if it’s a bug (though it seems to work as expected in my testing, but I may be misinterpreting something). Would you be willing to elaborate a bit further on if you used Service Principles or Management Certificates and if you created the Azure Service Principle Account or Credential using Powershell or the Azure Portal. If you could also please send some screenshots showing the Azure Portal, Resource Permissions and the error on the Octopus Server? This will help me better understand your scenario and give it a test as you have it setup.
I look forward to hearing back and getting to the bottom of this one!
I am hoping you can give me some tips on trouble shooting the issue, is there any way to work out what server it is trying to connect to? Are there any logs
I can look at, or tracing I can turn on? At the moment all I get is a generic “Cannot connect” message.
Thanks for the above info and your patience while I debugged this further.
I tried to recreate the error that you are getting and was able to do so when I switched to a host-only network on my Virtual Machine. This indicated that there is indeed something being blocked by your firewall or proxy. Can I please ask which server did you whitelist? I would suggest trying to access your Azure portal from the server that Octopus is running on.
Sorry that you are still facing this issue. I had a discussion with the team about the problem you are facing and we have a couple of steps you could try to help narrow down the root cause of the issue.
Step 1: We suggest running a PowerShell script locally instead of the UI to test the authentication of your Azure credentials. You would need to install Azure PowerShell module using the command Install-Module -Name Az -AllowClobber. (Here is Microsoft’s documentation on how to do this).
Next, please run the TestAzure.ps1 (731 Bytes). You will have to enter your SubscriptionId, TenantId, ClientId and Secret in the script. If it works you should be able to see the Azure Resource Groups you have access to.
Step 2: If Step 1 works could you please RDP into your server running the Octopus instance and try running the script there? Please note that the script would have to be run on a PowerShell window with the same user permissions used by your Octopus server. By doing this we are eliminating the possibility that there is something blocking on Octopus Server or your Azure credentials.
Step 3: If that works we recommend trying to connect to Azure using a local instance of Octopus Server.
Step 4: If the above steps work, we recommend next logging the outgoing requests rejected by your firewall/proxy. We think there might be an endpoint used by the Azure authentication library that may be blocked.
I managed to get your script working on our server, it was super helpful because it returns a meaningful error.
If I try to use those settings in the account page it still doesn’t work
L Same error as before.
If I try to run your powershell script from the Octopus script console it also doesn’t work. But the script console will work if I add this to your script:
Sorry that you are facing this issue. I tried reproducing the steps to add an Azure Web App on my local instance and was able to get through the health check successfully. We will need some more information to investigate what is going on there. Can you please attach the full task log? Details on how to do this are available here https://octopus.com/docs/support/get-the-raw-output-from-a-task
Alternatively, you can send it to support@octopus.com. Only Octopus staff will be able to access it.
We wanted to confirm if you have modified the default machine policy to use a custom script? If so, can you please send us the script? At the moment, the Azure targets only run the default machine policy. This is a known issue which can be found here: https://github.com/OctopusDeploy/Issues/issues/5341. The script might help us identify why the Azure target is running into trouble here.
This is proving to be quite the mystery, made all the more awkward by our poor timezone overlap.
Good: At this point it sounds like you can set up and test your Azure Service Principal Account through your normal Octopus Server.
Bad: It sounds like Health Checks are failing, which is strange. The health check just checks the Azure Web App actually exists and can be accessed using the Service Principle. What is also strange is that there is no helpful logging nor error message, just an exit code of 1.
Zoom call?
The easiest way forward would be to set up a Zoom call where we can screenshare and get to the bottom of this more quickly, the problem being our time zone overlap again, and we may not be able to get this together until next week. Can you confirm some times which may work for you to do that call? I can join a call most nights of the week after 8PM AEST.
In the meantime
In the meantime there are still some things you can do to isolate what is going wrong:
Upgrade to the latest release of Octopus Server. If you want to stay on the 2019.6 LTS, please install the latest patch of 2019.6.x. If you are happy to go onto the fast lane, upgrade to the latest possible version of Octopus Server. Learn about release lanes. I’m doubtful this will fix the problem, but it will make it much easier for me to support you moving forward.
Install an instance of Octopus Server on your own computer and try doing the same thing which is failing on your “real” Octopus Server. This will help isolate the problem down to the environment or the software or the configuration. You can download the latest MSI and set up a trial of Octopus Server in just a few minutes as long as you have SQL Server (even express) accessible on/from your computer.
Thanks for getting back to me. I’ve booked another session with our network team to make sure they’re not blocking anything when I do the health check.
After that I’m definitely keen on the zoom call. Can we book in an evening later this week - Wednesday or Thursday?
I am a Continuous Delivery Architect based in the UK and only got back from Annual Leave today. If you have time, book in some time with me and we can jump in and get this working for you. You can book some time with me here.
It looks like the proxy in place is blocking the health checks and these IP’s need to be whitelisted in your proxy configuration to allow for the health check to run successfully.