Hi Everyone
TLDR
I am trying to work out if I have an AWS CLI problem or an Octopus problem. Any help is appreciated.
Details
I have a scheduled release that runs 0300 every night. This release invokes an aws lambda using the aws cli. This lambda takes a while to run depending on the tenant. Some take 1 minute, some take 13 minutes. If the lambda takes longer than 10 minutes (but less than the maximum of 15 minutes) a strange thing occurs. The lambda runs to completion (I can see this in the cloud watch logs) but the Octopus step just sits there. After a minute or two the lambda starts again. This happens a few times. This is strange because the aws cli command is invoked with a timeout (--cli-read-timeout 900
) so I would expect it to wait until that timeout was reached.
A few things to note:
- I can run this aws cli command without any problems from my local machine. The problem only occurs on the Octopus step hence why I am asking the question
- The max timeout on a lambda is 15 minutes. I only mention this to recognise that though the lambda might not be the best candidate for long running operation but it is technically allowed so should work.
- I was wondering if Octopus does a retry behind the scenes or some sort of continuation that is adding events to the lambda invocation queue?
- EDIT: running the command asynchronously doesn’t have this problem either but I lose the error handling so I would rather run this synchronously
- I am a bit new at the whole dev ops game so it is possible I have misunderstood something
Thanks in advance