A sample implementation of an evolutionary architecture for a serverless application using safe deployments, automatically computing the fitness function at deployment time, with the possibility to rollback back if fitness is not improving.
A sample implementation of an evolutionary architecture for a serverless application using safe deployments, automatically computing the fitness function at deployment time, with the possibility to rollback if fitness is not improving.
Made with ❤️ by Danilo Poccia.
To build this implementation, I started from the sample code in the AWS SAM repository: https://github.com/awslabs/serverless-application-model/tree/master/examples/2016-10-31/lambda_safe_deployments
I updated the Node.js runtime to version 8.10, so that I could make use of the new async
/await
syntax.
The AWS SAM template.yaml
creates:
myFirstFunction
and mySecondFunction
) that implement a basic API (using the Amazon API Gateway)preTrafficHook
Lambda function that is used to measure the fitness of the architecture and posts the result as a CloudWatch metric that you can monitor, alarm or visualize in a dashboard
To test the deployment, you can use the SAM CLI and the following build
/package
/deploy
commands two times:
sam build
sam package --s3-bucket <YOUR_BUCKET> \
--output-template-file packaged.yaml
sam deploy --template-file packaged.yaml \
--stack-name evolutionary-deployment \
--capabilities CAPABILITY_IAM
You can follow the first implementation of the stack, and the next updates, from the CloudFormation console. The previous commands use the default region set for the AWS CLI.
For the two Lambda functions providing an API, different deployment strategies are implemented:
myFirstFunction
is using a Linear deployment adding 10% of the invocations to the new version every minute (Linear10PercentEvery1Minute
), taking 10 minutes to completemySecondFunction
is using a Canary deployment with 10% of the invocations to the new version for 5 minutes, and then a rollout to 100% (Canary10Percent5Minutes
)The preTrafficHook function is running some tests to check if the deployment must Succeed
or Fail
and at the same time is computing the value of the fitness function for this deployment:
To simplify and reuse atomic tests on single resources, the SAM template is passing the StackId
to the preTrafficHook
function as an environment variable.
Using the StackId
, the function is getting the list of the resources in the stack, on which it can iterate with a switch that can apply specific tests depending on the resource type.
Most of the tests involve invocations to AWS services, so to make it more efficient are reduce the overall duration of this function:
async
functions (so that are automatically wrapped as promises)Promises.all()
For example, some of the tests that can be implemented on non-functional requirements, such as security and scalability, are:
Those checks contribute to the measurement of the fitness function, so that if you change you architecture (and possibly your application) to be more secure or scalable, you automatically increase the resulting fitness.
Instead of implementing all tests, you can leverage the existing AWS Config managed rules, such as:
A full lists of AWS Config managed rules is available here. To check the compliance to one or more of those rules, I am using the AWS Config getComplianceDetailsByResource
API.