A serverless application to automatically backup instances and volumes regularly on AWS and delete them after a certain number of days
With serverless!
Found at: https://github.com/AndrewFarley/AWS-Automated-Daily-Instance-AMI-Snapshots
# Make sure your CLI has a default AWS credentials setup, if not run this...
aws configure
# Clone this repository with...
git clone [email protected]:AndrewFarley/AWS-Automated-Daily-Instance-AMI-Snapshots.git
cd AWS-Automated-Daily-Instance-AMI-Snapshots
# Deploy it with...
serverless deploy
# Run it manually with...
serverless invoke --function execute_handler --log
Now go tag your instances or volumes (manually, or automatically if you have an automated infrastructure like Terraform or CloudFormation) with the Key "backup" (with any value) which will trigger this script to back that instance up.
If you'd like to specify the number of days to retain backups, set the key "Retention" with a numeric value. If you do not specify this, by default keeps the AMIs for 7 days.
After tagging some servers, try to run it manually again and check the output to see if it detected your server. To make sure your tag works, go run the lambda yourself manually and check the log output. If you tagged some instances and it ran successfully, your output will look something like this...
bash-3.2$ serverless invoke --function execute_handler --log
--------------------------------------------------------------------
Scanning region: eu-central-1
Scanning for instances with tags (backup,Backup)
Found 2 instances to backup...
Instance: i-00001111222233334
Name: jenkins-build-server
Time: 7 days
AMI: ami-00112233445566778
Instance: i-55556666777788889
Name: primary-webserver
Time: 7 days
AMI: ami-11223344556677889
Scanning for AMIs with tags (AWSAutomatedDailySnapshots)
Found AMI to consider: ami-008e6cb79f78f1469
Delete After: 06-12-2018
This item is too new, skipping...
Scanning region: eu-west-1
Scanning for instances with tags (backup,Backup)
Found 0 instances to backup...
Scanning for AMIs with tags (AWSAutomatedDailySnapshots)
Scanning region: eu-west-2
Now every day, once a day this lambda will run and automatically make no-downtime snapshots of your servers and/or volumes.
If you'd like to tweak this function it's very easy to do without ever having to edit code or re-deploy it. Simply edit the environment variables of the Lambda. If you didn't change the region this deploys to, you should be able to CLICK HERE and simply update any of the environment variables in the Lambda and hit save. Seen below...
If you wish to schedule the time for your AMI backups, simply edit the serverless.yml rate
and use the cron syntax as follows.
# Replace this line...
rate: rate(1 day)
# With this...
rate: cron(0 0 * * ? *)
For Reference on the cron format, see: Amazon Lambda Scheduling with Rate or Cron
NOTE: Keep in mind Amazon uses UTC time, so the above is at midnight in UTC, which is usually 8 hours ahead of California (PST) time for example. If you wanted midnight in PST, you'd need to add 8 hours to this, making the line cron(0 8 * * ? *)
If you want to run this script in an "alternate" mode where it snapshots once a week, and expires after one month you can do this. Please run these four commands on a freshly checked out copy of this repo, these will run on OS-X or Linux.
# First, replace our rate of once a day, to once a week on saturday
sed 's/rate(1 day)/cron(0 0 ? * SAT *)/' < serverless.yml > serverless.yml.tmp && cat serverless.yml.tmp > serverless.yml
# Second, replace our stack name, so it makes sense (and we can deploy this multiple times)
sed 's/daily-instance-snapshot/weekly-instance-snapshot/' < serverless.yml > serverless.yml.tmp && cat serverless.yml.tmp > serverless.yml
# Third, set our retention time to 30 days
sed 's/DEFAULT_RETENTION_TIME: "7"/DEFAULT_RETENTION_TIME: "30"/' < serverless.yml > serverless.yml.tmp && cat serverless.yml.tmp > serverless.yml
# Fourth, change the name of the key to tag on so we can deploy this at the same time as the daily snapshot (default) deployment
sed 's/KEY_TO_TAG_ON: "AWSAutomatedDailySnapshots"/KEY_TO_TAG_ON: "AWSAutomatedWeeklySnapshots"/' < serverless.yml > serverless.yml.tmp && cat serverless.yml.tmp > serverless.yml
and yes, I know you could use in-place sed, but this works differently on OS-X
Feel free to adjust the above to any other specifications you desire. Some good examples might be running once a month, expire after a year, once a week expire after 6 months, once every 3 days expire after a month, etc.
To validate that images have been created you can view your AMIs section under the AWS Console in EC2. Alternatively, you can use the following command-line example.
aws ec2 describe-images --owners self --filters "Name=tag:Backup,Values=true" \
--query 'Images[ * ].{ID:ImageId, ImgName:Name, Owner:OwnerId, Tag:Description, CreationDate:CreationDate}' | jq .
[
{
"ID": "ami-123c8a43",
"ImgName": "myserver.mydomain.com-backup-2018-07-02-09-00-34",
"Owner": "012345678901",
"Tag": "Automatic Daily Backup of myserver.mydomain.com from i-098765b1a132aa1b",
"CreationDate": "2018-07-02T09:00:34.000Z"
},
...
PLEASE NOTE: This script will NOT restart your instances nor interrupt your servers as this may anger you or your client, and I wouldn't want to be responsible for that.
Because of this, Amazon can't guarantee the file system integrity of the created image, but generally most backups are perfectly fine. Almost every single one I've ever tested, of the thousands of AMIs I've made over the course of the last 8 years have been perfectly fine. I've only had a handful of bad eggs, and if you use these backups with something like autoscaling with health checks, then any issues in AMIs should be rooted out fairly quickly (as they never get healthy).
In practice, only if you have heavy disk IO does this ever cause a problem for example on heavily loaded database servers. For these type of servers, you are better off running a daily cronjob on them to force your database to sync to file (eg: CHECKPOINT in pgsql) and then initiating an AMI snapshot.
If you want this, you'll have to do this yourself or scrounge the net for example scripts.
Simple remove with the serverless remove command. Please keep in mind any AMIs this script may have created will still be in place, you will need to delete those yourself.
serverless remove
Date | Features / Milestones |
---|---|
June 2018 | Initial public release, moved configuration to env variables, bugfixes, exception handling |
September 2018 | Bugfix, internal AWS tags prefixed with aws: caused failures, renaming those tag keys |
November 2018 | Feature Snapshot Volumes added, thanks @milvain for the idea |
November 2018 | Feature Documentation for Weekly Snapshots , thanks @ChampionWolf for the idea |
July 2022 | Updating for Serverless 2.0 framework |
November 2022 | Adjusting IAM roles and further adjustments/standards for Serverless 2.0 framework, validating this tool still works great having just installed it on a few clients (it does!) |
December 2022 | Updating to Python 3.9 and adding new AWS regions (ap-south-2, me-central-1, eu-south-2, eu-central-2) |
This script is in use at a number of my clients including OlinData, Shake-On, Xeelas, RVillage, Pharos, Diversigen, Orasure, RogersPOS and others.
If you're happily using this script somewhere for a client to make them super happy let me know so I can add a section here for shoutouts to happy customers. +1 to open source eh?
Please feel free to file Github bugs if you find any or suggestions for features! If you're technically minded, please feel free to fork and make your own modifications. If you make any fixed/changes that are awesome, please send me pull requests or patches.
If you have any questions/problems beyond that, feel free to email me at one of the emails in author above.