A sample AWS Step Functions (SFN) state machine, designed to one-way synchronize an Amazon S3 source bucket into another S3 destination bucket.
Copyright 2015 Amazon.com, Inc. or its affiliates. All Rights Reserved. Licensed under the Apache License, Version 2.0 (the "License"). You may not use this file except in compliance with the License. A copy of the License is located at
http://aws.amazon.com/apache2.0/
or in the "license" file accompanying this file. This file is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License.
This AWS Step Functions (SFN) state machine is designed to one-way synchronize an Amazon S3 source bucket into an Amazon S3 destination bucket as follows:
This code is purely meant for illustration/demonstration purposes, please use it at your own risk. Although it has been developed carefully and with the best intentions in mind, there is no guarantee and there may be bugs. It will copy real files and it will delete real files in your Amazon S3 buckets in case it deems so necessary. To avoid any damage, please use it only with Amazon S3 buckets that contain purely test and demonstration data.
You will need a system with Python 2.7 and virtualenv (https://virtualenv.pypa.io/en/stable/installation/) installed, and an AWS account that is configured on your system to be ready to use with the AWS CLI.
(We won't use the AWS CLI but will use the AWS credentials stored in its configuration files.)
> sudo yum install -y gcc libffi-devel openssl-devel # Make sure some prerequisites are installed.
> virtualenv env # Create a Python virtual environment.
> cd env; . ./bin/activate # Activate the Python virtual environment.
> git clone https://github.com/awslabs/sync-buckets-state-machine # Clone the software from this Git repository.
> cd sync-buckets-state-machine
> pip install -r requirements.txt # This will also install the "fab" utility from http://www.fabfile.org.
> cp fabfile_config_template.py fabfile_config.py
> vi fabfile_config.py # Fill in your own values.
> fab # Install everything into your AWS account.
Start the Amazon Step Functions console in your chosen region and start a new execution with an input like:
{
"source": "your-source-bucket-name",
"destination": "your-destination-bucket-name"
}
Optionally sync based on a prefix:
{
"source": "...",
"destination": "...",
"prefix": "images/"
}
Optionally sync after a specific key:
{
"source": "...",
"destination": "...",
"startAfter": "images/1000"
}
This assumes that you're still working from the sync-buckets-state-machine that you installed into in the steps above.
> fab delete # Delete the CloudFormation stack and its resources.
> deactivate # Deactivate the Python virtual environment
> cd ../..; /bin/rm -rf env # Clean up.
Please send feedback, suggestions, etc. to [email protected] (Constantin Gonzalez)