DEPRECATED Collection of python scripts to run failure injection on AWS infrastructure
All scripts here have been ported to AWS FIS - See https://github.com/adhorn/aws-fis-templates-cdk
⚠️USE AT YOUR OWN RISK⚠️
Using these scripts may create an unreasonable risk. If you choose to use the scripts provided here in your own activities, you do so at your own risk. None of the authors or contributors, or anyone else connected with these scripts, in any way whatsoever, can be responsible for your use of the scripts contained in this repository. Use these scripts only if you understand what the code does
script-fail-az: simulate the lose of an Availability Zone (AZ) in a VPC.
❯ script-fail-az --help
usage: script-fail-az [-h] --region REGION --vpc-id VPC_ID --az-name AZ_NAME
[--duration DURATION] [--limit-asg] [--failover-rds]
[--failover-elasticache] [--log-level LOG_LEVEL]
Simulate AZ failure: associate subnet(s) with a Chaos NACL that deny ALL
Ingress and Egress traffic - blackhole
optional arguments:
-h, --help show this help message and exit
--region REGION The AWS region of choice (default: None)
--vpc-id VPC_ID The VPC ID of choice (default: None)
--az-name AZ_NAME The name of the availability zone to blackout
(default: None)
--duration DURATION The duration, in seconds, of the blackout (default:
60)
--limit-asg Remove "failed" AZ from Auto Scaling Group (ASG)
(default: False)
--failover-rds Failover RDS if master in the blackout subnet
(default: False)
--failover-elasticache
Failover Elasticache if primary in the blackout subnet
(default: False)
--log-level LOG_LEVEL
Python log level. INFO, DEBUG, etc. (default: INFO)
script-stop-instance: randomly kill an instance in a particular AZ if proper tags.
❯ script-stop-instance --help
usage: script-stop-instance [-h] [--log-level LOG_LEVEL] --region REGION
--az-name AZ_NAME [--tag TAG]
[--duration DURATION]
Script to randomly stop instance in AZ filtered by tag
optional arguments:
-h, --help show this help message and exit
--log-level LOG_LEVEL
Python log level. INFO, DEBUG, etc. (default: INFO)
--region REGION The AWS region of choice (default: None)
--az-name AZ_NAME The name of the availability zone of choice (default:
None)
--tag TAG Filter instances by tag name:value (default:
SSMTag:chaos-ready)
--duration DURATION Duration (s) before restarting the instance (default:
60)
script-fail-rds: force RDS failover if master is in a particular AZ or if database ID provided.
❯ script-fail-rds --help
script-fail-rds [-h] --region REGION --rds-id RDS_ID --vpc-id VPC_ID
--az-name AZ_NAME [--log-level LOG_LEVEL]
Force RDS failover if master is in a particular AZ or if database ID provided
optional arguments:
-h, --help show this help message and exit
--region REGION The AWS region of choice. (default: None)
--rds-id RDS_ID The Id of the RDS database to failover. (default:
None)
--vpc-id VPC_ID The VPC ID of where the DB is. (default: None)
--az-name AZ_NAME The name of the AZ where the DB master is. (default:
None)
--log-level LOG_LEVEL
Python log level. INFO, DEBUG, etc. (default: INFO)
script-fail-elasticache: force elasticache failover if primary node is in a particular AZ or if cluster name provided.
❯ script-fail-elasticache --help
usage: script-fail-elasticache [-h] --region REGION --elasticache-cluster-name
ELASTICACHE_CLUSTER_NAME --vpc-id VPC_ID --az-name
AZ_NAME [--log-level LOG_LEVEL]
Force ElastiCache failover if master is in a particular AZ or if master node
ID provided
optional arguments:
-h, --help show this help message and exit
--region REGION The AWS region of choice. (default: None)
--elasticache-cluster-name ELASTICACHE_CLUSTER_NAME
The cache cluster name to failover. (default: None)
--vpc-id VPC_ID The VPC ID where the primary node (master) is.
(default: None)
--az-name AZ_NAME The AZ where the primary node (master) is. (default:
None)
--log-level LOG_LEVEL
Python log level. INFO, DEBUG, etc. (default: INFO)
You have two options. Choose one of the options below
Build a wheel.
pip install wheel
python setup.py bdist_wheel
The wheel file chaos_aws-1.0.0-py3-none-any.whl
is in the the dist
folder:
cd dist
Install the wheel with pip.
pip install chaos_aws-1.0.0-py3-none-any.whl
Run the script with its console script:
script-fail-az --region eu-west-3 --vpc-id vpc-2719dc4e --az-name eu-west-3a --duration 60 --limit-asg --failover-rds --failover-elasticache
script-stop-instance --region eu-west-3 --az-name eu-west-3a --tag "chaos:ready"
script-fail-rds --region eu-west-3 --rds-id database-1
script-fail-rds --region eu-west-3 --vpc-id vpc-2719dc4e --az-name eu-west-3c
script-fail-elasticache --region eu-west-3 --vpc-id vpc-2719dc4e --az-name eu-west-3c
script-fail-elasticache --region eu-west-3 --elasticache-cluster-name chaoscluster
Install requirements
pip install -r requirements.txt
Run the script with its console script:
python scripts/fail_az.py --region eu-west-3 --vpc-id vpc-2719dc4e --az-name eu-west-3c --duration 60 --limit-asg --failover-rds --failover-elasticache
python scripts/stop_random_instance.py --region eu-west-3 --az-name eu-west-3a --tag "chaos:ready"
python scripts/fail_rds.py --region eu-west-3 --rds-id database-1
python scripts/fail_rds.py --region eu-west-3 --vpc-id vpc-2719dc4e --az-name eu-west-3c
python scripts/fail_elasticache.py --region eu-west-3 --vpc-id vpc-2719dc4e --az-name eu-west-3c
python scripts/fail_elasticache.py --region eu-west-3 --elasticache-cluster-name chaoscluster