Chaos Framework proposes a unified API for vendors to provide solutions to various aspects of performing the principles of chaos engineering in cloud-native environment.
The framework proposals a unified API for vendors to provide solutions to various aspects of performing the principles of chaos engineering in a Cloud Native environment, its built-in modules will heavily testify reliability, availability and resilience for distriuted system. Currently, the community supported the following platforms:
Take RocketMQ for example:
mvn clean install
bin/chaos.sh --driver driver-rocketmq/rocketmq.yaml --install
In one shell, we start the some cluster nodes and the controller using docker compose.
cd docker
./up.sh --dev
In another shell, use docker exec -it chaos-control bash
to enter the controller, then
mvn clean install
bin/chaos.sh --driver driver-rocketmq/rocketmq.yaml --install --restart
Usage: messaging-chaos [options]
Options:
--agent
Run program as a http agent.
Default: false
-c, --concurrency
The number of clients. eg: 5
Default: 4
* -d, --driver
Driver. eg.: driver-rocketmq/rocketmq.yaml
-f, --fault
Fault type to be injected. eg: noop, minor-kill, major-kill,
random-kill, fixed-kill, random-partition, fixed-partition,
partition-majorities-ring, bridge, random-loss, minor-suspend,
major-suspend, random-suspend, fixed-suspend, leader-kill, leader-suspend
Default: noop
-i, --fault-interval
Fault injection interval. eg: 30
Default: 30
-n, --fault-nodes
The nodes need to be fault injection. The nodes are separated by
semicolons. eg: 'n1;n2;n3' Note: this parameter must be used with
fixed-xxx faults such as fixed-kill, fixed-partition, fixed-suspend.
-h, --help
Help message
--install
Whether to install program. It will download the installation package on
each cluster node. When you first use OpenChaos to test a
distributed system, it should be true.
Default: false
--restart
Whether to restart program. If you want the nodes to be restarted, and
shut down after the experiment, it should be true.
Default: false
-t, --limit-time
Chaos execution time in seconds (excluding check time and recovery
time). eg: 60
Default: 60
-m, --model
Test model. Currently queue model and kv model are supported.
Default: queue
--output-dir
The directory of history files and the output files
-p, --port
The listening port of http agent.
Default: 8080
--pull
Driver use pull consumer, default is push consumer. Just for queue model.
Default: false
-r, --rate
Approximate number of requests per second. eg: 20
Default: 20
--recovery
Calculate failure recovery time.
Default: false
--rto
Calculate failure recovery time in fault.
Default: false
-u, --username
User name for ssh remote login. eg: admin
Default: root
--password
User password for ssh remote login. eg: admin
Default: null
The following fault types are currently supported: