AI RC Car Agent that using deep reinforcement learning on Jetson Nano
Overview
This software is able to self learning your AI Robocar by Deep reinforcement learning in few minutes.
You can use to Real Robocar and DonkeySim See in.
Many DIY self driving car like JetBot or JetRacer, DonkeyCar are using behavior cloning by supervised-learning. The method need much labeled data that is collected by human demonstration. Human driving techniques is very important in this case.
On the other hands, In this software using deep reinforcement learning (DRL). That is can earned running behavior automatically through interaction with environment. Do not need sample data that is human labelling.
In addition this software agent can run on the Jetson Nano. Why can run on Jetson Nano and short learning time? because using integrate of SAC[soft actor critic] and VAE. SAC is a state of the art off-policy reinforcement learning method. In addition VAE train on cloud server beforehand as CNN layer of SAC.(This method called state representation learning) .
This method devised by Antonin RAFFIN
Detail of SAC here:
This demo video showed that JetBot can earned policy of running road under 30 minutes. Only using Jetson Nano.
Jetbot or JetRacer
JetPack>=4.2
Python=>3.6
pip>=19.3.1
pytorch>=1.8.0
Windows, macOS or Ubuntu (DonkeySim only)
x86-64 arch
Python>=3.6
pip>=19.3.1
DonkeySIM
Optional CUDA10.1(Windows and using GPU.)
pytorch>=1.8.0
Set up JetBot using the following SDCard image. [https://jetbot.org/v0.4.3/software_setup/sd_card.html]
Checking your JetBot Environment. Please write down JETBOT_VERSION and L4T_VERSION.
#JETBOT_VERSION
$ sudo docker images jetbot/jetbot | grep jupyter | cut -f 8 -d ' ' | cut -f 2 -d '-'
#L4T_VERSION
$ sudo docker images jetbot/jetbot | grep jupyter | cut -f 8 -d ' ' | cut -f 3 -d '-'
And Setup LearningRacer for Docker container image.
$ cd ~/ && git clone https://github.com/masato-ka/airc-rl-agent.git
$ cd airc-rl-agent/docker/jetbot && sh build.sh
$ sh enable.sh /home/jetbot
# disable jetbot/jetbot container. Tag name modify for your system by JETBOT_VERSION and L4T_VERSION.
$ sudo docker update --restart=no jetbot_jupyter
$ sudo restart
JetBot images(JetPack>=4.4) are using docker container . Therefore, build application on docker container . allocate maximum memory to the container.
You are able to use racer
command inside docker container. Access to Jupyter Notebook on the
container[http://
You need train original VAE model. Because torch version problem. Coud you cahange
to torch.save(vae.state_dict(), 'vae.torch', _use_new_zipfile_serialization=True)
in VAE_CNN.ipynb training cell.
Firstly setup your jetracer software to JetPack 4.5.1 following this link. Then run below command on your jetracer terminal.
$ cd ~/ && git clone https://github.com/masato-ka/airc-rl-agent.git
$ cd airc-rl-agent
$ sh install_jetpack.sh
Some time pytorch can not recognize your GPU by CUDA Driver problem. In this situation, you need to install pytorch following this link. Detail see in this
$ cd ~/ && git clone https://github.com/masato-ka/airc-rl-agent.git
$ cd airc-rl-agent
$ sudo pip3 install .\[choose platform\]
When complete install please check run command.
$ racer --version
learning_racer version 1.5.0 .
data_collection.ipynb
or data_collection_without_gamepad.ipynb
in notebook/utility/jetbot
. If you use on JetRacer,
usenotebook/utility/jetracer/data_collection.ipynb
.VAE CNN.ipynb
on Google Colaboratory.When your robot is Jetbot, Coud you modify VAE_CNN.ipynb.
A.Offline check
When you run VAE_CNN.ipynb, you can check projection of latent spaces on TensorBoard Projection Tab. This latent spaces are labeled by K-means. If similar images stick together, it indicate to that good latent spaces.
B.Online check
Run notebooks/util/jetbot_vae_viewer.ipynb
and Check reconstruction image. Check that the image is reconstructed
at several places on the course.
If you use on JetRacer, Using jetracer_vae_viewer.ipynb
.
user_interface_without_gamepad.ipynb
$ racer train -robot jetbot
# If you use on JetRacer, "-robot jetracer". default is jetbot.
After few minutes, the AI car starts running. Please push STOP button immediately before the course out. Then, after `` `RESET``` is displayed at the prompt, press the START button. Repeat this.
When you use without_gamepad, you can check status using Validation box.
Can run | Waiting learning |
---|---|
Name | description | Default |
---|---|---|
-config(--config-path) | Specify the file path of config.yml. | config.yml |
-vae(--vae-path) | Specify the file path of the trained VAE model. | vae.torch |
-device(--device) | Specifies whether Pytorch uses CUDA. Set 'cuda' to use. Set 'cpu' when using CPU. | cuda |
-robot(--robot-driver) | Specify the type of car to use. choose from jetbot, jetracer, jetbot-auto, jetracer-auto and sim. | JetBot |
-steps(--time-steps) | Specify the maximum learning step for reinforcement learning. Modify the values according to the size and complexity of the course. | 5000 |
-save_freq(--save_freq_episode) | ||
Specify how many episodes to save the policy model. The policy starts saving after the gradient calculation starts. | 10 | |
-s(--save) | Specify the path and file name to save the model file of the training result. | model |
-l(--load-model) | Define pre-train model path. | - |
In -robot option, If you choose jetracer-auto or jetbot-auto, Auto train mode start. When this mode, Robot stop without human controll and pullback position where start learning.
When only inference, run below command, The script load VAE model and RL model and start running your car.
$ racer demo -robot jetbot
Name | description | Default |
---|---|---|
-config(--config-path) | Specify the file path of config.yml. | config.yml |
-vae(--vae-path) | Specify the file path of the trained VAE model. | vae.torch |
-model(--model-path | Specify the file to load the trained reinforcement learning model. | model |
-device(--device) | Specifies whether Pytorch uses CUDA. Set 'cuda' to use. Set 'cpu' when using CPU. | cuda |
-robot(--robot-driver) | Specify the type of car to use. JetBot and JetRacer can be specified. | JetBot |
-steps(--time-steps) | Specify the maximum step for demo. Modify the values according to the size and complexity of the course. | 5000 |
-tblog(--tb-log) | Define logging directory name, If not set, Do not logging. | None |
In below command, run the demo 1000 steps with model file name is model.
$ racer demo -robot jetbot -steps 1000 -model model
You can get pre-trained VAE model. from here
$wget "https://drive.google.com/uc?export=download&id=19r1yuwiRGGV-BjzjoCzwX8zmA8ZKFNcC" -O vae.torch
$ racer train -robot sim -vae <downloaded vae model path> -device cpu -host <DonkeySim IP>
Name | description | Default |
---|---|---|
-config(--config-path) | Specify the file path of config.yml. | config.yml |
-vae(--vae-path) | Specify the file path of the trained VAE model. | vae.torch |
-device(--device) | Specifies whether Pytorch uses CUDA. Set 'cuda' to use. Set 'cpu' when using CPU. | cuda |
-robot(--robot-driver) | Specify the type of car to use. JetBot and JetRacer can be specified. | JetBot |
-steps(--time-steps) | Specify the maximum learning step for reinforcement learning. Modify the values according to the size and complexity of the course. | 5000 |
-save_freq(--save_freq_episode) | Specify how many steps to save the policy model. The policy starts saving after the gradient calculation starts. | 10 |
-save_path(--save-model-path) | Specify the path for saved model file. | model_log |
-s(--save) | Specify the path and file name to save the model file of the training result. | model |
-l(--load-model) | Define pre-train model path. | - |
$ racer demo -robot sim -model <own trained model path> -vae <downloaded vae model path> -steps 1000 -device cpu -host <DonkeySim IP> -user <your own name>
Name | description | Default |
---|---|---|
-config(--config-path) | Specify the file path of config.yml. | config.yml |
-vae(--vae-path) | Specify the file path of the trained VAE model. | vae.torch |
-model(--model-path | Specify the file to load the trained reinforcement learning model. | model |
-device(--device) | Specifies whether Pytorch uses CUDA. Set 'cuda' to use. Set 'cpu' when using CPU. | cuda |
-robot(--robot-driver) | Specify the type of car to use. JetBot and JetRacer can be specified. | JetBot |
-steps(--time-steps) | Specify the maximum step for demo. Modify the values according to the size and complexity of the course. | 5000 |
-user(--sim-user) | Define user name for own car that showed DonkeySim | anonymous |
-car(--sim-car) | Define car model type for own car that showed DonkeySim | Donkey |
You can configuration to some hyper parameter using config.yml.
Section | Parameter | Description |
---|---|---|
SAC_SETTING | LOG_INTERVAL | Reference to stable baselines document. |
^ | VERBOSE | ^ |
^ | LERNING_RATE | ^ |
^ | ENT_COEF | ^ |
^ | TRAIN_FREQ | ^ |
^ | BATCH_SIZE | ^ |
^ | GRADIENT_STEPS | ^ |
^ | LEARNING_STARTS | ^ |
^ | BUFFER_SIZE | ^ |
^ | GAMMA | ^ |
^ | TAU | ^ |
^ | USER_SDE | ^ |
^ | USER_SDE_AT_WARMUP | ^ |
^ | SDE_SAMPLE_FREQ | ^ |
^ | VARIANTS_SIZE | Define size of VAE latent |
^ | IMAGE_CHANNELS | Number of image channel. |
REWARD_SETTING | REWARD_CRASH | Define reward when crash. |
^ | CRASH_REWARD_WEIGHT | Weight of crash reward. |
^ | THROTTLE_REWARD_WEIGHT | Weight of reward for speed. |
AGENT_SETTING | N_COMMAND_HISTORY | Number of length command history as observation. |
^ | MIN_STEERING | min value of agent steering. |
^ | MAX_STEERING | max value of agent steering. |
^ | MIN_THROTTLE | min value of agent throttle. |
^ | MAX_THROTTLE | max value of agent throttle. |
^ | MAX_STEERING_DIFF | max value of steering diff each steps. |
JETRACER_SETTING | STEERING_CHANNEL | Steering PWM pin number. |
^ | THROTTLE_CHANNEL | Throttle PWM pin number. |
^ | STEERING_GAIN | value of steering gain for NvidiaCar. |
^ | STEERING_OFFSET | value of steering offset for NvidiaCar. |
^ | THROTTLE_GAIN | value of throttle gain for NvidiaCar. |
^ | THROTTLE_OFFSET | value of throttle offset for NvidiaCar. |
2020/03/08 Alpha release
2020/03/16 Alpha-0.0.1 release
2020/03/23 Beta release
2020/03/23 Beta-0.0.1 release
2020/04/26 v1.0.0 release
2020/06/30 v1.0.5 release
2020/10/11 v1.5.0 release
2021/01/09 v1.5.1 release
2021/04/11 v1.5.2 release
2021/12/26 v1.6.0 release
2022/03/27 v1.7.0 release
2022/07/03 v1.7.1 relase
This software license under MIT licence.