(ROS, C++) YOLOv9 detection using TensorRT
ROS
of the:
https://github.com/engcang/TensorRT_YOLOv9_ROS/assets/34734707/0dff22cc-ec12-45fb-a931-fb0c90181fd7
ROS
(currently supporting only ROS1
)C++
>= 17cmake
>= 3.14OpenCV
>= 4.2TensorRT
, CUDA
, cuDNN
.engine
file generated with TensorRT
CUDA
11.5, cuDNN
8.3.2.44, TensorRT
8.4.0.6CUDA
and cuDNN
CUDA
following instructions at here - https://developer.nvidia.com/cuda-downloads
cuDNN
following instructions at here - https://developer.nvidia.com/cudnn-downloads
gedit ~/.bashrc
*** Type and save below, CUDA_PATH should be like /usr/local/cuda-11.5, depending on your version ***
export PATH=CUDA_PATH/bin:$PATH
export LD_LIBRARY_PATH=CUDA_PATH/lib64:$LD_LIBRARY_PATH
. ~/.bashrc
gedit ~/.profile
*** Type and save below, CUDA_PATH should be like /usr/local/cuda-11.5, depending on your version ***
export PATH=CUDA_PATH/bin:$PATH
export LD_LIBRARY_PATH=CUDA_PATH/lib64:$LD_LIBRARY_PATH
. ~/.profile
# Verify
dpkg -l | grep cuda
dpkg -l | grep cudnn
nvcc --version
TensorRT
TensorRT
at here - https://developer.nvidia.com/tensorrt-download
sudo apt install tensorrt
sudo apt install python3-libnvinfer-dev
sudo apt install onnx-graphsurgeon
TensorRT
: tensorrt
, python3-libnvinfer-dev
, onnx-graphsurgeon
Python3
virtual envpython3 -m pip install virtualenv virtualenvwrapper
cd <PATH YOU WANT TO SAVE VIRTUAL ENVIRONMENT>
virtualenv -p python3 <NAME YOU WANT>
*** Now you can activate with
source <PATH YOU SAVED>/<NAME YOU WANT>/bin/activate
*** Deactivate with
deactivate
YOLOv9
repo and install requirementsgit clone https://github.com/WongKinYiu/yolov9
cd yolov9
pip install -r requirements.txt
YOLOv9
weight file as .pt
by training your own data or downloading the pre-trained model at here - https://github.com/WongKinYiu/yolov9/releases
.pt
file (saving computation, memory, and size by trimming unnecessary parts for inference but necessary only for training)cd yolov9 # cloned at above step
wget https://raw.githubusercontent.com/engcang/TensorRT_YOLOv9_ROS/main/reparameterize.py
*** Change the number of classes in the reparameterize.py in line 8 (nc=80)
python reparameterize.py yolov9-c.pt yolov9-c-reparameterized.pt # input.pt output.put
.pt
file as .onnx
python export.py --weights yolov9-c-reparameterized.pt --include onnx
.onnx
to .engine
/usr/src/tensorrt/bin/trtexec --onnx=yolov9-c-reparameterized.onnx --saveEngine=yolov9-c.engine
#for faster, less accurate
/usr/src/tensorrt/bin/trtexec --onnx=yolov9-c-reparameterized.onnx --saveEngine=yolov9-c-fp16.engine --fp16
#not recommended - much faster, much less accurate
/usr/src/tensorrt/bin/trtexec --onnx=yolov9-c-reparameterized.onnx --saveEngine=yolov9-c-int8.engine --int8
YOLOv9
is cloned already, requirements are installed alreadyYOLO format
.roboflow
- https://docs.ultralytics.com/yolov5/tutorials/roboflow_datasets_integration/
data.yaml
file by copying and editing yolov9/data/coco.yaml
as follows:path: training # dataset root dir (relative from train.py file)
train: train # train images folder (relative to 'path')
val: val # val images folder (relative to 'path')
test: test # test images folder (relative to 'path')
# Classes
names:
0: Transmission tower
1: Insulator
yolov9.yaml
file by copying and editing yolov9/models/detect/yolov9.yaml or yolov9-c, yolov9-e, etc.
# parameters
nc: 2 # number of classes
depth_multiple: 1.0 # model depth multiple
width_multiple: 1.0 # layer channel multiple
#activation: nn.LeakyReLU(0.1)
#activation: nn.ReLU()
# anchors
anchors: 3
# YOLOv9 backbone
backbone:
[
[-1, 1, Silence, []],
# conv down
[-1, 1, Conv, [64, 3, 2]], # 1-P1/2
...
]
yolov9/data/hyps/hyp.scratch-high.yaml
yolov9
folder. If outside the yolov9
folder, error occurs!
yolov9
│ ...
├─ data # Reference folder
│ ├─ coco.yaml
│ └─ hyps
│ └─ hyp.scratch-high.yaml
├─ models # Reference folder
│ ...
│ ├─ detect
│ ...
│ │ ├─ yolov9-c.yaml
│ │ ├─ yolov9-e.yaml
│ │ └─ yolov9.yaml
├─ runs # Output saved folder
│ ...
├─ train.py # Using this file for GELAN
├─ train_dual.py # Using this file for YOLOv9
├─ training # Using this folder
│ ├─ yolov9-c.pt
│ ├─ data.yaml
│ ├─ yolov9.yaml
│ ├─ test
│ │ ├─ 02001.jpg
│ │ ├─ 02001.txt
│ │ └─ ...
│ ├─ train
│ │ ├─ 00001.jpg
│ │ ├─ 00001.txt
│ │ └─ ...
│ ├─ val
│ │ ├─ 04000.jpg
│ │ ├─ 04000.txt
│ │ └─ ...
└─ └─ ...
cd yolov9
*** Using pretrained model (yolov9-c.pt here), fine-tuning:
python train_dual.py --batch-size 4 --epochs 100 --img 640 --device 0 --close-mosaic 15 \
--data training/data.yaml --weights training/yolov9-c.pt --cfg training/yolov9.yaml --hyp data/hyps/hyp.scratch-high.yaml
*** From the scratch:
python train_dual.py --batch-size 4 --epochs 100 --img 640 --device 0 --close-mosaic 15 \
--data training/data.yaml --weights '' --cfg training/yolov9.yaml --hyp data/hyps/hyp.scratch-high.yaml
AttributeError: 'FreeTypeFont' object has no attribute 'getsize'
pip install Pillow==9.5.0
Killed
and does not trainbatch-size
a lotAssertionError: Invalid CUDA '--device 0' requested, use '--device cpu' or pass valid CUDA device(s)
torch
and torchvision
are not CUDA
versions.*** Check the version at https://download.pytorch.org/whl/torch_stable.html
*** torch >= 1.7.0, torchvision>=0.8.1
pip install torch==1.11.0+cu115 torchvision==0.12.0+cu115 -f https://download.pytorch.org/whl/torch_stable.html
RuntimeError: CUDA out of memory. Tried to allocate 50.00 MiB (GPU 0; 9.76 GiB total capacity; 6.68 GiB already allocated; 45.00 MiB free; 6.82 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF
batch-size
a lotcd ~/<your_workspace>/src
git clone https://github.com/engcang/TensorRT_YOLOv9_ROS.git
*** Check the paths of TensorRT in CMakeLists.txt ***
cd ~/<your_workspace>
catkin build -DCMAKE_BUILD_TYPE=Release
config/config.yaml
roslaunch tensorrt_yolov9_ros run.launch
YOLO
(v3, v4, v7) accelerated with TensorRT
using tkdnn