BEVDet implemented by TensorRT, C++; Achieving real-time performance on Orin
English | 简体中文
This project is a TensorRT implementation for BEVDet inference, written in C++. It can be tested on the nuScenes dataset and also provides a single test sample. BEVDet is a multi-camera 3D object detection model in bird's-eye view. For more details about BEVDet, please refer to the following link BEVDet. The script to export the ONNX model is in this repository.
NEWS : A new branch "one" has been released. This branch implements TensorRT-plugins. bevdet-tensorrt-cpp
This project implements the following:
The features of this project are as follows:
The following parts need to be implemented:
All time units are in milliseconds (ms), and Nearest interpolation is used by default.
Preprocess | Image stage | BEV pool | Align Feature | BEV stage | Postprocess | mean Total | |
---|---|---|---|---|---|---|---|
NVIDIA A4000 FP32 | 0.478 | 16.559 | 0.151 | 0.899 | 6.848 | 0.558 | 25.534 |
NVIDIA A4000 FP16 | 0.512 | 8.627 | 0.168 | 0.925 | 2.966 | 0.619 | 13.817 |
NVIDIA A4000 Int8 | 0.467 | 3.929 | 0.143 | 0.885 | 1.771 | 0.631 | 7.847 |
Jetson AGX Orin FP32 | 2.800 | 38.09 | 0.620 | 2.018 | 11.893 | 1.065 | 55.104 |
Jetson AGX Orin FP16 | 2.816 | 17.025 | 0.571 | 2.111 | 5.747 | 0.919 | 29.189 |
Jetson AGX Orin Int8 | 2.924 | 10.340 | 0.596 | 1.861 | 4.004 | 0.982 | 20.959 |
Note: The inference time of the module refers to the time of a frame, while the total time is calculated as the average time of 200 frames.
Model | Description | mAP | NDS | Infer time |
---|---|---|---|---|
Pytorch | 0.3972 | 0.5074 | 96.052 | |
Pytorch | LSS accelerate1 | 0.3787 | 0.4941 | 86.236 |
Trt FP32 | Python Preprocess2 | 0.3776 | 0.4936 | 25.534 |
Trt FP32 | Bicubic sampler3 | 0.3723 | 0.3895 | 33.960 |
Trt FP32 | Nearest sampler4 | 0.3703 | 0.4884 | 25.534 |
Trt FP16 | Nearest sampler | 0.3702 | 0.4883 | 13.817 |
Pytorch | Nearest sampler 5 | 0.3989 | 0.5169 | —— |
Pytorch | LSS accelerate 5 | 0.3800 | 0.4997 | —— |
Trt FP16 | 5 | 0.3785 | 0.5013 | 12.738 |
Note: The PyTorch model does not include preprocessing time, and all models were tested on an NVIDIA A4000 GPU
The Project provides a test sample that can also be used for inference on the nuScenes dataset. When testing on the nuScenes dataset, you need to use the data_infos folder provided by this project. The data folder should have the following structure:
└── data
├── nuscenes
├── data_infos
├── samples_infos
├── sample0000.yaml
├── sample0001.yaml
├── ...
├── samples_info.yaml
├── time_sequence.yaml
├── samples
├── sweeps
├── ...
the data_infos folder can be downloaded from Google drive or Baidu Netdisk
For desktop or server:
For Jetson AGX Orin
Please use the ONNX file provided by this project to generate the TRT engine based on the script:
python tools/export_engine.py cfgs/bevdet_lt_depth.yaml model/img_stage_lt_d.onnx model/bev_stage_lt_d.engine --postfix="_lt_d_fp16" --fp16=True
ONNX files, cound be downloaded from Baidu Netdisk or Google Drive
mkdir build && cd build
cmake .. && make
./bevdemo ../configure.yaml