[ICCVW-2021] SA-Det3D: Self-attention based Context-Aware 3D Object Detection
By Prarthana Bhattacharyya, Chengjie Huang and Krzysztof Czarnecki.
We provide code support and configuration files to reproduce the results in the paper:
SA-Det3D: Self-Attention Based Context-Aware 3D Object Detection.
Our code is based on OpenPCDet, which is a clean open-sourced project for benchmarking 3D object detection methods.
Fig.1. Self-Attention augmented global-context aware backbone networks.
In this paper, we explore variations of
self-attention for contextual modeling in 3D object
detection by augmenting convolutional features with
self-attention features.
We first incorporate the pairwise self-attention mechanism into the current
state-of-the-art BEV, voxel, point and point-voxel based detectors and show
consistent improvement over strong baseline models while simultaneously
significantly reducing their parameter footprint and computational cost.
We call this variant full self-attention (FSA).
We also propose a self-attention variant that
samples a subset of the most representative features by
learning deformations over randomly sampled locations.
This not only allows us to scale explicit global contextual
modeling to larger point-clouds,
but also leads to more discriminative and informative feature
descriptors. We call this variant deformable self-attention (DSA).
Fig.2. 3D Car AP with respect to params and FLOPs of baseline and proposed self-attention variants.
Fig.3. Visualizing qualitative results between baseline and our proposed self-attention module.
We provide our proposed detection models in this section. The 3D AP results (R-40) on KITTI 3D Object Detection validation of the Car moderate category are shown in the table below.
Notes:
Car 3D AP | Params (M) | G-FLOPs | download | |
---|---|---|---|---|
PointPillar_baseline | 78.39 | 4.8 | 63.4 | PointPillar |
PointPillar_red | 78.07 | 1.5 | 31.5 | PointPillar-red |
PointPillar_DSA | 78.94 | 1.1 | 32.4 | PointPillar-DSA |
PointPillar_FSA | 79.04 | 1.0 | 31.7 | PointPillar-FSA |
SECOND_baseline | 81.61 | 4.6 | 76.7 | SECOND |
SECOND_red | 81.11 | 2.5 | 51.2 | SECOND-red |
SECOND_DSA | 82.03 | 2.2 | 52.6 | SECOND-DSA |
SECOND_FSA | 81.86 | 2.2 | 51.9 | SECOND-FSA |
Point-RCNN_baseline | 80.52 | 4.0 | 27.4 | Point-RCNN |
Point-RCNN_red | 80.40 | 2.2 | 24 | Point-RCNN-red |
Point-RCNN_DSA | 81.80 | 2.3 | 19.3 | Point-RCNN-DSA |
Point-RCNN_FSA | 82.10 | 2.5 | 19.8 | Point-RCNN-FSA |
PV-RCNN_baseline | 84.83 | 12 | 89 | PV-RCNN |
PV-RCNN_DSA | 84.71 | 10 | 64 | PV-RCNN-DSA |
PV-RCNN_FSA | 84.95 | 10 | 64.3 | PV-RCNN-FSA |
a. Clone the repo:
git clone --recursive https://github.com/AutoVision-cloud/SA-Det3D
b. Copy SA-Det3D src into OpenPCDet:
sh ./init.sh
c. Install OpenPCDet and prepare KITTI data:
Please refer to INSTALL.md for installation and dataset preparation.
d. Run experiments with a specific configuration file:
Please refer to GETTING_STARTED.md to learn more about how to train and run inference on this detector.
If you find this project useful in your research, please consider citing:
@misc{bhattacharyya2021sadet3d,
title={SA-Det3D: Self-Attention Based Context-Aware 3D Object Detection},
author={Prarthana Bhattacharyya and Chengjie Huang and Krzysztof Czarnecki},
year={2021},
eprint={2101.02672},
archivePrefix={arXiv},
primaryClass={cs.CV}
}