MinkLoc3D: Point Cloud Based Large-Scale Place Recognition
Paper: MinkLoc3D: Point Cloud Based Large-Scale Place Recognition 2021 IEEE Winter Conference on Applications of Computer Vision (WACV) arXiv
Warsaw University of Technology
The paper presents a learning-based method for computing a discriminative 3D point cloud descriptor for place recognition purposes. Existing methods, such as PointNetVLAD, are based on unordered point cloud representation. They use PointNet as the first processing step to extract local features, which are later aggregated into a global descriptor. The PointNet architecture is not well suited to capture local geometric structures. Thus, state-of-the-art methods enhance vanilla PointNet architecture by adding different mechanism to capture local contextual information, such as graph convolutional networks or using hand-crafted features. We present an alternative approach, dubbed MinkLoc3D, to compute a discriminative 3D point cloud descriptor, based on a sparse voxelized point cloud representation and sparse 3D convolutions. The proposed method has a simple and efficient architecture. Evaluation on standard benchmarks proves that MinkLoc3D outperforms current state-of-the-art.
If you find this work useful, please consider citing:
@INPROCEEDINGS{9423215,
author={Komorowski, Jacek},
booktitle={2021 IEEE Winter Conference on Applications of Computer Vision (WACV)},
title={MinkLoc3D: Point Cloud Based Large-Scale Place Recognition},
year={2021},
volume={},
number={},
pages={1789-1798},
doi={10.1109/WACV48630.2021.00183}}
Code was tested using Python 3.8 with PyTorch 1.9.1 and MinkowskiEngine 0.5.4 on Ubuntu 20.04 with CUDA 10.2. Note: CUDA 11.1 is not recommended as there are some issues with MinkowskiEngine 0.5.4 on CUDA 11.1.
The following Python packages are required:
Modify the PYTHONPATH
environment variable to include absolute path to the project root folder:
export PYTHONPATH=$PYTHONPATH:/home/.../MinkLoc3D
MinkLoc3D is trained on a subset of Oxford RobotCar and In-house (U.S., R.A., B.D.) datasets introduced in PointNetVLAD: Deep Point Cloud Based Retrieval for Large-Scale Place Recognition paper (link). There are two training datasets:
For dataset description see PointNetVLAD paper or github repository (link).
You can download training and evaluation datasets from here (alternative link).
Before the network training or evaluation, run the below code to generate pickles with positive and negative point clouds for each anchor point cloud. NOTE: Training and evaluation pickles format has changed in this release of MinkLoc3D code. If you have created these files using the previous version of the code, they must be removed and re-created.
cd generating_queries/
# Generate training tuples for the Baseline Dataset
python generate_training_tuples_baseline.py --dataset_root <dataset_root_path>
# Generate training tuples for the Refined Dataset
python generate_training_tuples_refine.py --dataset_root <dataset_root_path>
# Generate evaluation tuples
python generate_test_sets.py --dataset_root <dataset_root_path>
<dataset_root_path>
is a path to dataset root folder, e.g. /data/pointnetvlad/benchmark_datasets/
.
Before running the code, ensure you have read/write rights to <dataset_root_path>
, as training and evaluation pickles
are saved there.
To train MinkLoc3D network, download and decompress the dataset and generate training pickles as described above.
Edit the configuration file (config_baseline.txt
or config_refined.txt
).
Set dataset_folder
parameter to the dataset root folder.
Modify batch_size_limit
parameter depending on available GPU memory.
Default limit (=256) requires at least 11GB of GPU RAM.
To train the network, run:
cd training
# To train minkloc3d model on the Baseline Dataset
python train.py --config ../config/config_baseline.txt --model_config ../models/minkloc3d.txt
# To train minkloc3d model on the Refined Dataset
python train.py --config ../config/config_refined.txt --model_config ../models/minkloc3d.txt
Pretrained models are available in weights
directory
minkloc3d_baseline.pth
trained on the Baseline Datasetminkloc3d_refined.pth
trained on the Refined DatasetTo evaluate pretrained models run the following commands:
cd eval
# To evaluate the model trained on the Baseline Dataset
python evaluate.py --config ../config/config_baseline.txt --model_config ../models/minkloc3d.txt --weights ../weights/minkloc3d_baseline.pth
# To evaluate the model trained on the Refined Dataset
python evaluate.py --config ../config/config_refined.txt --model_config ../models/minkloc3d.txt --weights ../weights/minkloc3d_refined.pth
MinkLoc3D performance (measured by Average Recall@1%) compared to state-of-the-art:
Method | Oxford | U.S. | R.A. | B.D |
---|---|---|---|---|
PointNetVLAD [1] | 80.3 | 72.6 | 60.3 | 65.3 |
PCAN [2] | 83.8 | 79.1 | 71.2 | 66.8 |
DAGC [3] | 87.5 | 83.5 | 75.7 | 71.2 |
LPD-Net [4] | 94.9 | 96.0 | 90.5 | 89.1 |
EPC-Net [5] | 94.7 | 96.5 | 88.6 | 84.9 |
SOE-Net [6] | 96.4 | 93.2 | 91.5 | 88.5 |
NDT-Transformer [7] | 97.7 | |||
MinkLoc3D (our) | 97.9 | 95.0 | 91.2 | 88.5 |
Method | Oxford | U.S. | R.A. | B.D |
---|---|---|---|---|
PointNetVLAD [1] | 80.1 | 94.5 | 93.1 | 86.5 |
PCAN [2] | 86.4 | 94.1 | 92.3 | 87.0 |
DAGC [3] | 87.8 | 94.3 | 93.4 | 88.5 |
LPD-Net [4] | 94.9 | 98.9 | 96.4 | 94.4 |
SOE-Net [6] | 96.4 | 97.7 | 95.9 | 92.6 |
MinkLoc3D (our) | 98.5 | 99.7 | 99.3 | 96.7 |
Our code is released under the MIT License (see LICENSE file for details).