BigDetection: A Large-scale Benchmark for Improved Object Detector Pre-training
By Likun Cai, Zhi Zhang, Yi Zhu, Li Zhang, Mu Li, Xiangyang Xue.
This repo is the official implementation of BigDetection. It is based on mmdetection and CBNetV2.
We construct a new large-scale benchmark termed BigDetection. Our goal is to simply leverage the training data from existing datasets (LVIS, OpenImages and Object365) with carefully designed principles, and curate a larger dataset for improved detector pre-training. BigDetection dataset has 600 object categories and contains 3.4M training images with 36M object bounding boxes. We show some important statistics of BigDetection in the following figure.
Left: Number of images per category of BigDetection. Right: Number of instances in different object sizes.
We show the evaluation results on BigDetection Validation. We hope BigDetection could serve as a new challenging benchmark for evaluating next-level object detection methods.
Method | mAP (bigdet val) | Links |
---|---|---|
YOLOv3 | 9.7 | model/config |
Deformable DETR | 13.1 | model/config |
Faster R-CNN (C4)* | 18.9 | model |
Faster R-CNN (FPN)* | 19.4 | model |
CenterNet2* | 23.1 | model |
Cascade R-CNN* | 24.1 | model |
CBNetV2-Swin-Base | 35.1 | model/config |
We show the finetuning performance on COCO minival/test-dev. Results show that BigDetection pre-training provides significant benefits for different detector architectures. We achieve 59.8 mAP on COCO test-dev with a single model.
Method | mAP (coco minival/test-dev) | Links |
---|---|---|
YOLOv3 | 30.5/- | config |
Deformable DETR | 39.9/- | model/config |
Faster R-CNN (C4)* | 38.8/- | model |
Faster R-CNN (FPN)* | 40.5/- | model |
CenterNet2* | 45.3/- | model |
Cascade R-CNN* | 45.1/- | model |
CBNetV2-Swin-Base | 59.1/59.5 | model/config |
CBNetV2-Swin-Base (TTA) | 59.5/59.8 | config |
We followed STAC and SoftTeacher to evaluate on COCO for different partial annotation settings.
Method | mAP (1%) | mAP (2%) | mAP (5%) | mAP (10%) |
---|---|---|---|---|
Baseline | 9.8 | 14.3 | 21.2 | 26.2 |
STAC | 14.0 | 18.3 | 24.4 | 28.6 |
SoftTeacher (ICCV 21) | 20.5 | 26.5 | 30.7 | 34.0 |
Ours | 25.3 | 28.1 | 31.9 | 34.1 |
model | model | model | model |
*
are implemented on another detection codebase Detectron2. Here we provide the pretrained checkpoints. The results can be reproduced following the installation of CenterNet2 codebase.8X
schedule on BigDetection.1X
schedule on COCO.TTA
denotes test time augmentation.Ubuntu 16.04
CUDA 10.2
# Create conda environment
conda create -n bigdet python=3.7 -y
conda activate bigdet
# Install Pytorch
conda install pytorch==1.8.0 torchvision==0.9.0 cudatoolkit=10.2 -c pytorch
# Install mmcv
pip install mmcv-full==1.3.9 -f https://download.openmmlab.com/mmcv/dist/cu102/torch1.8.0/index.html
# Clone and install
git clone https://github.com/amazon-research/bigdetection.git
cd bigdetection
pip install -r requirements/build.txt
pip install -v -e .
# Install Apex (optinal)
git clone https://github.com/NVIDIA/apex
cd apex
pip install -v --disable-pip-version-check --no-cache-dir --global-option="--cpp_ext" --global-option="--cuda_ext" ./
Our BigDetection involves 3 datasets and train/val data can be downloaded from their official website (Objects365, OpenImages v6, LVIS v1.0). All datasets should be placed under $bigdetection/data/ as below. The synsets (total 600 class names) of BigDetection dataset can be downloaded here: bigdetection_synsets. Contact us with [email protected] to get access to our pre-processed annotation files.
bigdetection/data
└── BigDetection
├── annotations
│ ├── bigdet_obj_train.json
│ ├── bigdet_oid_train.json
│ ├── bigdet_lvis_train.json
│ ├── bigdet_val.json
│ └── cas_weights.json
├── train
│ ├── Objects365
│ ├── OpenImages
│ └── LVIS
└── val
To train a detector with pre-trained models, run:
# multi-gpu training
tools/dist_train.sh <CONFIG_FILE> <GPU_NUM> --cfg-options load_from=<PRETRAIN_MODEL>
Pre-training
To pre-train a CBNetV2 with a Swin-Base backbone on BigDetection using 8 GPUs, run: (PRETRAIN_MODEL
should be pre-trained checkpoint of Base-Swin-Transformer: model)
tools/dist_train.sh configs/BigDetection/cbnetv2/htc_cbv2_swin_base_giou_4conv1f_adamw_bigdet.py 8 \
--cfg-options load_from=<PRETRAIN_MODEL>
To pre-train a Deformable-DETR with a ResNet-50 backbone on BigDetection, run:
tools/dist_train.sh configs/BigDetection/deformable_detr/deformable_detr_r50_16x2_8x_bigdet.py 8
Fine-tuning
To fine-tune a BigDetection pre-trained CBNetV2 (with Swin-Base backbone) on COCO, run: (PRETRAIN_MODEL
should be BigDetection pre-trained checkpoint of CBNetV2: model)
tools/dist_train.sh configs/BigDetection/cbnetv2/htc_cbv2_swin_base_giou_4conv1f_adamw_20e_coco.py 8 \
--cfg-options load_from=<PRETRAIN_MODEL>
To evaluate a detector with pre-trained checkpoints, run:
tools/dist_test.sh <CONFIG_FILE> <CHECKPOINT> <GPU_NUM> --eval bbox
BigDetection evaluation
To evaluate pre-trained CBNetV2 on BigDetection validation, run:
tools/dist_test.sh configs/BigDetection/cbnetv2/htc_cbv2_swin_base_giou_4conv1f_adamw_bigdet.py \
<BIGDET_PRETRAIN_CHECKPOINT> 8 --eval bbox
COCO evaluation
To evaluate COCO-finetuned CBNetV2 on COCO validation, run:
# without test-time-augmentation
tools/dist_test.sh configs/BigDetection/cbnetv2/htc_cbv2_swin_base_giou_4conv1f_adamw_20e_coco.py \
<COCO_FINETUNE_CHECKPOINT> 8 --eval bbox mask
# with test-time-augmentation
tools/dist_test.sh configs/BigDetection/cbnetv2/htc_cbv2_swin_base_giou_4conv1f_adamw_20e_coco_tta.py \
<COCO_FINETUNE_CHECKPOINT> 8 --eval bbox mask
Other configuration based on Detectron2 can be found at detectron2-probject.
If you use our dataset or pretrained models in your research, please kindly consider to cite the following paper.
@article{bigdetection2022,
title={BigDetection: A Large-scale Benchmark for Improved Object Detector Pre-training},
author={Likun Cai and Zhi Zhang and Yi Zhu and Li Zhang and Mu Li and Xiangyang Xue},
journal={arXiv preprint arXiv:2203.13249},
year={2022}
}
See CONTRIBUTING for more information.
This project is licensed under the Apache-2.0 License.
We thank the authors releasing mmdetection and CBNetv2 for object detection research community.