A Unified Efficient Pyramid Transformer for Semantic Segmentation, ICCVW 2021
Code for the ICCV 2021 Workshop paper: A Unified Efficient Pyramid Transformer for Semantic Segmentation.
conda create -n unept python=3.7 pip
Then, activate the environment:
conda activate unept
For example:
conda install pytorch==1.5.1 torchvision==0.6.1 cudatoolkit=10.2 -c pytorch
pip install -r requirements.txt
Please following the code from openseg to generate ground truth for boundary refinement.
The data format should be like this.
You can download the processed dt_offset
file here.
path/to/ADEChallengeData2016/
images/
training/
validation/
annotations/
training/
validation/
dt_offset/
training/
validation/
You can download the processed dataset here.
path/to/PASCAL-Context/
train/
image/
label/
dt_offset/
val/
image/
label/
dt_offset/
The default is for multi-gpu, DistributedDataParallel training.
python -m torch.distributed.launch --nproc_per_node=8 \ # specify gpu number
--master_port=29500 \
train.py --launcher pytorch \
--config /path/to/config_file
data_root
in the config file;./work_dirs
;pretrained
path in the config file.# single-gpu testing
python test.py --checkpoint /path/to/checkpoint \
--config /path/to/config_file \
--eval mIoU \
[--out ${RESULT_FILE}] [--show] \
--aug-test \ # for multi-scale flip aug
# multi-gpu testing (4 gpus, 1 sample per gpu)
python -m torch.distributed.launch --nproc_per_node=4 --master_port=29500 \
test.py --launcher pytorch --eval mIoU \
--config_file /path/to/config_file \
--checkpoint /path/to/checkpoint \
--aug-test \ # for multi-scale flip aug
We report results on validation sets.
Backbone | Crop Size | Batch Size | Dataset | Lr schd | Mem(GB) | mIoU(ms+flip) | config |
---|---|---|---|---|---|---|---|
Res-50 | 480x480 | 16 | ADE20K | 160K | 7.0G | 46.1 | config |
DeiT | 480x480 | 16 | ADE20K | 160K | 8.5G | 50.5 | config |
DeiT | 480x480 | 16 | PASCAL-Context | 160K | 8.5G | 55.2 | config |
See CONTRIBUTING for more information.
This project is licensed under the Apache-2.0 License.
If you use this code and models for your research, please consider citing:
@article{zhu2021unified,
title={A Unified Efficient Pyramid Transformer for Semantic Segmentation},
author={Zhu, Fangrui and Zhu, Yi and Zhang, Li and Wu, Chongruo and Fu, Yanwei and Li, Mu},
journal={arXiv preprint arXiv:2107.14209},
year={2021}
}
We thank the authors and contributors of MMCV, MMSegmentation, timm and Deformable DETR.