CAIRI Supervised, Semi- and Self-Supervised Visual Representation Learning Toolbox and Benchmark
A collection of weights and logs for image classification experiments with modern Transformer architectures on CIFAR-100. These benchmarks are proposed for the convenience of conducting research in Mixup augmentations with Transformers since the most published benchmarks of Mixup variants with ViTs are based on ImageNet-1K. Please refer to our tech report for more details.
Backbones | $Beta$ | DEiT-S(/16) | DEiT-S(/16) | Swin-T | Swin-T |
---|---|---|---|---|---|
Epoch | $\alpha$ | 200 epochs | 600 epochs | 200 epochs | 600 epochs |
Vanilla | - | 65.81 | 68.50 | 78.41 | 81.29 |
MixUp | 0.8 | 69.98 | 76.35 | 76.78 | 83.67 |
CutMix | 2 | 74.12 | 79.54 | 80.64 | 83.38 |
DeiT | 0.8,1 | 75.92 | 79.38 | 81.25 | 84.41 |
SmoothMix | 0.2 | 67.54 | 80.25 | 66.69 | 81.18 |
SaliencyMix | 0.2 | 69.78 | 76.60 | 80.40 | 82.58 |
AttentiveMix+ | 2 | 75.98 | 80.33 | 81.13 | 83.69 |
FMix* | 1 | 70.41 | 74.31 | 80.72 | 82.82 |
GridMix | 1 | 68.86 | 74.96 | 78.54 | 80.79 |
PuzzleMix | 2 | 73.60 | 81.01 | 80.44 | 84.74 |
ResizeMix* | 1 | 68.45 | 71.95 | 80.16 | 82.36 |
AlignMix | 1 | - | - | 78.91 | 83.34 |
TransMix | 0.8,1 | 76.17 | 79.33 | 81.33 | 84.45 |
AutoMix | 2 | 76.24 | 80.91 | 82.67 | 84.70 |
SAMix* | 2 | 77.94 | 82.49 | 82.62 | 84.85 |
We provide a collection of model weights and logs for image classification networks on ImageNet-1K (download) reproduced with OpenMixup
or MMLab frameworks. You can view the training setting in config files and README pages of related models. You can download all files from Baidu Cloud (cicj).
If you want us to reproduce a new model or can provide reproduced results to OpenMixup, Please contact us by GitHub issued or e-mail. This release is on updating for a long time!
Model | Paper | Pretrain | Params(M) | Flops(G) | Top-1(%) | Top-5(%) | Config | Download |
---|---|---|---|---|---|---|---|---|
DeiT-S | ICML'2021 | From scratch | 22.05 | 4.24 | 80.28 | 95.07 | config | model | log |
DeiT-B | ICML'2021 | From scratch | 86.57 | 16.86 | 81.82 | 95.57 | config | model | log |
Swin-T | ICCV'2021 | From scratch | 28.29 | 4.36 | 81.18 | 95.61 | config | model | log |
ConvNeXt-T | CVPR'2022 | From scratch | 28.59 | 4.46 | 82.16 | 95.81 | config | model | log |
UniFormer-T | ICLR'2022 | From scratch | 5.55 | 0.88 | 78.02 | 94.14 | config | model | log |
UniFormer-S | ICLR'2022 | From scratch | 21.5 | 3.44 | 82.29 | 95.91 | config | model | log |
VAN-T (B0) | arXiv'2022 | From scratch | 4.11 | 0.88 | 75.77 | 92.99 | config | model | log |
VAN-S (B1) | arXiv'2022 | From scratch | 13.86 | 2.52 | 81.03 | 95.56 | config | model | log |
VAN-B (B2) | arXiv'2022 | From scratch | 26.58 | 5.03 | 82.65 | 96.17 | config | model | log |
LITv2-S | NIPS'2022 | From scratch | 27.85 | 3.52 | 81.74 | 95.59 | config | model | log |
CoC-T | ICLR'2023 | From scratch | 5.60 | 1.10 | 72.70 | 91.26 | config | model | log |
CoC-T-plain | ICLR'2023 | From scratch | 5.60 | 1.10 | 73.16 | 95.48 | config | model | log |
CoC-S | ICLR'2023 | From scratch | 14.7 | 2.78 | 77.71 | 93.87 | config | model | log |
A collection of weights and logs for image classification experiments with RSB A3 training setting on ImageNet-1K (download). You can view the training setting in ResNet strikes back and find the full results in MogaNet (Appendix Table A.7). You can download all files from Baidu Cloud (ss3j).
Model | Date | Train / Test | Params (M) | Top-1 (%) | Top-5 (%) | Config | Download |
---|---|---|---|---|---|---|---|
ResNet-50 | CVPR'2016 | 160 / 224 | 26 | 78.1 | 93.8 | config | model | log |
ResNet-101 | CVPR'2016 | 160 / 224 | 45 | 79.9 | 94.9 | config | model | log |
ResNet-152 | CVPR'2016 | 160 / 224 | 60 | 80.7 | 95.2 | config | model | log |
ViT-T | ICLR'2021 | 160 / 224 | 6 | 66.7 | 87.7 | config | model | log |
ViT-S | ICLR'2021 | 160 / 224 | 22 | 73.8 | 91.2 | config | model | log |
ViT-B | ICLR'2021 | 160 / 224 | 87 | 76.0 | 91.8 | config | model | log |
PVT-T | ICCV'2021 | 160 / 224 | 13 | 71.5 | 89.8 | config | model | log |
PVT-S | ICCV'2021 | 160 / 224 | 25 | 72.1 | 90.2 | config | model | log |
Swin-T | ICCV'2021 | 160 / 224 | 28 | 77.7 | 93.7 | config | model | log |
Swin-S | ICCV'2021 | 160 / 224 | 50 | 80.2 | 95.1 | config | model | log |
Swin-B | ICCV'2021 | 160 / 224 | 50 | 80.5 | 95.4 | config | model | log |
LITV2-T | NIPS'2022 | 160 / 224 | 28 | 79.7 | 94.7 | config | model | log |
LITV2-M | NIPS'2022 | 160 / 224 | 49 | 80.5 | 95.2 | config | model | log |
LITV2-B | NIPS'2022 | 160 / 224 | 87 | 81.3 | 95.5 | config | model | log |
ConvMixer-768-d32 | arXiv'2022 | 160 / 224 | 21 | 77.6 | 93.5 | config | model | log |
PoolFormer-S12 | CVPR'2022 | 160 / 224 | 12 | 69.3 | 88.7 | config | model | log |
PoolFormer-S24 | CVPR'2022 | 160 / 224 | 21 | 74.1 | 91.8 | config | model | log |
PoolFormer-S36 | CVPR'2022 | 160 / 224 | 31 | 74.6 | 92.0 | config | model | log |
PoolFormer-M36 | CVPR'2022 | 160 / 224 | 56 | 80.7 | 95.2 | config | model | log |
PoolFormer-M48 | CVPR'2022 | 160 / 224 | 73 | 81.2 | 95.3 | config | model | log |
ConvNeXt-T | CVPR'2022 | 160 / 224 | 29 | 78.8 | 94.2 | config | model | log |
ConvNeXt-S | CVPR'2022 | 160 / 224 | 50 | 81.7 | 95.7 | config | model | log |
ConvNeXt-B | CVPR'2022 | 160 / 224 | 89 | 82.1 | 95.9 | config | model | log |
ConvNeXt-L | CVPR'2022 | 160 / 224 | 189 | 82.8 | 96.0 | config | model | log |
VAN-B0 | arXiv'2022 | 160 / 224 | 4 | 72.6 | 94.2 | config | model | log |
VAN-B2 | arXiv'2022 | 160 / 224 | 27 | 81.0 | 91.5 | config | model | log |
VAN-B3 | arXiv'2022 | 160 / 224 | 45 | 81.9 | 95.7 | config | model | log |
HorNet-T (7×7) | NIPS'2022 | 160 / 224 | 22 | 80.1 | 95.0 | config | model | log |
HorNet-S (7×7) | NIPS'2022 | 160 / 224 | 50 | 81.2 | 95.4 | config | model | log |
MogaNet-XT | arXiv'2022 | 160 / 224 | 3 | 72.8 | 91.3 | config | model | log |
MogaNet-T | arXiv'2022 | 160 / 224 | 5 | 75.4 | 92.6 | config | model | log |
MogaNet-S | arXiv'2022 | 160 / 224 | 25 | 81.1 | 95.5 | config | model | log |
MogaNet-B | arXiv'2022 | 160 / 224 | 44 | 82.2 | 95.9 | config | model | log |
MogaNet-L | arXiv'2022 | 160 / 224 | 83 | 83.2 | 96.4 | config | model | log |
A collection of weights and logs for image classification experiments of MogaNet on ImageNet-1K (download). You can download all files from Baidu Cloud (z8mf) at MogaNet/Classification_OpenMixup
.
tools/dist_test.sh
for the classification performance or fine-tune them on downstream tasks by only loading the encoder weights, e.g., COCO detection and ADE20K segmentation.attn_force_fp32
: During fp16 training, we force to run the gating functions with fp32 to avoid inf or nan. We found that if we use attn_force_fp32=True
during training, it should also keep attn_force_fp32=True
during evaluation. This might be caused by the difference between the output results of using attn_force_fp32
or not. It will not affect performances of fully fine-tuning but the results of transfer learning (e.g., COCO Mask-RCNN freezes the parameters of the first stage). We set it to true by default in OpenMixup while removing it in MogaNet implementation. For example, you can use moga_small_ema_sz224_8xb128_ep300.pth with attn_force_fp32=True
while using moga_small_ema_sz224_8xb128_no_forcefp32_ep300.pth with attn_force_fp32=False
.Model | Pretrain | Setting | resolution | Params(M) | Flops(G) | Top-1 (%) | Config | Download |
---|---|---|---|---|---|---|---|---|
MogaNet-XT | From scratch | DeiT | 224x224 | 2.97 | 0.80 | 76.5 | config | model | log |
MogaNet-XT | From scratch | DeiT | 256x256 | 2.97 | 1.04 | 77.2 | config | model | log |
MogaNet-XT* | From scratch | DeiT-3 | 256x256 | 2.97 | 1.04 | 77.6 | config | model | log |
MogaNet-T | From scratch | DeiT | 224x224 | 5.20 | 1.10 | 79.0 | config | model | log |
MogaNet-T | From scratch | DeiT | 256x256 | 5.20 | 1.44 | 79.6 | config | model | log |
MogaNet-T* | From scratch | DeiT-3 | 256x256 | 5.20 | 1.44 | 80.0 | config | model | log |
MogaNet-S | From scratch | DeiT | 224x224 | 25.3 | 4.97 | 83.4 | config | model | log |
MogaNet-B | From scratch | DeiT | 224x224 | 43.9 | 9.93 | 84.3 | config | model | log |
MogaNet-L | From scratch | DeiT | 224x224 | 82.5 | 15.9 | 84.7 | config | model | log |
MogaNet-XL | From scratch | DeiT | 224x224 | 180.8 | 34.5 | 85.1 | config | model | log |
MogaNet-XT | From scratch | RSB A3 | 160x160 | 2.97 | 0.80 | 72.8 | config | model | log |
MogaNet-T | From scratch | RSB A3 | 160x160 | 5.20 | 1.10 | 75.4 | config | model | log |
MogaNet-S | From scratch | RSB A3 | 160x160 | 25.3 | 4.97 | 81.1 | config | model | log |
MogaNet-B | From scratch | RSB A3 | 160x160 | 43.9 | 9.93 | 82.2 | config | model | log |
MogaNet-L | From scratch | RSB A3 | 160x160 | 43.9 | 9.93 | 83.2 | config | model | log |
A collection of weights and logs for self-supervised learning benchmark on ImageNet-1K (download). You can find pre-training codes of compared methods in OpenMixup, VISSL, solo-learn, and the official repositories. You can download all files from Baidu Cloud: A2MIM (3q5i).
tools/dist_test.sh
or fine-tune pre-trained models tools/dist_train.sh
with --load_checkpoint
(loading the full checkpoints). Note that pre-trained weights stated with full_
contains the full keys of pre-trained models while backbone_
only contains the encoder weights, which can be used for downstream tasks, e.g., COCO detection and ADE20K segmentation.We provide the source of pre-trained weights, pre-training epochs, fine-tuning epochs and protocol, and top-1 accuracy in the following table.
Methods | Source | PT epoch | FT protocol | FT top-1 |
---|---|---|---|---|
PyTorch | PyTorch | 90 | RSB A3 | 78.8 |
Inpainting | OpenMixup | 70 | RSB A3 | 78.4 |
Relative-Loc | OpenMixup | 70 | RSB A3 | 77.8 |
Rotation | OpenMixup | 70 | RSB A3 | 77.7 |
SimCLR | VISSL | 100 | RSB A3 | 78.5 |
MoCoV2 | OpenMixup | 100 | RSB A3 | 78.5 |
BYOL | OpenMixup | 100 | RSB A3 | 78.7 |
BYOL | Official | 300 | RSB A3 | 78.9 |
BYOL | Official | 300 | RSB A2 | 80.1 |
SwAV | VISSL | 100 | RSB A3 | 78.9 |
SwAV | Official | 400 | RSB A3 | 79.0 |
SwAV | Official | 400 | RSB A2 | 80.2 |
BarlowTwins | solo learn | 100 | RSB A3 | 78.5 |
BarlowTwins | Official | 300 | RSB A3 | 78.8 |
MoCoV3 | Official | 100 | RSB A3 | 78.7 |
MoCoV3 | Official | 300 | RSB A3 | 79.0 |
MoCoV3 | Official | 300 | RSB A2 | 80.1 |
A2MIM | OpenMixup | 100 | RSB A3 | 78.8 |
A2MIM | OpenMixup | 300 | RSB A3 | 78.9 |
A2MIM | OpenMixup | 300 | RSB A2 | 80.4 |
We provide the source of pre-trained weights, pre-training epochs, fine-tuning epochs and protocol, and top-1 accuracy in the following table.
Methods | Source | PT epoch | FT protocol | FT top-1 |
---|---|---|---|---|
SimMIM | Official | 800 | BEiT (SimMIM) | 83.8 |
SimMIM (RGB mean) | OpenMixup | 800 | BEiT (SimMIM) | 84.0 |
A2MIM | OpenMixup | 800 | BEiT (SimMIM) | 84.3 |
A collection of weights and logs for mixup classification benchmark on iNaturalist-2018 (download, config). You can download all files from Baidu Cloud: iNaturalist-2018 (wy2v).
tools/dist_test.sh
or fine-tune pre-trained models tools/dist_train.sh
with --load_checkpoint
.Backbones | ResNet-50 top-1 | ResNeXt-101 top-1 |
---|---|---|
Vanilla | 62.53 | 66.94 |
MixUp [ICLR'2018] | 62.69 | 67.56 |
CutMix [ICCV'2019] | 63.91 | 69.75 |
ManifoldMix [ICML'2019] | 63.46 | 69.30 |
SaliencyMix [ICLR'2021] | 64.27 | 70.01 |
FMix [Arixv'2020] | 63.71 | 69.46 |
PuzzleMix [ICML'2020] | 64.36 | 70.12 |
ResizeMix [Arixv'2020] | 64.12 | 69.30 |
AutoMix [ECCV'2022] | 64.73 | 70.49 |
SAMix [Arxiv'2021] | 64.84 | 70.54 |
A collection of weights and logs for mixup classification benchmark on iNaturalist-2017 (download, config). You can download all files from Baidu Cloud: iNaturalist-2017 (1e7w).
tools/dist_test.sh
or fine-tune pre-trained models tools/dist_train.sh
with --load_checkpoint
.Backbones | ResNet-18 top-1 | ResNet-50 top-1 | ResNeXt-101 top-1 |
---|---|---|---|
Vanilla | 51.79 | 60.23 | 63.70 |
MixUp [ICLR'2018] | 51.40 | 61.22 | 66.27 |
CutMix [ICCV'2019] | 51.24 | 62.34 | 67.59 |
ManifoldMix [ICML'2019] | 51.83 | 61.47 | 66.08 |
SaliencyMix [ICLR'2021] | 51.29 | 62.51 | 67.20 |
FMix [Arixv'2020] | 52.01 | 61.90 | 66.64 |
PuzzleMix [ICML'2020] | - | 62.66 | 67.72 |
ResizeMix [Arixv'2020] | 51.21 | 62.29 | 66.82 |
AutoMix [ECCV'2022] | 52.84 | 63.08 | 68.03 |
SAMix [Arxiv'2021] | 53.42 | 63.32 | 68.26 |
A collection of weights and logs for mixup classification benchmark on Place205 (download, config). You can download all files from Baidu Cloud (4m94).
tools/dist_test.sh
or fine-tune pre-trained models tools/dist_train.sh
with --load_checkpoint
.Backbones | ResNet-18 top-1 | ResNet-50 top-1 |
---|---|---|
Vanilla | 59.63 | 63.10 |
MixUp [ICLR'2018] | 59.33 | 63.01 |
CutMix [ICCV'2019] | 59.21 | 63.75 |
ManifoldMix [ICML'2019] | 59.46 | 63.23 |
SaliencyMix [ICLR'2021] | 59.50 | 63.33 |
FMix [Arixv'2020] | 59.51 | 63.63 |
PuzzleMix [ICML'2020] | 59.62 | 63.91 |
ResizeMix [Arixv'2020] | 59.66 | 63.88 |
AutoMix [ECCV'2022] | 59.74 | 64.06 |
SAMix [Arxiv'2021] | 59.86 | 64.27 |
OpenMixup
(built on Read the Docs).README
and update configs for self-supervised and supervised methods.openmixup.models.utils
and support more network layers.DropPath
(using stochastic depth rule) in ResNet
for RSB A1/A2 training settings.PreciseBNHook
(update all BN stats) and RepeatSampler
(set sync_random_seed) for RSB A1/A2.l1_loss
is supported by FP16 training, other regression losses (e.g., MSE and Smooth_L1 losses) will cause NAN when the target and prediction are not normalized in FP16 training.