Openmixup Versions Save

CAIRI Supervised, Semi- and Self-Supervised Visual Representation Learning Toolbox and Benchmark

vits-mix-cifar100-weights

10 months ago

A collection of weights and logs for image classification experiments with modern Transformer architectures on CIFAR-100. These benchmarks are proposed for the convenience of conducting research in Mixup augmentations with Transformers since the most published benchmarks of Mixup variants with ViTs are based on ImageNet-1K. Please refer to our tech report for more details.

Since the original resolutions of CIFAR-100 are too small for ViTs, we resize the input images to $224\times 224$ (training and testing) while not modifying the ViT architectures. This benchmark uses the DeiT setup and trains the model for 200 or 600 epochs with a batch size of 100 on CIFAR-100. The basic learning rates of DeiT and Swin are $1e-3$ and $5e-4$, which is the optimal setup in our experiments. We search and report $\alpha$ in $Beta(\alpha, \alpha)$ for all compared methods. View config files in mixups/vits.
The best of top-1 accuracy in the last 10 training epochs is reported for ViT architectures. We released the trained models and logs in vits-mix-cifar100-weights.

ViTs' Mixup Benchmark on CIFAR-100

Backbones	$Beta$	DEiT-S(/16)	DEiT-S(/16)	Swin-T	Swin-T
Epoch	$\alpha$	200 epochs	600 epochs	200 epochs	600 epochs
Vanilla	-	65.81	68.50	78.41	81.29
MixUp	0.8	69.98	76.35	76.78	83.67
CutMix	2	74.12	79.54	80.64	83.38
DeiT	0.8,1	75.92	79.38	81.25	84.41
SmoothMix	0.2	67.54	80.25	66.69	81.18
SaliencyMix	0.2	69.78	76.60	80.40	82.58
AttentiveMix+	2	75.98	80.33	81.13	83.69
FMix*	1	70.41	74.31	80.72	82.82
GridMix	1	68.86	74.96	78.54	80.79
PuzzleMix	2	73.60	81.01	80.44	84.74
ResizeMix*	1	68.45	71.95	80.16	82.36
AlignMix	1	-	-	78.91	83.34
TransMix	0.8,1	76.17	79.33	81.33	84.45
AutoMix	2	76.24	80.91	82.67	84.70
SAMix*	2	77.94	82.49	82.62	84.85

open-in1k-weights

1 year ago

We provide a collection of model weights and logs for image classification networks on ImageNet-1K (download) reproduced with OpenMixup or MMLab frameworks. You can view the training setting in config files and README pages of related models. You can download all files from Baidu Cloud (cicj).

If you want us to reproduce a new model or can provide reproduced results to OpenMixup, Please contact us by GitHub issued or e-mail. This release is on updating for a long time!

ImageNet Classification with OpenMixup

Model	Paper	Pretrain	Params(M)	Flops(G)	Top-1(%)	Top-5(%)	Config	Download
DeiT-S	ICML'2021	From scratch	22.05	4.24	80.28	95.07	config	model \| log
DeiT-B	ICML'2021	From scratch	86.57	16.86	81.82	95.57	config	model \| log
Swin-T	ICCV'2021	From scratch	28.29	4.36	81.18	95.61	config	model \| log
ConvNeXt-T	CVPR'2022	From scratch	28.59	4.46	82.16	95.81	config	model \| log
UniFormer-T	ICLR'2022	From scratch	5.55	0.88	78.02	94.14	config	model \| log
UniFormer-S	ICLR'2022	From scratch	21.5	3.44	82.29	95.91	config	model \| log
VAN-T (B0)	arXiv'2022	From scratch	4.11	0.88	75.77	92.99	config	model \| log
VAN-S (B1)	arXiv'2022	From scratch	13.86	2.52	81.03	95.56	config	model \| log
VAN-B (B2)	arXiv'2022	From scratch	26.58	5.03	82.65	96.17	config	model \| log
LITv2-S	NIPS'2022	From scratch	27.85	3.52	81.74	95.59	config	model \| log
CoC-T	ICLR'2023	From scratch	5.60	1.10	72.70	91.26	config	model \| log
CoC-T-plain	ICLR'2023	From scratch	5.60	1.10	73.16	95.48	config	model \| log
CoC-S	ICLR'2023	From scratch	14.7	2.78	77.71	93.87	config	model \| log

rsb-a3-weights

1 year ago

A collection of weights and logs for image classification experiments with RSB A3 training setting on ImageNet-1K (download). You can view the training setting in ResNet strikes back and find the full results in MogaNet (Appendix Table A.7). You can download all files from Baidu Cloud (ss3j).

We train all models for 100 epochs according to the RSB A3 setting on ImageNet-1K. We turn the basic learning in {8e-3, 6e-3} to get better performances.
The best top-1 accuracy of image classification in the last 10 training epochs is reported for all experiments.

RSB A3 Image Classification on ImageNet-1K

Model	Date	Train / Test	Params (M)	Top-1 (%)	Top-5 (%)	Config	Download
ResNet-50	CVPR'2016	160 / 224	26	78.1	93.8	config	model \| log
ResNet-101	CVPR'2016	160 / 224	45	79.9	94.9	config	model \| log
ResNet-152	CVPR'2016	160 / 224	60	80.7	95.2	config	model \| log
ViT-T	ICLR'2021	160 / 224	6	66.7	87.7	config	model \| log
ViT-S	ICLR'2021	160 / 224	22	73.8	91.2	config	model \| log
ViT-B	ICLR'2021	160 / 224	87	76.0	91.8	config	model \| log
PVT-T	ICCV'2021	160 / 224	13	71.5	89.8	config	model \| log
PVT-S	ICCV'2021	160 / 224	25	72.1	90.2	config	model \| log
Swin-T	ICCV'2021	160 / 224	28	77.7	93.7	config	model \| log
Swin-S	ICCV'2021	160 / 224	50	80.2	95.1	config	model \| log
Swin-B	ICCV'2021	160 / 224	50	80.5	95.4	config	model \| log
LITV2-T	NIPS'2022	160 / 224	28	79.7	94.7	config	model \| log
LITV2-M	NIPS'2022	160 / 224	49	80.5	95.2	config	model \| log
LITV2-B	NIPS'2022	160 / 224	87	81.3	95.5	config	model \| log
ConvMixer-768-d32	arXiv'2022	160 / 224	21	77.6	93.5	config	model \| log
PoolFormer-S12	CVPR'2022	160 / 224	12	69.3	88.7	config	model \| log
PoolFormer-S24	CVPR'2022	160 / 224	21	74.1	91.8	config	model \| log
PoolFormer-S36	CVPR'2022	160 / 224	31	74.6	92.0	config	model \| log
PoolFormer-M36	CVPR'2022	160 / 224	56	80.7	95.2	config	model \| log
PoolFormer-M48	CVPR'2022	160 / 224	73	81.2	95.3	config	model \| log
ConvNeXt-T	CVPR'2022	160 / 224	29	78.8	94.2	config	model \| log
ConvNeXt-S	CVPR'2022	160 / 224	50	81.7	95.7	config	model \| log
ConvNeXt-B	CVPR'2022	160 / 224	89	82.1	95.9	config	model \| log
ConvNeXt-L	CVPR'2022	160 / 224	189	82.8	96.0	config	model \| log
VAN-B0	arXiv'2022	160 / 224	4	72.6	94.2	config	model \| log
VAN-B2	arXiv'2022	160 / 224	27	81.0	91.5	config	model \| log
VAN-B3	arXiv'2022	160 / 224	45	81.9	95.7	config	model \| log
HorNet-T (7×7)	NIPS'2022	160 / 224	22	80.1	95.0	config	model \| log
HorNet-S (7×7)	NIPS'2022	160 / 224	50	81.2	95.4	config	model \| log
MogaNet-XT	arXiv'2022	160 / 224	3	72.8	91.3	config	model \| log
MogaNet-T	arXiv'2022	160 / 224	5	75.4	92.6	config	model \| log
MogaNet-S	arXiv'2022	160 / 224	25	81.1	95.5	config	model \| log
MogaNet-B	arXiv'2022	160 / 224	44	82.2	95.9	config	model \| log
MogaNet-L	arXiv'2022	160 / 224	83	83.2	96.4	config	model \| log

moganet-in1k-weights

1 year ago

A collection of weights and logs for image classification experiments of MogaNet on ImageNet-1K (download). You can download all files from Baidu Cloud (z8mf) at MogaNet/Classification_OpenMixup.

We train MogaNet for 100 and 300 epochs according to the RSB A3 and DeiT settings on ImageNet-1K. Note that * denotes the refined training setting of lightweight models with 3-Augment. Refer to the Appendix of MogaNet for more training details.
The best top-1 accuracy of image classification in the last 10 training epochs is reported for all experiments. Note that we report the classification accuracy of EMA weights for MogaNet-S, MogaNet-B, and MogaNet-L.
As for evaluation experiments of the pre-trained weights, you can test them with tools/dist_test.sh for the classification performance or fine-tune them on downstream tasks by only loading the encoder weights, e.g., COCO detection and ADE20K segmentation.
Warning of attn_force_fp32: During fp16 training, we force to run the gating functions with fp32 to avoid inf or nan. We found that if we use attn_force_fp32=True during training, it should also keep attn_force_fp32=True during evaluation. This might be caused by the difference between the output results of using attn_force_fp32 or not. It will not affect performances of fully fine-tuning but the results of transfer learning (e.g., COCO Mask-RCNN freezes the parameters of the first stage). We set it to true by default in OpenMixup while removing it in MogaNet implementation. For example, you can use moga_small_ema_sz224_8xb128_ep300.pth with attn_force_fp32=True while using moga_small_ema_sz224_8xb128_no_forcefp32_ep300.pth with attn_force_fp32=False.

Image Classification on ImageNet-1K

Model	Pretrain	Setting	resolution	Params(M)	Flops(G)	Top-1 (%)	Config	Download
MogaNet-XT	From scratch	DeiT	224x224	2.97	0.80	76.5	config	model \| log
MogaNet-XT	From scratch	DeiT	256x256	2.97	1.04	77.2	config	model \| log
MogaNet-XT*	From scratch	DeiT-3	256x256	2.97	1.04	77.6	config	model \| log
MogaNet-T	From scratch	DeiT	224x224	5.20	1.10	79.0	config	model \| log
MogaNet-T	From scratch	DeiT	256x256	5.20	1.44	79.6	config	model \| log
MogaNet-T*	From scratch	DeiT-3	256x256	5.20	1.44	80.0	config	model \| log
MogaNet-S	From scratch	DeiT	224x224	25.3	4.97	83.4	config	model \| log
MogaNet-B	From scratch	DeiT	224x224	43.9	9.93	84.3	config	model \| log
MogaNet-L	From scratch	DeiT	224x224	82.5	15.9	84.7	config	model \| log
MogaNet-XL	From scratch	DeiT	224x224	180.8	34.5	85.1	config	model \| log
MogaNet-XT	From scratch	RSB A3	160x160	2.97	0.80	72.8	config	model \| log
MogaNet-T	From scratch	RSB A3	160x160	5.20	1.10	75.4	config	model \| log
MogaNet-S	From scratch	RSB A3	160x160	25.3	4.97	81.1	config	model \| log
MogaNet-B	From scratch	RSB A3	160x160	43.9	9.93	82.2	config	model \| log
MogaNet-L	From scratch	RSB A3	160x160	43.9	9.93	83.2	config	model \| log

a2mim-in1k-weights

1 year ago

A collection of weights and logs for self-supervised learning benchmark on ImageNet-1K (download). You can find pre-training codes of compared methods in OpenMixup, VISSL, solo-learn, and the official repositories. You can download all files from Baidu Cloud: A2MIM (3q5i).

All compared methods adopt ResNet-50 or ViT-B architectures and are pre-trained 100/300 or 800 epochs on ImageNet-1K. The pre-training and fine-tuning testing image size are $224\times 224$. The fine-tuning protocols include: RSB A3 and RSB A2 for ResNet-50, BEiT (SimMIM) for ViT-B. Refer to the paper of A2MIM for more details.
The best top-1 accuracy of fine-tuning in the last 10 training epochs is reported for all self-supervised methods.
Visualization of mixed samples of A2MIM are provided in zip files.
As for pre-training and fine-tuning weights, you can evaluate them with tools/dist_test.sh or fine-tune pre-trained models tools/dist_train.sh with --load_checkpoint (loading the full checkpoints). Note that pre-trained weights stated with full_ contains the full keys of pre-trained models while backbone_ only contains the encoder weights, which can be used for downstream tasks, e.g., COCO detection and ADE20K segmentation.

Self-supervised Pre-training and Fine-tuning with ResNet-50 on ImageNet-1K

We provide the source of pre-trained weights, pre-training epochs, fine-tuning epochs and protocol, and top-1 accuracy in the following table.

Methods	Source	PT epoch	FT protocol	FT top-1
PyTorch	PyTorch	90	RSB A3	78.8
Inpainting	OpenMixup	70	RSB A3	78.4
Relative-Loc	OpenMixup	70	RSB A3	77.8
Rotation	OpenMixup	70	RSB A3	77.7
SimCLR	VISSL	100	RSB A3	78.5
MoCoV2	OpenMixup	100	RSB A3	78.5
BYOL	OpenMixup	100	RSB A3	78.7
BYOL	Official	300	RSB A3	78.9
BYOL	Official	300	RSB A2	80.1
SwAV	VISSL	100	RSB A3	78.9
SwAV	Official	400	RSB A3	79.0
SwAV	Official	400	RSB A2	80.2
BarlowTwins	solo learn	100	RSB A3	78.5
BarlowTwins	Official	300	RSB A3	78.8
MoCoV3	Official	100	RSB A3	78.7
MoCoV3	Official	300	RSB A3	79.0
MoCoV3	Official	300	RSB A2	80.1
A2MIM	OpenMixup	100	RSB A3	78.8
A2MIM	OpenMixup	300	RSB A3	78.9
A2MIM	OpenMixup	300	RSB A2	80.4

Self-supervised Pre-training and Fine-tuning with ViT-B on ImageNet-1K

We provide the source of pre-trained weights, pre-training epochs, fine-tuning epochs and protocol, and top-1 accuracy in the following table.

Methods	Source	PT epoch	FT protocol	FT top-1
SimMIM	Official	800	BEiT (SimMIM)	83.8
SimMIM (RGB mean)	OpenMixup	800	BEiT (SimMIM)	84.0
A2MIM	OpenMixup	800	BEiT (SimMIM)	84.3

mixup-inat2018-weights

1 year ago

A collection of weights and logs for mixup classification benchmark on iNaturalist-2018 (download, config). You can download all files from Baidu Cloud: iNaturalist-2018 (wy2v).

All compared methods adopt ResNet-50 and ResNeXt-101 (32x4d) architectures and are trained 100 epochs using the PyTorch training recipe. The training and testing image size is 224 with the CenterCrop ratio of 0.85. We search $\alpha$ in $Beta(\alpha, \alpha)$ for all compared methods.
The median of top-1 accuracy in the last 5 training epochs is reported for ResNet variants.
Visualization of mixed samples from AutoMix and SAMix are provided in zip files. [2022-08-22] Update MixBlock keys in AutoMix and SAMix checkpoints.
Test pre-trained weights with tools/dist_test.sh or fine-tune pre-trained models tools/dist_train.sh with --load_checkpoint.

Mixup Classification Benchmark on iNaturalist-2018

Backbones	ResNet-50 top-1	ResNeXt-101 top-1
Vanilla	62.53	66.94
MixUp [ICLR'2018]	62.69	67.56
CutMix [ICCV'2019]	63.91	69.75
ManifoldMix [ICML'2019]	63.46	69.30
SaliencyMix [ICLR'2021]	64.27	70.01
FMix [Arixv'2020]	63.71	69.46
PuzzleMix [ICML'2020]	64.36	70.12
ResizeMix [Arixv'2020]	64.12	69.30
AutoMix [ECCV'2022]	64.73	70.49
SAMix [Arxiv'2021]	64.84	70.54

mixup-inat2017-weights

1 year ago

A collection of weights and logs for mixup classification benchmark on iNaturalist-2017 (download, config). You can download all files from Baidu Cloud: iNaturalist-2017 (1e7w).

All compared methods adopt ResNet-18/50 and ResNeXt-101 (32x4d) architectures and are trained 100 epochs using the PyTorch training recipe. The training and testing image size is 224 with the CenterCrop ratio of 0.85. We search $\alpha$ in $Beta(\alpha, \alpha)$ for all compared methods.
The median of top-1 accuracy in the last 5 training epochs is reported for ResNet variants.
Visualization of mixed samples from AutoMix and SAMix are provided in zip files. [2022-08-22] Update MixBlock keys in AutoMix and SAMix checkpoints.
Test pre-trained weights with tools/dist_test.sh or fine-tune pre-trained models tools/dist_train.sh with --load_checkpoint.

Mixup Classification Benchmark on iNaturalist-2017

Backbones	ResNet-18 top-1	ResNet-50 top-1	ResNeXt-101 top-1
Vanilla	51.79	60.23	63.70
MixUp [ICLR'2018]	51.40	61.22	66.27
CutMix [ICCV'2019]	51.24	62.34	67.59
ManifoldMix [ICML'2019]	51.83	61.47	66.08
SaliencyMix [ICLR'2021]	51.29	62.51	67.20
FMix [Arixv'2020]	52.01	61.90	66.64
PuzzleMix [ICML'2020]	-	62.66	67.72
ResizeMix [Arixv'2020]	51.21	62.29	66.82
AutoMix [ECCV'2022]	52.84	63.08	68.03
SAMix [Arxiv'2021]	53.42	63.32	68.26

mixup-place205-weights

1 year ago

A collection of weights and logs for mixup classification benchmark on Place205 (download, config). You can download all files from Baidu Cloud (4m94).

All compared methods adopt ResNet-18/50 architectures and are trained 100 epochs using the PyTorch training recipe. The training and testing image size is 224 with the CenterCrop ratio of 0.85. We search $\alpha$ in $Beta(\alpha, \alpha)$ for all compared methods.
The median of top-1 accuracy in the last 5 training epochs is reported for ResNet variants.
Visualization of mixed samples from AutoMix and SAMix are provided in zip files. [2022-08-22] Update MixBlock keys in AutoMix and SAMix checkpoints.
Test pre-trained weights with tools/dist_test.sh or fine-tune pre-trained models tools/dist_train.sh with --load_checkpoint.

Mixup Classification Benchmark on Place205

Backbones	ResNet-18 top-1	ResNet-50 top-1
Vanilla	59.63	63.10
MixUp [ICLR'2018]	59.33	63.01
CutMix [ICCV'2019]	59.21	63.75
ManifoldMix [ICML'2019]	59.46	63.23
SaliencyMix [ICLR'2021]	59.50	63.33
FMix [Arixv'2020]	59.51	63.63
PuzzleMix [ICML'2020]	59.62	63.91
ResizeMix [Arixv'2020]	59.66	63.88
AutoMix [ECCV'2022]	59.74	64.06
SAMix [Arxiv'2021]	59.86	64.27

V0.2.3

1 year ago

Highlight

Support the online document of OpenMixup (built on Read the Docs).
Provide README and update configs for self-supervised and supervised methods.
Support new contrastive learning method (Barlow Twins) and Masked Image Modeling (MIM) methods (MAE, SimMIM, MaskFeat, CAE, A2MIM).
Support new backbone networks (ConvMixer, DenseNet, MLPMixer, ResNeSt, PoolFormer, UniFormer, VAN).
Support new Fine-tuing method (HCR).
Support new mixup augmentation methods (SmoothMix, GridMix).
Support more regression losses (Charbonnier loss, Focal Frequency loss, Focal L1/L2 loss, Balanced L1 loss, Balanced MSE loss).
Support more regression metrics (regression errors and correlations) and the regression dataset.
Support more reweight classification losses (Gradient Harmonized loss, Varifocal Focal Loss) from MMDetection.
Model Zoos and lists of Awesome Mixups have been updated.

Bug Fixes

Refactor code structures of openmixup.models.utils and support more network layers.
Fix the bug of DropPath (using stochastic depth rule) in ResNet for RSB A1/A2 training settings.
Fix bugs in self-supervised classification benchmarks (configs and implementations of VisionTransformer).
Update INSTALL.md. We suggest you install PyTorch 1.8 or higher and mmcv-full for better usage of this repo. Since PyTorch 1.8 has bugs in AdamW optimizer, do not use PyTorch 1.8 to fine-tune ViT-based methods.
Fix bugs in PreciseBNHook (update all BN stats) and RepeatSampler (set sync_random_seed) for RSB A1/A2.
Fix bugs in regression metrics, MIM dataset, and benchmark configs. Notice that only l1_loss is supported by FP16 training, other regression losses (e.g., MSE and Smooth_L1 losses) will cause NAN when the target and prediction are not normalized in FP16 training.

release

2 years ago

Highlights

Support various popular backbones (ConvNets and ViTs), various image datasets, popular mixup methods, and benchmarks for supervised learning. Config files are available (reorganized).
Support popular self-supervised methods (e.g., BYOL, MoCo.V3, MAE, SimMIM) on both large-scale and small-scale datasets, and self-supervised benchmarks (merged from MMSelfSup). Config files are available (reorganized).
Support analyzing tools for self-supervised learning (kNN/SVM/linear metrics and t-SNE/UMAP visualization).
Convenient usage of configs: fast configs generation by 'auto_train.py' and configs inheriting (MMCV).
Support mixed-precision training (NVIDIA Apex or MMCV Apex) for all methods.
Model Zoos and lists of Awesome Mixups have been released.

Bug Fixes

Done code refactoring follows MMSelfSup and MMClassification #3.
Fix mixed-precision training overflow (NAN & INF in supervised mixup methods).
Fix fine-tuning settings (ViT and Swin Transformer) as MMSelfsup.