OpenMMLab Pre-training Toolbox and Benchmark
Fix some bugs and enhance the codebase.
Full Changelog: https://github.com/open-mmlab/mmpretrain/compare/1.0.0...v1.0.1
We are excited to announce that there are several advanced multi-modal methods suppported! We integrated huggingface/transformers with vision backbones in MMPreTrain to run inference and training(in developing).
Methods | Datasets |
---|---|
BLIP (arxiv'2022) | COCO (caption, retrieval, vqa) |
BLIP-2 (arxiv'2023) | Flickr30k (caption, retrieval) |
OFA (CoRR'2022) | GQA |
Flamingo (NeurIPS'2022) | NLVR2 |
Chinese CLIP (arxiv'2022) | NoCaps |
MiniGPT-4 (arxiv'2023) | OCR VQA |
LLaVA (arxiv'2023) | Text VQA |
Otter (arxiv'2023) | VG VQA |
VisualGenomeQA | |
VizWiz | |
VSR |
We test DeepSpeed and FSDP with MMEngine. The following are the memory and training time with ViT-large, ViT-huge and 8B multi-modal models, the left figure is the memory data, and the right figure is the training time data.
Test environment: 8*A100 (80G) PyTorch 2.0.0 Remark: Both FSDP and DeepSpeed were tested with default configurations and not tuned, besides manually tuning the FSDP wrap policy can further reduce training time and memory usage.
out_type
. (#1570)A total of 12 developers contributed to this release.
@XiudingCai @Ezra-Yu @KeiChiTse @mzr1996 @bobo0810 @wangbo-zhao @yuweihao @fangyixiao18 @YuanLiuuuuuu @MGAMZ @okotaku @zzc98
We are excited to announce that MMClassification and MMSelfSup have been merged into ONE codebase, named MMPreTrain, which has the following highlights:
mmpretrain/models
, where a new folder selfsup
was made, which support 18 recent self-supervised learning algorithms.Contrastive leanrning | Masked image modeling |
---|---|
MoCo series | BEiT series |
SimCLR | MAE |
BYOL | SimMIM |
SwAV | MaskFeat |
DenseCL | CAE |
SimSiam | MILAN |
BarlowTwins | EVA |
DenseCL | MixMIM |
Support LeViT, XCiT, ViG, and ConvNeXt-V2 backbone, thus currently we support 68 backbones or algorithms and 472 checkpoints.
Add t-SNE visualization, users could visualize t-SNE to analyze the ability of your backbone. An example of visualization: left is from MoCoV2_ResNet50
and the right is from MAE_ViT-base
:
ImageClassificationInferencer
. (#1261)scaled_dot_product_attention
to accelerate MultiheadAttention
. (#1434)--out
and --dump
in tools/test.py
. (#1307)A total of 13 developers contributed to this release. Thanks to @techmonsterwang , @qingtian5 , @mzr1996 , @okotaku , @zzc98 , @aso538 , @szwlh-c , @fangyixiao18 , @yukkyo , @Ezra-Yu , @csatsurnh , @2546025323 , @GhaSiKey .
Full Changelog: https://github.com/open-mmlab/mmpretrain/compare/v1.0.0rc5...v1.0.0rc7
--tta
in tools/test.py
. (#1161)A total of 12 developers contributed to this release.
@marouaneamz @piercus @Ezra-Yu @mzr1996 @bobo0810 @suibe-qingtian @Scarecrow0 @tonysy @WINDSKY45 @wangbo-zhao @Francis777 @okotaku
dist_train_arm.sh
for ARM device.dist_train_arm.sh
for ARM device and update NPU results. (#1218)MMClsWandbHook
stuck. (#1242)device_ids
in tools/test.py
. (#1215)pytorch2torchscript.md
. (#1173)miscellaneous.md
. (#1137)ClassBalancedDataset
. (#901)A total of 7 developers contributed to this release.
@nijkah @xiaoyuan0203 @mzr1996 @Qiza-lyhm @ganghe74 @unseenme @wangjiangben-hw
A total of 4 developers contributed to this release.
@techmonsterwang @mzr1996 @fangyixiao18 @kitecats
num_classes
. (#1184)inference_model
cannot get classes information in checkpoint. (#1093)A total of 14 developers contributed to this release.
@austinmw @Ezra-Yu @nijkah @yingfhu @techmonsterwang @mzr1996 @sanbuphy @tonysy @XingyuXie @gaoyang07 @kitecats @marouaneamz @okotaku @zzc98