Mmclassification Versions Save

OpenMMLab Pre-training Toolbox and Benchmark

v1.0.1

9 months ago

Fix some bugs and enhance the codebase.

What's Changed

New Contributors

Full Changelog: https://github.com/open-mmlab/mmpretrain/compare/1.0.0...v1.0.1

1.0.0

10 months ago

MMPreTrain Release v1.0.0: Backbones, Self-Supervised Learning and Multi-Modalilty

Support more multi-modal algorithms and datasets

We are excited to announce that there are several advanced multi-modal methods suppported! We integrated huggingface/transformers with vision backbones in MMPreTrain to run inference and training(in developing).

Methods Datasets
BLIP (arxiv'2022) COCO (caption, retrieval, vqa)
BLIP-2 (arxiv'2023) Flickr30k (caption, retrieval)
OFA (CoRR'2022) GQA
Flamingo (NeurIPS'2022) NLVR2
Chinese CLIP (arxiv'2022) NoCaps
MiniGPT-4 (arxiv'2023) OCR VQA
LLaVA (arxiv'2023) Text VQA
Otter (arxiv'2023) VG VQA
VisualGenomeQA
VizWiz
VSR

Add iTPN, SparK self-supervised learning algorithms.

image image

Provide examples of New Config and DeepSpeed/FSDP

We test DeepSpeed and FSDP with MMEngine. The following are the memory and training time with ViT-large, ViT-huge and 8B multi-modal models, the left figure is the memory data, and the right figure is the training time data.

Test environment: 8*A100 (80G) PyTorch 2.0.0 image Remark: Both FSDP and DeepSpeed were tested with default configurations and not tuned, besides manually tuning the FSDP wrap policy can further reduce training time and memory usage.

New Features

  • Transfer shape-bias tool from mmselfsup (#1658)
  • Download dataset by using MIM&OpenDataLab (#1630)
  • Support New Configs (#1639, #1647, #1665)
  • Support Flickr30k Retrieval dataset (#1625)
  • Support SparK (#1531)
  • Support LLaVA (#1652)
  • Support Otter (#1651)
  • Support MiniGPT-4 (#1642)
  • Add support for VizWiz dataset (#1636)
  • Add support for vsr dataset (#1634)
  • Add InternImage Classification project (#1569)
  • Support OCR-VQA dataset (#1621)
  • Support OK-VQA dataset (#1615)
  • Support TextVQA dataset (#1569)
  • Support iTPN and HiViT (#1584)
  • Add retrieval mAP metric (#1552)
  • Support NoCap dataset based on BLIP. (#1582)
  • Add GQA dataset (#1585)

Improvements

  • Update fsdp vit-huge and vit-large config (#1675)
  • Support deepspeed with flexible runner (#1673)
  • Update Otter and LLaVA docs and config. (#1653)
  • Add image_only param of ScienceQA (#1613)
  • Support to use "split" to specify training set/validation (#1535)

Bug Fixes

  • Refactor _prepare_pos_embed in ViT (#1656#1679)
  • Freeze pre norm in vision transformer (#1672)
  • Fix bug loading IN1k dataset (#1641)
  • Fix sam bug (#1633)
  • Fixed circular import error for new transform (#1609)
  • Update torchvision transform wrapper (#1595)
  • Set default out_type in CAM visualization (#1586)

Docs Update

New Contributors

v1.0.0rc8

11 months ago

Highlights

  • Support multiple multi-modal algorithms and inferencers. You can explore these features by the gradio demo!
  • Add EVA-02, Dino-V2, ViT-SAM and GLIP backbones.
  • Register torchvision transforms into MMPretrain, you can now easily integrate torchvision's data augmentations in MMPretrain.

New Features

  • Support Chinese CLIP. (#1576)
  • Add ScienceQA Metrics (#1577)
  • Support multiple multi-modal algorithms and inferencers. (#1561)
  • add eva02 backbone (#1450)
  • Support dinov2 backbone (#1522)
  • Support some downstream classification datasets. (#1467)
  • Support GLIP (#1308)
  • Register torchvision transforms into mmpretrain (#1265)
  • Add ViT of SAM (#1476)

Improvements

  • [Refactor] Support to freeze channel reduction and add layer decay function (#1490)
  • [Refactor] Support resizing pos_embed while loading ckpt and format output (#1488)

Bug Fixes

  • Fix scienceqa (#1581)
  • Fix config of beit (#1528)
  • Incorrect stage freeze on RIFormer Model (#1573)
  • Fix ddp bugs caused by out_type. (#1570)
  • Fix multi-task-head loss potential bug (#1530)
  • Support bce loss without batch augmentations (#1525)
  • Fix clip generator init bug (#1518)
  • Fix the bug in binary cross entropy loss (#1499)

Docs Update

  • Update PoolFormer citation to CVPR version (#1505)
  • Refine Inference Doc (#1489)
  • Add doc for usage of confusion matrix (#1513)
  • Update MMagic link (#1517)
  • Fix example_project README (#1575)
  • Add NPU support page (#1481)
  • train cfg: Removed old description (#1473)
  • Fix typo in MultiLabelDataset docstring (#1483)

Contributors

A total of 12 developers contributed to this release.

@XiudingCai @Ezra-Yu @KeiChiTse @mzr1996 @bobo0810 @wangbo-zhao @yuweihao @fangyixiao18 @YuanLiuuuuuu @MGAMZ @okotaku @zzc98

v1.0.0rc7

1 year ago

MMPreTrain v1.0.0rc7 Release Notes

  • Highlights
  • New Features
  • Improvements
  • Bug Fixes
  • Docs Update

Highlights

We are excited to announce that MMClassification and MMSelfSup have been merged into ONE codebase, named MMPreTrain, which has the following highlights:

  • Integrated Self-supervised learning algorithms from MMSelfSup, such as MAE, BEiT, etc. Users could find that in our directory mmpretrain/models, where a new folder selfsup was made, which support 18 recent self-supervised learning algorithms.
Contrastive leanrning Masked image modeling
MoCo series BEiT series
SimCLR MAE
BYOL SimMIM
SwAV MaskFeat
DenseCL CAE
SimSiam MILAN
BarlowTwins EVA
DenseCL MixMIM
  • Support RIFormer, which is a way to keep a vision backbone effective while removing token mixers in its basic building blocks. Equipped with our proposed optimization strategy, we are able to build an extremely simple vision backbone with encouraging performance, while enjoying high efficiency during inference.
  • Support LeViT, XCiT, ViG, and ConvNeXt-V2 backbone, thus currently we support 68 backbones or algorithms and 472 checkpoints.

  • Add t-SNE visualization, users could visualize t-SNE to analyze the ability of your backbone. An example of visualization: left is from MoCoV2_ResNet50 and the right is from MAE_ViT-base:

  • Refactor dataset pipeline visualization, now we could also visualize the pipeline of mask image modeling, such as BEiT:

New Features

  • Support RIFormer. (#1453)
  • Support XCiT Backbone. (#1305)
  • Support calculate confusion matrix and plot it. (#1287)
  • Support RetrieverRecall metric & Add ArcFace config (#1316)
  • Add ImageClassificationInferencer. (#1261)
  • Support InShop Dataset (Image Retrieval). (#1019)
  • Support LeViT backbone. (#1238)
  • Support VIG Backbone. (#1304)
  • Support ConvNeXt-V2 backbone. (#1294)

Improvements

  • Use PyTorch official scaled_dot_product_attention to accelerate MultiheadAttention. (#1434)
  • Add ln to vit avg_featmap output (#1447)
  • Update analysis tools and documentations. (#1359)
  • Unify the --out and --dump in tools/test.py. (#1307)
  • Enable to toggle whether Gem Pooling is trainable or not. (#1246)
  • Update registries of mmcls. (#1306)
  • Add metafile fill and validation tools. (#1297)
  • Remove useless EfficientnetV2 config files. (#1300)

Bug Fixes

  • Fix precise bn hook (#1466)
  • Fix retrieval multi gpu bug (#1319)
  • Fix error repvgg-deploy base config path. (#1357)
  • Fix bug in test tools. (#1309)

Docs Update

  • Translate some tools tutorials to Chinese. (#1321)
  • Add Chinese translation for runtime.md. (#1313)

Contributors

A total of 13 developers contributed to this release. Thanks to @techmonsterwang , @qingtian5 , @mzr1996 , @okotaku , @zzc98 , @aso538 , @szwlh-c , @fangyixiao18 , @yukkyo , @Ezra-Yu , @csatsurnh , @2546025323 , @GhaSiKey .

New Contributors

Full Changelog: https://github.com/open-mmlab/mmpretrain/compare/v1.0.0rc5...v1.0.0rc7

v1.0.0rc5

1 year ago

Highlights

  • Support EVA, RevViT, EfficientnetV2, CLIP, TinyViT and MixMIM backbones.
  • Reproduce the training accuracy of ConvNeXt and RepVGG.
  • Support multi-task training and testing.
  • Support Test-time Augmentation.

New Features

  • [Feature] Add EfficientnetV2 Backbone. (#1253)
  • [Feature] Support TTA and add --tta in tools/test.py. (#1161)
  • [Feature] Support Multi-task. (#1229)
  • [Feature] Add clip backbone. (#1258)
  • [Feature] Add mixmim backbone with checkpoints. (#1224)
  • [Feature] Add TinyViT for dev-1.x. (#1042)
  • [Feature] Add some scripts for development. (#1257)
  • [Feature] Support EVA. (#1239)
  • [Feature] Implementation of RevViT. (#1127)

Improvements

  • [Reproduce] Reproduce RepVGG Training Accuracy. (#1264)
  • [Enhance] Support ConvNeXt More Weights. (#1240)
  • [Reproduce] Update ConvNeXt config files. (#1256)
  • [CI] Update CI to test PyTorch 1.13.0. (#1260)
  • [Project] Add ACCV workshop 1st Solution. (#1245)
  • [Project] Add Example project. (#1254)

Bug Fixes

  • [Fix] Fix imports in transforms. (#1255)
  • [Fix] Fix CAM visualization. (#1248)
  • [Fix] Fix the requirements and lazy register mmcls models. (#1275)

Contributors

A total of 12 developers contributed to this release.

@marouaneamz @piercus @Ezra-Yu @mzr1996 @bobo0810 @suibe-qingtian @Scarecrow0 @tonysy @WINDSKY45 @wangbo-zhao @Francis777 @okotaku

v0.25.0

1 year ago

Highlights

  • Support MLU backend.
  • Add dist_train_arm.sh for ARM device.

New Features

  • Support MLU backend. (#1159)
  • Support Activation Checkpointing for ConvNeXt. (#1152)

Improvements

  • Add dist_train_arm.sh for ARM device and update NPU results. (#1218)

Bug Fixes

  • Fix a bug caused MMClsWandbHook stuck. (#1242)
  • Fix the redundant device_ids in tools/test.py. (#1215)

Docs Update

  • Add version banner and version warning in master docs. (#1216)
  • Update NPU support doc. (#1198)
  • Fixed typo in pytorch2torchscript.md. (#1173)
  • Fix typo in miscellaneous.md. (#1137)
  • further detail for the doc for ClassBalancedDataset. (#901)

Contributors

A total of 7 developers contributed to this release.

@nijkah @xiaoyuan0203 @mzr1996 @Qiza-lyhm @ganghe74 @unseenme @wangjiangben-hw

v1.0.0rc4

1 year ago

Highlights

  • New API to get pre-defined models of MMClassification. See #1236 for more details.
  • Refactor BEiT backbone and support v1/v2 inference. See #1144.

New Features

  • Support getting models from the name defined in the model-index file. (#1236)

Improvements

  • Support evaluation on both EMA and non-EMA models. (#1204)
  • Refactor BEiT backbone and support v1/v2 inference. (#1144)

Bug Fixes

  • Fix reparameterize_model.py doesn't save meta info. (#1221)
  • Fix dict update in BEiT. (#1234)

Docs Update

  • Update install tutorial. (#1223)
  • Update MobileNetv2 & MobileNetv3 readme. (#1222)
  • Add version selection in the banner. (#1217)

Contributors

A total of 4 developers contributed to this release.

@techmonsterwang @mzr1996 @fangyixiao18 @kitecats

v1.0.0rc3

1 year ago

Highlights

  • Add Switch Recipe Hook, Now we can modify training pipeline, mixup and loss settings during training, see #1101.
  • Add TIMM and HuggingFace wrappers. Now you can train/use models in TIMM/HuggingFace directly, see #1102.
  • Support retrieval tasks, see #1055.
  • Reproduce MobileOne training accuracy. See #1191.

New Features

  • Add checkpoints from EfficientNets NoisyStudent & L2. (#1122)
  • Migrate CSRA head to 1.x. (#1177)
  • Support RepLKnet backbone. (#1129)
  • Add Switch Recipe Hook. (#1101)
  • Add adan optimizer. (#1180)
  • Support DaViT. (#1105)
  • Support Activation Checkpointing for ConvNeXt. (#1153)
  • Add TIMM and HuggingFace wrappers to build classifiers from them directly. (#1102)
  • Add reduction for neck (#978)
  • Support HorNet Backbone for dev1.x. (#1094)
  • Add arcface head. (#926)
  • Add Base Retriever and Image2Image Retriever for retrieval tasks. (#1055)
  • Support MobileViT backbone. (#1068)

Improvements

  • [Enhance] Enhance ArcFaceClsHead. (#1181)
  • [Refactor] Refactor to use new fileio API in MMEngine. (#1176)
  • [Enhance] Reproduce mobileone training accuracy. (#1191)
  • [Enhance] add deleting params info in swinv2. (#1142)
  • [Enhance] Add more mobilenetv3 pretrains. (#1154)
  • [Enhancement] RepVGG for YOLOX-PAI for dev-1.x. (#1126)
  • [Improve] Speed up data preprocessor. (#1064)

Bug Fixes

  • Fix the torchserve. (#1143)
  • Fix configs due to api refactor of num_classes. (#1184)
  • Update mmcls2torchserve. (#1189)
  • Fix for inference_model cannot get classes information in checkpoint. (#1093)

Docs Update

  • Add not-found page extension. (#1207)
  • update visualization doc. (#1160)
  • Support sort and search the Model Summary table. (#1100)
  • Improve the ResNet model page. (#1118)
  • update the readme of convnext. (#1156)
  • Fix the installation docs link in README. (#1164)
  • Improve ViT and MobileViT model pages. (#1155)
  • Improve Swin Doc and Add Tabs enxtation. (#1145)
  • Add MMEval projects link in README. (#1162)
  • Add runtime configuration docs. (#1128)
  • Add custom evaluation docs (#1130)
  • Add custom pipeline docs. (#1124)
  • Add MMYOLO projects link in MMCLS1.x. (#1117)

Contributors

A total of 14 developers contributed to this release.

@austinmw @Ezra-Yu @nijkah @yingfhu @techmonsterwang @mzr1996 @sanbuphy @tonysy @XingyuXie @gaoyang07 @kitecats @marouaneamz @okotaku @zzc98

v0.24.1

1 year ago

New Features

  • [Feature] Support mmcls with NPU backend. (#1072)

Bug Fixes

  • [Fix] Fix performance issue in convnext DDP train. (#1098)

Contributors

A total of 3 developers contributed to this release.

@wangjiangben-hw @790475019 @mzr1996

v1.0.0rc2

1 year ago

New Features

Improvements

  • Update analyze_results.py for dev-1.x. (#1071)
  • Get scores from inference api. (#1070)

Bug Fixes

  • Update requirements. (#1083)

Docs Update

  • Add 1x docs schedule. (#1015)

Contributors

A total of 3 developers contributed to this release.

@mzr1996 @okotaku @yingfhu