Mmediting Versions Save

OpenMMLab Multimodal Advanced, Generative, and Intelligent Creation Toolbox. Unlock the magic 🪄: Generative-AI (AIGC), easy-to-use APIs, awsome model zoo, diffusion models, for text-to-image generation, image/video restoration/enhancement, etc.

v1.2.0

3 months ago

Highlights

  • An advanced and powerful inpainting algorithm named PowerPaint is released in our repository. Click to View

New Features & Improvements

Bug Fixes

New Contributors

Full Changelog: https://github.com/open-mmlab/mmagic/compare/v1.1.0...v1.2.0

v1.1.0

6 months ago

Highlights

In this new version of MMagic, we have added support for the following five new algorithms.


  • Support AnimateDiff, a popular text2animation method. Click to View

512

New Features & Improvements

CodeCamp Contributions

Bug Fixes

New Contributors

Full Changelog: https://github.com/open-mmlab/mmagic/compare/v1.0.2...v1.1.0

v1.0.2

7 months ago

Highlights

1. More detailed documentation

Thank you to the community contributors for helping us improve the documentation. We have improved many documents, including both Chinese and English versions. Please refer to the documentation for more details.

2. New algorithms

  • Support Prompt-to-prompt, DDIM Inversion and Null-text Inversion. Click to View.

From right to left: origin image, DDIM inversion, Null-text inversion


Prompt-to-prompt Editing

cat -> dog
spider man -> iron man(attention replace)
Effel tower -> Effel tower at night (attention refine)
blossom sakura tree -> blossom(-3) sakura tree (attention reweight)
  • Support Attention Injection for more stable video generation with controlnet. Click to view.
  • Support Stable Diffusion Inpainting. Click to view.

New Features & Improvements

Bug Fixes

New Contributors

v1.0.1

10 months ago

New Features & Improvements

  • Support tomesd for StableDiffusion speed-up. #1801
  • Support all inpainting/matting/image restoration models inferencer. #1833, #1873
  • Support animated drawings at projects. #1837
  • Support Style-Based Global Appearance Flow for Virtual Try-On at projects. #1786
  • Support tokenizer wrapper and support EmbeddingLayerWithFixe. #1846

Bug Fixes

  • Fix install requirements. #1819
  • Fix inst-colorization PackInputs. #1828, #1827
  • Fix inferencer in pip-install. #1875

New Contributors

v1.0.0

11 months ago

We are excited to announce the release of MMagic v1.0.0 that inherits from MMEditing and MMGeneration.

mmagic-log

Since its inception, MMEditing has been the preferred algorithm library for many super-resolution, editing, and generation tasks, helping research teams win more than 10 top international competitions and supporting over 100 GitHub ecosystem projects. After iterative updates with OpenMMLab 2.0 framework and merged with MMGeneration, MMEditing has become a powerful tool that supports low-level algorithms based on both GAN and CNN.

Today, MMEditing embraces Generative AI and transforms into a more advanced and comprehensive AIGC toolkit: MMagic (Multimodal Advanced, Generative, and Intelligent Creation).

In MMagic, we have supported 53+ models in multiple tasks such as fine-tuning for stable diffusion, text-to-image, image and video restoration, super-resolution, editing and generation. With excellent training and experiment management support from MMEngine, MMagic will provide more agile and flexible experimental support for researchers and AIGC enthusiasts, and help you on your AIGC exploration journey. With MMagic, experience more magic in generation! Let's open a new era beyond editing together. More than Editing, Unlock the Magic!

Highlights

1. New Models

We support 11 new models in 4 new tasks.

  • Text2Image / Diffusion
    • ControlNet
    • DreamBooth
    • Stable Diffusion
    • Disco Diffusion
    • GLIDE
    • Guided Diffusion
  • 3D-aware Generation
    • EG3D
  • Image Restoration
    • NAFNet
    • Restormer
    • SwinIR
  • Image Colorization
    • InstColorization

https://user-images.githubusercontent.com/49083766/233564593-7d3d48ed-e843-4432-b610-35e3d257765c.mp4

2. Magic Diffusion Model

For the Diffusion Model, we provide the following "magic" :

3. Upgraded Framework

To improve your "spellcasting" efficiency, we have made the following adjustments to the "magic circuit":

  • By using MMEngine and MMCV of OpenMMLab 2.0 framework, We decompose the editing framework into different modules and one can easily construct a customized editor framework by combining different modules. We can define the training process just like playing with Legos and provide rich components and strategies. In MMagic, you can complete controls on the training process with different levels of APIs.
  • Support for 33+ algorithms accelerated by Pytorch 2.0.
  • Refactor DataSample to support the combination and splitting of batch dimensions.
  • Refactor DataPreprocessor and unify the data format for various tasks during training and inference.
  • Refactor MultiValLoop and MultiTestLoop, supporting the evaluation of both generation-type metrics (e.g. FID) and reconstruction-type metrics (e.g. SSIM), and supporting the evaluation of multiple datasets at once.
  • Support visualization on local files or using tensorboard and wandb.

New Features & Improvements

  • Support 53+ algorithms, 232+ configs, 213+ checkpoints, 26+ loss functions, and 20+ metrics.
  • Support controlnet animation and Gradio gui. Click to view.
  • Support Inferencer and Demo using High-level Inference APIs. Click to view.
  • Support Gradio gui of Inpainting inference. Click to view.
  • Support qualitative comparison tools. Click to view.
  • Enable projects. Click to view.
  • Improve converters scripts and documents for datasets. Click to view.

v1.0.0rc7

11 months ago

Highlights

We are excited to announce the release of MMEditing 1.0.0rc7. This release supports 51+ models, 226+ configs and 212+ checkpoints in MMGeneration and MMEditing. We highlight the following new features

  • Support DiffuserWrapper
  • Support ControlNet (training and inference).
  • Support PyTorch 2.0.

New Features & Improvements

  • Support DiffuserWrapper. #1692
  • Support ControlNet (training and inference). #1744
  • Support PyTorch 2.0 (successfully compile 33+ models on 'inductor' backend). #1742
  • Support Image Super-Resolution and Video Super-Resolution models inferencer. #1662, #1720
  • Refactor tools/get_flops script. #1675
  • Refactor dataset_converters and documents for datasets. #1690
  • Move stylegan ops to MMCV. #1383

Bug Fixes

  • Fix disco inferencer. #1673
  • Fix nafnet optimizer config. #1716
  • Fix tof typo. #1711

Contributors

A total of 8 developers contributed to this release. Thanks @LeoXing1996, @Z-Fran, @plyfager, @zengyh1900, @liuwenran, @ryanxingql, @HAOCHENYE, @VongolaWu

New Contributors

Full Changelog: https://github.com/open-mmlab/mmediting/compare/v1.0.0rc6...v1.0.0rc7

v1.0.0rc6

1 year ago

Highlights

We are excited to announce the release of MMEditing 1.0.0rc6. This release supports 50+ models, 222+ configs and 209+ checkpoints in MMGeneration and MMEditing. We highlight the following new features

  • Support Gradio gui of Inpainting inference.
  • Support Colorization, Translationin and GAN models inference.

Backwards Incompatible changes

  1. GenValLoop and MultiValLoop has been merged to EditValLoop, GenTestLoop and MultiTestLoop has been merged to EditTestLoop. Use case:
    Case 1: metrics on a single dataset

    >>> # add the following lines in your config
    >>> # 1. use `EditValLoop` instead of `ValLoop` in MMEngine
    >>> val_cfg = dict(type='EditValLoop')
    >>> # 2. specific EditEvaluator instead of Evaluator in MMEngine
    >>> val_evaluator = dict(
    >>>     type='EditEvaluator',
    >>>     metrics=[
    >>>         dict(type='PSNR', crop_border=2, prefix='Set5'),
    >>>         dict(type='SSIM', crop_border=2, prefix='Set5'),
    >>>     ])
    >>> # 3. define dataloader
    >>> val_dataloader = dict(...)

    Case 2: different metrics on different datasets

    >>> # add the following lines in your config
    >>> # 1. use `EditValLoop` instead of `ValLoop` in MMEngine
    >>> val_cfg = dict(type='EditValLoop')
    >>> # 2. specific a list EditEvaluator
    >>> # do not forget to add prefix for each metric group
    >>> div2k_evaluator = dict(
    >>>     type='EditEvaluator',
    >>>     metrics=dict(type='SSIM', crop_border=2, prefix='DIV2K'))
    >>> set5_evaluator = dict(
    >>>     type='EditEvaluator',
    >>>     metrics=[
    >>>         dict(type='PSNR', crop_border=2, prefix='Set5'),
    >>>         dict(type='SSIM', crop_border=2, prefix='Set5'),
    >>>     ])
    >>> # define evaluator config
    >>> val_evaluator = [div2k_evaluator, set5_evaluator]
    >>> # 3. specific a list dataloader for each metric groups
    >>> div2k_dataloader = dict(...)
    >>> set5_dataloader = dict(...)
    >>> # define dataloader config
    >>> val_dataloader = [div2k_dataloader, set5_dataloader]
  1. Support stack and split for EditDataSample, Use case:
# Example for `split`
gen_sample = EditDataSample()
gen_sample.fake_img = outputs  # tensor
gen_sample.noise = noise  # tensor
gen_sample.sample_kwargs = deepcopy(sample_kwargs)  # dict
gen_sample.sample_model = sample_model  # string
# set allow_nonseq_value as True to copy non-sequential data (sample_kwargs and sample_model for this example)
batch_sample_list = gen_sample.split(allow_nonseq_value=True)  

# Example for `stack`
data_sample1 = EditDataSample()
data_sample1.set_gt_label(1)
data_sample1.set_tensor_data({'img': torch.randn(3, 4, 5)})
data_sample1.set_data({'mode': 'a'})
data_sample1.set_metainfo({
    'channel_order': 'rgb',
    'color_flag': 'color'
})
data_sample2 = EditDataSample()
data_sample2.set_gt_label(2)
data_sample2.set_tensor_data({'img': torch.randn(3, 4, 5)})
data_sample2.set_data({'mode': 'b'})
data_sample2.set_metainfo({
    'channel_order': 'rgb',
    'color_flag': 'color'
})
data_sample_merged = EditDataSample.stack([data_sample1, data_sample2])
  1. GenDataPreprocessor has been merged into EditDataPreprocessor,

    • No changes are required other than changing the type field in config.
    • Users do not need to define input_view and output_view since we will infer the shape of mean automatically.
    • In evaluation stage, all tensors will be converted to BGR (for three-channel images) and [0, 255].
  2. PixelData has been removed.

  3. For BaseGAN/CondGAN models, real images are passed from data_samples.gt_img instead of inputs['img'].

New Features & Improvements

  • Refactor FileIO. #1572
  • Refactor registry. #1621
  • Refactor Random degradations. #1583
  • Refactor DataSample, DataPreprocessor, Metric and Loop. #1656
  • Use mmengine.basemodule instead of nn.module. #1491
  • Refactor Main Page. #1609
  • Support Gradio gui of Inpainting inference. #1601
  • Support Colorization inferencer. #1588
  • Support Translation models inferencer. #1650
  • Support GAN models inferencer. #1653, #1659
  • Print config tool. #1590
  • Improve type hints. #1604
  • Update Chinese documents of metrics and datasets. #1568, #1638
  • Update Chinese documents of BigGAN and Disco-Diffusion. #1620
  • Update Evaluation and README of Guided-Diffusion. #1547

Bug Fixes

  • Fix the meaning of momentum in EMA. #1581
  • Fix output dtype of RandomNoise. #1585
  • Fix pytorch2onnx tool. #1629
  • Fix API documents. #1641, #1642
  • Fix loading RealESRGAN EMA weights. #1647
  • Fix arg passing bug of dataset_converters scripts. #1648

Contributors

A total of 17 developers contributed to this release. Thanks @plyfager, @LeoXing1996, @Z-Fran, @zengyh1900, @VongolaWu, @liuwenran, @austinmw, @dienachtderwelt, @liangzelong, @i-aki-y, @xiaomile, @Li-Qingyun, @vansin, @Luo-Yihang, @ydengbi, @ruoningYu, @triple-Mu

New Contributors

Full Changelog: https://github.com/open-mmlab/mmediting/compare/v1.0.0rc5...v1.0.0rc6

0.16.1

1 year ago

New Features & Improvements

  • Support FID and KID metrics. #775
  • Support groups parameter in ResidualBlockNoBN. #1510

Bug Fixes

  • Fix the bug of TTSR configuration file. #1435
  • Fix RealESRGAN test dataset. #1489
  • Fix dump config in train scrips. #1584
  • Fix dynamic exportable ONNX of pixel-unshuffle. #1637

Contributors

A total of 10 developers contributed to this release. Thanks @LeoXing1996, @Z-Fran, @zengyh1900, @liuky74, @KKIEEK, @zeakey, @Sqhttwl, @yhna940, @gihwan-kim, @vansin

New Contributors

Full Changelog: https://github.com/open-mmlab/mmediting/compare/v0.16.0...0.16.1

v1.0.0rc5

1 year ago

Highlights

We are excited to announce the release of MMEditing 1.0.0rc5. This release supports 49+ models, 180+ configs and 177+ checkpoints in MMGeneration and MMEditing. We highlight the following new features

  • Support Restormer
  • Support GLIDE
  • Support SwinIR
  • Support Stable Diffusion

New Features & Improvements

  • Disco notebook.(#1507)
  • Revise test requirements and CI.(#1514)
  • Recursive generate summary and docstring.(#1517)
  • Enable projects.(#1526)
  • Support mscoco dataset.(#1520)
  • Improve Chinese documents.(#1532)
  • Type hints.(#1481)
  • Update download link.(#1554)
  • Update deployment guide.(#1551)

Bug Fixes

  • Fix documentation link checker.(#1522)
  • Fix ssim first channel bug.(#1515)
  • Fix restormer ut.(#1550)
  • Fix extract_gt_data of realesrgan.(#1542)
  • Fix model index.(#1559)
  • Fix config path in disco-diffusion.(#1553)
  • Fix text2image inferencer.(#1523)

Contributors

A total of 16 developers contributed to this release. Thanks @plyfager, @LeoXing1996, @Z-Fran, @zengyh1900, @VongolaWu, @liuwenran, @AlexZou14, @lvhan028, @xiaomile, @ldr426, @austin273, @whu-lee, @willaty, @curiosity654, @Zdafeng, @Taited

New Contributors

Full Changelog: https://github.com/open-mmlab/mmediting/compare/v1.0.0rc4...v1.0.0rc5

v1.0.0rc4

1 year ago

v1.0.0rc4 (06/12/2022)

Highlights

We are excited to announce the release of MMEditing 1.0.0rc4. This release supports 45+ models, 176+ configs and 175+ checkpoints in MMGeneration and MMEditing. We highlight the following new features

  • Support High-level APIs.
  • Support diffusion models.
  • Support Text2Image Task.
  • Support 3D-Aware Generation.

New Features & Improvements

  • Refactor high-level APIs. (#1410)
  • Support disco-diffusion text-2-image. (#1234, #1504)
  • Support EG3D. (#1482, #1493, #1494, #1499)
  • Support NAFNet model. (#1369)

Bug Fixes

  • fix srgan train config. (#1441)
  • fix cain config. (#1404)
  • fix rdn and srcnn train configs. (#1392)
  • Revise config and pretrain model loading in esrgan. (#1407)

Contributors A total of 14 developers contributed to this release. Thanks @plyfager, @LeoXing1996, @Z-Fran, @zengyh1900, @VongolaWu, @gaoyang07, @ChangjianZhao, @zxczrx123, @jackghosts, @liuwenran, @CCODING04, @RoseZhao929, @shaocongliu, @liangzelong.

New Contributors