Home Projects Resources Alternatives Blog Sign In

Text To Video Finetuning Versions Save

Finetune ModelScope's Text To Video model using Diffusers 🧨

Overview
Versions
Reviews
Resources

v3.01

5 months ago

First of all a note from me. Thank you guys for your support, feedback, and journey through discovering the nascent, innate potential of video Diffusion Models.

@damo-vilab (the creators of ModelScope and others) Has released an official repository for finetuning all things Video Diffusion Models, and I recommend their implementations over this repository. https://github.com/damo-vilab/i2vgen-xl

https://github.com/ExponentialML/Text-To-Video-Finetuning/assets/59846140/55608f6a-333a-458f-b7d5-94461c5da8bb

This repository will no longer be updated, but will instead be archived for researchers & builders that wish to bootstrap their projects. I will be leaving the issues, pull requests, and all related things for posterity purposes.

Thanks again!

v3.00

10 months ago

New Release with some exciting features and bug fixes!

Changes

Add alternative to offset noise from https://arxiv.org/abs/2305.08891 rescale_schedule in the config.
Use default dropout of 0.1 on all temporal convolution layers.
Added support for training LoRA models for use with the text2video A1111 extension. lora_version: "stable_lora" in the config.
Add ability to choose different Accelerator loggers.
Regress Accelerator version to 0.19 to prevent model checkpoint saving issues.
Multiple contributions to inference.py for stability and ease of use. Thanks @bruefire, @JCBrouwer, and @bfasenfest!