Stable Diffusion Xl Video Save

Transform a pretrained text-to-image model to a text-to-video model

Project README

StableDiffusionXLVideo

Train a Video Diffusion Model Using Stable Diffusion XL Image Priors

This project involves training a video diffusion model based on Stable Diffusion XL image priors.

Getting Started

Installation

git clone https://github.com/motexture/stable-diffusion-xl-video.git
cd stable-diffusion-xl-video

Python Requirements

pip install deepspeed
pip install -r requirements.txt

On some systems, deepspeed requires installing the CUDA toolkit first in order to properly install. If you do not have CUDA toolkit, or deepspeed shows an error follow the instructions by NVIDIA: https://developer.nvidia.com/cuda-downloads

or on linux systems:

sudo apt install build-essential
wget https://developer.download.nvidia.com/compute/cuda/12.2.0/local_installers/cuda_12.2.0_535.54.03_linux.run
sudo sh cuda_12.2.0_535.54.03_linux.run

During the installation you only need to install toolkit, not the drivers or documentation.

Preparing the config file

Open the training.yaml file and modify the parameters according to your needs.

Train

deepspeed train.py --config training.yaml

Running inference

The inference.py script can be used to render videos with trained checkpoints.

Example usage:

python inference.py \
  --model sdxlvid \
  --prompt "a fast moving fancy sports car" \
  --num-frames 16 \
  --width 1024 \
  --height 1024 \
  --sdp

Shoutouts

  • ExponentialML for the original training and inference code
  • lucidrains for the pseudo 3D convolutions and the "make-a-video" implementation
  • xuduo35 for his own "make-a-video" implementation
  • guoyww for the AnimateDiff paper and code
  • Showlab and bryandlee for their Tune-A-Video contribution
Open Source Agenda is not affiliated with "Stable Diffusion Xl Video" Project. README Source: motexture/StableDiffusionXLVideo
Stars
35
Open Issues
1
Last Commit
5 months ago
License
MIT

Open Source Agenda Badge

Open Source Agenda Rating