[CVPR2024 Highlight] VBench - We Evaluate Video Generation
This repository contains the implementation of the following paper:
VBench: Comprehensive Benchmark Suite for Video Generative Models
Ziqi Huang∗, Yinan He∗, Jiashuo Yu∗, Fan Zhang∗, Chenyang Si, Yuming Jiang, Yuanhan Zhang, Tianxing Wu, Qingyang Jin, Nattapol Chanpaisit, Yaohui Wang, Xinyuan Chen, Limin Wang, Dahua Lin+, Yu Qiao+, Ziwei Liu+
IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2024
pip install vbench
.['subject_consistency', 'background_consistency', 'temporal_flickering', 'motion_smoothness', 'dynamic_degree', 'aesthetic_quality', 'imaging_quality', 'object_class', 'multiple_objects', 'human_action', 'color', 'spatial_relationship', 'scene', 'temporal_style', 'appearance_style', 'overall_consistency']
We propose VBench, a comprehensive benchmark suite for video generative models. We design a comprehensive and hierarchical Evaluation Dimension Suite to decompose "video generation quality" into multiple well-defined dimensions to facilitate fine-grained and objective evaluation. For each dimension and each content category, we carefully design a Prompt Suite as test cases, and sample Generated Videos from a set of video generation models. For each evaluation dimension, we specifically design an Evaluation Method Suite, which uses carefully crafted method or designated pipeline for automatic objective evaluation. We also conduct Human Preference Annotation for the generated videos for each dimension, and show that VBench evaluation results are well aligned with human perceptions. VBench can provide valuable insights from multiple perspectives.
We visualize VBench evaluation results of various publicly available video generation models, as well as Gen-2 and Pika, across 16 VBench dimensions. We normalize the results per dimension for clearer comparisons.
See numeric values at our Leaderboard :1st_place_medal::2nd_place_medal::3rd_place_medal:
See model info for video generation models we used for evaluation.
pip install vbench
To evaluate some video generation ability aspects, you need to install detectron2 via:
pip install detectron2@git+https://github.com/facebookresearch/[email protected]
If there is an error during detectron2 installation, see here.
Download VBench_full_info.json to your running directory to read the benchmark prompt suites.
git clone https://github.com/Vchitect/VBench.git
pip install -r VBench/requirements.txt
pip install VBench
If there is an error during detectron2 installation, see here.
Use VBench to evaluate videos, and video generative models.
mode=custom_input
, and you can evaluate your own videos.We support evaluating any video. Simply provide the path to the video file, or the path to the folder that contains your videos. There is no requirement on the videos' names.
'subject_consistency', 'background_consistency', 'motion_smoothness', 'dynamic_degree', 'aesthetic_quality', 'imaging_quality'
To evaluate videos with customed input prompt, run our script with --mode=custom_input
:
python evaluate.py \
--dimension $DIMENSION \
--videos_path /path/to/folder_or_video/ \
--mode=custom_input
alternatively you can use our command:
vbench evaluate \
--dimension $DIMENSION \
--videos_path /path/to/folder_or_video/ \
--mode=custom_input
vbench evaluate --videos_path $VIDEO_PATH --dimension $DIMENSION
For example:
vbench evaluate --videos_path "sampled_videos/lavie/human_action" --dimension "human_action"
from vbench import VBench
my_VBench = VBench(device, <path/to/VBench_full_info.json>, <path/to/save/dir>)
my_VBench.evaluate(
videos_path = <video_path>,
name = <name>,
dimension_list = [<dimension>, <dimension>, ...],
)
For example:
from vbench import VBench
my_VBench = VBench(device, "vbench/VBench_full_info.json", "evaluation_results")
my_VBench.evaluate(
videos_path = "sampled_videos/lavie/human_action",
name = "lavie_human_action",
dimension_list = ["human_action"],
)
vbench evaluate \
--videos_path $VIDEO_PATH \
--dimension $DIMENSION \
--mode=vbench_category \
--category=$CATEGORY
or
python evaluate.py \
--dimension $DIMENSION \
--videos_path /path/to/folder_or_video/ \
--mode=vbench_category
We have provided scripts to download VideoCrafter-1.0 samples, and the corresponding evaluation scripts.
# download sampled videos
sh scripts/download_videocrafter1.sh
# evaluate VideoCrafter-1.0
sh scripts/evaluate_videocrafter1.sh
[Optional] Please download the pre-trained weights according to the guidance in the model_path.txt
file for each model in the pretrained
folder to ~/.cache/vbench
.
We provide prompt lists are at prompts/
.
Check out details of prompt suites, and instructions for how to sample videos for evaluation.
To facilitate future research and to ensure full transparency, we release all the videos we sampled and used for VBench evaluation. You can download them on Google Drive.
See detailed explanations of the sampled videos here.
We also provide detailed setting for the models under evaluation here.
To perform evaluation on one dimension, run this:
python evaluate.py --videos_path $VIDEOS_PATH --dimension $DIMENSION
['subject_consistency', 'background_consistency', 'temporal_flickering', 'motion_smoothness', 'dynamic_degree', 'aesthetic_quality', 'imaging_quality', 'object_class', 'multiple_objects', 'human_action', 'color', 'spatial_relationship', 'scene', 'temporal_style', 'appearance_style', 'overall_consistency']
Alternatively, you can evaluate multiple models and multiple dimensions using this script:
bash evaluate.sh
vbench_videos/{model}/{dimension}/{prompt}-{index}.mp4/gif
To filter static videos in the temporal flickering dimension, run this:
# This only filter out static videos whose prompt matches the prompt in the temporal_flickering.
python static_filter.py --videos_path $VIDEOS_PATH
You can adjust the filtering scope by:
# 1. Change the filtering scope to consider all files inside videos_path for filtering.
python static_filter.py --videos_path $VIDEOS_PATH --filter_scope all
# 2. Specify the path to a JSON file ($filename) to consider only videos whose prompts match those listed in $filename.
python static_filter.py --videos_path $VIDEOS_PATH --filter_scope $filename
If you find our repo useful for your research, please consider citing our paper:
@InProceedings{huang2023vbench,
title={{VBench}: Comprehensive Benchmark Suite for Video Generative Models},
author={Huang, Ziqi and He, Yinan and Yu, Jiashuo and Zhang, Fan and Si, Chenyang and Jiang, Yuming and Zhang, Yuanhan and Wu, Tianxing and Jin, Qingyang and Chanpaisit, Nattapol and Wang, Yaohui and Chen, Xinyuan and Wang, Limin and Lin, Dahua and Qiao, Yu and Liu, Ziwei},
booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition},
year={2024}
}
Order is based on the time joining the project:
Ziqi Huang, Yinan He, Jiashuo Yu, Fan Zhang, Nattapol Chanpaisit, Xiaojie Xu, Qianli Ma.
This project wouldn't be possible without the following open-sourced repositories: AMT, UMT, RAM, CLIP, RAFT, GRiT, IQA-PyTorch, ViCLIP, and LAION Aesthetic Predictor.