GeneFace++: Generalized and Stable Real-Time 3D Talking Face Generation; Official Code
| | 中文文档
This is the official implementation of GeneFace++ Paper with Pytorch, which enables high lip-sync, high video-reality and high system-efficiency 3D talking face generation. You can visit our Demo Page to watch demo videos and learn more details.
The eye blink control is an experimental feature, and we are currently working on improving its robustness. Thanks for your patience.
We provide a guide for a quick start in GeneFace++.
Step 1: Follow the steps in docs/prepare_env/install_guide.md
, create a new python environment named geneface
, and download 3DMM files into deep_3drecib/BFM
.
Step 2: Download pre-processed dataset of May(Google Drive or BaiduYun Disk with password 98n4), and place it here data/binary/videos/May/trainval_dataset.npy
Step 3: Download pre-trained audio-to-motino model audio2motion_vae.zip
(Google Drive or BaiduYun Disk with password 9cqp) and motion-to-video checkpoint motion2video_nerf.zip
, which is specific to May (in this Google Drive or in thisBaiduYun Disk with password 98n4), and unzip them to ./checkpoints/
After these steps,your directories checkpoints
and data
should be like this:
> checkpoints
> audio2motion_vae
> motion2video_nerf
> may_head
> may_torso
> data
> binary
> videos
> May
trainval_dataset.npy
geneface
Python environment, and execute:export PYTHONPATH=./
python inference/genefacepp_infer.py --a2m_ckpt=checkpoints/audio2motion_vae --head_ckpt= --torso_ckpt=checkpoints/motion2video_nerf/may_torso --drv_aud=data/raw/val_wavs/MacronSpeech.wav --out_name=may_demo.mp4
Or you can play with our Gradio WebUI:
export PYTHONPATH=./
python inference/app_genefacepp.py --a2m_ckpt=checkpoints/audio2motion_vae --head_ckpt= --torso_ckpt=checkpoints/motion2video_nerf/may_torso
Or use our provided Google Colab and run all cells in it.
Please refer to details in docs/process_data
and docs/train_and_infer
.
Below are answers to frequently asked questions when training GeneFace++ on custom videos:
May.mp4
). Or you need to hand-crop your training video. issue
If you found this repo helpful to your work, please consider cite us:
@article{ye2023geneface,
title={GeneFace: Generalized and High-Fidelity Audio-Driven 3D Talking Face Synthesis},
author={Ye, Zhenhui and Jiang, Ziyue and Ren, Yi and Liu, Jinglin and He, Jinzheng and Zhao, Zhou},
journal={arXiv preprint arXiv:2301.13430},
year={2023}
}
@article{ye2023geneface++,
title={GeneFace++: Generalized and Stable Real-Time Audio-Driven 3D Talking Face Generation},
author={Ye, Zhenhui and He, Jinzheng and Jiang, Ziyue and Huang, Rongjie and Huang, Jiawei and Liu, Jinglin and Ren, Yi and Yin, Xiang and Ma, Zejun and Zhao, Zhou},
journal={arXiv preprint arXiv:2305.00787},
year={2023}
}