DiffSinger: Singing Voice Synthesis via Shallow Diffusion Mechanism (SVS & TTS); AAAI 2022; Official code
This repository is the official PyTorch implementation of our AAAI-2022 paper, in which we propose DiffSinger (for Singing-Voice-Synthesis) and DiffSpeech (for Text-to-Speech).
:tada: :tada: :tada: Updates:
:rocket: News:
PortaSpeech: Portable and High-Quality Generative Text-to-Speech
was accepted by NeurIPS-2021 .If you want to use env of anaconda:
conda create -n your_env_name python=3.8
source activate your_env_name
pip install -r requirements_2080.txt (GPU 2080Ti, CUDA 10.2)
or pip install -r requirements_3090.txt (GPU 3090, CUDA 11.4)
Or, if you want to use virtual env of python:
## Install Python 3.8 first.
python -m venv venv
source venv/bin/activate
# install requirements.
pip install -U pip
pip install Cython numpy==1.19.1
pip install torch==1.9.0
pip install -r requirements.txt
Mel Pipeline | Dataset | Pitch Input | F0 Prediction | Acceleration Method | Vocoder |
---|---|---|---|---|---|
DiffSpeech (Text->F0, Text+F0->Mel, Mel->Wav) | Ljspeech | None | Explicit | Shallow Diffusion | HiFiGAN |
DiffSinger (Lyric+F0->Mel, Mel->Wav) | PopCS | Ground-Truth F0 | None | Shallow Diffusion | NSF-HiFiGAN |
DiffSinger (Lyric+MIDI->F0, Lyric+F0->Mel, Mel->Wav) | OpenCpop | MIDI | Explicit | Shallow Diffusion | NSF-HiFiGAN |
FFT-Singer (Lyric+MIDI->F0, Lyric+F0->Mel, Mel->Wav) | OpenCpop | MIDI | Explicit | Invalid | NSF-HiFiGAN |
DiffSinger (Lyric+MIDI->Mel, Mel->Wav) | OpenCpop | MIDI | Implicit | None | Pitch-Extractor + NSF-HiFiGAN |
DiffSinger+PNDM (Lyric+MIDI->Mel, Mel->Wav) | OpenCpop | MIDI | Implicit | PLMS | Pitch-Extractor + NSF-HiFiGAN |
DiffSpeech+PNDM (Text->Mel, Mel->Wav) | Ljspeech | None | Implicit | PLMS | HiFiGAN |
tensorboard --logdir_spec exp_name
@article{liu2021diffsinger,
title={Diffsinger: Singing voice synthesis via shallow diffusion mechanism},
author={Liu, Jinglin and Li, Chengxi and Ren, Yi and Chen, Feiyang and Liu, Peng and Zhao, Zhou},
journal={arXiv preprint arXiv:2105.02446},
volume={2},
year={2021}}
Especially thanks to: