Tensorflow implementation of Chinese/Mandarin TTS (Text-to-Speech) based on Tacotron-2 model.
Tensorflow implementation of DeepMind's Tacotron-2. A deep neural network architecture described in this paper: Natural TTS synthesis by conditioning Wavenet on MEL spectogram predictions
tacotron-2-mandarin-griffin-lim
|--- datasets
|--- logs-Tacotron
|--- eval-dir
|--- plots
|--- taco_pretrained
|--- wavs
|--- papers
|--- prepare
|--- tacotron
|--- models
|--- utils
|--- tacotron_output
|--- eval
|--- logs-eval
|--- plots
|--- wavs
|--- training_data
|--- audio
|--- linear
|--- mels
There are some synthesis samples here.
you can get pretrained model here.
OS: Ubuntu 16.04
Step (0) - Git clone repository
git clone https://github.com/atomicoo/tacotron2-mandarin.git
cd tacotron-2-mandarin-griffin-lim/
Step (1) - Install dependencies
Install Python 3 (python-3.5.5 for me)
Install TensorFlow (tensorflow-1.10.0 for me)
Install other dependencies
pip install -r requirements.txt
Step (2) - Prepare dataset
Download dataset BIAOBEI or THCHS-30
After that, your doc tree should be:
tacotron-2-mandarin-griffin-lim
|--- ...
|--- BZNSYP
|--- ProsodyLabeling
|--- 000001-010000.txt
|--- Wave
|--- ...
Prepare dataset (default is BIAOBEI
)
python prepare_dataset.py
If preparing THCHS-30
, you can use parameter --dataset=THCHS-30
.
After that, you can get a folder BIAOBEI
as follow:
tacotron-2-mandarin-griffin-lim
|--- ...
|--- BIAOBEI
|--- biaobei_48000
|--- ...
Preprocess dataset (default is BIAOBEI
)
python preprocess.py
If prrprocessing THCHS-30
, you can use parameter --dataset=THCHS-30
.
After that, you can get a folder training_data
as follow:
tacotron-2-mandarin-griffin-lim
|--- ...
|--- training_data
|--- audio
|--- linear
|--- mels
|--- train.txt
|--- ...
Step (3) - Train tacotron model
python train.py
More parameters, please see train.py.
After that, you can get a folder logs-Tacotron
as follow:
tacotron-2-mandarin-griffin-lim
|--- ...
|--- logs-Tacotron
|--- eval-dir
|--- plots
|--- taco_pretrained
|--- wavs
|--- ...
Step (4) - Synthesize audio
python synthesize.py
More parameters, please see synthesize.py.
After that, you can get a folder tacotron_output
as follow:
tacotron-2-mandarin-griffin-lim
|--- ...
|--- tacotron_output
|--- eval
|--- logs-eval
|--- plots
|--- wavs
|--- ...