Pytorch implementation of Tacotron
A pytorch implementation of Tacotron: A Fully End-to-End Text-To-Speech Synthesis Model.
pip install -r requirements.txt
I used LJSpeech dataset which consists of pairs of text script and wav files. The complete dataset (13,100 pairs) can be downloaded here. I referred https://github.com/keithito/tacotron for the preprocessing code.
hyperparams.py
includes all hyper parameters that are needed.data.py
loads training data and preprocess text to index and wav files to spectrogram. Preprocessing codes for text is in text/ directory.module.py
contains all methods, including CBHG, highway, prenet, and so on.network.py
contains networks including encoder, decoder and post-processing network.train.py
is for training.synthesis.py
is for generating TTS sample.hyperparams.py
, especially 'data_path' which is a directory that you extract files, and the others if necessary.train.py
.synthesis.py
. Make sure the restore step.