Speaker Recognition Pytorch Save

Speaker recognition ,Voiceprint recognition

Project README

speaker recognition

PyTorch implementation of speech embedding net and loss described here: https://arxiv.org/pdf/1710.10467.pdf.

Also contains code to create embeddings compatible as input for the speaker diarization model found at https://github.com/google/uis-rnn

training loss

The TIMIT speech corpus was used to train the model, found here: https://catalog.ldc.upenn.edu/LDC93S1, or here, https://github.com/philipperemy/timit

Dependencies

PyTorch 0.4.1
python 3.5+
numpy 1.15.4
librosa 0.6.1

The python WebRTC VAD found at https://github.com/wiseman/py-webrtcvad is required to create run dvector_create.py, but not to train the neural network.

Preprocessing

Change the following config.yaml key to a regex containing all .WAV files in your downloaded TIMIT dataset. The TIMIT .WAV files must be converted to the standard format (RIFF) for the dvector_create.py script, but not for training the neural network.

unprocessed_data: './TIMIT/*/*/*/*.wav'

Run the preprocessing script:

./data_preprocess.py

Two folders will be created, train_tisv and test_tisv, containing .npy files containing numpy ndarrays of speaker utterances with a 90%/10% training/testing split.

GE2E-loss model training

To train the speaker verification model, run:

./train_speech_embedder.py

with the following config.yaml key set to true:

training: !!bool "true"

for testing, set the key value to:

training: !!bool "false"

The log file and checkpoint save locations are controlled by the following values:

log_file: './speech_id_checkpoint/Stats'
checkpoint_dir: './speech_id_checkpoint'

Only TI-SV is implemented.

Performance

EER across 10 epochs: 0.0377

D vector embedding creation

After training and testing the model, run dvector.py to create the data.pkl

The file can be loaded and used to train the triple-loss model.

triplet-loss model training

After create dvector,we use the triplet loss to train a model which are discribed here: https://arxiv.org/pdf/1705.02304.pdf run train.py

Reference

When reference speakers,run cli.py

https://github.com/HarryVolek/PyTorch_Speaker_Verification

https://github.com/philipperemy/deep-speaker

Open Source Agenda is not affiliated with "Speaker Recognition Pytorch" Project. README Source: Aurora11111/speaker-recognition-pytorch

Stars

Open Issues

Last Commit

4 years ago

Repository

Aurora11111/speaker-recognition-pytorch

License

BSD-3-Clause

Open Source Agenda Badge

<a href="https://www.opensourceagenda.com/projects/speaker-recognition-pytorch"><img src="https://www.opensourceagenda.com/projects/speaker-recognition-pytorch/reviews/badge.svg" alt="Open Source Agenda"></a>

Submit Review Review Your Favorite Project

Submit Resource Articles, Courses, Videos

Submit Article Submit a post to our blog

From the blog

Dec 11, 2022

How to Choose Which Programming Language to Learn First?

From the blog

Dec 11, 2022