Tensorflow End2end Speech Recognition Save

End-to-End speech recognition implementation base on TensorFlow (CTC, Attention, and MTL training)

Project README

TensorFlow Implementation of End-to-End Speech Recognition

Requirements

TensorFlow >= 1.3.0
tqdm >= 4.14.0
python-Levenshtein >= 0.12.0
setproctitle >= 1.1.10
seaborn >= 0.7.1

Corpus

TIMIT

Phone (39, 48, 61 phones)
character

LibriSpeech

Phone (under implementation)
Character
Word

CSJ (Corpus of Spontaneous Japanese)

Phone (under implementation)
Japanese kana character (about 150 classes)
Japanese kanji characters (about 3000 classes)

These corpuses will be added in the future.

Switchboard
WSJ
AMI

This repository does'nt include pre-processing and pre-processing is based on this repo. If you want to do pre-processing, please look at this repo.

Model

Encoder

BLSTM
LSTM
BGRU
GRU
VGG-BLSTM
VGG-LSTM
Multi-task BLSTM
- you can set another CTC layer to the aubitrary layer.
Multi-task LSTM
VGG

Connectionist Temporal Classification (CTC) [Graves+ 2006]

Greedy decoder
Beam Search decoder
Beam Search decoder w/ CharLM (under implementation)

Options

Frame-stacking [Sak+ 2015]
Multi-GPUs training (synchronous)
Splicing
Down sampling (under implementation)

Attention Mechanism

Decoder

Greedy decoder
Beam search decoder (under implementation)

Attention type

Bahdanau's content-based attention
Bahdanau's normed content-based attention (under implementation)
location-based attention
Hybrid attention
Luong's dot attention
Luong's scaled dot attention (under implementation)
Luong's general attention
Luong's concat attention
Baidu's attention (under implementation)

Options

Sharpning
Temperature regularization in the softmax layer (Output posteriors)
Joint CTC-Attention [Kim 2016]
Coverage (under implementation)

Usage

Please refer to docs in each corpuse

TIMIT
LibriSpeech
CSJ

Lisense

MIT

Contact

[email protected]

Open Source Agenda is not affiliated with "Tensorflow End2end Speech Recognition" Project. README Source: hirofumi0810/tensorflow_end2end_speech_recognition

Stars

312

Open Issues

Last Commit

6 years ago

Repository

hirofumi0810/tensorflow_end2end_speech_recognition

License

MIT

Open Source Agenda Badge

<a href="https://www.opensourceagenda.com/projects/tensorflow-end2end-speech-recognition"><img src="https://www.opensourceagenda.com/projects/tensorflow-end2end-speech-recognition/reviews/badge.svg" alt="Open Source Agenda"></a>

Submit Review Review Your Favorite Project

Submit Resource Articles, Courses, Videos

Submit Article Submit a post to our blog

From the blog

Dec 11, 2022

How to Choose Which Programming Language to Learn First?

From the blog

Dec 11, 2022