Pytorch Seq2seq Example Save

Fully batched seq2seq example based on practical-pytorch, and more extra features.

Project README

Batched Seq2Seq Example Based on the seq2seq-translation-batched.ipynb from practical-pytorch, but more extra features.

This example runs grammatical error correction task where the source sequence is a grammatically erroneuous English sentence and the target sequence is an grammatically correct English sentence. The corpus and evaluation script can be download at: https://github.com/keisks/jfleg.

Extra features

Cleaner codebase
Very detailed comments for learners
Implement Pytorch native dataset and dataloader for batching
Correctly handle the hidden state from bidirectional encoder and past to the decoder as initial hidden state.
Fully batched attention mechanism computation (only implement general attention but it's sufficient). Note: The original code still uses for-loop to compute, which is very slow.
Support LSTM instead of only GRU
Shared embeddings (encoder's input embedding and decoder's input embedding)
Pretrained Glove embedding
Fixed embedding
Tie embeddings (decoder's input embedding and decoder's output embedding)
Tensorboard visualization
Load and save checkpoint
Replace unknown words by selecting the source token with the highest attention score. (Translation)

Cons

Comparing to the state-of-the-art seq2seq library, OpenNMT-py, there are some stuffs that aren't optimized in this codebase:

Use CuDNN when possible (always on encoder, on decoder when input_feed=0)
Always avoid indexing / loops and use torch primitives.
When possible, batch softmax operations across time. (this is the second complicated part of the code)
Batch inference and beam search for translation (this is the most complicated part of the code)

How to speed up RNN training?

Several ways to speed up RNN training:

Batching
Static padding
Dynamic padding
Bucketing
Truncated BPTT

See "Sequence Models and the RNN API (TensorFlow Dev Summit 2017)" for understanding those techniques.

You can use torchtext or OpenNMT's data iterator for speeding up the training. It can be 7x faster! (ex: 7 hours for an epoch -> 1 hour!)

Acknowledgement

Thanks to the author of OpenNMT-py @srush for answering the questions for me! See https://github.com/OpenNMT/OpenNMT-py/issues/552

Open Source Agenda is not affiliated with "Pytorch Seq2seq Example" Project. README Source: howardyclo/pytorch-seq2seq-example

Stars

Open Issues

Last Commit

6 years ago

Repository

howardyclo/pytorch-seq2seq-example

Open Source Agenda Badge

<a href="https://www.opensourceagenda.com/projects/pytorch-seq2seq-example"><img src="https://www.opensourceagenda.com/projects/pytorch-seq2seq-example/reviews/badge.svg" alt="Open Source Agenda"></a>

Submit Review Review Your Favorite Project

Submit Resource Articles, Courses, Videos

Submit Article Submit a post to our blog

From the blog

Dec 11, 2022

How to Choose Which Programming Language to Learn First?

From the blog

Dec 11, 2022