Fully batched seq2seq example based on practical-pytorch, and more extra features.
Batched Seq2Seq Example
Based on the seq2seq-translation-batched.ipynb
from practical-pytorch, but more extra features.
This example runs grammatical error correction task where the source sequence is a grammatically erroneuous English sentence and the target sequence is an grammatically correct English sentence. The corpus and evaluation script can be download at: https://github.com/keisks/jfleg.
general attention
but it's sufficient). Note: The original code still uses for-loop to compute, which is very slow.Comparing to the state-of-the-art seq2seq library, OpenNMT-py, there are some stuffs that aren't optimized in this codebase:
input_feed
=0)Several ways to speed up RNN training:
See "Sequence Models and the RNN API (TensorFlow Dev Summit 2017)" for understanding those techniques.
You can use torchtext or OpenNMT's data iterator for speeding up the training. It can be 7x faster! (ex: 7 hours for an epoch -> 1 hour!)
Thanks to the author of OpenNMT-py @srush for answering the questions for me! See https://github.com/OpenNMT/OpenNMT-py/issues/552