ByteNet for character-level language modelling
This is a tensorflow implementation of the byte-net model from DeepMind's paper Neural Machine Translation in Linear Time.
From the abstract
The ByteNet decoder attains state-of-the-art performance on character-level language modeling and outperforms the previous best results obtained with recurrent neural networks. The ByteNet also achieves a performance on raw character-level machine translation that approaches that of the best neural translation models that run in quadratic time. The implicit structure learnt by the ByteNet mirrors the expected alignments between the sequences.
Image Source - Neural Machine Translation in Linear Time paper
The model applies dilated 1d convolutions on the sequential data, layer by layer to obain the source encoding. The decoder then applies masked 1d convolutions on the target sequence (conditioned by the encoder output) to obtain the next character in the target sequence.The character generation model is just the byteNet decoder, while the machine translation model is the combined encoder and decoder.
ByteNet/generator.py
and the translation model is defined in ByteNet/translator.py
. ByteNet/ops.py
contains the bytenet residual block, dilated conv1d and layer normalization.Data/generator_training_data/shakespeare.txt
.Create the following directories Data/tb_summaries/translator_model
, Data/tb_summaries/generator_model
, Data/Models/generation_model
, Data/Models/translation_model
.
Text Generation
model_config.py
.Data/generator_training_data
. A sample shakespeare.txt is included in the repo.python train_generator.py --text_dir="Data/generator_training_data"
python train_generator.py --help
for more options.Machine Translation
model_config.py
.Data/MachineTranslation
. You may download the new commentary training corpus using this link.bucket_quant
. The sentences are padded with a special character beyond the actual length.python train_translator.py --source_file=<source sentences file> --target_file=<target sentences file> --bucket_quant=50
python train_translator.py
--help for more options.python generate.py --seed="SOME_TEXT_TO_START_WITH" --sample_size=<SIZE OF GENERATED SEQUENCE>
python translate.py
.
ANTONIO:
What say you to this part of this to thee?
KING PHILIP:
What say these faith, madam?
First Citizen:
The king of England, the will of the state,
That thou dost speak to me, and the thing that shall
In this the son of this devil to the storm,
That thou dost speak to thee to the world,
That thou dost see the bear that was the foot,