Textgenrnn Versions Save

Easily train your own text-generating neural network of any size and complexity on any text dataset with a few lines of code.

6 years ago

Switched to a fit_generator implementation of generating sequences for training, instead of loading all sequences into memory. This will allow training large text files (10MB+) without requiring ridiculous amounts of RAM.
Better word_level support:
- The model will only keep max_words words and discard the rest.
- The model will not train to predict words not in the vocabulary
- All punctuation (including smart quotes) are their own token.
- When generating, newlines/tabs have surrounding whitespace stripped. (this is not the case for other punctuation as there are too many rules around that)
Training on single text no longer uses meta tokens to indicate the start/end of the text and does not use them when generating, which results in slightly better output.

6 years ago

First release after the major refactor.