Easily train your own text-generating neural network of any size and complexity on any text dataset with a few lines of code.
Support for Tensorflow 2.1 added! (thanks to #165 by @ZerxXxes!)
TF 2.1 is now the minimum version in light of this migration. (which has both CPU and GPU support). If you need to use TF 1.X, use an older version.
Two major features:
Generate text using two (or more!) trained models simultaneously. See this notebook for a demo.
The results are messier than usual so a lower temperature
is recommended. It should work on both char-level and word-level models, or a mix of both. (however, I do not recommending mixing line-delimited and full text models!)
Please file issues if there are errors!
Thanks to tqdm
, all generate
functions show a progress bar! You can override this by passing progress=False
to the function.
Additionally, the default generate temperature is now [1.0, 0.5, 0.2, 0.2]
!
prefix
in word-level models correctly.Emergency bug fix to address #57 which occurs in newer Keras versions.
train_on_texts(new_model=True)
to train_new_model
.dropout
could cause issues.encode_text_vectors
to encode text using the trained network.similarity
to quickly calculate cosine similarity and return the most similar texts.See this notebook for details.
is_csv
work for real downstream.validation
to disable validation training for speed.is_csv
: Use with train_from_file
if the source file is a one-column CSV (e.g. an export from BigQuery or Google Sheets) for proper quote/newline escaping.prop_keep
to train_size
, and will use the remaining data for validation.dropout
, which randomly excludes input tokens each epoch.