Textgenrnn Versions Save

Easily train your own text-generating neural network of any size and complexity on any text dataset with a few lines of code.

v2.0.0

4 years ago

Support for Tensorflow 2.1 added! (thanks to #165 by @ZerxXxes!)

TF 2.1 is now the minimum version in light of this migration. (which has both CPU and GPU support). If you need to use TF 1.X, use an older version.

v1.5.0

5 years ago

Two major features:

Synthesis (beta)

Generate text using two (or more!) trained models simultaneously. See this notebook for a demo.

The results are messier than usual so a lower temperature is recommended. It should work on both char-level and word-level models, or a mix of both. (however, I do not recommending mixing line-delimited and full text models!)

Please file issues if there are errors!

Generate Progress Bar

Thanks to tqdm, all generate functions show a progress bar! You can override this by passing progress=False to the function.

Additionally, the default generate temperature is now [1.0, 0.5, 0.2, 0.2]!

v1.4.1

5 years ago

v1.4

5 years ago

Features

  • Interactive mode, which lets you control which text is added. (#52, thanks @Juanets !)
  • Allow backends other than TensorFlow (#44, thanks @torokati44 !)
  • Allow periodic weights saving (#37, thanks @IrekRybark !)
  • Multi-GPU support (beta: see #62 )

Fixes

  • Handle prefix in word-level models correctly.

v1.3.2

5 years ago

Emergency bug fix to address #57 which occurs in newer Keras versions.

v1.3.1

5 years ago
  • Added ability to cycle temperatures during training (see this notebook for more information
  • Added utf-8 encoding for vocab export.
  • Added alias for train_on_texts(new_model=True) to train_new_model.
  • Fixed an issue where specifying dropout could cause issues.

v1.3

6 years ago
  • Added encode_text_vectors to encode text using the trained network.
  • Added similarity to quickly calculate cosine similarity and return the most similar texts.

See this notebook for details.

v1.2.2

6 years ago
  • Make is_csv work for real downstream.
  • Description tweaks

v1.2.1

6 years ago
  • Added validation to disable validation training for speed.
  • Added is_csv: Use with train_from_file if the source file is a one-column CSV (e.g. an export from BigQuery or Google Sheets) for proper quote/newline escaping.
  • README tweaks

v1.2

6 years ago
  • Renamed prop_keep to train_size, and will use the remaining data for validation.
  • Added dropout, which randomly excludes input tokens each epoch.