Gpt 2 Simple Versions Save

Python package to easily retrain OpenAI's GPT-2 text-generating model on new texts

v0.4.2

5 years ago

load_gpt2() in a fresh session is much faster and uses much less memory when loaded. (for the 117M model, the system will stay under <2 GB RAM which is the critical point for cloud services)
start_tf_sess() now accepts a threads parameter, which is useful if you know exactly how many threads will be used.

v0.4.1

5 years ago

Number of CSV tokens was inadvertently doubled. (#25)

v0.4

5 years ago

Support the 345M model (thanks to Neil Shepperd for the gradient checkpointing implementation!)
Support model_name in the CLI for above support
Support run_name in the CLI
Support .csv files as an input dataset to finetune (will parse the CSV as if it was done via encode_csv()).
Fix one off issues (#21)

v0.3.1

5 years ago

Fix one-off error where checkpoint saved a step early.
Fix issue where restore_from='fresh uses the counter from a previously-trained checkpoint.
If restore_from='latest , steps will now train for the specified amount of steps, instead of the training until the specified number of steps. (#13, #14)

v0.3

5 years ago

Added a basic CLI.
Added a include_prefix parameter to give an option to exclude the input prefix.
Improved regex for truncation.

v0.2

5 years ago

is_gpt2_downloaded: Check if the model is downloaded.
encode_csv: Convert a CSV to a format suitable for GPT-2.

v0.1

5 years ago