Gpt 2 Simple Versions Save

Python package to easily retrain OpenAI's GPT-2 text-generating model on new texts

v0.8.1

2 years ago

Thanks to https://github.com/YaleDHLab via https://github.com/minimaxir/gpt-2-simple/pull/275, gpt-2-simple now supports TensorFlow 2 by default, and the minimum TensorFlow version is now 2.5.1! The Colab Notebook has also been update to no longer use TensorFlow 1.X.

Note: Development on gpt-2-simple has mostly been superceded by aitextgen, which has similar AI text generation capabilities with more efficient training time and resource usage. If you do not require using TensorFlow, I recommend using aitextgen instead. Checkpoints trained using gpt-2-simple can be loaded using aitextgen as well.

v0.7.2

3 years ago

Switched the model URL from GCP to Azure. (#253)
Pin TensorFlow 1.15 (#200)
Add checkpoint loading from other checkpoints (#175)

v0.7.1

4 years ago

Some have successfully finetuned 774M/1558M, so the assert has been removed.

v0.7

4 years ago

Multi-GPU support (#127) (not fully tested; will add some docs when done)
Fixed checkpoint dir bug (#134)
Added a hard assert of a TensorFlow version >= 2.0 is used (#137)

v0.6

4 years ago

774M is explicitly blocked from being fine-tuned and will trigger an assert if attempted. If a way to finetune it without being super-painful is added, the ability to finetune it will be restored.
Allow ability to generate text from the default pretrained models by passing model_name to gpt2.load_gpt2() and gpt2.generate() (this will work with 774M.
Addsgd as an optimizer parameter to finetune (default: adam)
Support for changed model names, w/ changes more prominent in the README.

v0.5.4

4 years ago

Merged a few PRs:

Fixed generate cmd run name: #78 Resolved most depreciation warnings: #83 Optional model parameters: #90

This does not make the package fully TF 2.0 compatible, but it's a big step!

v0.5.3

4 years ago

Assertion was triggering false positives, so removing it.

v0.5.2

4 years ago

Minor fix to prevent issue hit with gpt-2-cloud-run.

A goal of the release was to allow a graph reset without resetting the parameters; that did not seem to work, so holding off on that release.

v0.5.1

4 years ago

Merged PRs (including fix for prefix issue). (see commits for more info)

v0.5

4 years ago

Adapted a few functions from Neil Shepperd's fork:

Nucleus Sampling (top_p) when generating text, which results in surprisingly different results. (setting top_p=0.9 works well). Supercedes top_k when used. (#51)
An encode_dataset() function to preencode and compress a large dataset before loading it for finetuning. (#19, #54)

Improvements to continuing model training:

overwrite argument for finetune: with restore_from="latest", this continues model training without creating a duplicate copy of the model, and is therefore good for transfer learning using multiple datasets (#20)
You can continue to finetune a model without having the original GPT-2 model present.

Improvements with I/O involving Colaboratory

Checkpoint folders are now packaged into a .tar file when copying to Google Drive, and when copying from Google Drive, the '.tar' file is automatically unpackaged into the correct checkpoint format. (you can pass copy_folder=True to the copy_checkpoint function to revert to the old behavior). (#37: thanks @woctezuma !)
copy_checkpoint_to_gdrive and copy_checkpoint_from_gdrive now take a run_name argument instead of a checkpoint_folder argument.

Miscellaneous

Added CLI arguments for top_k, top_p, overwrite.
Cleaned up redundant function parameters (#39)