Python package to easily retrain OpenAI's GPT-2 text-generating model on new texts
Thanks to https://github.com/YaleDHLab via https://github.com/minimaxir/gpt-2-simple/pull/275, gpt-2-simple now supports TensorFlow 2 by default, and the minimum TensorFlow version is now 2.5.1! The Colab Notebook has also been update to no longer use TensorFlow 1.X.
Note: Development on gpt-2-simple has mostly been superceded by aitextgen, which has similar AI text generation capabilities with more efficient training time and resource usage. If you do not require using TensorFlow, I recommend using aitextgen instead. Checkpoints trained using gpt-2-simple can be loaded using aitextgen as well.
Some have successfully finetuned 774M/1558M, so the assert has been removed.
model_name
to gpt2.load_gpt2()
and gpt2.generate()
(this will work with 774M.sgd
as an optimizer
parameter to finetune
(default: adam
)Merged a few PRs:
Fixed generate cmd run name: #78 Resolved most depreciation warnings: #83 Optional model parameters: #90
This does not make the package fully TF 2.0 compatible, but it's a big step!
Assertion was triggering false positives, so removing it.
Minor fix to prevent issue hit with gpt-2-cloud-run.
A goal of the release was to allow a graph reset without resetting the parameters; that did not seem to work, so holding off on that release.
Merged PRs (including fix for prefix issue). (see commits for more info)
top_p
) when generating text, which results in surprisingly different results. (setting top_p=0.9
works well). Supercedes top_k
when used. (#51)encode_dataset()
function to preencode and compress a large dataset before loading it for finetuning. (#19, #54)overwrite
argument for finetune
: with restore_from="latest"
, this continues model training without creating a duplicate copy of the model, and is therefore good for transfer learning using multiple datasets (#20)finetune
a model without having the original GPT-2 model present..tar
file when copying to Google Drive, and when copying from Google Drive, the '.tar' file is automatically unpackaged into the correct checkpoint format. (you can pass copy_folder=True
to the copy_checkpoint
function to revert to the old behavior). (#37: thanks @woctezuma !)copy_checkpoint_to_gdrive
and copy_checkpoint_from_gdrive
now take a run_name
argument instead of a checkpoint_folder
argument.top_k
, top_p
, overwrite
.