Textaugment Versions Save

TextAugment: Text Augmentation Library

2.0.0

6 months ago

now supports gensim >= 4
now support fasttext models
enhanced code to allow user to select top n words (synonyms/most similar words)
added punctuation insertion

v1.3.4

3 years ago

Fixed minor issues

v1.3.3

3 years ago

Added support for Fasttext augmentation
Added example notebook for Fasttext augmentation

1.3.2

3 years ago

minor updates

1.3.1

3 years ago

fix minor issues

1.3

3 years ago

added mixup augmentation algorithm for NLP

1.2

3 years ago

Added support for EDA algorithm
Added examples using Jupyter notebook

1.1

4 years ago

Updated ReadMe and icons.

Added licence icon.
Release icon.
Wheel icon.
Python version icon.

Added pre-print paper citation.

1.0

4 years ago

TextAugment is a Python 3 library for augmenting text for natural language processing applications. TextAugment stands on the giant shoulders of NLTK, Gensim, and TextBlob and plays nicely with them.

Requirements

Python 3 The following software packages are dependencies and will be installed automatically.

$ pip install numpy nltk gensim textblob googletrans

The following code downloads wordnet, tokenizer, and part-of-speech tagger model.

nltk.download('wordnet')
nltk.download('punkt')
nltk.download('averaged_perceptron_tagger')

Install from pip [Recommended]

$ pip install textaugment

How to use

>>> from textaugment import Word2vec
>>> t = Word2vec(model='path/to/gensim/model'or 'gensim model itself')
>>> t.augment('The stories are good')
The films are good

Citation

@article{marivate2019improving,
  title={Improving short text classification through global augmentation methods},
  author={Marivate, Vukosi and Sefara, Tshephisho},
  journal={arXiv preprint arXiv:1907.03752},
  year={2019}
}

Textaugment Versions Save

2.0.0

v1.3.4

v1.3.3

1.3.2

1.3.1

1.3

1.2

1.1

Updated ReadMe and icons.

1.0

Requirements

Install from pip [Recommended]

How to use

Citation

Built with ❤ on Python