Nlpaug Versions Save

Data augmentation for NLP

0.0.10

4 years ago
  • Add aug_max to control maximum number of augmented item
  • Fix ContextualWordEmbsAug (for BERT) error when input is longer than max sequence length
  • Add RandomWordAug Substitute action
  • Fix ContextualWordEmbsAug error when no augmented data
  • Support multi thread processing (for CPU only) to speed up the augmentation
  • Fix KeyboardAug error #55

0.0.9

4 years ago
  • Added Swap Mode (adjacent, middle and random) for RandomAug (character level)
  • Added SynonymAug (WordNet/ PPDB) and AntonymAug (WordNet)
  • WordNetAug is deprecated. Uses SynonymAug instead
  • Introduce parameter n. Returning more than 1 augmented data. Changing output format from text (or numpy) to list of text (or numpy) if n > 1
  • Introduce parameter temperature in ContextualWordEmbsAug and ContextualWordEmbsForSentenceAug to control the randomness
  • aug_n parameter is deprecated. This parameter will be replaced by top_k parameter
  • Fixed tokenization issue #48
  • Upgraded transformers dependency (or pytorch_transformer) to 2.0.0
  • Upgraded PyTorch dependency to 1.2.0
  • Added SplitAug

0.0.8

4 years ago
  • BertAug is replaced by ContextualWordEmbsAug
  • Support GPU (for ContextualWordEmbsAug only) #26
  • Upgraded pytorch_transformer to 1.1.0 version #33
  • ContextualWordEmbsAug suuports both BERT and XLNet model
  • Removed librosa dependency
  • Add ContextualWordEmbsForSentenceAug for generating next sentence
  • Fix sampling issue #38

0.0.7

4 years ago
  • Add new augmenter (CropAug, LoudnessAug, MaskAug)
  • QwertyAug is deprecated. It will be replaced by KeyboardAug
  • Remove StopWordsAug. It will be replaced by RandomWordAug
  • Code refactoring
  • Added model download function for word2vec, GloVe and fasttext

0.0.6

4 years ago

0.0.6 release