Data augmentation for NLP
Fix #142
ContextualWordEmbsAug supports bert-base-multilingual-uncased (for non English inputs) Fix missing library dependency #74 Fix single token error when using RandomWordAug #76 Fix replacing character in RandomCharAug error #77 Enhance word's augmenter to support regular expression stopwords #81 Enhance char's augmenter to support regular expression stopwords #86 KeyboardAug supports Thai language #92 Fix word casing issue #82
Support color noise (pink, blue, red and violet noise) in audio's NoiseAug Support given background noise in audio's NoiseAug Support inject noise to portion of audio only in audio's NoiseAug Introduce zone, coverage to all audio augmenter. Support only augmented portion of audio input Add VTLP augmentation methods (Audio's augmenter) Adopt latest transformer's interface #59 Support RoBERTa (including DistilRoBERTa) and DistilBERT (ContextualWordEmbsAug) Support DistilGPT2 (ContextualWordEmbsForSentenceAug) Fix librosa hard dependency #62 Introduce optimize attribute ContextualWordEmbsForSentenceAug #63 Optimize word selection for ContextualWordEmbsAug and ContextualWordEmbsForSentenceAug (Speed up around 30%) Add retry mechanism into ContextualWordEmbsAug insert action #68