TextAttack 🐙 is a Python framework for adversarial attacks, data augmentation, and model training in NLP https://textattack.readthedocs.io/en/master/
Bug fixing Matching documentation in Readme.md and the files in /doc folder add checklist add multilingual USE add gradient-based word importance ranking update to a more complete API documentation add cola constraint add the lazy loader
print_step
bug with alzantot
recipe (#195, thanks @heytitle for reporting!)Version 0.1.0 is our biggest release yet! Here's a summary of the changes:
Backwards compatibility note:
python -m textattack <args>
is renamed topython -m textattack attack <args>
. Or, better yet,textattack attack <args>
!
textattack
command (#132)
textattack augment
, textattack eval
, textattack attack
, textattack list
(#132)textattack train
, textattack peek-dataset
, and lots of infrastructure for training models (#139)nlp
format; temporarily remove non-NLP datasets (AGNews, English->German translation) (#134)MaxLengthModification
constraint that prevents modifications beyond tokenizer max_length (#143)pytest
tests and code formatting with black
; run tests on Python 3.6, 3.7, 3.8 with Travis CI (#127, #136)BERTScore
constrained based on "BERTScore: Evaluating Text Generation with BERT" (Zhang et al, 2019) (#146)Checkpoint
class; track attack results in a worklist; attack resume fixes (#128, #141)TA_CACHE_DIR
(#150)big changes:
transformers
models from the command-line using the --model-from-huggingface
optionnlp
datasets from the command-line using the --dataset-from-nlp
option--attack-from-file
, --model-from-file
, --dataset-from-file
small changes:
Checkpoint
classtextattack.shared.utils.get_logger() -> textattack.shared.logger
, textattack.shared.utils.get_device() -> textattack.shared.utils.device
)TokenizedText
memory usagePreTransformationConstraints
: constraints now can be applied before the transformation to prevent word modifications at certain indices. This abstraction allowed us to remove the notion of modified_indices
from search methods, which paves the way for us to introduce attacks that insert or delete words and phrases, as opposed to simply swapping words.Attack
and SearchMethod
: search methods are now a parameter to the attack instead of different subclasses of Attack
. This syntax fits better with our framework and enforces a clearer sense of separation between the responsibilities of the attack and those of the search method.check_compatibility
methodUntargetedClassification
and NonOverlappingOutput
now return scores between 0 and 1.python-m textattack
supports new checkpoint-related arguments: --checkpoint-interval
and --checkpoint-dir
--enable-wandb
flag