Implementation of papers for text classification task on SST-1/SST-2
Here is a new boy :bow: who wants to become a NLPer and his repository for Text Classification. Besides TextCNN and TextAttnBiLSTM, more models will be added in the near future.
Thanks for you Star:star:, Fork and Watch!
...
GoogleNews-vectors-negative300.bin
glove.840B.300d.txt
TextCNN
models/TextCNN.py
TextAttnBiLSTM
models/TextAttnBiLSTM.py
model | SST-1 | SST-2 |
---|---|---|
CNN-rand | 45.0 | 82.7 |
CNN-static | 45.5 | 86.8 |
CNN-non-static | 48.0 | 87.2 |
CNN-multichannel | 47.4 | 88.1 |
model | SST-1 | SST-2 |
---|---|---|
CNN-rand | 34.841 | 74.500 |
CNN-static | 45.056 | 84.125 |
CNN-non-static | 46.974 | 85.886 |
CNN-multichannel | 45.129 | 85.993 |
Attention + BiLSTM | 47.015 | 85.632 |
Attention + BiGRU | 47.854 | 85.102 |
Please install the following library requirements first.
pandas==0.24.2
torch==1.1.0
fire==0.1.3
numpy==1.16.2
gensim==3.7.3
│ .gitignore
│ config.py # Global Configuration
│ datasets.py # Create Dataloader
│ main.py
│ preprocess.py
│ README.md
│ requirements.txt
│ utils.py
│
├─checkpoints # Save checkpoint and best model
│
├─data # pretrained word vectors and datasets
│ │ glove.6B.300d.txt
│ │ GoogleNews-vectors-negative300.bin
│ └─stanfordSentimentTreebank # datasets folder
│
├─models
│ TextAttnBiLSTM.py
│ TextCNN.py
│ __init__.py
│
└─output_data # Preprocessed data and vocabulary, etc.
Set global configuration parameters in config.py
Preprocess the datasets
$python preprocess.py
$python main.py run
You can set the parameters in the config.py
and models/TextCNN.py
or models/TextAttnBiLSTM.py
in the command line.
$python main.py run [--option=VALUE]
For example,
$python main.py run --status='train' --use_model="TextAttnBiLSTM"
$python main.py run --status='test' --best_model="checkpoints/BEST_checkpoint_SST-2_TextCNN.pth"
TextCNN
model uses the n-gram-like convolution kernel extraction feature, while the TextAttnBiLSTM
model uses BiLSTM to capture semantics and long-term dependencies, combined with the attention mechanism for classification.padding_idx=0
in embedding layer[1] Convolutional Neural Networks for Sentence Classification
[3] Attention-Based Bidirection LSTM for Text Classification