Implementing nlp papers relevant to classification with PyTorch, gluonnlp
The papers were implemented in using korean corpus
pyenv virualenv 3.7.7 nlp
pyenv activate nlp
pip install -r requirements.txt
python build_dataset.py
python build_vocab.py
python train.py # default training parameter
python evaluate.py # defatul evaluation parameter
nsmc
)conf/model/{type}.json
(e.g. type = ["sencnn", "charcnn",...]
)conf/dataset/nsmc.json
# example: Convolutional_Neural_Networks_for_Sentence_Classification
├── build_dataset.py
├── build_vocab.py
├── conf
│ ├── dataset
│ │ └── nsmc.json
│ └── model
│ └── sencnn.json
├── evaluate.py
├── experiments
│ └── sencnn
│ └── epochs_5_batch_size_256_learning_rate_0.001
├── model
│ ├── data.py
│ ├── __init__.py
│ ├── metric.py
│ ├── net.py
│ ├── ops.py
│ ├── split.py
│ └── utils.py
├── nsmc
│ ├── ratings_test.txt
│ ├── ratings_train.txt
│ ├── test.txt
│ ├── train.txt
│ ├── validation.txt
│ └── vocab.pkl
├── train.py
└── utils.py
Model \ Accuracy | Train (120,000) | Validation (30,000) | Test (50,000) | Date |
---|---|---|---|---|
SenCNN | 91.95% | 86.54% | 85.84% | 20/05/30 |
CharCNN | 86.29% | 81.69% | 81.38% | 20/05/30 |
ConvRec | 86.23% | 82.93% | 82.43% | 20/05/30 |
VDCNN | 86.59% | 84.29% | 84.10% | 20/05/30 |
SAN | 90.71% | 86.70% | 86.37% | 20/05/30 |
ETRIBERT | 91.12% | 89.24% | 88.98% | 20/05/30 |
SKTBERT | 92.20% | 89.08% | 88.96% | 20/05/30 |
conf/model/{type}.json
(e.g. type = ["siam", "san",...]
)conf/dataset/qpair.json
# example: Siamese_recurrent_architectures_for_learning_sentence_similarity
├── build_dataset.py
├── build_vocab.py
├── conf
│ ├── dataset
│ │ └── qpair.json
│ └── model
│ └── siam.json
├── evaluate.py
├── experiments
│ └── siam
│ └── epochs_5_batch_size_64_learning_rate_0.001
├── model
│ ├── data.py
│ ├── __init__.py
│ ├── metric.py
│ ├── net.py
│ ├── ops.py
│ ├── split.py
│ └── utils.py
├── qpair
│ ├── kor_pair_test.csv
│ ├── kor_pair_train.csv
│ ├── test.txt
│ ├── train.txt
│ ├── validation.txt
│ └── vocab.pkl
├── train.py
└── utils.py
Model \ Accuracy | Train (6,136) | Validation (682) | Test (758) | Date |
---|---|---|---|---|
Siam | 93.00% | 83.13% | 83.64% | 20/05/30 |
SAN | 89.47% | 82.11% | 81.53% | 20/05/30 |
Stochastic | 89.26% | 82.69% | 80.07% | 20/05/30 |
ETRIBERT | 95.07% | 94.42% | 94.06% | 20/05/30 |
SKTBERT | 95.43% | 92.52% | 93.93% | 20/05/30 |