Revised Version of SAT Model in "Improved Word Representation Learning with Sememes"
This is the revised version of SAT (Sememe Attention over Target) model, which is presented in the ACL 2017 paper Improved Word Representation Learning with Sememes. To get more details about the model, please read the paper or access the original project website.
中国""Taiwan台湾
and 中国""Japan日本
from SememeFile
and revise the corresponding lines in Word_Sense_Sememe_File
Word_Sense_Sememe_File
, which are not used in training processbash run_SAT.sh
To change training file, you can just switch the data/train_sample.txt
in run_SAT.sh
to your training file name.
The results are based on the 21G Sogou-T
as the training file, which can be downloaded from here (password: f2ul). And the hyper-parameters for all the models are the same as those in run_SAT.sh
. You can download the trained word embeddings from here.
Model | Wordsim-240 | Wordsim-297 |
---|---|---|
CBOW | 56.05 | 62.58 |
Skip-gram | 56.72 | 61.99 |
GloVe | 55.83 | 58.44 |
SAT | 62.11 | 62.74 |
Model | city-acc | city-rank | family-acc | family-rank | capital-acc | capital-rank | total-acc | total-rank |
---|---|---|---|---|---|---|---|---|
Skip-gram | 84.14 | 1.50 | 86.67 | 1.21 | 61.30 | 8.31 | 70.70 | 5.66 |
SAT | 98.85 | 1.01 | 77.20 | 5.27 | 80.06 | 10.10 | 82.29 | 7.52 |