This repository contains code for the following paper:
Project website: https://homes.cs.washington.edu/~eunsol/_site/open_entity.html
You have to put set three paths at ./resources/constant.py
FILE_ROOT=where you our dataset.
GLOVE_VEC=the path where you can find pretrained glove vectors.
EXP_ROOT=where you save models.
(2), (3), (4) can be downloaded from here http://nlp.cs.washington.edu/entity_type/data/ultrafine_acl18.tar.gz
Gigaword is a licensed dataset from LDC, so is not released with the code.
Without it, however, model can reach reasonable performances (29.8F1 instead of 31.7F1 reported).
Alternatively, you can email the first author get the processed version after verifying your LDC license.
python3 main.py MODEL_ID -lstm_type single -enhanced_mention -data_setup joint -add_crowd -multitask
To train model on the Ontonotes dataset python3 main.py onto -lstm_type single -goal onto -enhanced_mention
To run predictions of pre-trained model: python3 main.py MODEL_ID -lstm_type single -enhanced_mention -data_setup joint -add_crowd -multitask -mode test -reload_model_name MODEL_NAME_TIMESTAMP -eval_data crowd/test.json -load
python3 scrorer.py OUTPUT_FILENAME
Contact: Eunsol Choi -- [email protected]
Credit: