CGExpan Save

The source code used for paper "Empower Entity Set Expansion via Language Model Probing", published in ACL 2020.

Project README

CGExpan

The source code used for paper "Empower Entity Set Expansion via Language Model Probing", published in ACL 2020.

Data

You can download the Wiki and APR datasets from the following links:

https://www.dropbox.com/sh/7aejy7t1bi9cjdj/AABIK71EcGtI2YAU-IoikK0xa?dl=0

After downloading the dataset, put them under the folder "./data/"

Run

For Wiki dataset, run

python src/main.py -dataset data/wiki/ -m 2 -gen_thres 3

For APR dataset, run

python src/main.py -dataset data/apr/ -m 2 -gen_thres 1

Results for each query will be saved under "./data/[DATA]/results"

Pretrained Embedding

To get pre-trained embedding for your own dataset, you need to provide "entity2id.txt" and "sentences.json". Please refer to SetExpan and HiExpan for the preprocessing code.

After putting the required files in your dataset folder "./data/[DATA]", you can run the following command to get the pretrained embedding:

python src/PretrainedEmb.py -dataset data/[DATA]

Citations

If you find our work useful for your research, please cite the following paper:

@inproceedings{Zhang2020CGExpan,
  title={Empower Entity Set Expansion via Language Model Probing},
  author={Zhang, Yunyi and Shen, Jiaming and Shang, Jingbo and Han, Jiawei},
  booktitle={ACL},
  year={2020}
}

Open Source Agenda is not affiliated with "CGExpan" Project. README Source: yzhan238/CGExpan

Stars

Open Issues

Last Commit

3 years ago

Repository

yzhan238/CGExpan

Open Source Agenda Badge

<a href="https://www.opensourceagenda.com/projects/cgexpan"><img src="https://www.opensourceagenda.com/projects/cgexpan/reviews/badge.svg" alt="Open Source Agenda"></a>

Submit Review Review Your Favorite Project

Submit Resource Articles, Courses, Videos

Submit Article Submit a post to our blog

From the blog

Dec 11, 2022

How to Choose Which Programming Language to Learn First?

From the blog

Dec 11, 2022