Official code for semantic parsing model "HSP" in paper "Complex Question Decomposition for Semantic Parsing"(ACL'19).
This is the code base for ACL'19 paper Complex Question Decomposition for Semantic Parsing
.
Download ComplexWebQ data, prepare environment and libraries.
In order to run preprocess, you should put the following files in DATA_PATH directory, DATA_PATH is defined in the script.
complex_questions
directory, also you can generate them by following steps)cd WebAsKB
.WebAsKB/config.py
.WebAsKB/config.py
to train, dev and test, and Run python webaskb_run.py gen_golden_sup
for three times.train.json, dev.json, test.json
in DATA_PATH.In order to run the POS annotation process, you should download and start a StanfordCoreNLP server in localhost:9003.
https://nlp.stanford.edu/software/stanford-corenlp-full-2018-10-05.zip
, unzip and cd to it.java -mx4g -cp "*" edu.stanford.nlp.pipeline.StanfordCoreNLPServer -port 9003 -timeout 15000
.We provide a template script scripts/run.sh
for the users, and you need to change the following directory settings at least to run it.
Now run scripts/run.sh preprocess
, the command will generate the data format for our model, and annotate POS labels.
Prepare Glove pretrained embedding file glove.6B.300d.txt
and put it in DATA_PATH/embed/.
scripts/run.sh prepare
will shuffle the dataset and build vocabulary file.
To train our decompose model, use scripts/run.sh train
.
To train our semantic parsing model, use scripts/run.sh train_lf
.
scripts/run.sh test
, it will generate decomposed query with a input file, and print bleu-4 & rouge-l score compared to references.
scripts/run.sh test_lf
, it will generate logical form with a input file, and print EM score compared to references.
If you use this code in your research, please kindly cite our paper via the following BibTeX.
@inproceedings{Zhang2019HSP,
author = {Zhang, Haoyu and Cai, Jingjing and Xu, Jianjun and Wang, Ji},
booktitle = {Conference of the Association for Computational Linguistics (ACL)},
year = {2019}
}