Code and Checkpoints for "Generate rather than Retrieve: Large Language Models are Strong Context Generators" in ICLR 2023.
This is the official implementation of our pre-print paper "Generate rather than Retrieve: Large Language Models are Strong Context Generators", in ICLR 2023 [OpenReview] [arXiv].
Create an environment and install openai package via pip install openai
.
Add your OpenAI API key at openai.api_key
(line 12) in inference.py
From their official websites: [NQ/TriviaQA/WebQ] / [FM2] / [FEVER/Wizard]
From Google drive: (we unified the formats of the above datasets) [link]
Please put them into indataset
folder. Now it contains webq
and fm2
.
Step1: generate background document.
python mainfunc.py
--dataset {dataset}
--task step1
--split test
Note: we use the text-davinci-002
in our experiment; we use greedy search in the zero-shot setting, to ensure the reproducibility of our experiments.
Note: if you have limited access to OpenAI API, you could directly use our outputs, without spending money on reproducing our experiments. [zero-shot: step1]
Step2: infer answer from document.
python mainfunc.py
--dataset {dataset}
--task step2
--split test
Trick: we remove the \n
in the generated documents.
Note: if you have limited access to OpenAI API, you could directly use our outputs, without spending money on reproducing our experiments. [zero-shot: step2]
Method1: use sampling to generate multiple documents.
python mainfunc.py
--dataset {dataset}
--task step1
--split test
--num_sequence 10
--temperature 0.95
Method2: use clustering to generate diverse documents.
python clusterfunc.py
--dataset {dataset}
--task step1
--split {split}
--num_sequence 1
--temperature 0.95
--clustering
Fusion-in-decoder: train a reader model to infer answer from documents
We use the FiD code from its official GitHub repository [link].
Download our trained FiD checkpoint at Huggingface Hub.
git lfs install
git clone https://huggingface.co/wyu1/GenRead-3B-NQ
git lfs install
git clone https://huggingface.co/wyu1/GenRead-3B-TQA
If you need checkpoints on other settings, please email [email protected]
@inproceedings{yu2023generate,
title={Generate rather than retrieve: Large language models are strong context generators},
author={Yu, Wenhao and Iter, Dan and Wang, Shuohang and Xu, Yichong and Ju, Mingxuan and Sanyal, Soumya and Zhu, Chenguang and Zeng, Michael and Jiang, Meng},
booktitle={International Conference for Learning Representation (ICLR)},
year={2023}
}
Please kindly cite our paper if you find this paper and the codes helpful.