Survey on Machine Reading Comprehension
Year | Title | Model | Datasets | Misc | Paper, Source Code |
---|---|---|---|---|---|
2019 | XLNet: Generalized Autoregressive Pretraining for Language Understanding | XLNet | Race, SQuAD 1.1, SQuAD 2.0 | pretrained LM | paper, code |
2019 | BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding | BERT | GLUE, SQuAD 1.1, SQuAD 2.0, SWAG | pretrained LM | paper, code |
2018 | S-NET: From Answer Extraction to Answer Generation for Machine Reading Comprehension | S-NET | MS-MARCO | multiple passages | paper, [code] |
2018 | QANET: Combining local Convolution with global Self-Attention for Reading Comprehension | QANet | SQuAD 1.1 | paper, code | |
2017 | ReasoNet: Learning to Stop Reading in Machine Comprehension | ReasoNet | CNN and Daily Mail, SQuAD 1.1 | paper, [code] | |
2017 | Reading Wikipedia to Answer Open-Domain Questions | DrQA | Wikipedia, SQuAD 1.1, CuratedTREC, WebQuestions, WikiMovies | OPQA, Multi-Passage MRC | paper, code |
2017 | R-Net: Machine Reading Comprehension with Self-Matching Networks | R-Net | SQuAD 1.1, MS-MARCO | paper, code | |
2017 | Machine Comprehension Using Match-LSTM and Answer Pointer | Match-LSTM + Pointer Network | SQuAD 1.1 | paper, code | |
2017 | Gated-Attention Readers for Text Comprehension | Gated-attention Reader | CNN and Daily Mail, Children’s Book Test, Who Did What | paper, code | |
2017 | Gated Self-Matching Networks for Reading Comprehension and Question Answering | Gated Self-Matching | SQuAD 1.1 | paper, [code] | |
2017 | Dynamic CoAttention Networks for Question Answering | Dynamic coattention networks | SQuAD 1.1 | paper, code | |
2017 | DCN+: Mixed Objective and Deep Residual CoAttention for Question Answering | DCN+ | SQuAD 1.1 | paper, code | |
2017 | Bi-directional Attention Flow for Machine Comprehension | BiDAF | SQuAD 1.1 | paper, code | |
2017 | Attention-over-Attention Neural Networks for Reading Comprehension | Attention-over-Attention Reader | Children’s Book Test, CNN and Daily Mail | paper, code | |
2016 | Text Understanding with the Attention Sum Reader Network | Attention Sum Reader | Children’s Book Test, CNN and Daily Mail | paper, code | |
2016 | Multi-Perspective Context Matching for Machine Comprehension | Multi-Perspective Context Matching | SQuAD 1.1 | paper, [code] | |
2016 | Key-Value Memory Networks for Directly Reading Documents | Key-Value Memory Networks | WikiMovies, WikiQA | paper, code | |
2016 | Iterative Alternating Neural Attention for Machine Reading | Iterative Attention Reader | Children’s Book Test, CNN and Daily Mail | paper, [code] | |
2016 | A Thorough Examination of the CNN/Daily Mail Reading Comprehension Task | CNN and Daily Mail | paper, [code] | ||
2015 | Teaching Machines to Read and Comprehend | Attentive Reader | CNN and Daily Mail | paper, code |
Following Danqi Chen, we have four answer types:
Details: https://github.com/xanhho/Reading-Comprehension-Question-Answering-Papers/wiki/MRC-Datasets
Year | Dataset | Task | Size | Source | Web/Paper | Answer type | Misc | Similar datasets |
---|---|---|---|---|---|---|---|---|
2019 | ROPES | RC | 14k | Wikipedia + science textbooks | web, paper | Span extraction | background passage + situation | ShARC |
2019 | RC-QED | RC | 12k | Wikipedia | web, paper | Multiple choice | multi-passage | HotpotQA |
2019 | QUOREF | RC | 24k+ | Wikipedia | web, paper | Span extraction | coreference resolution | |
2019 | COSMOS QA | 35,600 | narrative | web, paper | Multiple choice | |||
2019 | DROP | RC | 96k | Wikipedia | web, paper | Span extraction + numerical reasoning | multi-span answers | |
2019 | Natural Questions | RC | 323k | Wikipedia | paper | Span extraction | ||
2018 | SQuAD 2.0 | RC | 150k | Wikipedia | paper | Span extraction | no answer: 50k | NewsQA |
2018 | MultiRC | RC | 6k+ questions | various articles | web, paper | Multiple choice | multiple sentence reasoning | MCTest |
2018 | CSQA | QA | 200k dialogs, 1.6M turns | paper | ||||
2018 | QuAC | RC | 100k | Wikipedia | web, paper | Span extraction | conversational questions | CoQA |
2018 | QAngaroo (Wikihop + Medhop) | RC | Wikipedia + Medline | web, paper | Multiple choice | multi-passage | HotpotQA | |
2018 | HotpotQA | RC | 113k | Wikipedia | web, paper | Span extraction | multi-passage | QAngaroo |
2018 | CoQA | RC | 127k | various articles | paper | Free answering | conversational questions | QuAC |
2018 | ComplexWebQuestions | RC | 34,689 | WebQuestionsSP | web, paper | Span extraction? | multi-passage | |
2018 | SWAG | QA | 113k | video caption | Multiple choice | situational commonsense reasoning | ||
2018 | RecipeQA | RC | 36k | various | multimodal comprehension | |||
2018 | ProPara | RC | 2k | procedural text | bAbI, SCoNE | |||
2018 | OpenBookQA | QA | 6k | science facts | Multiple choice | external knowledge | ARC | |
2018 | FEVER | |||||||
2018 | DuReader | Free answering | ||||||
2018 | DuoRC | RC | 186k | movie plot | Span extraction | NarrativeQA | ||
2018 | CLOTH | RC | 99k | English exams | Cloze test | RACE | ||
2018 | CliCR | RC | 100k | clinical case text | Cloze test | |||
2018 | ARC | RC | 8k | science exam | easy 5197, challenge 2590 | |||
2017 | WikiSuggest | paper | ||||||
2017 | TriviaQA | RC | 96k question-answer pairs | Web + Wikipedia | web, paper | Span extraction | SQuAD | |
2017 | SQA | paper | ||||||
2017 | SearchQA | paper | Free answering | |||||
2017 | RACE | paper | Multiple choice | |||||
2017 | NarrativeQA | paper | Free answering | |||||
2016 | Who-did-What | paper | Cloze test | |||||
2016 | SQuAD 1.1 | RC | 87k training + 10k development | Wikipedia | paper | Span extraction | NewsQA | |
2016 | NewsQA | paper | Span extraction | |||||
2016 | MS MARCO | web | Free answering | |||||
2016 | LAMBADA | paper | Cloze test | |||||
2016 | WikiMovies | QA | ||||||
2015 | CuratedTREC | QA | ||||||
2015 | CNN and Daily Mail | RC | 93k + 220k articles | CNN + Daily Mail | paper web | Cloze test | ||
2015 | Children's Book Test | RC | 108 children's books | web | Cloze test | |||
2015 | bAbI | RC | classic text adventure game | web | Free answering | 20 tasks | ||
2013 | WebQuestions | QA | ||||||
2013 | QA4MRE | RC | various articles | paper | Multiple choice | |||
2013 | MCTest | RC | 500 stories + 2k questions | fictional stories | paper | Multiple choice | open-domain | |
1999 | DeepRead | RC | 60 development and 60 test? | news stories | paper | Free answering |