Open IE Papers Save

Open Information Extraction (OpenIE) and Open Relation Extraction (ORE) papers and data.

Project README

Table of Contents

  1. General
  2. Literature Reviews
  3. Papers - Neural Networks
  4. Papers - Parse-based and statistical
  5. Papers - Older papers and legacy systems
  6. Training and Testing Data

General

This README containts OpenIE and ORE papers and resources. Summaries are by @jbecke and @TheodoreChristakis, to the best of our abilities after reading each paper or testing the system (when available). We welcome pull requests with additional resources, papers, or data.

Literature Reviews

Papers - Neural Networks

*Learning Open Information Extraction of Implicit Relations from Reading Comprehension Datasets extracting more implied ("common sense") relations.

Papers - Parse-based and statistical

  • Graphene generates n-ary extractions with semantically linking-labels like "TEMPORAL", "CAUSE", etc. as well as open relations
  • Stanford Open IE: produces maximally-shortened tuples. It seems to often produce tuples for which the reported confidience is often 1.0. GPL or proprietary available as part of Stanford Core NLP.
  • OpenIE-X (v4, v5, allen institute version). Works well with simple statements (see examples in this dataset). Outputs context for extractions and gives good confidence predictions that can be used to balance precision-recall. Note the restrictive license (research purposes only).
  • Open Relation Extraction and Grounding: Extracts argument pairs of relation tuples and forms weighted dependency trees between two arguments. It shows promising results in determining relative importance of each argument in the tree.
  • Unsupervised Open Relation Extraction: Used for unsupervised relation extraction from free text by using pretrained word embeddings while using a sentence's dependency parse tree as a foundation.

Papers - Older papers and legacy systems

  • From University of Washington
    • TextRunner - One of the earliest papers addressing open information extraction
    • Reverb - Improved the extraction to better form the tuple of (argument, relation, argument)
    • OLLIE - Addressed the issue of misleading propositions and non-verb mediated relations
  • CSD-IE - Generation of nested contractions which is especially effective in sentences using subordinating clauses
  • PropS: Syntax Based Proposition Extraction
  • ClausIE - Formed a strong relation between grammatical clauses, propositions, and OIE extractions by defining seven grammatical patterns
  • ReNoun - Used predominantly for noun-mediated relations.

Training and Testing Data

  • 35M sentence-tuple pairs: from the paper Neural Open Information Extraction. It was generated by OpenIE-4, removing any tuples less then 0.9 confidence. Because there is no sample data, I've copied a bit below. As you can see, the data is somewhat noisy. It might be useful for extra training data, but not as a gold dataset.
* moving and handling '' ' - a comprehensive course that covers safe handling and transport of casualties .
<arg1> '' ' - a comprehensive course </arg1> <rel> covers </rel> <arg2> safe handling and transport of casualties </arg2>

this word , adjectival magavan meaning `` possessing maga - '' , was once the premise that avestan maga - and median magu - were co-eval .
<arg1> - '' , was once the premise that avestan maga - and median magu - </arg1> <rel> were </rel> <arg2> co-eval </arg2>

melora walters as candy ' - a hooker who works for the motel where john person is staying , as a complimentary service to the guests .
<arg1> ' - a hooker </arg1> <rel> works </rel> <arg2> for the motel </arg2>

- - a hunter who uses bows and arrows instead of guns .
<arg1> - - a hunter </arg1> <rel> uses </rel> <arg2> bows and arrows instead of guns </arg2>
  • TupleInf Open IE Dataset: OpenIE-4 extractions of 8th grade and 4th grade questions. By inspection, these tend to be cleaner than the above dataset because of the simplicity of the language. Confidence-values are retained so you can make your own tradeoff between precision and recall. Note suitable for a gold dataset.
01 April 1969 The ATM would be a manned solar observatory making measurements of the Sun by telescopes and instruments above 
0.96 (The ATM; would be; a manned solar observatory making measurements of the Sun by telescopes and instruments)
0.93 (a manned solar observatory; making; measurements of the Sun)

01 April 1969 The ATM would be a manned solar observatory making measurements of the Sun by telescopes and instruments above the Earth's atmosphere.
0.96 (The ATM; would be; a manned solar observatory making measurements of the Sun by telescopes and instruments above the Earth's atmosphere)
0.93 (a manned solar observatory; making; measurements of the Sun)

01 - Compare the physical properties of ice, liquid, water, and vapor.

01 Earthly Seasons PURPOSE: To show that the seasons are the consequence of the tilt of earth.

0.1% water can lower the melting temperature of peridotite by 100 C.
0.91 (0.1% water; can lower; the melting temperature of peridotite)

( 020 ) Celsius &#176;C The international temperature scale where water freezes at 0 (degrees) and boils at 100 (degrees).
0.89 (water; freezes; at 0 (degrees)
  • Squadie (not yet published, expect changes): this is our dataset derived from Squad. It uses a similar JSON format to SQuAD and contains 50,000 tuples. This tuple can then be matched with the corresponding sentence in the training corpus. Not suitable as a gold corpus. Squadie is useful for extracting implied relations. We have also converted Maluuba NewsQA.
                        {
                            "question": "Which film did Beyoncé star in 2001 with Mekhi Phifer?",
                            "id": "56d4831f2ccc5a1400d83155",
                            "answer": "Carmen: A Hip Hopera",
                            "tuple": "<Which film\tdid Beyoncé star with Mekhi Phifer\tCarmen: A Hip Hopera>"
                        },
                        {
                            "question": "What was the name of Destiny Child's third album?",
                            "id": "56d4831f2ccc5a1400d83156",
                            "answer": "Survivor",
                            "tuple": "<Survivor\tthe name of\tDestiny Child 's third album>"
                        },
                        {
                            "question": "Who filed a lawsuit over Survivor?",
                            "id": "56d4831f2ccc5a1400d83157",
                            "answer": "Luckett and Roberson",
                            "tuple": "<Luckett and Roberson\tfiled a lawsuit over\tSurvivor>"
                        },
                        {
                            "question": "When did Destiny's Child announce their hiatus?",
                            "id": "56d4831f2ccc5a1400d83158",
                            "answer": "October 2001",
                            "tuple": "<Destiny 's Child\tannounce their hiatus\tOctober 2001>"
                        }
Open Source Agenda is not affiliated with "Open IE Papers" Project. README Source: NPCai/Open-IE-Papers
Stars
163
Open Issues
0
Last Commit
4 years ago
License
MIT

Open Source Agenda Badge

Open Source Agenda Rating