Skip to content

Topic 07: Open Information Extraction

Sherry Lin edited this page Oct 9, 2020 · 1 revision

Sides, Tutorials and Surveys

  1. Brief Introduction and Review of Open Information Extraction System [Slides]
  2. A Survey on Open Information Extraction [Paper]
  3. Open Information Extraction on Scientific Text: An Evaluation [Paper]
  4. Open Information Extraction (OIE) Resources Summary [Paper]

OpenIE Tools

  1. Open Information Extraction from the Web [TextRunner, IJCAI 2007]
  • Incoherent Extractions
  • Uninformative Extractions
  1. MinIE: Minimizing Facts in Open Information Extraction [MinIE, EMNLP 2017] [Code (java)] [Code (python)]
  • Represent information about polarity, modality, attribution and quantities with semantic annotations (instead of actual extraction)
  • identify and remove parts that are considered over specific
  1. Facts that Matter [SALIE, EMNLP 2018] [Code]
  • Extract salient facts, which fulfil two requirements: (1) relevance and (2) diversity
  1. Identifying Relations for Open Information Extraction [ReVerb, EMNLP 2011] [Paper][Code][Homepage]
  • Use syntactic constraints to specify relation phrases (3 simple patterns). Find longest phrase matching one of the syntactic constraints.
  • Find nearest noun-phrases to the left and right of relation phrase. - Not a relative pronoun or WHO-adverb or an existential there.
  • To avoid "over-specified" relation phrases, a relation phrase must have many distinct args in a large corpus
  1. ClausIE: Clause-Based Open Information Extraction [ClausIE, WWW 2013] [Paper][Code (Python)][Code (Java)]
  • Map the dependency relations of an input sentence to clause constituents.
  • A set of coherent clauses presenting a simple linguistic structure is derived from the input

Canonicalization of Open Knowledge Bases, OpenIE Triple Clustering

  1. Query-Driven On-The-Fly Knowledge Base Construction [QKBfly, VLDB2017] relation 🌟
  2. CESI: Canonicalizing Open Knowledge Bases using Embeddings and Side Information [CESI, WWW2018] Code triple
  3. Canonicalizing Open Knowledge Bases [CIKM 2014] triple 🌟
  4. Towards Practical Open Knowledge Base Canonicalization [FAC, CIKM 2018] triple 🌟
  5. Identifying Relations for Open Information Extraction [ReVerb, EMNLP 2011] [Paper][Code][Homepage] relation
  • Mophological Normalization
  1. Open Information Extraction to KBP Relations in 3 Hours [TAC. 2013] [Paper]
  • Main idea: relation phrases mapping to KB otology
  • Manually define a set of rules for each relation, to conduct the mapping
  • The motivation and error analysis are well written
  1. ClusType: Effective Entity Recognition and Typing by Relation Phrase-Based Clustering [ClusType, KDD2015] 🌟
  • Relation Clustering: Two relation phrases tend to have similar cluster membershipd, if they have similar (1) strings; (2) context words; and (3) left and right argument type indicators
  1. Unsupervised Methods for Determining Object and Relation Synonyms on the Web [Resolover, JAIR 2009] relation
  2. Relation Extraction with Matrix Fatorization and Universal Schemes [NAACL-HLT 2013] [Paper]
  • Close to relation clustering
  • Create a universal scheme by unioning surface form predicates from Open IE and relations in the schemas of pre-existing databases
  1. Canonicalization of Open Knowledge Bases with Side Information from the Source Text [PDF] (ICDE 2018) 🌟
  2. Canonicalizing Open Knowledge Bases with Multi-Layered Meta-Graph Neural Network [Paper]

Relation Phrases Clustering (finding synonymous phrases and hypernyms)

  1. HARPY: Hypernyms and Alignment of Relational Paraphrases [HAPPY, COLING 2014] [Paper}{Data]
  2. POLY: Mining Relational Paraphrases from Multilingual Sentences [POLY, EMNLP 2016] [Paper][Data]
  • Make use of another language
  1. RELLY: Inferring Hypernym Relationships Between Relational Phrases [REELY, EMNLP 2015] [Paper}[Data]
  2. PATTY: A Taxonomy of Relational Patterns with Semantic Types [PATTY, EMNLP 2012] [Paper][Data]
  3. Discovering and Exploring Relations on the Web [PATTY demo, VLDB 2012] [Paper] 🌟
  4. Ensemble Semantics for Large-Scale Unsupervised Relation Extraction [WEBRE, EMNLP-CoNELL 2012] relation
  5. Relation Schema Induction using Tensor Factorization with Side Information [SICTF, EMNLP 2016] relation schema induction (for building domain-specific kb from unstructured text) Code: https://github.com/malllabiisc/sictf
  6. Constrained Information-Theoretic Tripartite Graph Clustering to Identify Semantically Similar Relations [IJCAI 2015]

Others

  1. Intergring Local Context and Global Cohesiveness for Open Information Extraction [ReMine, WSDM 2019]
  • Solving a joint optimization problem to unify (1) segmenting entity/relation phrases in individual sentences based on local context; and (2) measuring the quality of tuples extracted from individual sentences with a translating-based objective.