articleJan 6, 2007Closed access

Open information extraction from the web

University of Washington

Abstract

Traditionally, Information Extraction (IE) has focused on satisfying precise, narrow, pre-specified requests from small homogeneous corpora (e.g., extract the location and time of seminars from a set of announcements). Shifting to a new domain requires the user to name the target relations and to manually create new extraction rules or hand-tag new training examples. This manual labor scales linearly with the number of target relations. This paper introduces Open IE (OIE), a new extraction paradigm where the system makes a single data-driven pass over its corpus and extracts a large set of relational tuples without requiring any human input. The paper also introduces TEXTRUNNER, a fully implemented, highly…

Citation impact

1,323
total citations
FWCI
118.61
Percentile
100%
References
94
Citations per year

Authors

5

Topics & keywords

Keywords
  • Tuple
  • Computer science
  • Information extraction
  • Relationship extraction
  • Scalability
  • Set (abstract data type)
  • Information retrieval
  • Reduction (mathematics)
UN Sustainable Development Goals
  • Decent work and economic growth
No related works found for this paper.