articleJun 1, 2007Closed access

Large-Scale Named Entity Disambiguation Based on Wikipedia Data

Microsoft Research (United Kingdom)

Abstract

This paper presents a large-scale system for the recognition and semantic disambiguation of named entities based on information extracted from a large encyclopedic collection and Web search results. It describes in detail the disambiguation paradigm employed and the information extraction process from Wikipedia. Through a process of maximizing the agreement between the contextual information extracted from Wikipedia and the context of a document, as well as the agreement among the category tags associated with the candidate entities, the implemented system shows high disambiguation accuracy on both news stories and Wikipedia articles. 1 Introduction and Related Work

Citation impact

1,076
total citations
FWCI
70.69
Percentile
100%
References
20
Citations per year

Authors

1

Topics & keywords

Keywords
  • Computer science
  • Information retrieval
  • Context (archaeology)
  • Process (computing)
  • Information extraction
  • Scale (ratio)
  • Natural language processing
  • Entity linking
UN Sustainable Development Goals
  • Quality Education
No related works found for this paper.