articleJul 27, 2011Closed access

Named Entity Recognition in Tweets: An Experimental Study

University of Washington

Abstract

People tweet more than 100 Million times daily, yielding a noisy, informal, but sometimes informative corpus of 140-character messages that mirrors the zeitgeist in an unprecedented manner. The performance of standard NLP tools is severely degraded on tweets. This paper addresses this issue by re-building the NLP pipeline beginning with part-of-speech tagging, through chunking, to named-entity recognition. Our novel T-NER system doubles F1 score compared with the Stanford NER system. T-NER leverages the redundancy inherent in tweets to achieve this performance, using LabeledLDA to exploit Freebase dictionaries as a source of distant supervision. LabeledLDA outperforms cotraining, increasing F1 by 25 % over ten…

Citation impact

1,203
total citations
FWCI
117.10
Percentile
100%
References
39
Citations per year

Authors

3

Topics & keywords

Keywords
  • Computer science
  • Named-entity recognition
  • Exploit
  • Pipeline (software)
  • Artificial intelligence
  • Natural language processing
  • Chunking (psychology)
  • Redundancy (engineering)
UN Sustainable Development Goals
  • Quality Education
No related works found for this paper.