Named Entity Recognition in Tweets: An Experimental Study

Ritter, Alan; Clark, Sam; Etzioni, Oren

articleJul 27, 2011Closed access

Named Entity Recognition in Tweets: An Experimental Study

ARAlan Ritter SCSam Clark OEOren Etzioni

Abstract

People tweet more than 100 Million times daily, yielding a noisy, informal, but sometimes informative corpus of 140-character messages that mirrors the zeitgeist in an unprecedented manner. The performance of standard NLP tools is severely degraded on tweets. This paper addresses this issue by re-building the NLP pipeline beginning with part-of-speech tagging, through chunking, to named-entity recognition. Our novel T-NER system doubles F1 score compared with the Stanford NER system. T-NER leverages the redundancy inherent in tweets to achieve this performance, using LabeledLDA to exploit Freebase dictionaries as a source of distant supervision. LabeledLDA outperforms cotraining, increasing F1 by 25 % over ten…

Citation impact

1,203

total citations

FWCI: 117.10
Percentile: 100%
References: 39

Citations per year

Authors

3

Topics & keywords

Topics

Keywords

Computer science
Named-entity recognition
Exploit
Pipeline (software)
Artificial intelligence
Natural language processing
Chunking (psychology)
Redundancy (engineering)

UN Sustainable Development Goals

Quality Education

No related works found for this paper.