articleJul 27, 2011Closed access
Named Entity Recognition in Tweets: An Experimental Study
Abstract
People tweet more than 100 Million times daily, yielding a noisy, informal, but sometimes informative corpus of 140-character messages that mirrors the zeitgeist in an unprecedented manner. The performance of standard NLP tools is severely degraded on tweets. This paper addresses this issue by re-building the NLP pipeline beginning with part-of-speech tagging, through chunking, to named-entity recognition. Our novel T-NER system doubles F1 score compared with the Stanford NER system. T-NER leverages the redundancy inherent in tweets to achieve this performance, using LabeledLDA to exploit Freebase dictionaries as a source of distant supervision. LabeledLDA outperforms cotraining, increasing F1 by 25 % over ten…
Citation impact
1,203
total citations
- FWCI
- 117.10
- Percentile
- 100%
- References
- 39
Citations per year
Authors
3Topics & keywords
Topics
Keywords
- Computer science
- Named-entity recognition
- Exploit
- Pipeline (software)
- Artificial intelligence
- Natural language processing
- Chunking (psychology)
- Redundancy (engineering)
UN Sustainable Development Goals
- Quality Education
No related works found for this paper.