Word Representations: A Simple and General Method for Semi-Supervised Learning

Turian, Joseph; Ratinov, Lev-Arie; Bengio, Yoshua

articleJul 11, 2010Closed access

Word Representations: A Simple and General Method for Semi-Supervised Learning

JTJoseph Turian LRLev-Arie Ratinov YBYoshua Bengio

Université de Montréal · University of Illinois Urbana-Champaign

Abstract

If we take an existing supervised NLP system, a simple and general way to improve accuracy is to use unsupervised word representations as extra word features. We evaluate Brown clusters, Collobert and Weston (2008) embeddings, and HLBL (Mnih &amp; Hinton, 2009) embeddings of words on both NER and chunking. We use near state-of-the-art supervised baselines, and find that each of the three word representations improves the accuracy of these baselines. We find further improvements by combining different word representations. You can download our word features, for off-the-shelf use in existing NLP systems, as well as our code, here:

Citation impact

1,946

total citations

FWCI: 68.10
Percentile: 100%
References: 47

Citations per year

Authors

3

Topics & keywords

Topics

Keywords

Chunking (psychology)
Computer science
Word (group theory)
Natural language processing
Artificial intelligence
Simple (philosophy)
Word embedding
Speech recognition

No related works found for this paper.