Linguistic Regularities in Continuous Space Word Representations
Brno University of Technology · Microsoft (United States)
Abstract
Continuous space language models have recently demonstrated outstanding results across a variety of tasks. In this paper, we examine the vector-space word representations that are implicitly learned by the input-layer weights. We find that these representations are surprisingly good at capturing syntactic and semantic regularities in language, and that each relationship is characterized by a relation-specific vector offset. This allows vector-oriented reasoning based on the offsets between words. For example, the male/female relationship is automatically learned, and with the induced vector representations, “King-Man + Woman ” results in a vector very close to “Queen. ” We demonstrate that the word vectors…
Citation impact
- FWCI
- 383.51
- Percentile
- 100%
- References
- 22
Authors
3Topics & keywords
- Computer science
- Natural language processing
- Word (group theory)
- Artificial intelligence
- Offset (computer science)
- Vector space
- SemEval
- Analogy
- Gender equality