Efficient Estimation of Word Representations in Vector Space
Brno University of Technology · Beijing University of Posts and Telecommunications · +1 more institution
Abstract
We propose two novel model architectures for computing continuous vector representations of words from very large data sets. The quality of these representations is measured in a word similarity task, and the results are compared to the previ-ously best performing techniques based on different types of neural networks. We observe large improvements in accuracy at much lower computational cost, i.e. it takes less than a day to learn high quality word vectors from a 1.6 billion words data set. Furthermore, we show that these vectors provide state-of-the-art perfor-mance on our test set for measuring syntactic and semantic word similarities.
Citation impact
- FWCI
- —
- Percentile
- —
- References
- 23
Authors
4Topics & keywords
- Word (group theory)
- Computer science
- Similarity (geometry)
- Set (abstract data type)
- Artificial intelligence
- Vector space
- Natural language processing
- Task (project management)