Neural Word Embedding as Implicit Matrix Factorization

Levy, Omer; Goldberg, Yoav

articleNeural Information Processing SystemsDec 8, 2014Closed access

Neural Word Embedding as Implicit Matrix Factorization

Abstract

We analyze skip-gram with negative-sampling (SGNS), a word embedding method introduced by Mikolov et al., and show that it is implicitly factorizing a word-context matrix, whose cells are the pointwise mutual information (PMI) of the respective word and context pairs, shifted by a global constant. We find that another embedding method, NCE, is implicitly factorizing a similar matrix, where each cell is the (shifted) log conditional probability of a word given its context. We show that using a sparse Shifted Positive PMI word-context matrix to represent words improves results on two word similarity tasks and one of two analogy tasks. When dense low-dimensional vectors are preferred, exact factorization with SVD…

Citation impact

1,590

total citations

FWCI: 103.43
Percentile: 100%
References: 29

Citations per year

Authors

2

Topics & keywords

Topics

Keywords

Word (group theory)
Word embedding
Computer science
Context (archaeology)
Matrix decomposition
Factorization
Embedding
Artificial intelligence

No related works found for this paper.