preprintMay 12, 2024GOLD OA

Is Cosine-Similarity of Embeddings Really About Similarity?

Netflix (United States) · Cornell University

Indexed inarxivcrossref

Abstract

Cosine-similarity is the cosine of the angle between two vectors, or equivalently the dot product between their normalizations. A popular application is to quantify semantic similarity between high-dimensional objects by applying cosine-similarity to a learned low-dimensional feature embedding. This can work better but sometimes also worse than the unnormalized dot-product between embedded vectors in practice. To gain insight into this empirical observation, we study embeddings derived from regularized linear models, where closed-form solutions facilitate analytical insights. We derive analytically how cosine-similarity can yield arbitrary and therefore meaningless 'similarities.' For some linear models the…

Citation impact

116
total citations
FWCI
36.50
Percentile
100%
References
3
Citations per year

Authors

3

Topics & keywords

Keywords
  • Similarity (geometry)
  • Cosine similarity
  • Computer science
  • Trigonometric functions
  • Artificial intelligence
  • Mathematics
  • Pattern recognition (psychology)
No related works found for this paper.