articleJan 1, 2002Closed access

Similarity estimation techniques from rounding algorithms

Princeton University

Indexed incrossref

Abstract

(MATH) A locality sensitive hashing scheme is a distribution on a family $\F$ of hash functions operating on a collection of objects, such that for two objects x,y, PrhεF[h(x) = h(y)] = sim(x,y), where sim(x,y) ε [0,1] is some similarity function defined on the collection of objects. Such a scheme leads to a compact representation of objects so that similarity of objects can be estimated from their compact sketches, and also leads to efficient algorithms for approximate nearest neighbor search and clustering. Min-wise independent permutations provide an elegant construction of such a locality sensitive hashing scheme for a collection of subsets with the set similarity measure sim(A,B) = \frac{|A ∩ B|}{|A ∪…

Citation impact

664
total citations
FWCI
1.40
Percentile
100%
References
0
Citations per year

Authors

1

Topics & keywords

Keywords
  • Locality-sensitive hashing
  • Hash function
  • Rounding
  • Similarity (geometry)
  • K-independent hashing
  • Nearest neighbor search
  • Cluster analysis
  • Locality
No related works found for this paper.