articleMay 19, 2002Closed access

Similarity estimation techniques from rounding algorithms

Princeton University

Indexed incrossref

Abstract

A locality sensitive hashing scheme is a distribution on a family F of hash functions operating on a collection of objects, such that for two objects x, y, Prh∈F[h(x) = h(y)] = sim(x,y), where sim(x,y) ∈ [0, 1] is some similarity function defined on the collection of objects. Such a scheme leads to a compact representation of objects so that similarity of objects can be estimated from their compact sketches, and also leads to efficient algorithms for approximate nearest neighbor search and clustering. Min-wise independent permutations provide an elegant construction of such a locality sensitive hashing scheme for a collection of subsets with the set similarity measure sim(A, B) = |A∩B| |A∪B |. We show that…

Citation impact

2,234
total citations
FWCI
4.42
Percentile
100%
References
48
Citations per year

Authors

1

Topics & keywords

Keywords
  • Rounding
  • Computer science
  • Similarity (geometry)
  • Algorithm
  • Estimation
  • Artificial intelligence
  • Pattern recognition (psychology)
  • Data mining
No related works found for this paper.