articleMay 19, 2002Closed access
Similarity estimation techniques from rounding algorithms
Indexed incrossref
Abstract
A locality sensitive hashing scheme is a distribution on a family F of hash functions operating on a collection of objects, such that for two objects x, y, Prh∈F[h(x) = h(y)] = sim(x,y), where sim(x,y) ∈ [0, 1] is some similarity function defined on the collection of objects. Such a scheme leads to a compact representation of objects so that similarity of objects can be estimated from their compact sketches, and also leads to efficient algorithms for approximate nearest neighbor search and clustering. Min-wise independent permutations provide an elegant construction of such a locality sensitive hashing scheme for a collection of subsets with the set similarity measure sim(A, B) = |A∩B| |A∪B |. We show that…
Citation impact
2,234
total citations
- FWCI
- 4.42
- Percentile
- 100%
- References
- 48
Citations per year
Authors
1Topics & keywords
Topics
Keywords
- Rounding
- Computer science
- Similarity (geometry)
- Algorithm
- Estimation
- Artificial intelligence
- Pattern recognition (psychology)
- Data mining
No related works found for this paper.