Abstract
Semantic hashing[1] seeks compact binary codes of data-points so that the Hamming distance between codewords correlates with semantic similarity. In this paper, we show that the problem of finding a best code for a given dataset is closely related to the problem of graph partitioning and can be shown to be NP hard. By relaxing the original problem, we obtain a spectral method whose solutions are simply a subset of thresholded eigenvectors of the graph Laplacian. By utilizing recent results on convergence of graph Laplacian eigenvectors to the Laplace-Beltrami eigenfunctions of manifolds, we show how to efficiently calculate the code of a novel data-point. Taken together, both learning the code and applying it…
Citation impact
2,090
total citations
- FWCI
- 46.13
- Percentile
- 100%
- References
- 17
Citations per year
Authors
3Topics & keywords
Topics
Keywords
- Binary code
- Laplacian matrix
- Eigenvalues and eigenvectors
- Computer science
- Hamming distance
- Hash function
- Laplace operator
- Graph
No related works found for this paper.