Abstract

Semantic hashing[1] seeks compact binary codes of data-points so that the Hamming distance between codewords correlates with semantic similarity. In this paper, we show that the problem of finding a best code for a given dataset is closely related to the problem of graph partitioning and can be shown to be NP hard. By relaxing the original problem, we obtain a spectral method whose solutions are simply a subset of thresholded eigenvectors of the graph Laplacian. By utilizing recent results on convergence of graph Laplacian eigenvectors to the Laplace-Beltrami eigenfunctions of manifolds, we show how to efficiently calculate the code of a novel data-point. Taken together, both learning the code and applying it…

Citation impact

2,090
total citations
FWCI
46.13
Percentile
100%
References
17
Citations per year

Authors

3

Topics & keywords

Keywords
  • Binary code
  • Laplacian matrix
  • Eigenvalues and eigenvectors
  • Computer science
  • Hamming distance
  • Hash function
  • Laplace operator
  • Graph
No related works found for this paper.