Abstract
Words and phrases acquire meaning from the way they are used in society, from their relative semantics to other words and phrases. For computers, the equivalent of "society" is "database," and the equivalent of "use" is "a way to search the database". We present a new theory of similarity between words and phrases based on information distance and Kolmogorov complexity. To fix thoughts, we use the World Wide Web (WWW) as the database, and Google as the search engine. The method is also applicable to other search engines and databases. This theory is then applied to construct a method to automatically extract similarity, the Google similarity distance, of words and phrases from the WWW using Google page counts.…
Citation impact
1,763
total citations
- FWCI
- 122.16
- Percentile
- 100%
- References
- 44
Citations per year
Authors
2Topics & keywords
Topics
Keywords
- WordNet
- Computer science
- Information retrieval
- Semantics (computer science)
- Similarity (geometry)
- Semantic similarity
- Context (archaeology)
- Cluster analysis
UN Sustainable Development Goals
- Quality Education
No related works found for this paper.