SimLex-999: Evaluating Semantic Models With (Genuine) Similarity Estimation
University of Cambridge · Technion – Israel Institute of Technology
Abstract
We present SimLex-999, a gold standard resource for evaluating distributional semantic models that improves on existing resources in several important ways. First, in contrast to gold standards such as WordSim-353 and MEN, it explicitly quantifies similarity rather than association or relatedness so that pairs of entities that are associated but not actually similar (Freud, psychology) have a low rating. We show that, via this focus on similarity, SimLex-999 incentivizes the development of models with a different, and arguably wider, range of applications than those which reflect conceptual association. Second, SimLex-999 contains a range of concrete and abstract adjective, noun, and verb pairs, together with…
Citation impact
- FWCI
- 198.11
- Percentile
- 100%
- References
- 77
Authors
3Topics & keywords
- Concreteness
- Computer science
- Natural language processing
- Noun
- Artificial intelligence
- Adjective
- Semantic similarity
- Similarity (geometry)