articleJun 14, 2009Closed access
Information theoretic measures for clusterings comparison
Data61 · UNSW Sydney · +1 more institution
Indexed incrossref
Abstract
Information theoretic based measures form a fundamental class of similarity measures for comparing clusterings, beside the class of pair-counting based and set-matching based measures. In this paper, we discuss the necessity of correction for chance for information theoretic based measures for clusterings comparison. We observe that the baseline for such measures, i.e. average value between random partitions of a data set, does not take on a constant value, and tends to have larger variation when the ratio between the number of data points and the number of clusters is small. This effect is similar in some other non-information theoretic based measures such as the well-known Rand Index. Assuming a…
Citation impact
926
total citations
- FWCI
- 35.76
- Percentile
- 100%
- References
- 16
Citations per year
Authors
3Topics & keywords
Topics
Keywords
- Randomness
- Mutual information
- Class (philosophy)
- Mathematics
- Hypergeometric distribution
- Set (abstract data type)
- Matching (statistics)
- Similarity (geometry)
No related works found for this paper.