Information Theoretic Measures for Clusterings Comparison: Variants, Properties, Normalization and Correction for Chance
UNSW Sydney · The University of Melbourne
Abstract
Information theoretic measures form a fundamental class of measures for comparing clusterings, and have recently received increasing interest. Nevertheless, a number of questions concerning their properties and inter-relationships remain unresolved. In this paper, we perform an organized study of information theoretic measures for clustering comparison, including several existing popular measures in the literature, as well as some newly proposed ones. We discuss and prove their important properties, such as the metric property and the normalization property. We then highlight to the clustering community the importance of correcting information theoretic measures for chance, especially when the data size is…
Citation impact
- FWCI
- 44.73
- Percentile
- 100%
- References
- 32
Authors
3Topics & keywords
- Normalization (sociology)
- Cluster analysis
- Metric (unit)
- Mathematics
- Property (philosophy)
- Measure (data warehouse)
- Information theory
- Class (philosophy)