Integrating constraints and metric learning in semi-supervised clustering
The University of Texas at Austin
Abstract
Semi-supervised clustering employs a small amount of labeled data to aid unsupervised learning. Previous work in the area has utilized supervised data in one of two approaches: 1) constraint-based methods that guide the clustering algorithm towards a better grouping of the data, and 2) distance-function learning methods that adapt the underlying similarity metric used by the clustering algorithm. This paper provides new methods for the two approaches as well as presents a new semi-supervised clustering algorithm that integrates both of these techniques in a uniform, principled framework. Experimental results demonstrate that the unified approach produces better clusters than both individual approaches as well…
Citation impact
- FWCI
- 39.28
- Percentile
- 100%
- References
- 16
Authors
3Topics & keywords
- Cluster analysis
- Constrained clustering
- Computer science
- Correlation clustering
- Canopy clustering algorithm
- Artificial intelligence
- Semi-supervised learning
- Metric (unit)