articleJan 7, 2007Closed access

k-means++: the advantages of careful seeding

Stanford University

Abstract

The k-means method is a widely used clustering technique that seeks to minimize the average squared distance between points in the same cluster. Although it offers no accuracy guarantees, its simplicity and speed are very appealing in practice. By augmenting k-means with a simple, randomized seeding technique, we obtain an algorithm that is O(log k)-competitive with the optimal clustering. Experiments show our augmentation improves both the speed and the accuracy of k-means, often quite dramatically. 1

Citation impact

6,298
total citations
FWCI
103.61
Percentile
100%
References
21
Citations per year

Authors

2

Topics & keywords

Keywords
  • Seeding
  • Cluster analysis
  • Computer science
  • Simplicity
  • Simple (philosophy)
  • k-means clustering
  • Algorithm
  • Cluster (spacecraft)
No related works found for this paper.