The k-means Algorithm: A Comprehensive Survey and Performance Evaluation
Edith Cowan University · McGill University · +1 more institution
Abstract
The k-means clustering algorithm is considered one of the most powerful and popular data mining algorithms in the research community. However, despite its popularity, the algorithm has certain limitations, including problems associated with random initialization of the centroids which leads to unexpected convergence. Additionally, such a clustering algorithm requires the number of clusters to be defined beforehand, which is responsible for different cluster shapes and outlier effects. A fundamental problem of the k-means algorithm is its inability to handle various data types. This paper provides a structured and synoptic overview of research conducted on the k-means algorithm to overcome such shortcomings.…
Citation impact
- FWCI
- 62.70
- Percentile
- 100%
- References
- 88
Authors
3Topics & keywords
- Cluster analysis
- Computer science
- Data mining
- Algorithm
- Convergence (economics)
- Initialization
- Centroid
- Outlier