articleJan 1, 2004Closed access

K -means clustering via principal component analysis

Lawrence Berkeley National Laboratory

Indexed incrossref

Abstract

Principal component analysis (PCA) is a widely used statistical technique for unsupervised dimension reduction. K-means clustering is a commonly used data clustering for performing unsupervised learning tasks. Here we prove that principal components are the continuous solutions to the discrete cluster membership indicators for K-means clustering. New lower bounds for K-means objective function are derived, which is the total variance minus the eigenvalues of the data covariance matrix. These results indicate that unsupervised dimension reduction is closely related to unsupervised learning. Several implications are discussed. On dimension reduction, the result provides new insights to the observed effectiveness…

Citation impact

1,393
total citations
FWCI
6.87
Percentile
100%
References
23
Citations per year

Authors

2

Topics & keywords

Keywords
  • Principal component analysis
  • Cluster analysis
  • Dimensionality reduction
  • Unsupervised learning
  • Pattern recognition (psychology)
  • Sparse PCA
  • Singular value decomposition
  • Artificial intelligence
No related works found for this paper.

Funding