On Consistency and Sparsity for Principal Components Analysis in High Dimensions
Stanford University · Stanford Health Care
Abstract
Principal components analysis (PCA) is a classic method for the reduction of dimensionality of data in the form of n observations (or cases) of a vector with p variables. Contemporary datasets often have p comparable with or even much larger than n. Our main assertions, in such settings, are (a) that some initial reduction in dimensionality is desirable before applying any PCA-type search for principal modes, and (b) the initial reduction in dimensionality is best achieved by working in a basis in which the signals have a sparse representation. We describe a simple asymptotic model in which the estimate of the leading principal component vector via standard PCA is consistent if and only if p(n)/n→0. We provide…
Citation impact
- FWCI
- 29.63
- Percentile
- 100%
- References
- 45
Authors
2Topics & keywords
- Principal component analysis
- Dimensionality reduction
- Curse of dimensionality
- Sparse PCA
- Consistency (knowledge bases)
- Mathematics
- Representation (politics)
- Simple (philosophy)