CIDR: Ultrafast and accurate clustering through imputation for single-cell RNA-seq data
Victor Chang Cardiac Research Institute · UNSW Sydney · +1 more institution
Abstract
Most existing dimensionality reduction and clustering packages for single-cell RNA-seq (scRNA-seq) data deal with dropouts by heavy modeling and computational machinery. Here, we introduce CIDR (Clustering through Imputation and Dimensionality Reduction), an ultrafast algorithm that uses a novel yet very simple implicit imputation approach to alleviate the impact of dropouts in scRNA-seq data in a principled manner. Using a range of simulated and real data, we show that CIDR improves the standard principal component analysis and outperforms the state-of-the-art methods, namely t-SNE, ZIFA, and RaceID, in terms of clustering accuracy. CIDR typically completes within seconds when processing a data set of…
Citation impact
- FWCI
- 25.12
- Percentile
- 100%
- References
- 30
Authors
3Topics & keywords
- Cluster analysis
- Imputation (statistics)
- Dimensionality reduction
- Computer science
- Principal component analysis
- Data mining
- Curse of dimensionality
- Data set