Automated optimized parameters for T-distributed stochastic neighbor embedding improve visualization and analysis of large datasets
Boston University · Kansas State University · +2 more institutions
Abstract
Abstract Accurate and comprehensive extraction of information from high-dimensional single cell datasets necessitates faithful visualizations to assess biological populations. A state-of-the-art algorithm for non-linear dimension reduction, t-SNE, requires multiple heuristics and fails to produce clear representations of datasets when millions of cells are projected. We develop opt-SNE, an automated toolkit for t-SNE parameter selection that utilizes Kullback-Leibler divergence evaluation in real time to tailor the early exaggeration and overall number of gradient descent iterations in a dataset-specific manner. The precise calibration of early exaggeration together with opt-SNE adjustment of gradient descent…
Citation impact
- FWCI
- 17.77
- Percentile
- 100%
- References
- 45
Authors
6Topics & keywords
- Computer science
- Visualization
- Heuristics
- Dimensionality reduction
- Divergence (linguistics)
- Data mining
- Stochastic gradient descent
- Kullback–Leibler divergence