articleNature CommunicationsNov 28, 2019GOLD OA

Automated optimized parameters for T-distributed stochastic neighbor embedding improve visualization and analysis of large datasets

Boston University · Kansas State University · +2 more institutions

Indexed incrossrefdoaj

Abstract

Abstract Accurate and comprehensive extraction of information from high-dimensional single cell datasets necessitates faithful visualizations to assess biological populations. A state-of-the-art algorithm for non-linear dimension reduction, t-SNE, requires multiple heuristics and fails to produce clear representations of datasets when millions of cells are projected. We develop opt-SNE, an automated toolkit for t-SNE parameter selection that utilizes Kullback-Leibler divergence evaluation in real time to tailor the early exaggeration and overall number of gradient descent iterations in a dataset-specific manner. The precise calibration of early exaggeration together with opt-SNE adjustment of gradient descent…

Citation impact

615
total citations
FWCI
17.77
Percentile
100%
References
45
Citations per year

Authors

6

Topics & keywords

Keywords
  • Computer science
  • Visualization
  • Heuristics
  • Dimensionality reduction
  • Divergence (linguistics)
  • Data mining
  • Stochastic gradient descent
  • Kullback–Leibler divergence
No related works found for this paper.

Funding