Normalization and variance stabilization of single-cell RNA-seq data using regularized negative binomial regression
Abstract
Single-cell RNA-seq (scRNA-seq) data exhibits significant cell-to-cell variation due to technical factors, including the number of molecules detected in each cell, which can confound biological heterogeneity with technical effects. To address this, we present a modeling framework for the normalization and variance stabilization of molecular count data from scRNA-seq experiments. We propose that the Pearson residuals from "regularized negative binomial regression," where cellular sequencing depth is utilized as a covariate in a generalized linear model, successfully remove the influence of technical characteristics from downstream analyses while preserving biological heterogeneity. Importantly, we show that an…
Citation impact
- FWCI
- 148.16
- Percentile
- 100%
- References
- 39
Authors
2- CHChristoph HafemeisterCorresponding
New York Genome Center
- RSRahul Satija
New York Genome Center
Topics & keywords
- Overfitting
- Covariate
- Normalization (sociology)
- Pooling
- Negative binomial distribution
- Count data
- Biology
- Regression