Statistics or biology: the zero-inflation controversy about scRNA-seq data
University of California, Los Angeles
Indexed incrossrefdoajpubmed
Abstract
Researchers view vast zeros in single-cell RNA-seq data differently: some regard zeros as biological signals representing no or low gene expression, while others regard zeros as missing data to be corrected. To help address the controversy, here we discuss the sources of biological and non-biological zeros; introduce five mechanisms of adding non-biological zeros in computational benchmarking; evaluate the impacts of non-biological zeros on data analysis; benchmark three input data types: observed counts, imputed counts, and binarized counts; discuss the open questions regarding non-biological zeros; and advocate the importance of transparent analysis.
Citation impact
640
total citations
- FWCI
- 52.93
- Percentile
- 100%
- References
- 128
Citations per year
Authors
4Topics & keywords
Topics
Keywords
- Biology
- Biological data
- Benchmarking
- Benchmark (surveying)
- Computational biology
- Statistics
- Econometrics
- Mathematics
No related works found for this paper.
Funding
- NSNational Science FoundationAwards: 2113754, 1846216, DMS-2113754
- APAlfred P. Sloan Foundation
- WMW. M. Keck Foundation
- NINational Institutes of HealthAwards: R01GM120507, R35GM140888
- DFDirectorate for Biological SciencesAward: 1846216
- DFDirectorate for Mathematical and Physical SciencesAward: 2113754
- JAJohnson and JohnsonAward: WiSTEM2D Award
- NINational Institute of General Medical SciencesAwards: R35GM140888, R01GM120507
- DGDavid Geffen School of Medicine, University of California, Los Angeles