CellFM: a large-scale foundation model pre-trained on transcriptomics of 100 million human cells
Sun Yat-sen University · Chongqing University · +3 more institutions
Abstract
Single-cell sequencing provides transcriptomic profiling at single-cell resolution, uncovering cellular heterogeneity with unprecedented precision. Yet, current single cell data analysis suffers from the inherent data noises, batch effects, and sparsity, highlighting the requirement of a unified model to represent cellular states. To circumvent this problem, many recent efforts focus on training single-cell foundation models based on large datasets. However, current human foundation models are still limited by the sizes of training data and model parameters. Here, we have collected a diverse dataset of 100 million human cells, on which we train a single-cell foundation model (CellFM) containing 800 million…
Citation impact
- FWCI
- 28.77
- Percentile
- 100%
- References
- 62
Authors
16Topics & keywords
- Foundation (evidence)
- Scale (ratio)
- Transcriptome
- Computational biology
- Computer science
- Data science
- Bioinformatics
- Biology