articleNature CommunicationsMay 20, 2025GOLD OA

CellFM: a large-scale foundation model pre-trained on transcriptomics of 100 million human cells

Sun Yat-sen University · Chongqing University · +3 more institutions

PubMed
Indexed incrossrefdoajpubmed

Abstract

Single-cell sequencing provides transcriptomic profiling at single-cell resolution, uncovering cellular heterogeneity with unprecedented precision. Yet, current single cell data analysis suffers from the inherent data noises, batch effects, and sparsity, highlighting the requirement of a unified model to represent cellular states. To circumvent this problem, many recent efforts focus on training single-cell foundation models based on large datasets. However, current human foundation models are still limited by the sizes of training data and model parameters. Here, we have collected a diverse dataset of 100 million human cells, on which we train a single-cell foundation model (CellFM) containing 800 million…

Citation impact

46
total citations
FWCI
28.77
Percentile
100%
References
62
Citations per year

Authors

16

Topics & keywords

Keywords
  • Foundation (evidence)
  • Scale (ratio)
  • Transcriptome
  • Computational biology
  • Computer science
  • Data science
  • Bioinformatics
  • Biology
No related works found for this paper.

Funding