articleNature MethodsNov 28, 2024HYBRID OA

Nucleotide Transformer: building and evaluating robust foundation models for human genomics

Nvidia (United States) · Technical University of Munich · +1 more institution

PubMed
Indexed incrossrefpubmed

Abstract

The prediction of molecular phenotypes from DNA sequences remains a longstanding challenge in genomics, often driven by limited annotated data and the inability to transfer learnings between tasks. Here, we present an extensive study of foundation models pre-trained on DNA sequences, named Nucleotide Transformer, ranging from 50 million up to 2.5 billion parameters and integrating information from 3,202 human genomes and 850 genomes from diverse species. These transformer models yield context-specific representations of nucleotide sequences, which allow for accurate predictions even in low-data settings. We show that the developed models can be fine-tuned at low cost to solve a variety of genomics…

Citation impact

347
total citations
FWCI
72.58
Percentile
100%
References
49
Citations per year

Authors

15

Topics & keywords

Keywords
  • Genomics
  • Computational biology
  • Computer science
  • Genome
  • DNA sequencing
  • Transformer
  • Context (archaeology)
  • Prioritization
No related works found for this paper.