Nucleotide Transformer: building and evaluating robust foundation models for human genomics
Nvidia (United States) · Technical University of Munich · +1 more institution
Abstract
The prediction of molecular phenotypes from DNA sequences remains a longstanding challenge in genomics, often driven by limited annotated data and the inability to transfer learnings between tasks. Here, we present an extensive study of foundation models pre-trained on DNA sequences, named Nucleotide Transformer, ranging from 50 million up to 2.5 billion parameters and integrating information from 3,202 human genomes and 850 genomes from diverse species. These transformer models yield context-specific representations of nucleotide sequences, which allow for accurate predictions even in low-data settings. We show that the developed models can be fine-tuned at low cost to solve a variety of genomics…
Citation impact
- FWCI
- 72.58
- Percentile
- 100%
- References
- 49
Authors
15Topics & keywords
- Genomics
- Computational biology
- Computer science
- Genome
- DNA sequencing
- Transformer
- Context (archaeology)
- Prioritization