Evolutionary-scale prediction of atomic-level protein structure with a language model
The Metropolitan Opera (United States) · New York University · +3 more institutions
Abstract
Recent advances in machine learning have leveraged evolutionary information in multiple sequence alignments to predict protein structure. We demonstrate direct inference of full atomic-level protein structure from primary sequence using a large language model. As language models of protein sequences are scaled up to 15 billion parameters, an atomic-resolution picture of protein structure emerges in the learned representations. This results in an order-of-magnitude acceleration of high-resolution structure prediction, which enables large-scale structural characterization of metagenomic proteins. We apply this capability to construct the ESM Metagenomic Atlas by predicting structures for >617 million metagenomic…
Citation impact
- FWCI
- 684.07
- Percentile
- 100%
- References
- 64
Authors
15- ZLZeming LinCorresponding
The Metropolitan Opera (United States), New York University
- HAHalil AkinCorresponding
The Metropolitan Opera (United States)
- RRRoshan RaoCorresponding
The Metropolitan Opera (United States)
- BHBrian HieCorresponding
Palo Alto University, The Metropolitan Opera (United States), Stanford University
- ZZZhongkai Zhu
The Metropolitan Opera (United States)
Topics & keywords
- Metagenomics
- Computer science
- Inference
- Protein structure prediction
- Construct (python library)
- Sequence (biology)
- Protein structure
- Scale (ratio)
- Quality Education