OpenFold: retraining AlphaFold2 yields new insights into its learning mechanisms and capacity for generalization
Harvard University Press · Columbia University · +16 more institutions
Abstract
AlphaFold2 revolutionized structural biology with the ability to predict protein structures with exceptionally high accuracy. Its implementation, however, lacks the code and data required to train new models. These are necessary to (1) tackle new tasks, like protein-ligand complex structure prediction, (2) investigate the process by which the model learns and (3) assess the model's capacity to generalize to unseen regions of fold space. Here we report OpenFold, a fast, memory efficient and trainable implementation of AlphaFold2. We train OpenFold from scratch, matching the accuracy of AlphaFold2. Having established parity, we find that OpenFold is remarkably robust at generalizing even when the size and…
Citation impact
- FWCI
- 53.40
- Percentile
- 100%
- References
- 58
Authors
34Topics & keywords
- Generalization
- Retraining
- Computer science
- Computational biology
- Chemistry
- Artificial intelligence
- Biology
- Mathematics
Funding
- NSNational Science FoundationAwards: 2134157, DE-AC02-05CH11231, OAC-2112606, OAC-2106661
- UDU.S. Department of EnergyAwards: -AC02-05CH11231, 05CH11231, DE-SC0022199, AC02-05CH11231, DE-AC02, DE-AC02-05CH11231, DE-AC02-
- NNvidia
- UOUniversity of Texas at Austin
- FHFlatiron Health
- DADefense Advanced Research Projects AgencyAwards: W911NF2010021, DE-AC02-05CH11231, DE-SC0022199
- OOOffice of ScienceAwards: DE-SC0022199, AC02-05CH11231, -AC02-05CH11231, DE-AC02
- NCNational Cancer InstituteAward: U54-CA225088
- NINational Institute of General Medical SciencesAward: R35GM150546