Data Generation for Machine Learning Interatomic Potentials and Beyond
Los Alamos National Laboratory · Statistical Service · +2 more institutions
Abstract
The field of data-driven chemistry is undergoing an evolution, driven by innovations in machine learning models for predicting molecular properties and behavior. Recent strides in ML-based interatomic potentials have paved the way for accurate modeling of diverse chemical and structural properties at the atomic level. The key determinant defining MLIP reliability remains the quality of the training data. A paramount challenge lies in constructing training sets that capture specific domains in the vast chemical and structural space. This Review navigates the intricate landscape of essential components and integrity of training data that ensure the extensibility and transferability of the resulting models. We…
Citation impact
- FWCI
- 13.24
- Percentile
- 100%
- References
- 345
Authors
12Topics & keywords
- Chemical space
- Transferability
- Reliability (semiconductor)
- Field (mathematics)
- Space (punctuation)
- Computer science
- Data acquisition
- Data science
Funding
- UDU.S. Department of EnergyAward: 89233218CNA000001
- CFCenter for Integrated NanotechnologiesAward: 89233218CNA000001
- OOOffice of ScienceAward: 89233218CNA000001
- NNNational Nuclear Security AdministrationAwards: No. 89233218CNA000001, 89233218CNA000001
- BEBasic Energy SciencesAwards: LANLE8B3, 89233218CNA000001
- LDLaboratory Directed Research and DevelopmentAward: 89233218CNA000001
- CSChemical Sciences, Geosciences, and Biosciences DivisionAward: 89233218CNA000001
- LALos Alamos National LaboratoryAward: 89233218CNA000001