GEOM, energy-annotated molecular conformations for property prediction and molecular generation
Harvard University · Massachusetts Institute of Technology
Abstract
Machine learning (ML) outperforms traditional approaches in many molecular design tasks. ML models usually predict molecular properties from a 2D chemical graph or a single 3D structure, but neither of these representations accounts for the ensemble of 3D conformers that are accessible to a molecule. Property prediction could be improved by using conformer ensembles as input, but there is no large-scale dataset that contains graphs annotated with accurate conformers and experimental data. Here we use advanced sampling and semi-empirical density functional theory (DFT) to generate 37 million molecular conformations for over 450,000 molecules. The Geometric Ensemble Of Molecules (GEOM) dataset contains…
Citation impact
- FWCI
- 31.97
- Percentile
- 100%
- References
- 127
Authors
2Topics & keywords
- Conformational isomerism
- Molecule
- Density functional theory
- Computer science
- Computational chemistry
- Graph
- Sampling (signal processing)
- Molecular graph