Open Molecular Crystals 2025 (OMC25) dataset and models
Corporación Universitaria del Meta · Tata Institute of Fundamental Research · +3 more institutions
Abstract
The development of accurate and efficient machine learning models for predicting the structure and properties of molecular crystals has been hindered by the scarcity of publicly available datasets with property labels. To address this challenge, we introduce the Open Molecular Crystals 2025 (OMC25) dataset, a collection of over 27 million molecular crystal structures containing 12 elements and up to 300 atoms in the unit cell. The dataset was created by relaxing over 230,000 randomly constructed molecular crystal structures-representing approximately 50,000 organic molecules-using dispersion-inclusive density functional theory (DFT) with the Perdew-Burke-Ernzerhof (PBE) exchange-correlation functional combined…
Citation impact
- FWCI
- 37.37
- Percentile
- 100%
- References
- 81
Authors
19Topics & keywords
- Crystal (programming language)
- Intermolecular force
- Crystal structure prediction
- Density functional theory
- Range (aeronautics)
- Property (philosophy)
- Dispersion (optics)
- Quality (philosophy)