articleScientific DataFeb 4, 2026GOLD OA

Open Molecular Crystals 2025 (OMC25) dataset and models

Corporación Universitaria del Meta · Tata Institute of Fundamental Research · +3 more institutions

PubMed
Indexed incrossrefdoajpubmed

Abstract

The development of accurate and efficient machine learning models for predicting the structure and properties of molecular crystals has been hindered by the scarcity of publicly available datasets with property labels. To address this challenge, we introduce the Open Molecular Crystals 2025 (OMC25) dataset, a collection of over 27 million molecular crystal structures containing 12 elements and up to 300 atoms in the unit cell. The dataset was created by relaxing over 230,000 randomly constructed molecular crystal structures-representing approximately 50,000 organic molecules-using dispersion-inclusive density functional theory (DFT) with the Perdew-Burke-Ernzerhof (PBE) exchange-correlation functional combined…

Citation impact

6
total citations
FWCI
37.37
Percentile
100%
References
81
Too recent for citation history.

Authors

19

Topics & keywords

Keywords
  • Crystal (programming language)
  • Intermolecular force
  • Crystal structure prediction
  • Density functional theory
  • Range (aeronautics)
  • Property (philosophy)
  • Dispersion (optics)
  • Quality (philosophy)
No related works found for this paper.

Funding