A review of molecular representation in the age of machine learning

University of Cambridge

Indexed incrossrefdatacite

Abstract

Abstract Research in chemistry increasingly requires interdisciplinary work prompted by, among other things, advances in computing, machine learning, and artificial intelligence. Everyone working with molecules, whether chemist or not, needs an understanding of the representation of molecules in a machine‐readable format, as this is central to computational chemistry. Four classes of representations are introduced: string, connection table, feature‐based, and computer‐learned representations. Three of the most significant representations are simplified molecular‐input line‐entry system (SMILES), International Chemical Identifier (InChI), and the MDL molfile, of which SMILES was the first to successfully be…

Citation impact

346
total citations
FWCI
47.48
Percentile
100%
References
85
Citations per year

Authors

3

Topics & keywords

Keywords
  • Cheminformatics
  • Computer science
  • Representation (politics)
  • Chemical space
  • Artificial intelligence
  • Identifier
  • Table (database)
  • Data science
No related works found for this paper.

Funding