reviewPatternsOct 1, 2022GOLD OA

SELFIES and the future of molecular string representations

Max Planck Institute for the Science of Light · Fordham University · +20 more institutions

PubMed
Indexed incrossrefdatacitedoajpubmed

Abstract

Artificial intelligence (AI) and machine learning (ML) are expanding in popularity for broad applications to challenging tasks in chemistry and materials science. Examples include the prediction of properties, the discovery of new reaction pathways, or the design of new molecules. The machine needs to read and write fluently in a chemical language for each of these tasks. Strings are a common tool to represent molecular graphs, and the most popular molecular string representation, Smiles, has powered cheminformatics since the late 1980s. However, in the context of AI and ML in chemistry, Smiles has several shortcomings-most pertinently, most combinations of symbols lead to invalid results with no valid…

Citation impact

261
total citations
FWCI
17.78
Percentile
100%
References
198
Citations per year

Authors

31

Topics & keywords

Keywords
  • Cheminformatics
  • Computer science
  • String (physics)
  • Artificial intelligence
  • Interpretability
  • Context (archaeology)
  • Popularity
  • Representation (politics)
UN Sustainable Development Goals
  • Quality Education
No related works found for this paper.

Funding