articleJournal of Chemical Information and ModelingDec 22, 2017GREEN OA

Mol2vec: Unsupervised Machine Learning Approach with Chemical Intuition

BioMed X Institute

Indexed incrossref

Abstract

Inspired by natural language processing techniques, we here introduce Mol2vec, which is an unsupervised machine learning approach to learn vector representations of molecular substructures. Like the Word2vec models, where vectors of closely related words are in close proximity in the vector space, Mol2vec learns vector representations of molecular substructures that point in similar directions for chemically related substructures. Compounds can finally be encoded as vectors by summing the vectors of the individual substructures and, for instance, be fed into supervised machine learning approaches to predict compound properties. The underlying substructure vector embeddings are obtained by training an…

Citation impact

738
total citations
FWCI
27.03
Percentile
100%
References
43
Citations per year

Authors

3

Topics & keywords

Keywords
  • Word2vec
  • Artificial intelligence
  • Computer science
  • Feature vector
  • Unsupervised learning
  • Support vector machine
  • Vector space
  • Chemical space
UN Sustainable Development Goals
  • Quality Education
No related works found for this paper.