articleJournal of Chemical Information and Computer SciencesSep 14, 2002Closed access

Reoptimization of MDL Keys for Use in Drug Discovery

Information Systems Laboratories (United States)

PubMed
Indexed incrossrefpubmed

Abstract

For a number of years MDL products have exposed both 166 bit and 960 bit keysets based on 2D descriptors. These keysets were originally constructed and optimized for substructure searching. We report on improvements in the performance of MDL keysets which are reoptimized for use in molecular similarity. Classification performance for a test data set of 957 compounds was increased from 0.65 for the 166 bit keyset and 0.67 for the 960 bit keyset to 0.71 for a surprisal S/N pruned keyset containing 208 bits and 0.71 for a genetic algorithm optimized keyset containing 548 bits. We present an overview of the underlying technology supporting the definition of descriptors and the encoding of these descriptors into…

Citation impact

1,900
total citations
FWCI
8.40
Percentile
100%
References
34
Citations per year

Authors

4

Topics & keywords

Keywords
  • Pruning
  • Basis (linear algebra)
  • Set (abstract data type)
  • Minimum description length
  • Reduction (mathematics)
  • Similarity (geometry)
  • Computer science
  • Selection (genetic algorithm)
UN Sustainable Development Goals
  • Good health and well-being
No related works found for this paper.