articleJournal of Chemical Information and ModelingNov 23, 2016Closed access

Extreme Gradient Boosting as a Method for Quantitative Structure–Activity Relationships

Merck & Co., Inc., Rahway, NJ, USA (United States)

PubMed
Indexed incrossrefpubmed

Abstract

In the pharmaceutical industry it is common to generate many QSAR models from training sets containing a large number of molecules and a large number of descriptors. The best QSAR methods are those that can generate the most accurate predictions but that are not overly expensive computationally. In this paper we compare eXtreme Gradient Boosting (XGBoost) to random forest and single-task deep neural nets on 30 in-house data sets. While XGBoost has many adjustable parameters, we can define a set of standard parameters at which XGBoost makes predictions, on the average, better than those of random forest and almost as good as those of deep neural nets. The biggest strength of XGBoost is its speed. Whereas…

Citation impact

525
total citations
FWCI
21.56
Percentile
100%
References
18
Citations per year

Authors

5

Topics & keywords

Keywords
  • Random forest
  • Artificial neural network
  • Gradient boosting
  • Computer science
  • Boosting (machine learning)
  • Artificial intelligence
  • Set (abstract data type)
  • Machine learning
No related works found for this paper.