Hyperparameters and tuning strategies for random forest

PPPhilipp ProbstMNMarvin N. WrightABAnne‐Laure Boulesteix

Ludwig-Maximilians-Universität München · Leibniz Institute for Prevention Research and Epidemiology - BIPS

Indexed inarxivcrossref

Abstract

The random forest (RF) algorithm has several hyperparameters that have to be set by the user, for example, the number of observations drawn randomly for each tree and whether they are drawn with or without replacement, the number of variables drawn randomly for each split, the splitting rule, the minimum number of samples that a node must contain, and the number of trees. In this paper, we first provide a literature review on the parameters' influence on the prediction performance and on variable importance measures. It is well known that in most cases RF works reasonably well with the default values of the hyperparameters specified in software packages. Nevertheless, tuning the hyperparameters can improve the…

Citation impact

1,450
total citations
FWCI
55.13
Percentile
100%
References
50
Citations per year

Authors

3
  • PP
    Philipp ProbstCorresponding

    Ludwig-Maximilians-Universität München

  • MN
    Marvin N. Wright

    Leibniz Institute for Prevention Research and Epidemiology - BIPS

  • AB
    Anne‐Laure Boulesteix

    Ludwig-Maximilians-Universität München

Topics & keywords

Keywords
  • Hyperparameter
  • Random forest
  • Benchmark (surveying)
  • Set (abstract data type)
  • Tree (set theory)
  • Implementation
  • Node (physics)
  • Data set
No related works found for this paper.

Funding