articleJournal of Statistical SoftwareJan 1, 2017DIAMOND OA

ranger : A Fast Implementation of Random Forests for High Dimensional Data in C++ and R

MNMarvin N. WrightAZAndreas Ziegler

Birkbeck, University of London · University of KwaZulu-Natal

Indexed inarxivcrossrefdoaj

Abstract

We introduce the C++ application and R package ranger. The software is a fast implementation of random forests for high dimensional data. Ensembles of classification, regression and survival trees are supported. We describe the implementation, provide examples, validate the package with a reference implementation, and compare runtime and memory usage with other implementations. The new software proves to scale best with the number of features, samples, trees, and features tried for splitting. Finally, we show that ranger is the fastest and most memory efficient implementation of random forests to analyze data on the scale of a genome-wide association study.

Citation impact

3,215
total citations
FWCI
110.60
Percentile
100%
References
21
Citations per year

Authors

2
  • MN
    Marvin N. WrightCorresponding

    Birkbeck, University of London

  • AZ
    Andreas Ziegler

    University of KwaZulu-Natal

Topics & keywords

Keywords
  • R package
  • Random forest
  • Scale (ratio)
  • Software
  • High dimensional
  • Software package
  • Clustering high-dimensional data
No related works found for this paper.

Funding