articleAug 8, 2016GOLD OA

XGBoost

TCTianqi ChenCGCarlos Guestrin

University of Washington

Indexed inarxivcrossref

Abstract

Tree boosting is a highly effective and widely used machine learning method. In this paper, we describe a scalable end-to-end tree boosting system called XGBoost, which is used widely by data scientists to achieve state-of-the-art results on many machine learning challenges. We propose a novel sparsity-aware algorithm for sparse data and weighted quantile sketch for approximate tree learning. More importantly, we provide insights on cache access patterns, data compression and sharding to build a scalable tree boosting system. By combining these insights, XGBoost scales beyond billions of examples using far fewer resources than existing systems.

Citation impact

47,462
total citations
FWCI
1200.15
Percentile
100%
References
21
Citations per year

Authors

2
  • TC
    Tianqi ChenCorresponding

    University of Washington

  • CG
    Carlos Guestrin

    University of Washington

Topics & keywords

Keywords
  • Boosting (machine learning)
  • Scalability
  • Sketch
  • Tree (set theory)
  • Decision tree
  • Gradient boosting
  • Cache
No related works found for this paper.

Funding