articleJournal Of Big DataNov 4, 2020GOLD OA

CatBoost for big data: an interdisciplinary review

Florida Atlantic University

PubMed
Indexed incrossrefdoajpubmed

Abstract

Gradient Boosted Decision Trees (GBDT's) are a powerful tool for classification and regression tasks in Big Data. Researchers should be familiar with the strengths and weaknesses of current implementations of GBDT's in order to use them effectively and make successful contributions. CatBoost is a member of the family of GBDT machine learning ensemble techniques. Since its debut in late 2018, researchers have successfully used CatBoost for machine learning studies involving Big Data. We take this opportunity to review recent research on CatBoost as it relates to Big Data, and learn best practices from studies that cast CatBoost in a positive light, as well as studies where CatBoost does not outshine other…

Citation impact

1,619
total citations
FWCI
53.73
Percentile
100%
References
116
Citations per year

Authors

2

Topics & keywords

Keywords
  • Computer science
  • Machine learning
  • Artificial intelligence
  • Decision tree
  • Categorical variable
  • Big data
  • Data science
  • Data mining
UN Sustainable Development Goals
  • Peace, Justice and strong institutions
No related works found for this paper.

Funding