CatBoost for big data: an interdisciplinary review
Indexed incrossrefdoajpubmed
Abstract
Gradient Boosted Decision Trees (GBDT's) are a powerful tool for classification and regression tasks in Big Data. Researchers should be familiar with the strengths and weaknesses of current implementations of GBDT's in order to use them effectively and make successful contributions. CatBoost is a member of the family of GBDT machine learning ensemble techniques. Since its debut in late 2018, researchers have successfully used CatBoost for machine learning studies involving Big Data. We take this opportunity to review recent research on CatBoost as it relates to Big Data, and learn best practices from studies that cast CatBoost in a positive light, as well as studies where CatBoost does not outshine other…
Citation impact
1,619
total citations
- FWCI
- 53.73
- Percentile
- 100%
- References
- 116
Citations per year
Authors
2Topics & keywords
Topics
Keywords
- Computer science
- Machine learning
- Artificial intelligence
- Decision tree
- Categorical variable
- Big data
- Data science
- Data mining
UN Sustainable Development Goals
- Peace, Justice and strong institutions
No related works found for this paper.