articleJul 20, 2016GOLD OA

Evaluation of a Tree-based Pipeline Optimization Tool for Automating Data Science

University of Pennsylvania · University of Chicago · +1 more institution

Indexed incrossref

Abstract

As the field of data science continues to grow, there will be an ever-increasing demand for tools that make machine learning accessible to non-experts. In this paper, we introduce the concept of tree-based pipeline optimization for automating one of the most tedious parts of machine learning--pipeline design. We implement an open source Tree-based Pipeline Optimization Tool (TPOT) in Python and demonstrate its effectiveness on a series of simulated and real-world benchmark data sets. In particular, we show that TPOT can design machine learning pipelines that provide a significant improvement over a basic machine learning analysis while requiring little to no input nor prior knowledge from the user. We also…

Citation impact

564
total citations
FWCI
33.16
Percentile
100%
References
29
Citations per year

Authors

4

Topics & keywords

Keywords
  • Computer science
  • Pipeline transport
  • Pipeline (software)
  • Python (programming language)
  • Machine learning
  • Tree (set theory)
  • Artificial intelligence
  • Benchmark (surveying)
No related works found for this paper.

Funding