Evaluation metrics and statistical tests for machine learning

Rainio, Oona; Teuho, Jarmo; Klén, Riku

doi:10.1038/s41598-024-56706-x

articleScientific ReportsMar 13, 2024GOLD OA

Evaluation metrics and statistical tests for machine learning

OROona Rainio JTJarmo Teuho RKRiku Klén

University of Turku · Turku University Hospital · +1 more institution

PubMed

Indexed incrossrefdoajpubmed

Abstract

Research on different machine learning (ML) has become incredibly popular during the past few decades. However, for some researchers not familiar with statistics, it might be difficult to understand how to evaluate the performance of ML models and compare them with each other. Here, we introduce the most common evaluation metrics used for the typical supervised ML tasks including binary, multi-class, and multi-label classification, regression, image segmentation, object detection, and information retrieval. We explain how to choose a suitable statistical test for comparing models, how to obtain enough values of the metric for testing, and how to perform the test and interpret its results. We also present a few…

Citation impact

970

total citations

FWCI: 364.25
Percentile: 100%
References: 60

Citations per year

Authors

3

Topics & keywords

Topics

Keywords

Computer science
Artificial intelligence
Metric (unit)
Machine learning
Binary classification
Convolutional neural network
Statistical hypothesis testing
Pattern recognition (psychology)

No related works found for this paper.