articlePatternsMay 31, 2024GOLD OA

The receiver operating characteristic curve accurately assesses imbalanced datasets

La Jolla Institute for Immunology · Fundação Oswaldo Cruz · +2 more institutions

PubMed
Indexed incrossrefdoajpubmed

Abstract

Many problems in biology require looking for a "needle in a haystack," corresponding to a binary classification where there are a few positives within a much larger set of negatives, which is referred to as a class imbalance. The receiver operating characteristic (ROC) curve and the associated area under the curve (AUC) have been reported as ill-suited to evaluate prediction performance on imbalanced problems where there is more interest in performance on the positive minority class, while the precision-recall (PR) curve is preferable. We show via simulation and a real case study that this is a misinterpretation of the difference between the ROC and PR spaces, showing that the ROC curve is robust to class…

No related works found for this paper.

Funding