Learning and evaluating classifiers under sample selection bias

Zadrozny, Bianca

doi:10.1145/1015330.1015425

articleJan 1, 2004Closed access

Learning and evaluating classifiers under sample selection bias

BZBianca Zadrozny

IBM Research - Thomas J. Watson Research Center

Indexed incrossref

Abstract

Classifier learning methods commonly assume that the training data consist of randomly drawn examples from the same distribution as the test examples about which the learned model is expected to make predictions. In many practical situations, however, this assumption is violated, in a problem known in econometrics as sample selection bias. In this paper, we formalize the sample selection bias problem in machine learning terms and study analytically and experimentally how a number of well-known classifier learning methods are affected by it. We also present a bias correction method that is particularly useful for classifier evaluation under sample selection bias.

Citation impact

813

total citations

FWCI: 15.71
Percentile: 100%
References: 17

Citations per year

Authors

1

BZ
Bianca ZadroznyCorresponding
IBM Research - Thomas J. Watson Research Center

Topics & keywords

Topics

Keywords

Selection bias
Classifier (UML)
Artificial intelligence
Computer science
Machine learning
Sampling bias
Sample size determination
Sample (material)

UN Sustainable Development Goals

Quality Education

No related works found for this paper.