Correcting Sample Selection Bias by Unlabeled Data

Huang, Jiayuan; Smola, Alexander J.; Gretton, Arthur; Borgwardt, Karsten; Schölkopf, Bernhard

doi:10.7551/mitpress/7503.003.0080

book chapterThe MIT Press eBooksSep 7, 2007Closed access

Correcting Sample Selection Bias by Unlabeled Data

JHJiayuan Huang AJAlexander J. Smola AGArthur Gretton KBKarsten Borgwardt BSBernhard Schölkopf

Max Planck Society · Max Planck Institute for Biological Cybernetics

Indexed incrossref

Abstract

We consider the scenario where training and test data are drawn from different distributions, commonly referred to as sample selection bias.Most algorithms for this setting try to first recover sampling distributions and then make appropriate corrections based on the distribution estimate.We present a nonparametric method which directly produces resampling weights without distribution estimation.Our method works by matching distributions between training and testing sets in feature space.Experimental results demonstrate that our method works well in practice.

Citation impact

1,556

total citations

FWCI: 26.97
Percentile: 100%
References: 26

Citations per year

Authors

5

Topics & keywords

Topics

Keywords

Selection bias
Selection (genetic algorithm)
Sample (material)
Sampling bias
Computer science
Artificial intelligence
Statistics
Pattern recognition (psychology)

No related works found for this paper.