A kernel two-sample test
Max Planck Institute for Intelligent Systems · Max Planck Institute for Biology · +3 more institutions
Abstract
We propose a framework for analyzing and comparing distributions, which we use to construct statistical tests to determine if two samples are drawn from different distributions. Our test statistic is the largest difference in expectations over functions in the unit ball of a reproducing kernel Hilbert space (RKHS), and is called the maximum mean discrepancy (MMD).We present two distributionfree tests based on large deviation bounds for the MMD, and a third test based on the asymptotic distribution of this statistic. The MMD can be computed in quadratic time, although efficient linear time approximations are available. Our statistic is an instance of an integral probability metric, and various classical metrics…
Citation impact
- FWCI
- 83.10
- Percentile
- 100%
- References
- 94
Authors
5- AGArthur GrettonCorresponding
Max Planck Institute for Intelligent Systems
- KBKarsten Borgwardt
Max Planck Institute for Biology
- MJMalte J. Rasch
Beijing Normal University
- BSBernhard Schölkopf
Max Planck Institute for Intelligent Systems
- AJAlexander J. Smola
Australian National University, Yahoo (United States)
Topics & keywords
- Reproducing kernel Hilbert space
- Mathematics
- Test statistic
- Kernel (algebra)
- Statistic
- Matching (statistics)
- Applied mathematics
- Statistics
- Gender equality