Keywords: Kernel methods, Maximum mean discrepancy, Likelihood-free inference, Hypothesis testing, Minimax statistics
Abstract: Given $n$ observations from two balanced classes, consider the task of labeling an additional $m$ inputs that are known to all belong to \emph{one} of the two classes.
Special cases of this problem are well-known: with complete
knowledge of class distributions ($n=\infty$) the
problem is solved optimally by the likelihood-ratio test; when
$m=1$ it corresponds to binary classification; and when $m\approx n$ it is equivalent to two-sample testing. The intermediate settings occur in the field of likelihood-free inference, where labeled samples are obtained by running forward simulations and the unlabeled sample is collected experimentally. In recent work it was discovered that there is a fundamental trade-off
between $m$ and $n$: increasing the data sample $m$ reduces the amount $n$ of training/simulation
data needed. In this work we (a) introduce a generalization where unlabeled samples
come from a mixture of the two classes -- a case often encountered in practice; (b) study the minimax sample complexity for non-parametric classes of densities under \textit{maximum mean
discrepancy} (MMD) separation; and (c) investigate the empirical performance of kernels parameterized by neural networks on two tasks: detection
of the Higgs boson and detection of planted DDPM generated images amidst
CIFAR-10 images. For both problems we confirm the existence of the theoretically predicted asymmetric $m$ vs $n$ trade-off.
Supplementary Material: pdf
Submission Number: 12815
Loading