E-Valuating Classifier Two-Sample Tests

Published: 25 Apr 2024, Last Modified: 25 Apr 2024Accepted by TMLREveryoneRevisionsBibTeX
Abstract: We introduce a powerful deep classifier two-sample test for high-dimensional data based on E-values, called E-C2ST. Our test combines ideas from existing work on split likelihood ratio tests and predictive independence tests. The resulting E-values are suitable for anytime-valid sequential two-sample tests. This feature allows for more effective use of data in constructing test statistics. Through simulations and real data applications, we empirically demonstrate that E-C2ST achieves enhanced statistical power by partitioning datasets into multiple batches, beyond the conventional two-split (training and testing) approach of standard two-sample classifier tests. This strategy increases the power of the test, while keeping the type I error well below the desired significance level.
Submission Length: Regular submission (no more than 12 pages of main content)
Assigned Action Editor: ~Matthew_Blaschko1
Submission Number: 1914