Post Selection Inference with Incomplete Maximum Mean Discrepancy Estimator

Makoto Yamada; Denny Wu; Yao-Hung Hubert Tsai; Hirofumi Ohta; Ruslan Salakhutdinov; Ichiro Takeuchi; Kenji Fukumizu

Post Selection Inference with Incomplete Maximum Mean Discrepancy Estimator

Makoto Yamada, Denny Wu, Yao-Hung Hubert Tsai, Hirofumi Ohta, Ruslan Salakhutdinov, Ichiro Takeuchi, Kenji Fukumizu

Published: 21 Dec 2018, Last Modified: 12 Oct 2025ICLR 2019 Conference Blind SubmissionReaders: Everyone

Abstract: Measuring divergence between two distributions is essential in machine learning and statistics and has various applications including binary classification, change point detection, and two-sample test. Furthermore, in the era of big data, designing divergence measure that is interpretable and can handle high-dimensional and complex data becomes extremely important. In this paper, we propose a post selection inference (PSI) framework for divergence measure, which can select a set of statistically significant features that discriminate two distributions. Specifically, we employ an additive variant of maximum mean discrepancy (MMD) for features and introduce a general hypothesis test for PSI. A novel MMD estimator using the incomplete U-statistics, which has an asymptotically normal distribution (under mild assumptions) and gives high detection power in PSI, is also proposed and analyzed theoretically. Through synthetic and real-world feature selection experiments, we show that the proposed framework can successfully detect statistically significant features. Last, we propose a sample selection framework for analyzing different members in the Generative Adversarial Networks (GANs) family.

Keywords: Maximum Mean Discrepancy, Selective Inference, Feature Selection, GAN

Community Implementations: [![CatalyzeX](/images/catalyzex_icon.svg) 1 code implementation](https://www.catalyzex.com/paper/post-selection-inference-with-incomplete/code)

10 Replies

Loading