Conditional Kernel Quantile Embeddings: A Nonparametric Framework for Conditional Two-Sample Testing

TMLR Paper5564 Authors

06 Aug 2025 (modified: 15 Aug 2025)Under review for TMLREveryoneRevisionsBibTeXCC BY 4.0
Abstract: Comparing conditional probability distributions, P(Y∣X) and Q(Y∣X), is a fundamental problem in machine learning, crucial for tasks like causal inference, detecting dataset shift, and model validation. The predominant approach, based on Conditional Kernel Mean Embeddings (KCMEs), suffers from significant drawbacks: it relies on strong and often unverifiable assumptions on the kernel to be a metric, incurs high computational costs, and may exhibit reduced sensitivity to higher-order distributional differences. We introduce Conditional Kernel Quantile Embeddings (CKQEs), a novel and robust framework for representing conditional distributions in a Reproducing Kernel Hilbert Space (RKHS). Throughout, we assume P_X = Q_X for conditional comparisons, and we require only that the output-space kernel be quantile-characteristic. From CKQEs, we construct the Conditional Kernel Quantile Discrepancy (CKQD), a new family of probability metrics. We prove that CKQD: (1) is a metric under substantially weaker and more practical kernel conditions than KCME-based distances, namely requiring only a quantile-characteristic kernel; (2) possesses a clear geometric interpretation, recovering a conditional version of the Sliced Wasserstein distance in a special case; and (3) admits a computationally efficient, statistically consistent non-parametric estimator with proven finite-sample convergence rates. By addressing the core weaknesses of the KCME framework, CKQE provides a more versatile and theoretically sound foundation for conditional two-sample testing.
Submission Length: Regular submission (no more than 12 pages of main content)
Changes Since Last Submission: Corrected numeric typos in Tables 5-7; regenerated those tables from the latest runs (same protocols/seeds as the original submission). The corresponding figures already matched the correct values; no figures changed. We also updated a few in-text sentences and captions that quoted those table entries so that text and tables now agree. No other changes. All claims and conclusion remain unchanged.
Assigned Action Editor: ~Krikamol_Muandet1
Submission Number: 5564
Loading