Positive Distribution Shift as a Framework for Understanding Tractable Learning

ICLR 2026 Conference Submission20014 Authors

19 Sept 2025 (modified: 08 Oct 2025)ICLR 2026 Conference SubmissionEveryoneRevisionsBibTeXCC BY 4.0
Keywords: positive distribution shift, tractable learning
TL;DR: We show that distribution shift can be beneficial: training on a carefully chosen distribution different from the target can significantly reduce computational cost.
Abstract: We study a setting where the goal is to learn a target function f(x) with respect to a target distribution D(x), but training is done on iid samples from a different training distribution D’(x), labeled by the true target f(x). Such a distribution shift (here in the form of covariate shift) is usually viewed negatively, as hurting or making learning harder, and the traditional distribution shift literature is mostly concerned with limiting or avoiding this negative effect. In contrast, we argue that such a distribution shift, i.e. training using D’ instead of D, can often be positive, and make learning easier, and that such a positive distribution shift (PDS) is central to contemporary machine learning, where much of the innovation in practice is in finding good training distributions D’, rather than changing the training algorithm. We further argue that the benefit is often computational rather than statistical, and that PDS allows computationally hard problems to become tractable even using standard gradient-based training. We formalize different variants of PDS, show how certain hard classes are easily learnable under PDS, and make connections with membership query learning.
Supplementary Material: zip
Primary Area: foundation or frontier models, including LLMs
Submission Number: 20014
Loading