Keywords: Domain Adaptation, Label shift, PU learning, deep learning, open set domain adaptation, identifiability
TL;DR: We introduce Open Set Label Shift (OSLS) problem, a coherent instantiation of Open Set Domain Adaptation (OSDA). We propose a simple practical solution for OSLS that significantly improves over OSDA methods.
Abstract: We introduce the problem of domain adaptation under Open Set Label Shift (OSLS) where the label distribution can change arbitrarily and a new class may arrive during deployment, but the class-conditional distributions $p(x|y)$ are domain-invariant. The learner's goals here are two-fold: (a) estimate the target label distribution, including the novel class; and (b) learn a target classifier. %for the target domain. First, we establish necessary and sufficient conditions for identifying these quantities. Second, we propose practical methods for both tasks. Unlike typical Open Set Domain Adaptation (OSDA) problems, which tend to be ill-posed and amenable only to heuristics, OSLS offers a well-posed problem amenable to more principled machinery. Experiments across numerous semi-synthetic benchmarks on vision, language, and medical datasets demonstrate that our methods consistently outperform OSDA baselines, achieving $10$--$25\%$ improvements in target domain accuracy. Finally, we analyze the proposed methods, establishing finite-sample convergence to the true label marginal and convergence to optimal classifier for linear models in a Gaussian setup.