Keywords: Non-Uniform Learning, Bandit Feedback, Multiclass Learning
Abstract: We study the problem of multiclass learning with bandit feedback in both the i.i.d. batch and adversarial online models. In the *uniform* learning framework, it is well known that no hypothesis class $\mathcal{H}$ is learnable in either model when the effective number of labels is unbounded. In contrast, within the *universal* learning framework, recent works by (Hanneke et al., 2025b) and (Hanneke et al., 2025a) have established surprising exact equivalences between learnability under bandit feedback and full supervision in both the i.i.d. batch and adversarial online models, respectively. This raises a natural question: What happens in the *non-uniform* learning framework, which lies between the uniform and universal learning frameworks? Our contributions are twofold: (1) We provide a combinatorial characterization of learnable hypothesis classes in both models, in the realizable and agnostic settings, within the non-uniform learning framework. Notably, this includes elementary and natural hypothesis classes, such as a countably infinite collection of constant functions over some domain that is learnable in both models. (2) We construct a hypothesis class that is non-uniformly learnable under full supervision in the adversarial online model (and thus also in the i.i.d. batch model), but not non-uniformly learnable under bandit feedback in the i.i.d. batch model (and thus also not in the adversarial online model). This serves as our main novel technical contribution that reveals a fundamental distinction between the non-uniform and universal learning frameworks.
Supplementary Material: zip
Primary Area: Theory (e.g., control theory, learning theory, algorithmic game theory)
Submission Number: 25313
Loading