Non-Uniform Multiclass Learning with Bandit Feedback

Steve Hanneke; Amirreza Shaeiri; Hongao Wang

Non-Uniform Multiclass Learning with Bandit Feedback

Steve Hanneke, Amirreza Shaeiri, Hongao Wang

Published: 18 Sept 2025, Last Modified: 29 Oct 2025NeurIPS 2025 posterEveryoneRevisionsBibTeXCC BY 4.0

Keywords: Non-Uniform Learning, Bandit Feedback, Multiclass Learning

Abstract: We study the problem of multiclass learning with bandit feedback in both the i.i.d. batch and adversarial online models. In the *uniform* learning framework, it is well known that no hypothesis class $\mathcal{H}$ is learnable in either model when the effective number of labels is unbounded. In contrast, within the *universal* learning framework, recent works by (Hanneke et al., 2025b) and (Hanneke et al., 2025a) have established surprising exact equivalences between learnability under bandit feedback and full supervision in both the i.i.d. batch and adversarial online models, respectively. This raises a natural question: What happens in the *non-uniform* learning framework, which lies between the uniform and universal learning frameworks? Our contributions are twofold: (1) We provide a combinatorial characterization of learnable hypothesis classes in both models, in the realizable and agnostic settings, within the non-uniform learning framework. Notably, this includes elementary and natural hypothesis classes, such as a countably infinite collection of constant functions over some domain that is learnable in both models. (2) We construct a hypothesis class that is non-uniformly learnable under full supervision in the adversarial online model (and thus also in the i.i.d. batch model), but not non-uniformly learnable under bandit feedback in the i.i.d. batch model (and thus also not in the adversarial online model). This serves as our main novel technical contribution that reveals a fundamental distinction between the non-uniform and universal learning frameworks.

Supplementary Material: zip

Primary Area: Theory (e.g., control theory, learning theory, algorithmic game theory)

Submission Number: 25313

Loading