Abstract: We study the problem of multiclass classification when the number of labels can be unbounded within the PAC learning framework. Our main contribution is a theory that demonstrates a *simple* and *elegant* agnostic to realizable reduction for this framework. This resolves an open problem raised by the recent work of (Hopkins et al., 2022). Notably, our result is the first *representation preserving* multiclass agnostic to realizable reduction, in contrast with the compression based approach of the work of (David et al., 2017). Furthermore, our main theorem is stated in an abstract framework, called ``Unified PAC Learning'', which encompasses a range of frameworks, including multiclass PAC learning, list PAC learning, and multilabel PAC learning. In addition, we explore representation preserving reductions to the realizable setting for two noise models, namely Massart noise and Tsybakov noise, in the multiclass PAC learning framework. We believe our technique may find other applications in ensuing studies of theoretical machine learning.
Lay Summary: We theoretically study the problem of multiclass classification when the number of labels can be unbounded. This setting is well-motivated both theoretically and empirically. For instance, from a practical standpoint, many critical machine learning tasks require classification into very large label spaces. Our main contribution is a theory that demonstrates a *simple* and *elegant* reduction, showing that a provable guarantee in a setup with a distributional assumption implies a provable guarantee in a setup without that assumption. This resolves an open problem raised by the recent work of Hopkins et al. (2022). Notably, our result is the first *representation preserving* reduction, meaning that if the theoretical result in the more restrictive setup has properties such as geometric margin, the theoretical result in the general setup retains them as well.
Primary Area: Theory->Learning Theory
Keywords: PAC Learning, Agnostic to Realizable Reduction, Unified PAC Framework, Multiclass Learning, List Learning, Multilabel Learning, Noise Models, Partial Concept Classes
Submission Number: 14444
Loading