Refining the Sample Complexity of Comparative Learning

Published: 18 Dec 2024, Last Modified: 14 Feb 2025ALT 2025EveryoneRevisionsBibTeXCC BY 4.0
Abstract: Comparative learning, a recently introduced variation of the PAC (Probably Approximately Correct) framework, interpolates between the two standard extreme settings of realizable and agnostic PAC learning. In comparative learning the labeling is assumed to be from one hypothesis class (the source class) while the learner's performance is to be measured against another hypothesis class (the benchmark class). This setup allows for incorporating more specific prior knowledge into PAC type learning bounds, which are known to be otherwise overly pessimistic. It also naturally represents model distillation tasks, where a predictor with specific requirements (such as being bounded in size or being interpretable) is trained on the labels from another model. A first sample complexity analysis of comparative learning established upper and lower bounds for general comparative learning. In this work, we propose a more fine grained view on this setting, distinguishing between proper learning and general (improper) learning. We derive novel upper and lower sample complexity bounds for both settings. In particular, we identify conditions for each of the regimes, and thereby exhibit how the rate depends on the relatedness of the two classes in perhaps unexpected ways.
PDF: pdf
Submission Number: 131
Loading