Agnostic Boosting

Shai Ben-David, Philip M. Long, Yishay Mansour

2001 (modified: 05 Nov 2022)COLT/EuroCOLT 2001Readers: Everyone

Abstract: We extend the boosting paradigm to the realistic setting of agnostic learning, that is, to a setting where the training sample is generated by an arbitrary (unknown) probability distribution over examples and labels.We define a ß-weak agnostic learner with respect to a hypothesis class F as follows: given a distribution P it outputs some hypothesis h ε F whose error is at most er p (F) + β, where er p (F) is the minimal error of an hypothesis from F under the distribution P (note that for some distributions the bound may exceed a half). We show a boosting algorithm that using the weak agnostic learner computes a hypothesis whose error is at most max{c1(β)er(F)c 2(ß), ε}, in time polynomial in 1/ε. While this generalization guarantee is significantly weaker than the one resulting from the known PAC boosting algorithms, one should note that the assumption required for β-weak agnostic learner is much weaker. In fact, an important virtue of the notion of weak agnostic learning is that in many cases such learning is achieved by efficient algorithms.

0 Replies