Theoretical Analysis of Weak-to-Strong Generalization

Published: 25 Sept 2024, Last Modified: 06 Nov 2024NeurIPS 2024 posterEveryoneRevisionsBibTeXCC BY 4.0
Keywords: Weak supervision, weak-to-strong generalization, self-supervised learning, semi-supervised learning
TL;DR: We give a theoretical account of strong student models learning to improve over their weaker teachers.
Abstract: Strong student models can learn from weaker teachers: when trained on the predictions of a weaker model, a strong pretrained student can learn to correct the weak model’s errors and generalize to examples where the teacher is not confident, even when these examples are excluded from training. This enables learning from cheap, incomplete, and possibly incorrect label information, such as coarse logical rules or the generations of a language model. We show that existing weak supervision theory results fail to account for both of these effects, which we call pseudolabel correction and coverage expansion, respectively. We give a new bound based on expansion properties of the data distribution and student hypothesis class that directly accounts for pseudolabel correction and coverage expansion. Our bound generalizes results from the co-training and self-training literature and captures the intuition that weak-to-strong generalization occurs when the mistakes of the weak model are hard for the strong model to fit without incurring additional error. We show that these expansion properties can be checked from finite data and give empirical evidence that they hold in practice.
Supplementary Material: zip
Primary Area: Natural language processing
Submission Number: 1368
Loading