Information-Theoretic Generalization Bounds for Iterative Semi-Supervised Learning

Haiyun He; Hanshu YAN; Vincent Tan

Information-Theoretic Generalization Bounds for Iterative Semi-Supervised Learning

Haiyun He, Hanshu YAN, Vincent Tan

Published: 28 Jan 2022, Last Modified: 13 Feb 2023ICLR 2022 SubmittedReaders: Everyone

Keywords: Generalization error, Information theory, Semi-supervised learning

Abstract: We consider iterative semi-supervised learning (SSL) algorithms that iteratively generate pseudo-labels for a large amount unlabelled data to progressively refine the model parameters. In particular, we seek to understand the behaviour of the {\em generalization error} of iterative SSL algorithms using information-theoretic principles. To obtain bounds that are amenable to numerical evaluation, we first work with a simple model---namely, the binary Gaussian mixture model. Our theoretical results suggest that when the class conditional variances are not too large, the upper bound on the generalization error decreases monotonically with the number of iterations, but quickly saturates. The theoretical results on the simple model are corroborated by extensive experiments on several benchmark datasets such as the MNIST and CIFAR datasets in which we notice that the generalization error improves after several pseudo-labelling iterations, but saturates afterwards.

One-sentence Summary: We study the behaviour of the generalization error of iterative semi-supervised learning algorithms using information-theoretic principles.

Supplementary Material: zip

13 Replies

Loading