Class-wise Generalization Error: an Information-Theoretic analysis

Firas Laakom; Moncef Gabbouj; Jürgen Schmidhuber; Yuheng Bu

Class-wise Generalization Error: an Information-Theoretic analysis

Firas Laakom, Moncef Gabbouj, Jürgen Schmidhuber, Yuheng Bu

Published: 11 Jul 2025, Last Modified: 21 Oct 2025Accepted by TMLREveryoneRevisionsBibTeXCC BY 4.0

Abstract: Existing generalization theories for supervised learning typically take a holistic approach and provide bounds for the expected generalization over the whole data distribution, which implicitly assumes that the model generalizes similarly for all different classes. In practice, however, there are significant variations in generalization performance among different classes, which cannot be captured by the existing generalization bounds. In this work, we tackle this problem by theoretically studying the class-generalization error, which quantifies the generalization performance of the model for each individual class. We derive a novel information-theoretic bound for class-generalization error using the KL divergence, and we further obtain several tighter bounds using recent advances in conditional mutual information bound, which enables practical evaluation. We empirically validate our proposed bounds in various neural networks and show that they accurately capture the complex class-generalization behavior. Moreover, we demonstrate that the theoretical tools developed in this work can be applied in several other applications.

Certifications: J2C Certification

Submission Length: Long submission (more than 12 pages of main content)

Assigned Action Editor: ~Gintare_Karolina_Dziugaite1

Submission Number: 4351

Loading