Class-wise Generalization Error: an Information-Theoretic analysis

TMLR Paper4351 Authors

25 Feb 2025 (modified: 11 Jun 2025)Decision pending for TMLREveryoneRevisionsBibTeXCC BY 4.0
Abstract: Existing generalization theories for supervised learning typically take a holistic approach and provide bounds for the expected generalization over the whole data distribution, which implicitly assumes that the model generalizes similarly for all different classes. In practice, however, there are significant variations in generalization performance among different classes, which cannot be captured by the existing generalization bounds. In this work, we tackle this problem by theoretically studying the class-generalization error, which quantifies the generalization performance of the model for each individual class. We derive a novel information-theoretic bound for class-generalization error using the KL divergence, and we further obtain several tighter bounds using recent advances in conditional mutual information bound, which enables practical evaluation. We empirically validate our proposed bounds in various neural networks and show that they accurately capture the complex class-generalization behavior. Moreover, we demonstrate that the theoretical tools developed in this work can be applied in several other applications.
Submission Length: Long submission (more than 12 pages of main content)
Assigned Action Editor: ~Gintare_Karolina_Dziugaite1
Submission Number: 4351
Loading