Distributional Robustness Bounds Generalization Errors

TMLR Paper4014 Authors

20 Jan 2025 (modified: 15 Apr 2025)Withdrawn by AuthorsEveryoneRevisionsBibTeXCC BY 4.0
Abstract: Bayesian methods, distributionally robust optimization methods, and regularization methods are three pillars of machine learning under distributional uncertainty, e.g., the uncertainty of an empirical distribution compared to the true underlying distribution. This paper investigates the connections among the three frameworks and, in particular, explores why these frameworks tend to have smaller generalization errors. Specifically, first, we suggest a quantitative definition for "distributional robustness", propose the concept of "robustness measure", and formalize several philosophical concepts in distributionally robust optimization. Second, we show that Bayesian methods are distributionally robust in the probably approximately correct (PAC) sense; in addition, by constructing a Dirichlet-process-like prior in Bayesian nonparametrics, it can be proven that any regularized empirical risk minimization method is equivalent to a Bayesian method. Third, we show that generalization errors of machine learning models can be characterized using the distributional uncertainty of the nominal distribution and the robustness measures of these machine learning models; this explains the reason why distributionally robust optimization models, Bayesian models, and regularization models tend to have smaller generalization errors in a unified manner.
Submission Length: Long submission (more than 12 pages of main content)
Assigned Action Editor: ~Fred_Roosta1
Submission Number: 4014
Loading

OpenReview is a long-term project to advance science through improved peer review with legal nonprofit status. We gratefully acknowledge the support of the OpenReview Sponsors. © 2025 OpenReview