Abstract: This paper explores the generalization characteristics of iterative learning algorithms with bounded updates for non-convex loss functions, employing information-theoretic techniques. Our key contribution is a novel bound for the generalization error of these algorithms with bounded updates. Our approach introduces two main novelties: 1) we reformulate the mutual information as the uncertainty of updates, providing a new perspective, and 2) instead of using the chaining rule of mutual information, we employ a variance decomposition technique to decompose information across iterations, allowing for a simpler surrogate process. We analyze our generalization bound under various settings and demonstrate improved bounds. Ultimately, our work takes a further step for developing practical generalization theories.
Submission Length: Regular submission (no more than 12 pages of main content)
Assigned Action Editor: ~Stefan_Magureanu1
Submission Number: 3535
Loading