Learning Guarantees for Non-convex Pairwise SGD with Heavy Tails

22 Sept 2023 (modified: 11 Feb 2024)Submitted to ICLR 2024EveryoneRevisionsBibTeX
Primary Area: learning theory
Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics.
Keywords: Stability, generalization bound, stochastic gradient descent, pairwise learning, heavy tail
Submission Guidelines: I certify that this submission complies with the submission instructions as described on https://iclr.cc/Conferences/2024/AuthorGuide.
Abstract: In recent years, there have been a growing number of works studying the generalization properties of pairwise stochastic gradient descent (SGD) from the perspective of algorithmic stability. However, few of them devote to simultaneously studying the generalization and optimization for the non-convex setting, especially the ones with heavy-tailed gradient noise. This paper establishes the stability-based learning guarantees for non-convex, heavy-tailed pairwise SGD by investigating its generalization and optimization jointly. Firstly, we bound the generalization error of pairwise SGD in the general non-convex setting, after bridging the quantitative relationships between $\ell_1$ on-average model stability and generalization error. Secondly, a refined generalization bound is established for non-convex pairwise SGD by introducing the heavy-tailed gradient noise to remove the bounded gradient assumption. Finally, the sharper error bounds for generalization and optimization are provided under the gradient dominance condition. In addition, we extend our analysis to the corresponding pairwise minibatch SGD and derive the first stability-based near-optimal generalization and optimization bounds which are consistent with many empirical observations. These theoretical results fill the learning theory gap for non-convex pairwise SGD with heavy tails.
Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors' identity.
No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.
Submission Number: 4323
Loading