A Unified Analysis of Stochastic Gradient Descent with Arbitrary Data Permutations and Beyond

Yipeng Li; Xinchen Lyu; Zhenyu Liu

A Unified Analysis of Stochastic Gradient Descent with Arbitrary Data Permutations and Beyond

Yipeng Li, Xinchen Lyu, Zhenyu Liu

Published: 18 Sept 2025, Last Modified: 29 Oct 2025NeurIPS 2025 posterEveryoneRevisionsBibTeXCC BY 4.0

Keywords: stochastic gradient descent, random reshuffling, federated learning, convergence analysis

TL;DR: A unified convergence analysis for permutation-based Stochastic Gradient Descent (SGD) with arbitrary permutations of examples.

Abstract: We aim to provide a unified convergence analysis for permutation-based Stochastic Gradient Descent (SGD), where data examples are permuted before each epoch. By examining the relations among permutations, we categorize existing permutation-based SGD algorithms into three categories: Arbitrary Permutations, Independent Permutations (including Random Reshuffling and FlipFlop Rajput et al., 2022), Dependent Permutations (including GraBs Lu et al., 2022a; Cooper et al., 2023). Existing unified analyses failed to encompass the Dependent Permutations category due to the inter-epoch permutation dependency. In this work, we propose a generalized assumption that explicitly characterizes the dependence of permutations across epochs. Building upon this assumption, we develop a unified framework for permutation-based SGD with arbitrary permutations of examples, incorporating all the existing permutation-based SGD algorithms. Furthermore, we adapt our framework for Federated Learning (FL), developing a unified framework for regularized client participation FL with arbitrary permutations of clients.

Supplementary Material: zip

Primary Area: Optimization (e.g., convex and non-convex, stochastic, robust)

Submission Number: 13901

Loading