Improving the convergence of SGD through adaptive batch sizes
Keywords: stochastic gradient descent, batch size
Abstract: Mini-batch stochastic gradient descent (SGD) and variants thereof
approximate the objective function's gradient with a small number of
training examples, aka the batch size. Small batch sizes require little
computation for each model update but can yield high-variance gradient
estimates, which poses some challenges for optimization. Conversely, large
batches require more computation but can yield higher precision gradient
estimates. This work presents a method to adapt the batch size to the
model's training loss. For various function classes, we show that our
method requires the same order of model updates as gradient descent while
requiring the same order of gradient computations as SGD. This method
requires evaluating the model's loss on the entire dataset every model
update. However, the required computation is greatly reduced by
approximating the training loss. We provide experiments that illustrate our
methods require fewer model updates without increasing the total amount of
computation.
Primary Area: optimization
Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics.
Submission Guidelines: I certify that this submission complies with the submission instructions as described on https://iclr.cc/Conferences/2025/AuthorGuide.
Reciprocal Reviewing: I understand the reciprocal reviewing requirement as described on https://iclr.cc/Conferences/2025/CallForPapers. If none of the authors are registered as a reviewer, it may result in a desk rejection at the discretion of the program chairs. To request an exception, please complete this form at https://forms.gle/Huojr6VjkFxiQsUp6.
Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors’ identity.
No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.
Submission Number: 9427
Loading