Using the Polyak Step Size in training Convolutional Neural Networks

Published: 19 Mar 2024, Last Modified: 01 Jul 2024Tiny Papers @ ICLR 2024 PresentEveryoneRevisionsBibTeXCC BY 4.0
Keywords: Deep Learning, Gradient Descent, Polyak Step Size, Optmisation, Convolutional Neural Networks, AlexNet, ResNet, CIFAR-10
TL;DR: This paper introduces upper-bounds on the Polyak Step Size and applies it to training CNNs in order to investigate it's utility for Deep Learning.
Abstract: The Polyak Step Size (PSS) is an adaptive learning rate that has yet to see much prominence in Deep Learning (DL). This paper investigates using the PSS in training Convolutional Neural Networks (CNN) for Image Classification (IC). We show that by introducing two upper bounds for the PSS, we can train accurate CNNs without the need for calculating a learning rate apriori. Additionally, we compare the upper-bounded PSS rates against other adaptive learning rate methods (AdaGrad and AdaDelta), showing that they achieve competitive performance.
Submission Number: 43
Loading