Enhanced Stochastic Gradient Descent with Backward Queried Data for Online Learning

Gio Huh

Published: 23 Feb 2021, Last Modified: 16 May 20252020 IEEE International Conference on Machine Learning and Applied Network Technologies (ICMLANT)EveryoneCC BY 4.0

Abstract: Stochastic gradient descent (SGD) is one of the preferred online optimization algorithms. However, one of its major drawbacks is its predisposition to forgetting previous data when optimizing through a data stream, also known as catastrophic interference. In this paper, we attempt to mitigate this drawback by proposing a new low-cost approach which incorporates backward queried data with SGD during online training. Under this new approach, we propose that for every new training sample through the data stream, the neural network is optimized using the corresponding backward queried data from the initial dataset. After comparing our algorithm to SGD, we see substantial improvements in the performance of the neural network with two different MNIST datasets (Fashion and Kuzushiji), demonstrating that our proposed algorithm can be a potential alternative to online SGD.