Patch Gradient Descent: Training Neural Networks on Very Large Images

Published: 28 Oct 2023, Last Modified: 28 Oct 2023WANT@NeurIPS 2023 PosterEveryoneRevisionsBibTeX
Keywords: patch gradient descent, memory limit, large images
TL;DR: A novel method that allows to train CNNs for very large images on very small GPUs
Abstract: Current deep learning models falter when faced with large-scale images, largely due to prohibitive computing and memory demands. Enter Patch Gradient Descent (PatchGD), a groundbreaking learning technique that seamlessly trains deep learning models on expansive images. This innovation takes inspiration from the standard feedforward-backpropagation paradigm. However, instead of processing an entire image simultaneously, PatchGD smartly segments and updates a core information-gathering element using portions of the image before the final evaluation. This ensures wide coverage across iterations, bringing in notable memory and computational efficiencies. When tested on the high-resolution PANDA and UltraMNIST datasets using ResNet50 and MobileNetV2 models, PatchGD clearly outstrips traditional gradient descent techniques, particularly under memory constraints. The future of handling vast image datasets effectively lies with PatchGD.
Submission Number: 43