InterTrain: Accelerating DNN Training using Input Interpolation

Sarada Krithivasan; Swagath Venkataramani; Sanchari Sen; Anand Raghunathan

InterTrain: Accelerating DNN Training using Input Interpolation

Sarada Krithivasan, Swagath Venkataramani, Sanchari Sen, Anand Raghunathan

29 Sept 2021 (modified: 13 Feb 2023)ICLR 2022 Conference Withdrawn SubmissionReaders: Everyone

Keywords: Efficient DNN Training

Abstract: Training Deep Neural Networks (DNNs) places immense compute requirements on the underlying hardware platforms, expending large amounts of time and energy. An important factor contributing to the long training times is the increasing dataset complexity required to reach state-of-the-art performance in real-world applications. To this end, we propose to reduce training runtimes by combining a subset of inputs in the training dataset via an interpolation operation. The goal is for training on the interpolated input to achieve a similar effect as training separately on each the constituent inputs that it represents. This results in a lower number of inputs (or mini-batches) to be processed in each epoch. However, we find that naively interpolating inputs leads to a considerable drop in learning performance and model accuracy. This is because the efficacy of learning on interpolated inputs is reduced by the interference between the forward/backward propagation of their constituent inputs. We propose two strategies to address this challenge and realize training speedups with minimal impact on accuracy. First, we reduce the impact of interference by exploiting the spatial separation between the features of the constituent inputs in the network’s intermediate representations. We also adaptively vary the weightage of constituent inputs based on their loss in previous epochs. Second, we propose loss-based metrics to automatically identify the subset of the training dataset that is subject to interpolation in each epoch. For ResNets of varying depth and MobileNetV2, we obtain upto 1.6x and 1.8x speed-ups in training for the ImageNet and Cifar10 datasets, respectively, on an Nvidia RTX 2080Ti GPU, with negligible loss in classification accuracy.

5 Replies

Loading