Abstract: The exponential rise in size and complexity of deep learning models and datasets have resulted
in a considerable demand for computational resources. Coreset selection is one of the
methods to alleviate this rising demand. The goal is to select a subset from a large dataset
to train a model that performs almost at par with the one trained on the large dataset
while reducing computational time and resource requirements. Existing approaches either
attempt to identify remarkable samples (e.g., Forgetting, Adversarial Deepfool, EL2N, etc.)
that stand out from the rest or solve complex optimization (e.g., submodular maximization,
OMP) problems to compose the coresets. This paper proposes a novel and intuitive approach
to efficiently select a coreset based on the similarity of loss gradients. Our method
works on the hypothesis that gradients of samples belonging to a given class will point
in similar directions during the early training phase. Samples with most neighbours that
produce similar gradient directions, in other words, that produce noise-free gradients, will
represent that class. Through extensive experimentation, we have demonstrated the effectiveness
of our approach in out-performing state-of-the-art coreset selection algorithms
on a range of benchmark datasets from CIFAR-10 to ImageNet with architectures of varied
complexity (ResNet-18, ResNet-50, VGG-16, ViT).We have also demonstrated the effectiveness
of our approach in Generative Modelling by implementing coreset selection to reduce
training time for various GAN models (DCGAN, MSGAN, SAGAN, SNGAN) for different
datasets (CIFAR-10, CIFAR-100, Tiny ImageNet) while not impacting the performance
metrics significantly. Source code is provided at URL.
Submission Length: Long submission (more than 12 pages of main content)
Changes Since Last Submission: **1**. Camera-ready version. Replaced the anonymised code base with the public GitHub repo link.
Video: https://drive.google.com/file/d/1h8XuoCw1BtndH2Tu8lGdmsQ3dkgK9l1q/view?usp=drive_link
Code: https://github.com/ai23resch04001/Noise_free_gradient
Supplementary Material: pdf
Assigned Action Editor: ~Pavel_Izmailov1
Submission Number: 3327
Loading