Abstract: Coreset selection is among the most effective ways to reduce the training time of CNNs, however, only limited is known on how the resultant models will behave under variations of the coreset size, and choice of datasets and models. Moreover, given the recent paradigm shift towards transformer-based models, it is still an open question how coreset selection would impact their performance. There are several similar intriguing questions that need to be answered for a wide acceptance of coreset selection methods, and this paper attempts to answer some of these. We present a systematic benchmarking setup and perform a rig- orous comparison of different coreset selection methods on CNNs and transformers. Our investigation reveals that under certain circumstances, random selection of subsets is more robust and stable when compared with the SOTA selection methods. We demonstrate that the conventional concept of uniform subset sampling across the various classes of the data is not the appropriate choice. Rather samples should be adaptively chosen based on the complexity of the data distribution for each class. Transformers are generally pretrained on large datasets, and we show that for certain target datasets, it helps to keep their perfor- mance stable at even very small coreset sizes. We further show that when no pretraining is done or when the pretrained transformer models are used with non-natural images (e.g. medical data), CNNs tend to generalize better than transformers at even very small coreset sizes. Lastly, we demonstrate that in the absence of the right pretraining, CNNs are better at learning the semantic coherence between spatially distant objects within an image, and these tend to outperform transformers at almost all choices of the coreset size.
Submission Length: Long submission (more than 12 pages of main content)
Changes Since Last Submission: Dear Action Editor and Reviewers,
Hope everyone is doing good. As promised earlier, we have submitted a modified version of the paper. The modified version of the paper contains the following modifications based on the reviewer’s remarks.
1. Experiments result on the OxFord-IIIT Pet dataset. We have added results with all four networks, five EDPE values, four coreset methods, and a comparison of pretrained and random initialized models.
2. We have also added standard deviation results for the OxFord-IIIT Pet dataset.
3. We have added the results of the remaining coreset methods (i.e. GLISTER and Random) on the CIFAR10 dataset.
4. We have corrected the description of the coreset methods and their mathematical equations.
5. We have also fixed the minor issues highlighted by all the reviewers.
We want to thank Action Editor and all the Reviewers for giving us more time to conduct experiments. With the additional time, we were able to conduct thorough experiments on the OxFord-IIIT Pet dataset. We were unable to run the experiments on the additional medical dataset due to resource and time constraints. We believe that the findings from the OxFord-IIIT Pet dataset will significantly bolster the paper, potentially mitigating the need for further experiments on the medical dataset.
We are open to any further comments that can improve the manuscript's quality.
Regards,
Authors
Assigned Action Editor: ~Nicolas_THOME2
Submission Number: 911
Loading