Learning the essential in less than 2k additional weights - a simple approach to improve image classification stability under corruptions

Published: 21 Jun 2024, Last Modified: 21 Jun 2024Accepted by TMLREveryoneRevisionsBibTeX
Abstract: The performance of image classification on well-known benchmarks such as ImageNet is remarkable, but in safety-critical situations, the accuracy often drops significantly under adverse conditions. To counteract these performance drops, we propose a very simple modification to the models: we pre-pend a single, dimension preserving convolutional layer with a large linear kernel whose purpose it is to extract the information that is essential for image classification. We show that our simple modification can increase the robustness against common corruptions significantly, especially for corruptions of high severity. We demonstrate the impact of our channel-specific layers on ImageNet-100 and ImageNette classification tasks and show an increase of up to 30% accuracy on corrupted data in the top1 accuracy. Further, we conduct a set of designed experiments to qualify the conditions for our findings. Our main result is that a data- and network-dependent linear subspace carries the most important classification information (the essential), which our proposed pre-processing layer approximately identifies for most corruptions, and at very low cost.
Submission Length: Regular submission (no more than 12 pages of main content)
Changes Since Last Submission: ### List of Changes * We updated the ResNet50 results in Table 4 * We added the comparison of different robustness methods in Table 7 * We added a description of "comparison of different robustness methods" in section 4.3 (Comparison with Augmentation and Joint Trainable Large Kernel and Augmentation). * We added an acknowledgements section.
Supplementary Material: pdf
Assigned Action Editor: ~Vincent_Dumoulin1
Submission Number: 2285