Top-GAP: Integrating Size Priors in CNNs for more Robustness, Interpretability, and Bias Mitigation

TMLR Paper2539 Authors

17 Apr 2024 (modified: 18 Apr 2024)Under review for TMLREveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: In the field of computer vision, convolutional neural networks (CNNs) have shown remarkable capabilities and are excelling in various tasks from image classification to semantic segmentation. However, their vulnerability to adversarial attacks remains a pressing issue that limits their use in safety-critical domains. In this paper, we present Top-GAP -- a method that aims to increase the native robustness of CNNs by restricting the spatial size of feature representations. The advantage of our approach over common adversarial training is that our method does not degrade in clean accuracy or training speed. On CIFAR-10 with PGD $\epsilon=8/255$ and $20$ iterations, we achieve over 50\% robust accuracy while retaining the original clean accuracy. Moreover, our size constraint helps to generate sparser and less noisy class activation maps, which significantly improves object localization and mitigates potential biases. We demonstrate on a variety of datasets and architectures that our method has comparable clean accuracy to regular trained models while improving localization and robustness. In addition, our method provides the ability to incorporate prior human knowledge about object sizes into the network, which is particularly beneficial in biological and medical domains where the variance in object sizes is not dominated by perspective projections.
Submission Length: Regular submission (no more than 12 pages of main content)
Assigned Action Editor: ~Pin-Yu_Chen1
Submission Number: 2539
Loading