Top-GAP: Integrating Size Priors in CNNs for more Robustness, Interpretability, and Bias Mitigation

TMLR Paper2539 Authors

17 Apr 2024 (modified: 17 Sept 2024)Rejected by TMLREveryoneRevisionsBibTeXCC BY 4.0
Abstract: In the field of computer vision, convolutional neural networks (CNNs) have shown remarkable capabilities and are excelling in various tasks from image classification to semantic segmentation. However, their vulnerability to adversarial attacks remains a pressing issue that limits their use in safety-critical domains. In this paper, we present Top-GAP -- a method that aims to increase the robustness of CNNs against simple PGD, FGSM, Square Attack and distribution shifts. The advantage of our approach is that it does not slow down the training or decrease the clean accuracy. Adversarial training instead requires many resources, which makes it hard to use in real-world applications. On CIFAR-10 with PGD $\epsilon=\frac{8}{255}$ and $20$ iterations, we achieve over 50\% robust accuracy while retaining the original clean accuracy. Furthermore, we see increases of up to 6\% accuracy against distribution shifts. Finally, our method provides the ability to incorporate prior human knowledge about object sizes into the network, which is particularly beneficial in biological and medical domains where the variance in object sizes is not dominated by perspective projections. Evaluations of the effective receptive field show that Top-GAP networks are able to focus their attention on class-relevant parts of the image.
Submission Length: Regular submission (no more than 12 pages of main content)
Assigned Action Editor: ~Pin-Yu_Chen1
Submission Number: 2539
Loading