Improving Vision Model Robustness against Misclassification and Uncertainty Attacks via Underconfidence Adversarial Training

Published: 05 Nov 2025, Last Modified: 05 Nov 2025NLDL 2026 PosterEveryoneRevisionsBibTeXCC BY 4.0
Keywords: adversarial training, uncertainty attacks, adversarial attacks, robustness, confidence manipulation, underconfidence attack, miscalibration, AI security
TL;DR: This work extends adversarial robustness to underconfidence attacks, proposing two novel attacks and a defense that improves robustness while using half the steps of standard adversarial training.
Abstract: Adversarial robustness research has focused on defending against misclassification attacks. However, such adversarially trained models remain vulnerable to \textit{underconfidence adversarial attacks}, which reduce the model’s confidence without changing the predicted class. Decreased confidence can result in unnecessary interventions, delayed diagnoses, and a weakening of trust in automated systems. In this work, we introduce two novel underconfidence attacks: one that induces ambiguity between a class pair, and \textbf{ConfSmooth} which spreads uncertainty across all classes. For defense, we propose \textbf{Underconfidence Adversarial Training (UAT)} that embeds our underconfidence attacks in an adversarial training framework. We extensively benchmark our underconfidence attacks and defense strategies across six model architectures (both CNN and ViT-based), and seven datasets (MNIST, CIFAR, ImageNet, MSTAR and medical imaging). In 14 of the 15 data-architecture combinations, our attack outperforms the state-of-the-art, often substantially. Our UAT defense maintains the highest robustness against all underconfidence attacks on CIFAR-10, and achieves comparable to or better robustness than adversarial training against misclassification attacks while taking half of the gradient steps. By broadening the scope of adversarial robustness to include uncertainty-aware threats and defenses, UAT enables more robust computer vision systems. The code will be made publicly available.
Serve As Reviewer: ~Josué_Martínez-Martínez1
Submission Number: 35
Loading