Calibration Attention: Instance-wise Temperature Scaling for Vision Transformers

15 Sept 2025 (modified: 12 Nov 2025)ICLR 2026 Conference Withdrawn SubmissionEveryoneRevisionsBibTeXCC BY 4.0
Keywords: Probabilistic Drediction, Calibration Attention, Model Calibration
Abstract: Calibration is essential for deploying Vision Transformers (ViTs) in risk-sensitive settings. While post-hoc temperature scaling fits a single global scalar on a validation split, it can degrade under distribution shift, as it ignores input-dependent uncertainty. We introduce Calibration Attention (CalAttn), a lightweight plug-in head that that dynamically learns an adaptive, per-instance temperature directly from the ViT’s CLS token. On CIFAR-10/100, MNIST, Tiny-ImageNet and ImageNet-1K with ViT/DeiT/Swin backbones, CalAttn reduces ECE by 2.02 pp (57.2\%) pre-TS and 1.18 pp (56.6\%) post-TS on average, while adding $<$0.1\% parameters. Learned temperatures concentrate near 1.0 on in-distribution data, limiting distortion when the model is already calibrated, yet adapt on harder examples. Extensive experiments confirm robustness across datasets, while comparisons highlight CalAttn’s efficiency over Dirichlet heads (3× params) and logit-temperature scaling baselines. Calibration Attention thus offers a simple, efficient strategy for producing trustworthy predictions in state-of-the-art Vision Transformers.
Supplementary Material: zip
Primary Area: alignment, fairness, safety, privacy, and societal considerations
Submission Number: 5300
Loading