LookSharp: Attention Entropy Minimization for Test-Time Adaptation

Published: 01 Mar 2026, Last Modified: 05 Apr 2026TTU at ICLR 2026 (Main)EveryoneRevisionsBibTeXCC BY 4.0
Abstract: Test-time adaptation (TTA) updates models during inference to reduce error on distribution shifts. Given the established test-time loss of entropy minimization over model predictions, we propose a new test-time loss of attention entropy minimization over the distributions computed by self-attention in the model. We propose $\textit{LookSharp}$ to minimize the entropy of the CLS-to-patch attention in the final layer of a ViT model and maintain focused attention on shifted data. We show that our attention entropy minimization improves robustness and is complementary to output entropy minimization on ImageNet-C and ImageNet-R.
Submission Number: 30
Loading