AnchorFormer: Differentiable anchor attention for efficient vision transformer

Jiquan Shan, Junxiao Wang, Lifeng Zhao, Liang Cai, Hongyuan Zhang, Ioannis Liritzis

Published: 01 Nov 2025, Last Modified: 13 Oct 2025CrossrefEveryoneRevisionsCC BY-SA 4.0

Abstract: Highlights•We design an anchor-based ViT, which generates attention using anchors and tokens.•To differentiably learn the pivotal regions, the anchor are represented by neurons.•Inspired by the Markov process, the global attention can be computed by anchors.•By rearranging the multiplication orders, it only requires the linear complexity.•AnchorFormer increases 9.0% accuracy and reduces 46.7% FLOPs on classification.

External IDs:doi:10.1016/j.patrec.2025.07.016