Reproducibility Study of “Vision Transformers Need Registers”

TMLR Paper4353 Authors

25 Feb 2025 (modified: 15 Apr 2025)Under review for TMLREveryoneRevisionsBibTeXCC BY 4.0
Abstract: Vision Transformers (ViTs) have achieved State-Of-The-Art (SOTA) performance in nu- merous tasks. However, the emergence of high-norm artifact tokens in supervised and self-supervised ViTs hinders interpretability of attention maps of such models. This study reproduces and validates previous work (5) addressing this issue through the use of register tokens - learnable placeholders added to the input sequence - that mitigate artifacts and yield smoother feature maps. We evaluated the presence of artifacts in various ViT models, namely DeiT-III and DINOv2 architectures, and investigated the impact of fine-tuning pre- trained ViTs with register tokens and additional regularization introduced. By conducting experiments on pre-trained and fine-tuned models, we confirm that register tokens eliminate artifact and improve attention map interpretability.
Submission Length: Long submission (more than 12 pages of main content)
Previous TMLR Submission Url: /forum?id=w9pgM58H05
Changes Since Last Submission:
  • Added semantic segmentation experiments on ADE20K using DINOv2-L and DeiT-III-S to evaluate the impact of register tokens.
  • Included downstream results for DeiT-III-Small with 0, 1, 2, and 4 register tokens; clarified that DINOv2 results use a fixed 4-register-token pretrained backbone.
  • Added histogram plots (Figure 9) showing the reduction of artifact tokens via fine-tuning, based on the L2 norm distribution of tokens.
  • Expanded Section 3.5 with details on the computational requirements for the experiments.
  • In Section 4.1.6, added an explanation of how local information was measured using linear classifiers trained to predict token spatial positions.
  • Made minor improvements to clarity and figures.
Assigned Action Editor: Lu Jiang
Submission Number: 4353
Loading

OpenReview is a long-term project to advance science through improved peer review with legal nonprofit status. We gratefully acknowledge the support of the OpenReview Sponsors. © 2025 OpenReview