Emergence of Segmentation with Minimalistic White-Box Transformers

Published: 27 Oct 2023, Last Modified: 21 Nov 2023NeurIPS XAIA 2023EveryoneRevisionsBibTeX
TL;DR: The white-box transformer leads to the emergence of segmentation properties in the network's self-attention maps, solely through a minimalistic supervised training recipe.
Abstract: Transformer-like models for vision tasks have recently proven effective for a wide range of downstream applications such as segmentation and detection. Previous works have shown that segmentation properties emerge in vision transformers (ViTs) trained using self-supervised methods such as DINO, but not in those trained on supervised classification tasks. In this study, we probe whether segmentation emerges in transformer-based models \textit{solely} as a result of intricate self-supervised learning mechanisms, or if the same emergence can be achieved under much broader conditions through proper design of the model architecture. Through extensive experimental results, we demonstrate that when employing a white-box transformer-like architecture known as \ours{}, whose design explicitly models and pursues low-dimensional structures in the data distribution, segmentation properties, at both the whole and parts levels, already emerge with a minimalistic supervised training recipe. Layer-wise finer-grained analysis reveals that the emergent properties strongly corroborate the designed mathematical functions of the white-box network. Our results suggest a path to design white-box foundation models that are simultaneously highly performant and mathematically fully interpretable.
Submission Track: Full Paper Track
Application Domain: Computer Vision
Survey Question 1: We investigate the property of a mathematically explainable transformer-like model, which enables us to analyze and explain (i) role of depth, (ii) effect of model design, (iii) semantic properties of attention heads. We qualitatively evaluate the emergence of segmentation of the model via visualizing attention map, which has been widely used as a way to interpret transformer-like architectures.
Survey Question 2: We discover that when employing a white-box transformer-like architecture, known as CRATE, the segmentation properties emerge with a minimalistic supervised training recipe. Such properties do not show up in supervised black-box vision transformers in the literature. The attention maps are important internal representations of the transformer and can be applied for understanding the internal mechanisms of the transformer.
Survey Question 3: White-box design of transformers; Mathematical interpretation; Visualization of attention maps in white-box transformers; Visualization of PCA components in white-box transformers
Submission Number: 24
Loading